In the last post of the series about Sigstore, I will look at the most exciting part of the implementation – ephemeral keys, or what the Sigstore team calls keyless signing. The post will go over the second and third scenarios I outlined in Implementing Containers’ Secure Supply Chain with Sigstore Part 1 – Signing with Existing Keys and go deeper into the experience of validating artifacts and moving artifacts between registries.

Using Sigstore to Sign with Ephemeral Keys

Using Cosign to sign with ephemeral keys is still an experimental feature and will be released in v1.14.0 (see the following PR). Signing with ephemeral keys is relatively easy.

$ COSIGN_EXPERIMENTAL=1 cosign sign 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1
Generating ephemeral keys...
Retrieving signed certificate...
 Note that there may be personally identifiable information associated with this signed artifact.
 This may include the email address associated with the account with which you authenticate.
 This information will be used for signing this artifact and will be stored in public transparency logs and cannot be removed later.
 By typing 'y', you attest that you grant (or have permission to grant) and agree to have this information stored permanently in transparency logs.
Are you sure you want to continue? (y/[N]): y
Your browser will now be opened to:
https://oauth2.sigstore.dev/auth/auth?access_type=online&client_id=sigstore&code_challenge=e16i62r65TuJiklImxYFIr32yEsA74fSlCXYv550DAg&code_challenge_method=S256&nonce=2G9cB5h89SqGwYQG2ey5ODeaxO8&redirect_uri=http%3A%2F%2Flocalhost%3A33791%2Fauth%2Fcallback&response_type=code&scope=openid+email&state=2G9cB7iQ7BSXYQdKKe6xGOY2Rk8
Successfully verified SCT...
Warning: Tag used in reference to identify the image. Consider supplying the digest for immutability.
"562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample" appears to be a private repository, please confirm uploading to the transparency log at "https://rekor.sigstore.dev" [Y/N]: y
tlog entry created with index: 5133131
Pushing signature to: 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample

You are sent to authenticate using OpenID Connect (OIDC) via the browser. I used my GitHub account to authenticate.

Once authenticated, you are redirected back to localhost, where Cosign reads the code query string parameter from the URL and verifies the authentication.

Here is what the redirect URL looks like.

http://localhost:43219/auth/callback?code=z6dghpnzujzxn6xmfltyl6esa&state=2G9dbwwf9zCutX3mNevKWVd87wS

I have also pushed v2 and v3 of the image to the registry and signed them using the approach above. Here is the new state in my registry.

wdt_ID Artifact Tag Artifact Type Artifact Digest
1 v1 Image sha256:9bd049b6b470118cc6a02d58595b86107407c9e288c0d556ce342ea8acbafdf4
2 sha256-9bd049b6b470118cc6a02d58595b86107407c9e288c0d556ce342ea8acbafdf4.sig Signature sha256:483f2a30b765c3f7c48fcc93a7a6eb86051b590b78029a59b5c2d00e97281241
3 v2 Image sha256:d4d59b7e1eb7c55b0811c3dfd3571ab386afbe6d46dfcf83e06343e04ae888cb
4 sha256-d4d59b7e1eb7c55b0811c3dfd3571ab386afbe6d46dfcf83e06343e04ae888cb.sig Signature sha256:8c43d1944b4d0c3f0d7d6505ff4d8c93971ebf38fc60157264f957e3532d8fd7
5 v3 Image sha256:2e19bd9d9fb13c356c64c02c574241c978199bfa75fd0f46b62748f59fb84f0a
6 sha256:2e19bd9d9fb13c356c64c02c574241c978199bfa75fd0f46b62748f59fb84f0a.sig Signature sha256:cc2a674776dfe5f3e55f497080e7284a5bd14485cbdcf956ba3cf2b2eebc915f

If you look at the console output, you will also see that one of the lines mentions tlog in it. This is the index in Rekor transaction log where the signature’s receipt is stored. For the three signatures that I created, the indexes are:

5133131 for v1
5133528 for v2
and 5133614 for v3

That is it! I have signed my images with ephemeral keys, and I have the tlog entries that correspond to the signatures. It is a fast and easy experience.

Verifying Images Signed With Ephemeral Keys

Verifying the images signed with ephemeral keys is built into the Cosign CLI.

$ COSIGN_EXPERIMENTAL=1 cosign verify 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1 | jq . > flasksample-v1-ephemeral-verification.json
Verification for 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1 --
The following checks were performed on each of these signatures:
- The cosign claims were validated
- Existence of the claims in the transparency log was verified offline
- Any certificates were verified against the Fulcio roots.

The outputs from the verification of flasksample:v1, flasksample:v2, and flasksample:v3 are available on GitHub. Few things to note about the output from the verification.

  • The output JSON contains the logIndexas well as the logID, which I did assume I could use to search for the receipts in Rekor. I have some confusion about the logID purpose, but I will go into that a little later!
  • There is a body field that I assume is the actual signature. This JSON field is not yet documented and is hard to know with such a generic name.
  • The type field seems to be a free text field. I would expect it to be something more structured and the values to come from a list of possible and, most importantly, standardized types.

Search and Explore Rekor

The goal of my second scenario – Sign Container Images with Ephemeral Keys from Fulcio is not only to sign images with ephemeral keys but also to invalidate one of the signed artifacts. Unfortunately, documentation and the help output from the commands are scarce. Also, searching on Google how to invalidate a signature in Rekor yields no results. I decided to start exploring the Rekor logs to see if that may help.

There aren’t many commands that you can use in Rekor. The four things you can do are: get records; search by email, SHA or artifact; uploadentry or artifact; and verify entry or artifact. Using the information from the outputs in the previous section, I can get the entries for the three images I signed using the log indexes.

$ rekor-cli get --log-index 5133131 > flasksample-v1-ephemeral-logentry.json
$ rekor-cli get --log-index 5133528 > flasksample-v2-ephemeral-logentry.json
$ rekor-cli get --log-index 5133614 > flasksample-v3-ephemeral-logentry.json

The outputs from the above commands for flasksample:v1, flasksample:v2, and flasksample:v3 are available on GitHub.

I first noted that the log entries are not returned in JSON format by the Rekor CLI. This is different from what Cosign returns and is a bit inconsistent. Second, the log entries outputted by the Rekor CLI are not the same as the verification outputs returned by Cosign. Cosign verification output provides different information than the Rekor log entry. This begs the question: “How does Cosign get this information?” First, though, let’s see what else Rekor can give me.

I can use Rekor search to find all the log entries that I created. This will include the ones for the three images above and, theoretically, everything else I signed.

$ rekor-cli search --email toddysm_dev1@outlook.com
Found matching entries (listed by UUID):
24296fb24b8ad77aaf485c1d70f4ab76518483d5c7b822cf7b0c59e5aef0e032fb5ff4148d936353
24296fb24b8ad77a3f43ac62c8c7bab7c95951d898f2909855d949ca728ffd3426db12ff55390847
24296fb24b8ad77ac2334dfe2759c88459eb450a739f08f6a16f5fd275431fb42c693974af3d5576
24296fb24b8ad77a8f14877c718e228e315c14f3416dfffa8d5d6ef87ecc4f02f6e7ce5b1d5b4e95
24296fb24b8ad77a6828c6f9141b8ad38a3dca4787ab096dca59d0ba68ff881d6019f10cc346b660
24296fb24b8ad77ad54d6e9bb140780477d8beaf9d0134a45cf2ded6d64e4f0d687e5f30e0bb8c65
24296fb24b8ad77a888dc5890ac4f99fc863d3b39d067db651bf3324674b85a62e3be85066776310
24296fb24b8ad77a47fae5af8718673a2ef951aaf8042277a69e808f8f59b598d804757edab6a294
24296fb24b8ad77a7155046f33fdc71ce4e291388ef621d3b945e563cb29c2e3cd6f14b9ba1b3227
24296fb24b8ad77a5fc1952295b69ca8d6f59a0a7cbfbd30163c3a3c3a294c218f9e00c79652d476

Note that the result lists UUIDs that are different from the logID properties in the verification output JSON. You can get log entries using the UUID or the logIndex but not using the logID. The UUIDs are not present in the Cosign output mentioned in the previous section, while the logID is. However, it is unclear what the logID can be used for and why the UUID is not included in the Cosign output.

Rekor search command supposedly allows you to search by artifact and SHA. However, it is not documented what form those need to take. Using the image name or the image SHA yield no results.

$ rekor-cli search --artifact 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample
Error: invalid argument "562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample" for "--artifact" flag: Key: '' Error:Field validation for '' failed on the 'url' tag
$ rekor-cli search --sha sha256:9bd049b6b470118cc6a02d58595b86107407c9e288c0d556ce342ea8acbafdf4
no matching entries found
$ rekor-cli search --sha 9bd049b6b470118cc6a02d58595b86107407c9e288c0d556ce342ea8acbafdf4
no matching entries found

I think the above are the core search scenarios for container images (and other artifacts), but it seems they are either not implemented or not documented. Neither the Rekor GitHub repository, the Rekor public documentation, nor the Rekor Swagger have any more details on the search. I filed an issue for Rekor to ask how the artifacts search works.

Coming back to the main goal of invalidating a signed artifact, I couldn’t find any documentation on how to do that. The only apparent options to invalidate the artifacts are either uploading something to Rekor or removing the signature from Rekor. I looked at all options to upload entries or artifacts to Rekor, but the documentation mainly describes how to sign and upload entries using other types like SSH, X509, etc. It does seem to me that there is no capability in Rekor to say: “This artifact is not valid anymore”.

I thought that looking at how Rekor verifies signatures may help me understand the approach.

Verifying Signatures Using Rekor CLI

I decided to explore how the signatures are verified and reverse engineer the process to understand if an artifact signature can be invalidated. Rekor CLI has a verify command. My assumption was that Rekor’s verify command worked the same as the Cosign verify command. Unfortunately, that is not the case.

$ rekor-cli verify --artifact 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1
Error: invalid argument "562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1" for "--artifact" flag: Key: '' Error:Field validation for '' failed on the 'url' tag
$ rekor-cli verify --entry 24296fb24b8ad77a8f14877c718e228e315c14f3416dfffa8d5d6ef87ecc4f02f6e7ce5b1d5b4e95
Error: invalid argument "24296fb24b8ad77a8f14877c718e228e315c14f3416dfffa8d5d6ef87ecc4f02f6e7ce5b1d5b4e95" for "--entry" flag: Key: '' Error:Field validation for '' failed on the 'url' tag

Unfortunately, due to a lack of documentation and examples, I wasn’t able to figure out how this worked without browsing the code. While that kind of digging is always an option, I would expect an easier experience as an end user.

I was made aware of the following blog post, though. It describes how to handle account compromise. To put it in context, if my GitHub account is compromised, this blog post describes the steps I need to take to invalidate the artifacts. I do have two problems with this proposal:

  1. As you remember, in my scenario, I wanted to invalidate only the flasksample:v2 artifact, and not all artifacts signed with my account. If I follow the proposal in the blog post, I will invalidate everything signed with my GitHub account, which may result in outages.
  2. The proposal relies on the consumer of artifacts to constantly monitor the news for what is valid and what is not; which GitHub account is compromised and which one is not. This is unrealistic and puts too much manual burden on the consumer of artifacts. In an ideal scenario, I would expect the technology to solve this with a proactive way to notify the users if something is wrong rather than expect them to learn reactively.

At this point in time, I will call this scenario incomplete. Yes, I am able to sign with ephemeral keys, but this doesn’t seem unique in this situation. The ease around the key generation is what they seem to be calling attention to, and it does make signing much less intimidating to new users, but I could still generate a new SSH or GPG key every time I need to sign something. Trusting Fulcio’s root does not automatically increase my security – I would even argue the opposite. Making it easier for everybody to sign does not increase security, either. Let’s Encrypt already proved that. While Let’s Encrypt made an enormous contribution to our privacy and helped secure every small business site, the ease, and accessibility with which it works means that every malicious site now also has a certificate. The lock in the address bar is no longer a sign of security. We are all excited about the benefits, but I bet very few of us are also excited for this to help the bad guys. We need to think beyond the simple signing and ensure that the whole end-to-end experience is secure.

I will move to the last scenario now.

Promoting Sigstore Signed Images Between Registries

In the last scenario I wanted to test the promotion of images between registries. Let’s create a v4 of the image and sign it using an ephemeral key. Here are the commands with the omitted output.

$ docker build -t 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v4 .
$ docker push 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v4
$ COSIGN_EXPERIMENTAL=1 cosign sign 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v4

The Rekor log index for the signature is 5253114. I can use Crane to copy the image and the signature from AWS ECR into Azure ACR.

$ crane copy 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v4 tsmacrtestcssc.azurecr.io/flasksample:v4
$ crane copy 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:sha256-aa2690ed4a407ac8152d24017eb6955b01cbb0fc44afe170dadedc30da80640a.sig tsmacrtestcssc.azurecr.io/flasksample:sha256-aa2690ed4a407ac8152d24017eb6955b01cbb0fc44afe170dadedc30da80640a.sig

Also, let’s validate the ephemeral key signature using the image in Azure ACR.

$ COSIGN_EXPERIMENTAL=1 cosign verify tsmacrtestcssc.azurecr.io/flasksample:v4 | jq .
Verification for tsmacrtestcssc.azurecr.io/flasksample:v4 --
The following checks were performed on each of these signatures:
 - The cosign claims were validated
 - Existence of the claims in the transparency log was verified offline
 - Any certificates were verified against the Fulcio roots.

Next, I will sign the image with a key stored in Azure Key Vault and verify the signature.

$ cosign sign --key azurekms://tsm-kv-usw3-tst-cssc.vault.azure.net/sigstore-azure-test-key-ec tsmacrtestcssc.azurecr.io/flasksample:v4
Warning: Tag used in reference to identify the image. Consider supplying the digest for immutability.
Pushing signature to: tsmacrtestcssc.azurecr.io/flasksample
$ cosign verify --key azurekms://tsm-kv-usw3-tst-cssc.vault.azure.net/sigstore-azure-test-key-ec tsmacrtestcssc.azurecr.io/flasksample:v4
Verification for tsmacrtestcssc.azurecr.io/flasksample:v4 --
The following checks were performed on each of these signatures:
 - The cosign claims were validated
 - The signatures were verified against the specified public key
[{"critical":{"identity":{"docker-reference":"tsmacrtestcssc.azurecr.io/flasksample"},"image":{"docker-manifest-digest":"sha256:aa2690ed4a407ac8152d24017eb6955b01cbb0fc44afe170dadedc30da80640a"},"type":"cosign container image signature"},"optional":null}]

Everything worked as expected. This scenario was very smooth, and I was able to complete it in less than a minute.

Summary

So far, I have just scratched the surface of what the Sigstore project could accomplish. While going through the scenarios in these posts, I had a bunch of other thoughts, so I wanted to highlight a few below:

  • Sigstore is built on a good idea to leverage ephemeral keys for signing container images (and other software). However, just the ephemeral keys alone do not provide higher security if there is no better process to invalidate the signed artifacts. With traditional X509 certificates, one can use CRL (Certificate Revocation Lists) or OCSP (Online Certificate Status Protocol) to revoke certificates. Although they are critiqued a lot, the process of invalidating artifacts using ephemeral keys and Sigstore does not seem like an improvement at the moment. I look forward to the improvements in this area as further discussions happen.
  • Sigstore, like nearly all open-source projects, would benefit greatly from better documentation and consistency in the implementation. Inconsistent messages, undocumented features, myriad JSON schemas, multiple identifiers used for different purposes, variable naming conventions in JSONs, and unpredictable output from the command line tools are just a few things that can be improved. I understand that some of the implementation was driven by requirements to work with legacy registries but going forward, that can be simplified by using OCI references. The bigger the project grows, the harder it will become to fix those.
  • The experience that Cosign offers is what makes the project successful. Signing and verifying images using the legacy X.509 and the ephemeral keys is easy. Hiding the complexity behind a simple CLI is a great strategy to get adoption.

I tested Sigstore a year ago and asked myself the question: “How do I solve the SolarWinds exploit with Sigstore?” Unfortunately, Sigstore doesn’t make it easier to solve that problem yet. Having in mind my experience above, I would expect a lot of changes in the future as Sigstore matures.

Unfortunately, there is no viable alternative to Sigstore on the market today. Notary v1 (or Docker Content Trust) proved not flexible enough. Notary v2 is still in the works and has yet to show what it can do. However, the lack of alternatives does not automatically mean that we avoid the due diligence required for a security product of such importance.  Sigstore has had a great start, and this series proves to me that we’ve got a lot of work ahead of us as an industry to solve our software supply chain problems.

In my previous post, Implementing Containers’ Secure Supply Chain with Sigstore Part 1 – Signing with Existing Keys, I went over the Cosign experience of signing images with existing keys. As I concluded there, the signing was easy to achieve, with just a few hiccups here and there. It does seem that Cosign does a lot behind the scenes to make it easy. Though, after looking at the artifacts stored in the registry, I got curious of how the signatures and attestations are saved. Unfortunately, the Cosign specifications are a bit light on details, and it seems they were created after or evolved together with the implementation. Hence, I decided to go with the reverse-engineering approach to understand what is saved in the registry.

At this post’s end, I will validate the signatures using Cosign CLI and complete my first scenario.

The Mystery Behind Cosign Artifacts

First, to be able to store Cosign artifacts in a registry, you need to use an OCI-compliant registry. When Cosign signs a container image, an OCI artifact is created and pushed to the registry. Every OCI artifact has a manifest and layers. The manifest is standardized, but the layers can be anything that can be packed in a tarball. So, Cosign’s signature should be in the layer pushed to the registry, and the manifest should describe the signature artifact. For image signatures, Cosign tags the manifest with a tag that uses the following naming convention sha-<image-sha>.sig.

From the examples in my previous post, when Cosign signed the 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1 image, it created a new artifact and tagged it sha256-9bd049b6b470118cc6a02d58595b86107407c9e288c0d556ce342ea8acbafdf4.sig. Here are the details of the image that was signed.

And here are the details of the signature artifact.

What is Inside Cosign Signature Artifact?

I was curious about what the signature artifact looks like. Using Crane, I can pull the signature artifact. All files are available in my Github test repository.

# Sign into the registry
$ aws ecr get-login-password --region us-west-2 | crane auth login --username AWS --password-stdin 562077019569.dkr.ecr.us-west-2.amazonaws.com

# Pull the signature manifest
$ crane manifest 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:sha256-9bd049b6b470118cc6a02d58595b86107407c9e288c0d556ce342ea8acbafdf4.sig | jq . > flasksample-v1-signature-manifest.json

# Pull the signature artifact as a tarball and unpack it into ./sigstore-signature
$ crane pull 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:sha256-9bd049b6b470118cc6a02d58595b86107407c9e288c0d556ce342ea8acbafdf4.sig flasksample-v1-signature.tar.gz
$ mkdir sigstore-signature
$ tar -xvf flasksample-v1-signature.tar.gz -C ./sigstore-signature
$ cd sigstore-signature/
$ ls -al
total 20
drwxrwxr-x 2 toddysm toddysm 4096 Oct 14 10:08 .
drwxrwxr-x 3 toddysm toddysm 4096 Oct 14 10:07 ..
-rw-r--r-- 1 toddysm toddysm  272 Dec 31  1969 09b3e371137191b52fdd07bdf115824b2b297a2003882e68d68d66d0d35fe1fc.tar.gz
-rw-r--r-- 1 toddysm toddysm  319 Dec 31  1969 manifest.json
-rw-r--r-- 1 toddysm toddysm  248 Dec 31  1969 sha256:00ce5fed483997c24aa0834081ab1960283ee9b2c9d46912bbccc3f9d18e335d
$ tar -xvf 09b3e371137191b52fdd07bdf115824b2b297a2003882e68d68d66d0d35fe1fc.tar.gz 
tar: This does not look like a tar archive

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

Surprisingly to me, the inner tarball (09b3e371137191b52fdd07bdf115824b2b297a2003882e68d68d66d0d35fe1fc.tar.gz) does not seem to be a tarball, although it has the proper extensions. I assumed this was the actual signature blob, but I couldn’t confirm without knowing how to manipulate that archive. Interestingly, opening the file in a simple text editor reveals that it is a plain JSON file with an .tar.gz extension. Looking into the other two files  manifest.json and sha256:00ce5fed483997c24aa0834081ab1960283ee9b2c9d46912bbccc3f9d18e335d it looks like all the files are some kind of manifests, but not very clear what for. I couldn’t find any specification explaining the content of the layer and the meaning of the files inside it.

Interestingly, Cosign offers a tool to download the signature.

$ cosign download signature 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1 | jq . > flasksample-v1-cosign-signature.json

The resulting signature is available in the GitHub repository. The page linked above claims that you can verify the signature in another tool, but I couldn’t immediately find details on how to do that. I decided to leave this for some other time.

The following stand out from this experience.

  • First (and typically a red flag when I am evaluating software), why the JSON file has a tarball file extension? Normally, this is malicious practice, and it’s especially concerning in this context. I am sure it will get fixed, and an explanation will be provided now that I have filed an issue for it.
  • Why are there so many JSON files? Trying to look at all the available Cosign documentation, I couldn’t find any architectural or design papers that explain those decisions. It seems to me that those things got hacked on top of each other when a need arose. There may be GitHub issues discussing those design decisions, but I didn’t find any in a quick search.
  • The signature is not in any of the files that I downloaded. The signature is stored as an OCI annotation in the manifest called dev.cosignproject.cosign/signature. So, do I even need the rest of the artifact?
  • Last but not least, it seems that Cosign tool is the only one that understands how to work with the stored artifacts, which may result in a tool lock-in. Although there is a claim that I can verify the signature in a different tool, without specification, it will be hard to implement such a tool.

What is Inside Cosign Attestation Artifact?

Knowing how the Cosign signatures work, I would expect something similar for the attestations. The mental hierarchy I have built in my mind is the following:

+ Image
  - Image signature
  + Attestation
    - Attestation signature

Unfortunately, this is not the case. There is no separate artifact for the attestation signature. Here are the steps to download the attestation artifact.

# Pull the attestation manifest
$ crane manifest 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:sha256-9bd049b6b470118cc6a02d58595b86107407c9e288c0d556ce342ea8acbafdf4.att | jq . > flasksample-v1-attestation-manifest.json

# Pull the attestation artifact as a tarball and unpack it into ./sigstore-attestation
$ crane pull 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:sha256-9bd049b6b470118cc6a02d58595b86107407c9e288c0d556ce342ea8acbafdf4.sig flasksample-v1-attestation.tar.gz
$ mkdir sigstore-attestation
$ tar -xvf flasksample-v1-attestation.tar.gz -C ./sigstore-attestation/
sha256:728e26b36817753d90d7de8420dacf4fa1bcf746da2b54bb8c59cd047a682198
c9880779c90158a29f7a69f29c492551261e7a3936247fc75f225171064d6d32.tar.gz
ff626be9ff3158e9d2118072cd24481d990a5145d10109affec6064423d74cc4.tar.gz
manifest.json

All attestation files are available in my Github test repository. Knowing what was done for the signature, the results are somewhat what I would have expected. Also, I think I am getting a sense of the design by slowly reverse-engineering it.

The manifest.json file describes the archive. It points to the config sha256:728e26b36817753d90d7de8420dacf4fa1bcf746da2b54bb8c59cd047a682198 file and the two layers c9880779c90158a29f7a69f29c492551261e7a3936247fc75f225171064d6d32.tar.gz and ff626be9ff3158e9d2118072cd24481d990a5145d10109affec6064423d74cc4.tar.gz. I was not sure what the config file sha256:728e26b36817753d90d7de8420dacf4fa1bcf746da2b54bb8c59cd047a682198 is used for, so I decided to ignore it. The two layer JSONs (which were both JSON files, despite the tar.gz extensions) were more interesting, so I decided to dig more into them.

The first thing to note is that the layer JSONs (here and here) for the attestations have a different format from the layer JSON for the signature. While the signature seems to be something proprietary to Cosign, the attestations have a payload type application/vnd.in-toto+json, which hints at something more widely accepted. While this is not an official IANA media type, there is at least in-toto specification published. The payload looks a lot like a Base64 encoded string, so I gave it a try.

# This decodes the SLSA provenance attestation
$ cat ff626be9ff3158e9d2118072cd24481d990a5145d10109affec6064423d74cc4.tar.gz | jq -r .payload | base64 -d | jq . > ff626be9ff3158e9d2118072cd24481d990a5145d10109affec6064423d74cc4.tar.gz.payload.json

# And this decodes the SPDX SBOM
$ cat c9880779c90158a29f7a69f29c492551261e7a3936247fc75f225171064d6d32.tar.gz | jq -r .payload | base64 -d | jq . > c9880779c90158a29f7a69f29c492551261e7a3936247fc75f225171064d6d32.tar.gz.payload.json

Both files are available here and here. If I want to get the SBOM or the SLSA provenance, I need to get the  predicate value from the above JSONs, JSON decode it, and then use it if I want. I didn’t go into that because that was not part of my goals for this experiment.

Note one thing! As you remember from the beginning of the section, I expected to have signatures for the attestations, and I do! They are not where I expected though. The signatures are part of the layer JSONs (the ones with the strange extensions). If you want to extract the signatures, you need to get the signatures value from them.

# Extract the SBOM attestation signature
$ cat c9880779c90158a29f7a69f29c492551261e7a3936247fc75f225171064d6d32.tar.gz | jq -r .signatures > flasksample-v1-sbom-signatures.json

# Extract the SLSA provenance attestation signature
cat ff626be9ff3158e9d2118072cd24481d990a5145d10109affec6064423d74cc4.tar.gz | jq -r .signatures > flasksample-v1-slsa-signature.json

The SBOM signature and the SLSA provenance signature are available on GitHub. For whatever reason, the key ID is left blank.

Here are my takeaways from this experience.

  • Cosign uses a myriad of nonstandard JSON file formats to store signatures and attestations. These still need documentation and to be standardized (except the in-toto one).
  • To get to the data, I need to make several conversations from JSON to Base64 to JSON-encoded, which increases not only the computation power that I need to use but also the probability of errors and bugs, so I would recommend making that simpler.
  • All attestations are stored in a single OCI artifact, and there is no way to retrieve a single attestation based on its type. In my example, if I need to get only the SLSA provenance (785 bytes), I still need to download the SBOM, which is 1.5 MB. This is 1500 times more data than I need. The impact will be performance and bandwidth cost, and at scale, would make this solution the wrong one for me.

Verifying Signatures and Attestations with Cosign

Cosign CLI has commands for signature and attestation verification.

# This one verifies the image signature
$ cosign verify --key awskms:///61c124fb-bf47-4f95-a805-65dda7cd08ae 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1 > sigstore-verify-signature-output.json

Verification for 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1 --
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - The signatures were verified against the specified public key

# This one verifies the image attestations
$ cosign verify-attestation --key awskms:///61c124fb-bf47-4f95-a805-65dda7cd08ae 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1 > sigstore-verify-attestation-output.json

Verification for 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1 --
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - The signatures were verified against the specified public key

The output of the signature verification and the output of the attestation verification are available in Github. The signature verification is as expected. The attestation verification is more compelling, though. It is another JSON file in non-standard JSON. It seems as though the developer took the two JSONs from the blobs and concatenated them, perhaps unintentionally. Below is a screenshot of the output loaded in Visual Studio Code. Notice that there is no comma between lines 10 and 11. Also, the JSON objects are not wrapped in a JSON array. I’ve opened another issue for the non-standard JSON output from the verification command.

With this, I was able to complete my first scenario – Sign Container Images With Existing Keys Stored in a KMS.

Summary

Here is the summary for the second part of my experience:

  • It seems that the Sigstore implementation grew organically, and some specs were written after the implementation was done. Though many pieces are missing specifications and documentation, it will be hard to develop third-party tooling or even maintain the code easily until those are written. The more the project grows, the harder and slower it will be to add new capabilities, and the risk of unintended side effects and even security bugs will grow.
  • There are certain architectural choices that I would question. I have already mentioned the issue with saving all attestations in a single artifact and the numerous proprietary manifests and JSON files. I would also like to dissect the lack of separation between Cosign CLI and Cosign libraries. If they were separate, it would be easier to use them in third-party tooling to verify or sign artifacts.
  • Finally, the above two tell me that there will be a lot of incompatible changes in the product going forward. I would expect this from an MVP but not from a V1 product that I will use in production. If the team wants to move to a cleaner design and more flexible architecture, a lot of the current data formats will change. This, of course, can be hidden behind the Cosign CLI, but that means that I need to take a hard dependency on it. It will be interesting to understand the plan for verification scenarios and how Cosign can be integrated with various policies and admission controllers. Incompatible changes in the verification scenario can, unfortunately, result in production outages.

My biggest concern so far is the inconsistent approach to the implementation and the lax architectural principles and documentation of the design decisions. On the bright side, verifying the signatures and the attestations using the Cosign CLI was very easy and smooth.

In my next post, I will look at the ephemeral key signing scenario and the capabilities to revoke signed artifacts. I will also look at the last scenario that involves the promotion and re-signing of artifacts.

Photo by ammiel jr on Unsplash

Today, the secure supply chain for software is on top of mind for every CISO and enterprise leader. After the President’s Executive Order (EO), many efforts were spun off to secure the supply chain. One of the most prominent is, of course, Sigstore. I looked at Sigstore more than a year ago and was excited about the idea of ephemeral keys. I thought it might solve some common problems with signing. Like, for example, reducing the blast radius if a signing key is compromised or signing identity is stolen.

Over the past twelve months, I’ve spent a lot of time working on a secure supply chain for containers at Microsoft and gained a deep knowledge of the use cases and myriad of scenarios. At the same time, Sigstore gained popularity, and more and more companies started using it to secure their container supply chains. I’ve followed the project development and the growth in popularity. In recent weeks, I decided to take another deep look at the technology and evaluate how it will perform against a few core scenarios to secure container images against supply chain attacks.

This will be a three-part series going over the Sigstore experience for signing containers. In the first part, I will look at the experience of signing with existing long-lived keys as well as adding attestations like SBOMs and SLSA provenance documents. In the second part, I will go deeper into the artifacts created during the signing and reverse-engineer their purpose. In the third part, I will look at the signing experience with short-lived keys as well as promoting signatures between registries.

Before that, though, let’s look at some scenarios that I will use to guide my experiment.

Containers’ Supply Chain Scenarios

Every technology implementation (I believe) should start with user scenarios. Signing container images is not a complete scenario but a part of a larger experience. Below are the experiences that I would like to test as part of my experiment. Also, I will do this using the top two cloud vendors – AWS and Azure.

Sign Container Images With Existing Keys Stored in a KMS

In this scenario, I will sign the images with keys that are already stored in my cloud key management systems (ASKW KMS or Azure Key Vault). The goal here is to enable enterprises to use existing keys and key management infrastructure for signing. Many enterprises already use this legacy signing scenario for their software, so there is nothing revolutionary here except the additional artifacts.

  1. Build a v1 of a test image
  2. Push the v1 of the test image to a registry
  3. Sign the image with a key stored in a cloud KMS
  4. Generate an SBOM for the container image
  5. Sign the SBOM and push it to the registry
  6. Generate an SLSA provenance attestation
  7. Sign the SLSA provenance attestation and push it to the registry
  8. Pull and validate the SBOM
  9. Pull and validate the SLSA provenance attestation

A note! I will cheat with the SLSA provenance attestations because the SLSA tooling works better in CI/CD pipelines than with manual Docker build commands that I will use for my experiment.

Sign Container Images with Ephemeral Keys from Fulcio

In this scenario, I will test how the signing with ephemeral keys (what Sigstore calls keyless signing) improves the security of the containers’ supply chain. Keyless signing is a bit misleading because keys are still involved in generating the signature. The difference is that the keys are generated on-demand by Fulcio and have a short lifespan (I believe 10 min or so). I will not generate SBOMs and SLSA provenance attestations for this second scenario, but you can assume that this may also be part of it in a real-life application. Here is what I will do:

  1. Build a v1 of a test image
  2. Push the v1 of the test image to a registry
  3. Sign the image with an ephemeral key
  4. Build a v2 of the test image and repeat steps 2 and 3 for it
  5. Build a v3 of the test image and repeat steps 2 and 3 for it
  6. Invalidate the signature for v2 of the test image

The premise of this scenario is to test a temporary exploit of the pipeline. This is what happened with SolarWinds Supply Chain Compromise, and I would like to understand how we might be able to use Sigstore to prevent such an attack in the future or how it could reduce the blast radius. I don’t want to invalidate the signatures for v1 and v3 because this will be similar to the traditional signing approach with long-lived keys.

Acquire OSS Container Image and Re-Sign for Internal Use

This is a common scenario that I’ve heard from many customers. They import images from public registries, verify them, scan them, and then want to re-sign them with internal keys before allowing them for use. So, here is what I will do:

  1. Build an image
  2. Push it to the registry
  3. Sign it with an ephemeral key
  4. Import the image and the signature from one registry (ECR) into another (ACR)
    Those steps will simulate importing an image signed with an ephemeral key from an OSS registry like Docker Hub or GitHub Container Registry.
  5. Sign the image with a key from the cloud KMS
  6. Validate the signature with the cloud KMS certificate

Let’s get started with the experience.

Environment Set Up

To run the commands below, you will need to have AWS and Azure accounts. I have already created container registries and set up asymmetric keys for signing in both cloud vendors. I will not go over the steps for setting those up – you can follow the vendor’s documentation for that. I have also set up AWS and Azure CLIs so I can sign into the registries, run other commands against the registries and retrieve the keys from the command line. Once again, you can follow the vendor’s documentation to do that. Now, let’s go over the steps to set up Sigstore tooling.

Installing Sigstore Tooling

To go over the scenarios above, I will need to install the Cosign and Rekor CLIs. Cosign is used to sign the images and also interacts with Fulcio to obtain the ephemeral keys for signing. Rekor is the transparency log that keeps a record of the signatures done by Cosign using ephemeral keys.

When setting up automation for either signing or signature verification, you will need to install Cosign only as a tool. If you need to add or retrieve Rekor records that are not related to signing or attestation, you will need to install Rekor CLI.

You have several options to install Cosign CLI; however, the only documented option to install Rekor CLI is using Golang or building from source (for which you need Golang). One note: the installation instructions for all Sigstore tools are geared toward Golang developers.

The next thing is that on the Sigstore documentation site, I couldn’t find information on how to verify that the Cosign binaries I installed were the ones that Sigstore team produced. And the last thing that I noticed after installing the CLIs is the details I got about the binaries. Running cosign version and rekor-cli version gives the following output.

$ cosign version
  ______   ______        _______. __    _______ .__   __.
 /      | /  __  \      /       ||  |  /  _____||  \ |  |
|  ,----'|  |  |  |    |   (----`|  | |  |  __  |   \|  |
|  |     |  |  |  |     \   \    |  | |  | |_ | |  . `  |
|  `----.|  `--'  | .----)   |   |  | |  |__| | |  |\   |
 \______| \______/  |_______/    |__|  \______| |__| \__|
cosign: A tool for Container Signing, Verification and Storage in an OCI registry.

GitVersion:    1.13.0
GitCommit:     6b9820a68e861c91d07b1d0414d150411b60111f
GitTreeState:  "clean"
BuildDate:     2022-10-07T04:37:47Z
GoVersion:     go1.19.2
Compiler:      gc
Platform:      linux/amd64Sigstore documentation site
$ rekor-cli version
  ____    _____   _  __   ___    ____             ____   _       ___
 |  _ \  | ____| | |/ /  / _ \  |  _ \           / ___| | |     |_ _|
 | |_) | |  _|   | ' /  | | | | | |_) |  _____  | |     | |      | |
 |  _ <  | |___  | . \  | |_| | |  _ <  |_____| | |___  | |___   | |
 |_| \_\ |_____| |_|\_\  \___/  |_| \_\          \____| |_____| |___|
rekor-cli: Rekor CLI

GitVersion:    v0.12.2
GitCommit:     unknown
GitTreeState:  unknown
BuildDate:     unknown
GoVersion:     go1.18.2
Compiler:      gc
Platform:      linux/amd64

Cosign CLI provides details about the build of the binary, Rekor CLI does not. Using the above process to install the binaries may seem insecure, but this seems to be by design, as explained in Sigstore Issue #2300: Verify the binary downloads when installing from .deb (or any other binary release).

Here is the catch, though! I looked at the above experience as a novice user going through the Sigstore documentation. Of course, as with any other technical documentation, this one is incomplete and not updated with the implementation. There is no documentation on how to verify the Cosign binary, but there is one describing how to verify Rekor binaries. If you go to the Sigstore Github organization and specifically to the Cosign and Rekor release pages, you will see that they’ve published the signatures and the SBOMs for both tools. You will also find binaries for Rekor that you can download. So you can verify the signature of the release binaries before installing. Here is what I did for Rekor CLI version that I had downloaded:

$ COSIGN_EXPERIMENTAL=1 cosign verify-blob \
    --cert https://github.com/sigstore/rekor/releases/download/v0.12.2/rekor-cli-linux-amd64-keyless.pem \
    --signature https://github.com/sigstore/rekor/releases/download/v0.12.2/rekor-cli-linux-amd64-keyless.sig \
    https://github.com/sigstore/rekor/releases/download/v0.12.2/rekor-cli-linux-amd64

tlog entry verified with uuid: 38665ab8dc42600de87ed9374e86c83ac0d7d11f1a3d1eaf709a8ba0d9a7e781 index: 4228293
Verified OK

Verifying the Cosign binary is trickier, though, because you need to have Cosign already installed to verify it. Here is the output if you already have Cosign installed and you want to move to a newer version:

$ COSIGN_EXPERIMENTAL=1 cosign verify-blob \
    --cert https://github.com/sigstore/cosign/releases/download/v1.13.0/cosign-linux-amd64-keyless.pem 
    --signature https://github.com/sigstore/cosign/releases/download/v1.13.0/cosign-linux-amd64-keyless.sig 
    https://github.com/sigstore/cosign/releases/download/v1.13.0/cosign-linux-amd64

tlog entry verified with uuid: 6f1153edcc399b22b016709a218127fc7d5e9fb7071cd4812a9847bf13f65190 index: 4639787
Verified OK

If you are installing Cosign for the first time and downloading the binaries from the release page, you can follow a process similar to the one for verifying Rekor releases. I have submitted an issue to update the Cosign documentation with release verification instructions.

I would rate the installation experience no worse than any other tool geared toward hardcore engineers.

Let’s get into the scenarios.

Using Cosign to Sign Container Images with a KMS Key

Here are the two images that I will use for the first scenario:

$ docker images
REPOSITORY                                                 TAG       IMAGE ID       CREATED         SIZE
562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample   v1        b40ba874cb57   2 minutes ago   138MB
tsmacrtestcssc.azurecr.io/flasksample                      v1        b40ba874cb57   2 minutes ago   138MB

Using Cosign With a Key Stored in AWS KMS

Let’s go over the AWS experience first.

# Sign into the registry
$ aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin 562077019569.dkr.ecr.us-west-2.amazonaws.com
Login Succeeded

# And push the image after that
$ docker push 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1

Signing the container image with the AWS key was relatively easy. Though, be careful when you omit the host and make sure you add that third backslash; otherwise, you will get errors. Here is what I got on the first attempt, which puzzled me a little.

$ cosign sign --key awskms://61c124fb-bf47-4f95-a805-65dda7cd08ae 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1
Error: signing [562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1]: getting signer: reading key: kms get: kms specification should be in the format awskms://[ENDPOINT]/[ID/ALIAS/ARN] (endpoint optional)
main.go:62: error during command execution: signing [562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1]: getting signer: reading key: kms get: kms specification should be in the format awskms://[ENDPOINT]/[ID/ALIAS/ARN] (endpoint optional)

$ cosign sign --key awskms://arn:aws:kms:us-west-2:562077019569:key/61c124fb-bf47-4f95-a805-65dda7cd08ae 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1
Warning: Tag used in reference to identify the image. Consider supplying the digest for immutability.
Error: signing [562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1]: recursively signing: signing digest: getting fetching default hash function: getting public key: operation error KMS: GetPublicKey, failed to parse endpoint URL: parse "https://arn:aws:kms:us-west-2:562077019569:key": invalid port ":key" after host
main.go:62: error during command execution: signing [562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1]: recursively signing: signing digest: getting fetching default hash function: getting public key: operation error KMS: GetPublicKey, failed to parse endpoint URL: parse "https://arn:aws:kms:us-west-2:562077019569:key": invalid port ":key" after host

Of course, when I typed the URIs correctly, the image was signed, and the signature got pushed to the registry.

$ cosign sign --key awskms:///61c124fb-bf47-4f95-a805-65dda7cd08ae 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1
Warning: Tag used in reference to identify the image. Consider supplying the digest for immutability.
Pushing signature to: 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample

Interestingly, I didn’t get the tag warning when using the Key ID incorrectly. I got it when I used the ARN incorrectly as well as when I used the Key ID correctly. Also, I struggled to interpret the error messages, which made me wonder about the consistency of the implementation, but I will cover more about that in the conclusions.

One nice thing was that I was able to copy the Key ID and the Key ARN and directly paste them into the URI without modification. Unfortunately, this was not the case with Azure Key Vault 🙁 .

Using Cosign to Sign Container Images With Azure Key Vault Key

According to the Cosign documentation, I had to set three environment variables to use keys stored in Azure Key Vault. It looks as if service principal is the only authentication option that Cosign implemented. So, I created one and gave it all the necessary permissions to Key Vault. I’ve also set the required environment variables with the service principal credentials.

As I hinted above, my first attempt to sign with a key stored in Azure Key Vault failed. Unlike the AWS experience, copying the key identifier from the Azure Portal and pasting it into the URI (without the https:// part) won’t do the job.

$ cosign sign --key azurekms://tsm-kv-usw3-tst-cssc.vault.azure.net/keys/sigstore-azure-test-key/91ca3fb133614790a51fc9c04bd96890 tsmacrtestcssc.azurecr.io/flasksample:v1
Error: signing [tsmacrtestcssc.azurecr.io/flasksample:v1]: getting signer: reading key: kms get: kms specification should be in the format azurekms://[VAULT_NAME][VAULT_URL]/[KEY_NAME]
main.go:62: error during command execution: signing [tsmacrtestcssc.azurecr.io/flasksample:v1]: getting signer: reading key: kms get: kms specification should be in the format azurekms://[VAULT_NAME][VAULT_URL]/[KEY_NAME]

If you decipher the help text that you get from the error message: kms specification should be in the format azurekms://[VAULT_NAME][VAULT_URL]/[KEY_NAME], you would assume that there are two ways to construct the URI:

  1. Using the key vault name and the key name like this
    azurekms://tsm-kv-usw3-tst-cssc/sigstore-azure-test-key
    The assumption is that Cosign automatically appends .vault.azure.net at the end.
  2. Using the key vault hostname (not URL or identifier) and the key name like this
    azurekms://tsm-kv-usw3-tst-cssc.vault.azure.net/sigstore-azure-test-key

The first one just hung for minutes and did not complete. I’ve tried it several times, but the behavior was consistent.

$ cosign sign --key azurekms://tsm-kv-usw3-tst-cssc/sigstore-azure-test-key tsmacrtestcssc.azurecr.io/flasksample:v1
Warning: Tag used in reference to identify the image. Consider supplying the digest for immutability.
^C
$

I assume the problem is that it tries to connect to a host named tsm-kv-usw3-tst-cssc , but it seems that it was not timing out. The hostname one brought me a step further. It seems that the call to Azure Key Vault was made, and I got the following error:

$ cosign sign --key azurekms://tsm-kv-usw3-tst-cssc.vault.azure.net/sigstore-azure-test-key tsmacrtestcssc.azurecr.io/flasksample:v1
Warning: Tag used in reference to identify the image. Consider supplying the digest for immutability.
Error: signing [tsmacrtestcssc.azurecr.io/flasksample:v1]: recursively signing: signing digest: signing the payload: keyvault.BaseClient#Sign: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="Forbidden" Message="The user, group or application 'appid=04b07795-xxxx-xxxx-xxxx-02f9e1bf7b46;oid=f4650a81-f57d-4fb3-870c-e84fe859f68a;numgroups=1;iss=https://sts.windows.net/08c1c649-bfdd-439e-8e5b-5ff31c72ce4e/' does not have keys sign permission on key vault 'tsm-kv-usw3-tst-cssc;location=westus3'. For help resolving this issue, please see https://go.microsoft.com/fwlink/?linkid=2125287" InnerError={"code":"ForbiddenByPolicy"}
main.go:62: error during command execution: signing [tsmacrtestcssc.azurecr.io/flasksample:v1]: recursively signing: signing digest: signing the payload: keyvault.BaseClient#Sign: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="Forbidden" Message="The user, group or application 'appid=04b07795-8ddb-461a-bbee-02f9e1bf7b46;oid=f4650a81-f57d-4fb3-870c-e84fe859f68a;numgroups=1;iss=https://sts.windows.net/08c1c649-bfdd-439e-8e5b-5ff31c72ce4e/' does not have keys sign permission on key vault 'tsm-kv-usw3-tst-cssc;location=westus3'. For help resolving this issue, please see https://go.microsoft.com/fwlink/?linkid=2125287" InnerError={"code":"ForbiddenByPolicy"}

Now, this was a very surprising error. And mainly because the AppId from the error message (04b07795-xxxx-xxxx-xxxx-02f9e1bf7b46) didn’t match the AppId (or Client ID) of the environment variable that I have set as per the Cosign documentation.

$ echo $AZURE_CLIENT_ID
a59eaa16-xxxx-xxxx-xxxx-dca100533b89

Note that I masked parts of the IDs for privacy reasons.

My first assumption was that the AppId from the error message was for my user account, with which I signed in using Azure CLI. This assumption turned out to be true. Not knowing the intended behavior, I filed an issue for the Sigstore team to clarify and document the Azure Key Vault authentication behavior. After restarting the terminal (it seems to restart is the norm in today’s software products 😉 ), I was able to move another step forward. Now, having only signed in with the service principal credentials, I got the following error:

$ cosign sign --key azurekms://tsm-kv-usw3-tst-cssc.vault.azure.net/sigstore-azure-test-key tsmacrtestcssc.azurecr.io/flasksample:v1
Warning: Tag used in reference to identify the image. Consider supplying the digest for immutability.
Error: signing [tsmacrtestcssc.azurecr.io/flasksample:v1]: recursively signing: signing digest: signing the payload: keyvault.BaseClient#Sign: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="BadParameter" Message="Key and signing algorithm are incompatible. Key https://tsm-kv-usw3-tst-cssc.vault.azure.net/keys/sigstore-azure-test-key/91ca3fb133614790a51fc9c04bd96890 is of type 'RSA', and algorithm 'ES256' can only be used with a key of type 'EC' or 'EC-HSM'."
main.go:62: error during command execution: signing [tsmacrtestcssc.azurecr.io/flasksample:v1]: recursively signing: signing digest: signing the payload: keyvault.BaseClient#Sign: Failure responding to request: StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400 Code="BadParameter" Message="Key and signing algorithm are incompatible. Key https://tsm-kv-usw3-tst-cssc.vault.azure.net/keys/sigstore-azure-test-key/91ca3fb133614790a51fc9c04bd96890 is of type 'RSA', and algorithm 'ES256' can only be used with a key of type 'EC' or 'EC-HSM'."

Apparently, I have generated an incompatible key! Note that RSA keys are not supported by Cosign, as I documented in the following Sigstore documentation issue. After generating a new key, the signing finally succeeded.

$ cosign sign --key azurekms://tsm-kv-usw3-tst-cssc.vault.azure.net/sigstore-azure-test-key-ec tsmacrtestcssc.azurecr.io/flasksample:v1
Warning: Tag used in reference to identify the image. Consider supplying the digest for immutability.
Pushing signature to: tsmacrtestcssc.azurecr.io/flasksample

OK! I was able to get through the first three steps of Scenario 1: Sign Container Images With Existing Keys from KMS. Next, I will add some other artifacts to the image – aka attestations. I will use only one of the cloud vendors for that because I don’t expect differences in the experience.

Adding SBOM Attestation With Cosign

Using Syft, I can generate an SBOM for the container image that I have built. Then I can use Cosign to sign and push the SBOM to the registry. Keep in mind that you need to be signed into the registry to generate the SBOM. Below are the steps to generate the SBOM (nothing to do with Cosign). The SBOM generated is also available in my Github test repo.

# Sign into AWS ERC
$ aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin 562077019569.dkr.ecr.us-west-2.amazonaws.com

# Generate the SBOM
$ syft packages 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1 -o spdx-json > flasksample-v1.spdx

Cosign CLI’s help shows the following message how to add an attestation to an image using AWS KMS key.

cosign attest --predicate <FILE> --type <TYPE> --key awskms://[ENDPOINT]/[ID/ALIAS/ARN] <IMAGE>

When I was running this test, there was no explanation of what the --type <TYPE> parameter was. I decided just to give it a try.

$ cosign attest --predicate flasksample-v1.spdx --type sbom --key awskms:///arn:aws:kms:us-west-2:562077019569:key/61c124fb-bf47-4f95-a805-65dda7cd08ae 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1
Error: signing 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1: invalid predicate type: sbom
main.go:62: error during command execution: signing 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1: invalid predicate type: sbom

Trying spdx-json as a type also doesn’t work. There were a couple of places here and here, where Cosign documentation spoke about custom predicate types, but none of the examples showed how to use the parameter. I decided to give it one last try.

$ cosign attest --predicate flasksample-v1.spdx --type "cosign.sigstore.dev/attestation/v1" --key awskms:///arn:aws:kms:us-west-2:562077019569:key/61c124fb-bf47-4f95-a805-65dda7cd08ae 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1
Error: signing 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1: invalid predicate type: cosign.sigstore.dev/attestation/v1
main.go:62: error during command execution: signing 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1: invalid predicate type: cosign.sigstore.dev/attestation/v1

Obviously, this was not yet documented, and it was not clear what values could be provided for it. Here the issue asking to clarify the purpose of the --type <TYPE> parameter. From the documentation examples, it seemed that this parameter could be safely omitted. So, I gave it a shot! Running the command without the parameter worked fine and pushed the attestation to the registry.

$ cosign attest --predicate flasksample-v1.spdx --key awskms:///arn:aws:kms:us-west-2:562077019569:key/61c124fb-bf47-4f95-a805-65dda7cd08ae 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1
Using payload from: flasksample-v1.spdx

One thing that I noticed with the attestation experience is that it pushed a single artifact with .att at the end of the tag. I will come back to this in the next post. Now, let’s push the SLSA attestation for this image.

Adding SLSA Attestation With Cosign

As I mentioned above, I will cheat with the SLSA attestation because I do all those steps manually and docker build doesn’t generate SLSA provenance. I will use this sample for the SLSA provenance attestation.

$ cosign attest --predicate flasksample-v1.slsa --key awskms:///arn:aws:kms:us-west-2:562077019569:key/61c124fb-bf47-4f95-a805-65dda7cd08ae 562077019569.dkr.ecr.us-west-2.amazonaws.com/flasksample:v1
Using payload from: flasksample-v1.slsa

Cosign did something, as we can see on the console as well as in the registry – the digest of the .att artifact changed.

The question, though, is what exactly happened?

In the next post of the series, I will go into detail about what is happening behind the scenes, where I will look deeper at the artifacts created by Cosign.

Summary

To summarize my experience so far, here is what I think.

  • As I mentioned above, the installation experience for the tools is no worse than any other tool targeted to engineers. Improvements in the documentation would be beneficial for the first-use experience, and I filed a few issues to help with that.
  • Signing with a key stored in AWS was easy and smooth. Unfortunately, the implementation followed the same pattern for Azure Key Vault. I think it would be better to follow the patterns for the specific cloud vendor. There is no expectation that each cloud vendor will follow the same naming, URI, etc. patterns; changing those may result in more errors than benefits for the user.
  • While Cosign hides a lot of the complexities behind the scenes, providing some visibility into what is happening will be good. For example, if you create a key in Azure Key Vault, Cosign CLI will automatically create a key that it supports. That will avoid the issue I encountered with the RSA keys, but it may not be the main scenario used in the enterprise.

Next time, I will spend some time looking at the artifacts created by Cosign and understanding their purpose, as well as how to verify those using Cosign and the keys stored in the KMS.

One important step in securing the supply chain for containers is preventing the use of “bad” images. I intentionally use the word “bad” here. For one enterprise, “bad” may mean “vulnerable”; for another, it may mean containing software with an unapproved license; for a third, it may be an image with a questionable signature; possibilities are many. Also, “bad” images may be OK to run in one environment (for example, the local machine of a developer for bug investigation) but not in another (for example, the production cluster). Lastly, the use of “bad” images needs to be verified in many phases of the supply chain – before release for internal use, before build, before deployment, and so on. The decision of whether a container image is “bad” cannot be made in advance and depends on the consumer of the image. One common way to prevent the use of “bad” images is the so-called quarantine pattern. The quarantine pattern prevents an image from being used unless certain conditions are met.

Let’s look at a few scenarios!

Scenarios That Will Benefit from a Quarantine Pattern

Pulling images from public registries and using those in your builds or for deployments is risky. Such public images may have vulnerabilities or malware included. Using them as base images or deploying them to your production clusters bypass any possible security checks, compromising your containers’ supply chain. For that reason, many enterprises ingest the images from a public registry into an internal registry where they can perform additional checks like vulnerability or malware scans. In the future, they may sign the images with an internal certificate, generate a Software Bill of Material (SBOM), add provenance data, or something else. Once those checks are done (or additional data about the image is generated), the image is released for internal use. The public images are in the quarantined registry before they are made available for use in the internal registry with “golden” (or “blessed 🙂 ) images.

Another scenario is where an image is used as a base image to build a new application image. Let’s say that two development teams use the  debian:bullseye-20220228 as a base image for their applications. The first application uses  libc-bin, while the second one doesn’t. libc-bin in that image has several critical and high severity vulnerabilities. The first team may not want to allow the use of the debian:bullseye-20220228 as a base image for their engineers, while the second one may be OK with it because the libc-bin vulnerabilities may not impact their application. You need to selectively allow the image to be used in the second team’s CI/CD pipeline but not in the first.

In the deployment scenario, teams may be OK deploying images with the developers’ signatures in the DEV environments, while the PROD ones should only accept images signed with the enterprise keys.

As you can see, deciding whether an image should be allowed for use or not is not a binary decision, and it depends on the intention of its use. In all scenarios above, an image has to be “quarantined” and restricted for certain use but allowed for another.

Options for Implementing the Quarantine Pattern

So, what are the options to implement the quarantine pattern for container images?

Using a Quarantine Flag and RBAC for Controlling the Access to an Image

This is the most basic but least flexible way to implement the quarantine pattern. Here is how it works!

When the image is pushed to the registry, the image is immediately quarantined, i.e. the quarantine flag on the image is set to  TRUE. A separate role like QuarantineReader is created in the registry and assigned to the actor or system allowed to perform tasks on the image while in quarantine. This role allows the actor or system to pull the image to perform the needed tasks. It also allows changing the quarantine flag from TRUE to FALSE when the task is completed.

The problem with this approach becomes obvious in the scenarios above. Take, for example, the ingestion of public images scenario. In this scenario, you have more than one actor that needs to modify the quarantine flag: the vulnerability scanner, the malware scanner, the signer, etc., before the images are released for internal use. All those tasks are done outside the registry, and some of them may run on a schedule or take a long time to complete (vulnerability and malware scans, for example). All those systems need to be assigned the QuarantineReader role and allowed to flip the flag when done. The problem, though, is that you need to synchronize between those services and only change the quarantine flag from TRUE to FALSE only after all the tasks are completed.

Managing concurrency between tasks is a non-trivial job. This complicates the implementation logic for the registry clients because they need to interact with each other or an external system that synchronizes all tasks and keeps track of their state. Unless you want to implement this logic into the registry, which I would not recommend.

One additional issue with this approach is its extensibility. What if you need to add one more task to the list of things that you want to do on the image before being allowed for use? You need to crack open the code and implement the hooks to the new system.

Lastly, some of the scenarios above are not possible at all. If you need to restrict access to the image to one team and not another, the only way to do it is to assign the QuarantineReader role to the first team. This is not optimal, though, because the meaning of the role is only to assign it so systems that will perform tasks to take the image out of quarantine and not use it for other purposes. Also, if you want to make decisions based on the content of vulnerability reports or SBOMs, this quarantine flag approach is not applicable at all.

Using Declarative Policy Approach

A more flexible approach is to use a declarative policy. The registry can be used to store all necessary information about the image, including vulnerability and malware reports, SBOMs, provenance information, and so on. Well, soon, registries will be able to do that 🙂 If your registry supports ORAS reference types, you can start saving those artifacts right now. In the future, thanks to the Reference Types OCI Working Group, every OCI-compliant registry should be able to do the same. How does that work?

When the image is initially pushed to the registry, no other artifacts are attached to it. Each individual system that needs to perform a task on the image can run on its own schedule. Once it completes the task, it pushes a reference type artifact to the registry with the subject of the image in question. Every time the image is pulled from the registry, the policy evaluates if the required reference artifacts are available; if not, the image is not allowed for use. You can define different policies for different situations as long as the policy engine understands the artifact types. Not only that, but you can even make decisions on the content of the artifacts as long as the policy engine is intelligent enough to interpret those artifacts.

Using the declarative policy approach, the same image will be allowed for use by clients with different requirements. Extending this is as simple as implementing a new policy, which in most cases doesn’t require coding.

Where Should the Policy Engine be Implemented?

Of course, the question that gets raised is where the policy engine should be implemented – as part of the registry or outside of it. I think the registries are intended to store the information and not make policy decisions. Think of a registry as yet another storage system – it has access control implemented but the only business logic it holds is how to manage the data. Besides that, there are already many policy engines available – OPA is the one that immediately comes to mind, that is flexible enough to enable this functionality relatively easily. Policy engines are already available, and different systems are already integrated with them. Adding one more engine as part of the registry will just increase the overhead of managing policies.

Summary

To summarise, using a declarative policy-based approach to control who should and shouldn’t be able to pull an artifact from the registry is more flexible and extensible. Adding capabilities to the policy engines to understand the artifact types and act on those will allow enterprises to develop custom controls tailored to their own needs. In the future, when policy engines can understand the content of each artifact, those policies will be able to evaluate SBOMs, vulnerability reports, and other content. This will open new opportunities to define fine-grained controls for the use of registry artifacts.