In my last post, Implementing Quarantine Pattern for Container Images, I wrote about how to implement a quarantine pattern for container images and how to use policies to prevent the deployment of an image that doesn’t meet certain criteria. In that post, I also mentioned that the quarantine flag (not to be confused with the quarantine pattern 🙂) has certain disadvantages. Since then, Steve Lasker has convinced me that the quarantine flag could be useful in certain scenarios. Many of those scenarios are new and will play a role in the containers’ secure supply chain improvements. Before we look at the scenarios, let’s revisit how the quarantine flag works.

What is the Container Image Quarantine Flag?

As you remember from the previous post, the quarantine flag is set on an image at the time the image is pushed to the registry. The expected workflow is shown in the flow diagram below.

The quarantine flag is set on the image for as long as the Quarantine Processor completes the actions and removes the image from quarantine. We will go into detail about what those actions can be later on in the post. The important thing to remember is that, while in quarantine, the image can be pulled only by the Quarantine Processor. Neither the Publisher nor the Consumer or other actor should be able to pull the image from the registry while in quarantine. The way this is achieved is through special permissions that are assigned to the Quarantine Processor that the other actors do not have. Such permissions can be quarantine pull, and quarantine push, and should allow pulling artifacts from and pushing artifacts to the registry while the image is in quarantine.

Inside the registry, you will have a mix of images that are in quarantine and images that are not. The quarantined ones can only be pulled by the Quarantine Processor, while others can be pulled by anybody who has access to the registry.

Quarantining images is a capability that needs to be implemented in the container registry. Though this is not a standard capability, very few, if any, container registries implement it. Azure Container Registry (ACR) has a quarantine feature that is in preview. As explained in the previous post, the quarantine flag’s limitations are still valid. Mainly, those are:

  • If you need to have more than one Quarantine Processor, you need to figure out a way to synchronize their operations. The Quarantine Processor who completes the last action should remove the quarantine flag.
  • Using asynchronous processing is hard to manage. The Quarantine Processor manages all the actions and changes the flag. If you have an action that requires asynchronous processing, the Quarantine Processor needs to wait for the action to complete to evaluate the result and change the flag.
  • Last, you should not set the quarantine flag once you remove it. If you do that, you may break a lot of functionality and bring down your workloads. The problem is that you do not have the granularity of control over who can and cannot pull the image except for giving them the Quarantine Processor role.

With all that said, though, if you have a single Quarantine Processor, the quarantine flag can be used to prepare the image for use. This can be very helpful in the secure supply chain scenarios for containers, where the CI/CD pipelines do not only push the images to the registries but also produce additional artifacts related to the images. Let’s look at a new build scenario for container images that you may want to implement.

Quarantining Images in the CI/CD Pipeline

The one place where the quarantine flag can prove useful is in the CI/CD pipeline used to produce a compliant image. Let’s assume that for an enterprise, a compliant image is one that is signed, has an SBOM that is also signed, and passed a vulnerability scan with no CRITICAL or HIGH severity vulnerabilities. Here is the example pipeline that you may want to implement.

In this case, the CI/CD agent is the one that plays the Quarantine Processor role and manages the quarantine flag. As you can see, the quarantine flag is automatically set in step 4 when the image is pushed. Steps 5, 6, 7, and 8 are the different actions performed on the image while it is in quarantine. While those actions are not complete, the image should not be pullable by any consumer. For example, some of those actions, like the vulnerability scan, may take a long time to complete. You don’t want a developer to accidentally pull the image before the vulnerability scan is done. If one of those actions fails for any reason, the image should stay in quarantine as non-compliant.

Protecting developers from pulling non-compliant images is just one of the scenarios that a quarantine flag can help with. Another one is avoiding triggers for workflows that are known to fail if the image is not compliant.

Using Events to Trigger Image Workflows

Almost every container registry has an eventing mechanism that allows you to trigger workflows based on events in the registry. Typically, you would use the image push event to trigger the deployment of your image for testing or production. In the above case, if your enterprise has a policy for only deploying images with signatures, SBOMs, and vulnerability reports, your deployment will fail if the deployment is triggered right after step 4. The deployment should be triggered after step 9, which will ensure that all the required actions on the image are performed before the deployment starts.

To avoid the triggering of the deployment, the image push event should be delayed till after step 9. A separate event quarantine push can be emitted in step 4 that can be used to trigger actions related to the quarantine of the image. Note of caution here, though! As we mentioned previously, synchronizing multiple actors who can act on the quarantine flag can be tricky. If the CI/CD pipeline is your Quarantine Processor, you may feel tempted to use the quarantine push event to trigger some other workflow or long-running action. An example of such action can be an asynchronous malware scanning and detonation action, which cannot be run as part of the CI/CD pipeline. The things to be aware of are:

  • To be able to pull the image, the malware scanner must also have the Quarantine Processor role assigned. This means that you will have more than one concurrent Quarantine Processor acting on the image.
  • The Quarantine Processor that finishes first will remove the quarantine flag or needs to wait for all other Quarantine Processors to complete. This, of course, adds complexity to managing the concurrency and various race conditions.

I would strongly suggest that you have only one Quarantine Processor and use it to manage all activities from it. Else, you can end up with inconsistent states of the images that do not meet your compliance criteria.

When Should Events be Fired?

We already mentioned in the previous section the various events you may need to implement in the registry:

  • A quarantine push event is used to trigger workflows that are related to images in quarantine.
  • An image push event is the standard event triggered when an image is pushed to the registry.

Here is a flow diagram of how those events should be fired.

This flow offers a logical sequence of events that can be used to trigger relevant workflows. The quarantine workflow should be trigerred by the quarantine push event, while all other workflows should be triggered by the image push event.

If you look at the current implementation of the quarantine feature in ACR, you will notice that both events are fired if the registry quarantine is not enabled (note that the feature is in preview, and functionality may change in the future). I find this behavior confusing. The reason, albeit philosophical, is simple – if the registry doesn’t support quarantine, then it should not send quarantine push events. The behavior should be consistent with any other registry that doesn’t have quarantine capability, and only the image push event should be fired.

What Data Should the Events Contain?

The consumers of the events should be able to make a decision on how to proceed based on the information in the event. The minimum information that needs to be provided in the event should be:

  • Timestamp
  • Event Type: quarantine or push
  • Repository
  • Image Tag
  • Image SHA
  • Actor

This information will allow the event consumers to subscribe to registry events and properly handle them.

Audit Logging for Quarantined Images

Because we are discussing a secure supply chain for containers, we should also think about traceability. For quarantine-enabled registries, a log message should be added at every point the status of the image is changed. Once again, this is something that needs to be implemented by the registry, and it is not standard behavior. At a minimum, you should log the following information:

  • When the image is put into quarantine (initial push)
    • Timestamp
    • Repository
    • Image Tag
    • Image SHA
    • Actor/Publisher
  • When the image is removed from quarantine (quarantine flag is removed)
    Note: if the image is removed from quarantine, the assumption is that is passed all the quarantine checks.

    • Timestamp
    • Repository
    • Image Tag
    • Image SHA
    • Actor/Quarantine Processor
    • Details
      Details can be free-form or semi-structured data that can be used by other tools in the enterprise.

One question that remains is whether a message should be logged if the quarantine does not pass after all actions are completed by the Quarantine Processor. It would be good to get the complete picture from the registry log and understand why certain images stay in quarantine forever. On the other side, though, the image doesn’t change its state (it is in quarantine anyway), and the registry needs to provide an API just to log the message. Because the API to remove the quarantine is not a standard OCI registry API, a single API can be provided to both remove the quarantine flag and log the audit message if the quarantine doesn’t pass. ACR quarantine feature uses the custom ACR API to do both.

Summary

To summarize, if implemented by a registry, the quarantine flag can be useful in preparing the image before allowing its wider use. The quarantine activities on the image should be done by a single Quarantine Processor to avoid concurrency and inconsistencies in the registry. The quarantine flag should be used only during the initial setup of the image before it is released for wider use. Reverting to a quarantine state after the image is published for wider use can be dangerous due to the lack of granularity for actor permissions. Customized policies should continue to be used for images that are published for wider use.

One important step in securing the supply chain for containers is preventing the use of “bad” images. I intentionally use the word “bad” here. For one enterprise, “bad” may mean “vulnerable”; for another, it may mean containing software with an unapproved license; for a third, it may be an image with a questionable signature; possibilities are many. Also, “bad” images may be OK to run in one environment (for example, the local machine of a developer for bug investigation) but not in another (for example, the production cluster). Lastly, the use of “bad” images needs to be verified in many phases of the supply chain – before release for internal use, before build, before deployment, and so on. The decision of whether a container image is “bad” cannot be made in advance and depends on the consumer of the image. One common way to prevent the use of “bad” images is the so-called quarantine pattern. The quarantine pattern prevents an image from being used unless certain conditions are met.

Let’s look at a few scenarios!

Scenarios That Will Benefit from a Quarantine Pattern

Pulling images from public registries and using those in your builds or for deployments is risky. Such public images may have vulnerabilities or malware included. Using them as base images or deploying them to your production clusters bypass any possible security checks, compromising your containers’ supply chain. For that reason, many enterprises ingest the images from a public registry into an internal registry where they can perform additional checks like vulnerability or malware scans. In the future, they may sign the images with an internal certificate, generate a Software Bill of Material (SBOM), add provenance data, or something else. Once those checks are done (or additional data about the image is generated), the image is released for internal use. The public images are in the quarantined registry before they are made available for use in the internal registry with “golden” (or “blessed 🙂 ) images.

Another scenario is where an image is used as a base image to build a new application image. Let’s say that two development teams use the  debian:bullseye-20220228 as a base image for their applications. The first application uses  libc-bin, while the second one doesn’t. libc-bin in that image has several critical and high severity vulnerabilities. The first team may not want to allow the use of the debian:bullseye-20220228 as a base image for their engineers, while the second one may be OK with it because the libc-bin vulnerabilities may not impact their application. You need to selectively allow the image to be used in the second team’s CI/CD pipeline but not in the first.

In the deployment scenario, teams may be OK deploying images with the developers’ signatures in the DEV environments, while the PROD ones should only accept images signed with the enterprise keys.

As you can see, deciding whether an image should be allowed for use or not is not a binary decision, and it depends on the intention of its use. In all scenarios above, an image has to be “quarantined” and restricted for certain use but allowed for another.

Options for Implementing the Quarantine Pattern

So, what are the options to implement the quarantine pattern for container images?

Using a Quarantine Flag and RBAC for Controlling the Access to an Image

This is the most basic but least flexible way to implement the quarantine pattern. Here is how it works!

When the image is pushed to the registry, the image is immediately quarantined, i.e. the quarantine flag on the image is set to  TRUE. A separate role like QuarantineReader is created in the registry and assigned to the actor or system allowed to perform tasks on the image while in quarantine. This role allows the actor or system to pull the image to perform the needed tasks. It also allows changing the quarantine flag from TRUE to FALSE when the task is completed.

The problem with this approach becomes obvious in the scenarios above. Take, for example, the ingestion of public images scenario. In this scenario, you have more than one actor that needs to modify the quarantine flag: the vulnerability scanner, the malware scanner, the signer, etc., before the images are released for internal use. All those tasks are done outside the registry, and some of them may run on a schedule or take a long time to complete (vulnerability and malware scans, for example). All those systems need to be assigned the QuarantineReader role and allowed to flip the flag when done. The problem, though, is that you need to synchronize between those services and only change the quarantine flag from TRUE to FALSE only after all the tasks are completed.

Managing concurrency between tasks is a non-trivial job. This complicates the implementation logic for the registry clients because they need to interact with each other or an external system that synchronizes all tasks and keeps track of their state. Unless you want to implement this logic into the registry, which I would not recommend.

One additional issue with this approach is its extensibility. What if you need to add one more task to the list of things that you want to do on the image before being allowed for use? You need to crack open the code and implement the hooks to the new system.

Lastly, some of the scenarios above are not possible at all. If you need to restrict access to the image to one team and not another, the only way to do it is to assign the QuarantineReader role to the first team. This is not optimal, though, because the meaning of the role is only to assign it so systems that will perform tasks to take the image out of quarantine and not use it for other purposes. Also, if you want to make decisions based on the content of vulnerability reports or SBOMs, this quarantine flag approach is not applicable at all.

Using Declarative Policy Approach

A more flexible approach is to use a declarative policy. The registry can be used to store all necessary information about the image, including vulnerability and malware reports, SBOMs, provenance information, and so on. Well, soon, registries will be able to do that 🙂 If your registry supports ORAS reference types, you can start saving those artifacts right now. In the future, thanks to the Reference Types OCI Working Group, every OCI-compliant registry should be able to do the same. How does that work?

When the image is initially pushed to the registry, no other artifacts are attached to it. Each individual system that needs to perform a task on the image can run on its own schedule. Once it completes the task, it pushes a reference type artifact to the registry with the subject of the image in question. Every time the image is pulled from the registry, the policy evaluates if the required reference artifacts are available; if not, the image is not allowed for use. You can define different policies for different situations as long as the policy engine understands the artifact types. Not only that, but you can even make decisions on the content of the artifacts as long as the policy engine is intelligent enough to interpret those artifacts.

Using the declarative policy approach, the same image will be allowed for use by clients with different requirements. Extending this is as simple as implementing a new policy, which in most cases doesn’t require coding.

Where Should the Policy Engine be Implemented?

Of course, the question that gets raised is where the policy engine should be implemented – as part of the registry or outside of it. I think the registries are intended to store the information and not make policy decisions. Think of a registry as yet another storage system – it has access control implemented but the only business logic it holds is how to manage the data. Besides that, there are already many policy engines available – OPA is the one that immediately comes to mind, that is flexible enough to enable this functionality relatively easily. Policy engines are already available, and different systems are already integrated with them. Adding one more engine as part of the registry will just increase the overhead of managing policies.

Summary

To summarise, using a declarative policy-based approach to control who should and shouldn’t be able to pull an artifact from the registry is more flexible and extensible. Adding capabilities to the policy engines to understand the artifact types and act on those will allow enterprises to develop custom controls tailored to their own needs. In the future, when policy engines can understand the content of each artifact, those policies will be able to evaluate SBOMs, vulnerability reports, and other content. This will open new opportunities to define fine-grained controls for the use of registry artifacts.

While working on a process of improving the container secure supply chain, I often need to go over the current challenges of patching container vulnerabilities. With the introduction of Automatic VM Patching, having those conversations are even more challenging because there is always the question: “Why can’t we patch containers the same way we patch VMs?” Really, why can’t we? First, let’s look at how VM and container workloads differ.

How do VM and Container Workloads Differ?

VM-based applications are considered legacy applications, and VMs fall under the category of Infrastructure-as-a-Service (IaaS) compute services. One of the main characteristics of IaaS compute services is the persistent local storage that can be used to save data on the VM. Typically the way you use the VMs for your application is as follows:

  • You choose a VM image from the cloud vendor’s catalog. The VM image specifies the OS and its version you want to run on the VM.
  • You create the VM from that image and specify the size of the VM. The size includes the vCPUs, memory, and persistent storage to be used by the VM.
  • You install the additional software you need for your application on the VM.

From this point onward, the VM workload state is saved to the persistent storage attached to the VM. Any changes to the OS (like patches) are also committed to the persistent storage, and next time the VM workload needs to be spun up, those are loaded from there. Here are the things to remember for VM-based workloads:

  • VM image is used only once when the VM workload is created.
  • Changes to the VM workload are saved to the persistent storage; the next time the VM is started, those changes are automatically loaded.
  • If a VM workload is moved to a different hardware, the changes will still be loaded from the persistent storage.

How do containers differ, though?

Whenever a new container workload is started, the container image is used to create the container (similar to the VM). If the container workload is stopped and started on the same VM or hardware, any changes to the container will also be automatically loaded. However, because orchestrators do not know whether the new workload will end up on the same VM (or hardware due to resource constraints), they do not stop but destroy the containers, and if a new one needs to be spun up, they use the container image again to create it.

That is a major distinction between VMs and containers. While the VM image is used only once when the VM workload is created, the container images are used repeatedly to re-create the container workload when moving from one place to another and increasing capacity. Thus, when a VM is patched, the patches will be saved to the VM’s persistent storage, while the container patches need to be available in the container image for the workloads to be always patched.

The bottom line is, unlike VMs, when you think of how to patch containers, you should target improvements in updating the container images.

A Timeline of a Container Image Patch

For this example, we will assume that we have an internal machine learning team that builds their application image using  python:3.10-bullseye as a base image. We will concentrate on the timelines for fixing the OpenSSL vulnerabilities CVE-2022-0778 and CVE-2022-1292. The internal application team’s dependency is OpenSSL <– Debian <– Python. Those are all Open Source Software (OSS) projects driven by their respective communities. Here is the timeline of fixes for those vulnerabilities by the OSS community.

2022-03-08: python:3.10.2-bullseye Released

Python publishes  puthon:3.10.2-bullseye container image. This is the last Python image before the CVE-2022-0778 OpenSSL vulnerability was fixed.

2022-03-15: OpenSSL CVE-2022-0778 Fixed

OpenSSL publishes fix for CVE-2022-0778 impacting versions 1.0.2 – 1.0.2zc, 1.1.1 – 1.1.1m, and 3.0.0 – 3.0.1.

2022-03-16: debian:bullseye-20220316 Released

Debian publishes  debian:bullseye-20220316 container image that includes a fix for CVE-2022-0778.

2022-03-18: python:3.10.3-bullseye Released

Python publishes  python:3.10.3-bullseye container image that includes a fix for CVE-2022-0778.

2022-05-03: OpenSSL CVE-2022-1292 Fixed

OpenSSL publishes fix for CVE-2022-1292 impacting versions 1.0.2 – 1.0.2zd, 1.1.1 – 1.1.1n, and 3.0.0 – 3.0.2.

2022-05-09: debian:bullseye-20220509 Released

Debian publishes debian:bullseye-20220316 container image that DOES NOT include a fix for CVE-2022-1292.

2022-05-27: debian:bullseye-20220527 Released

Debian publishes  debian:bullseye-20220527 container image that includes a fix for CVE-2022-1292.

2022-06-02: python:3.10.4-bullseye Released

Python publishes  python:3.10.4-bullseye container image that includes a fix for CVE-2022-1292.

There are a few important things to notice in this timeline:

  • CVE-2022-0778 was fixed in the whole chain within three days only.
  • In comparison, CVE-2022-1292 took 30 days to fix in the whole chain.
  • Also, in the case of CVE-2022-1292, Debian released a container image after the fix from OpenSSL was available, but that image DID NOT contain the fix.

The bottom line is:

  • Timelines for fixes by the OSS communities are unpredictable.
  • The latest releases of container images do not necessarily contain the latest software patches.

SLAs and the Typical Process for Fixing Container Vulnerabilities

The typical process teams use to fix vulnerabilities in container images is waiting for the fixes to appear in the upstream images. In our example, the machine learning team must wait for the fixes to appear in the python:3.10-bullseye image first, then rebuild their application image, test the new image, and re-deploy to their production workloads if tests pass. Let’s call this process wait-rebuild-test-redeploy (or WRTR if you like acronyms:)).

The majority of enterprises have established SLAs for fixing vulnerabilities. For those that have not established such yet, things will soon change due to the Executive Order for Improving the Nation’s Cybersecurity. Many enterprises model their patching processes based on the FedRAMP 30/90/180 rules specified in the FedRAMP Continuous Monitoring Strategy Guide. According to the FedRAMP rules, high severity vulnerabilities must be remediated within 30 days. CISA’s Operational Directive for Reducing the Risk of Known Exploited Vulnerabilities has much more stringent timelines of two weeks for vulnerabilities published in CISA’s Known Exploited Vulnerabilities Catalog.

Let’s see how the timelines for patching the abovementioned OpenSSL vulnerabilities fit into those SLAs for the machine learning team using the typical process for patching containers.

CVE-2022-0778 was published on March 15th, 2022. It is a high severity vulnerability, and according to the FedRAMP guidelines, the machine learning team has till April 14th, 2022, to fix the vulnerability in their application image. Considering that the python:3.10.3-bullseye image was published on March 18th, 2022, the machine learning team has 27 days to rebuild, test, and redeploy the image. This sounds like a reasonable time for those activities. Luckily, CVE-2022-0778 is not in the CISA’s catalog, but the team would still have 11 days for those activities if it was.

The picture with CVE-2022-1292 does not look so good, though. The vulnerability was published on May 3rd, 2022. It is a critical severity vulnerability, and according to the FedRAMP guidelines, the machine learning team has till June 2nd, 2022, to fix the vulnerability. Unfortunately, though, python:3.10.4-bullseye image was published on June 2nd, 2022. This means that the team needs to do the re-build, testing, and re-deployment on the same day the community published the image. Either the team needs to be very efficient with their processes or work around the clock that day to complete all the activities (after hoping the community will publish a fix for the python image before the SLA deadline). That is a very unrealistic expectation and also impacts the team’s morale. If by any chance, the vulnerability appeared on the CISA’s catalog (which luckily it did not), the team would not be able to fix it within the two-week SLA.

That proves that the wait-rebuild-test-redeploy (WRTR) process is ineffective in meeting the SLAs for fixing vulnerabilities in container images. But, what can you currently do to improve this and take control of the timelines?

Using Multi-Stage Builds to Fix Container Vulnerabilities

Until the container technology evolves and a more declarative way for patching container images is available, teams can use multi-stage builds to build their application images and fix the base image vulnerabilities. This is easily done in the CI/CD pipeline. This approach will also allow teams to control the timelines for vulnerability fixes and meet their SLAs. Here is an example how you can solve the issue with patching the above example:

FROM python:3.10.2-bullseye as baseimage

RUN apt-get update; \
     apt-get upgrade -y

RUN adduser appuser

FROM baseimage

USER appuser

WORKDIR /app

CMD [ "python", "--version" ]

In the above Dockerfile, the first stage of the build updates the base image with the latest patches. The second stage builds the application and runs it with the appropriate user permissions. Using this approach you awoid the wait part in the WRTD process above and you can always meet your SLAs with simple re-build of the image.

Of course, this approach also has drawbacks. One of its biggest issues is the level of control teams have over what patches are applied. Another one is that some teams do not want to include layers in their images that do not belong to the application (i.e. modify the base image layers). Those all are topics for another post 🙂

Photo by Webstacks on Unsplash

In Part 1 of the series Signatures, Key Management, and Trust in Software Supply Chains, I wrote about the basic concepts of identities, signatures, and attestation. In this one, I will expand on the house buying scenario, that I hinted about in Part 1, and will describe a few ways to exploit it in the physical world. Then, I will map this scenario to the digital world and delve into a few possible exploits. Throughout this, I will also suggest a few possible mitigations in both the physical as well as the digital world. The whole process as you may have already known is called threat modeling.

Exploiting Signatures Without Attestation in the Offline World

For the purpose of this scenario, we will assume that the parties involved are me and the title company. The document that needs to be signed is the deed (we can also call it the artifact). Here is a visual representation of the scenario:

Here is how the trust is established:

  • The title company has an inherent trust in the government.
  • This means that the title company will trust any government-issued identification like a driving license.
  • In my meeting with the title company, I present my driving license.
  • The title company verifies the driving license is legit and establishes trust in me.
  • Last, the title company trusts the signature that I use to sign the deed in front of them.
  • From here on, the title company trusts the deed to proceed with the transaction.

As we can see, establishing trust between the parties involves two important conditions – implicit trust in a central authority and verification of identity. Though, this process is easily exploitable with fake IDs (like fake driving license) as shown in the picture below.

In this case, an imposter can obtain a fake driving license and impersonate me in the transaction. If the title company can be fooled that the driving license is issued by the government, they can falsely establish trust in the imposter and allow him to sign the deed. From there on, the title company considers the deed trusted and continues with the transaction.

The problem here is with the verification step – the title company does not do a real-time verification if the driving license is legitimate. The verification step is done manually and offline by an employee of the title company and relies on her or his experience to recognize forged driving licenses. If this “gate” is passed, the signature on the deed becomes official and will not be verified anymore in the process.

There is one important step in this process that we didn’t mention yet. When the title company employee verifies the driving license, she or he also takes a photocopy of the driving license and attaches it to the documentation. This photocopy becomes part of the audit trail for the transaction if later on is discovered that the transaction needs to be reverted.

Exploiting Signatures Without Attestation in the Digital World

The above process is easily transferable to the digital world. In the following GitHub project I have an example of signing a simple text file artifact.txt. The example uses self-signed certificates for verifying the identity and the signature.

There are two folders in the repository. The real folder contains the files used to generate a key and X.509 certificate that is tied to my real identity and verified using my real domain name toddysm.com. The fake folder contains the files used to generate a key and X.509 certificate that is tied to an imposter identity that can be verified with a look-alike (or fake) domain. The look-alike domain uses homographs to replace certain characters in my domain name. If the imposter has ownership of the imposter domain, obtaining a trusted certificate with that domain name is easily achievable.

The dilemma you are presented with is, which certificate to trust – the one here or the one here. When you verify both certificates using the following commands:

openssl x509 -nameopt lname,utf8 -in [cert-file].crt -text -noout | grep Subject:
openssl x509 -nameopt lname,utf8 -in [cert-file].crt -text -noout | grep Issuer:

they both return visually indistinguishable information:

Subject: countryName=US, stateOrProvinceName=WA, localityName=Seattle, organizationName=Toddy Mladenov, commonName=toddysm.com, emailAddress=me@toddysm.com
Issuer: countryName=US, stateOrProvinceName=WA, localityName=Seattle, organizationName=Toddy Mladenov, commonName=toddysm.com, emailAddress=me@toddysm.com

It is the same as looking at two identical driving licenses, a legitimate one and a forged one, that have no visible differences.

The barrier for this exploit using PGP keys and SSH keys is even lower. While X.509 certificates need to be issued by a trusted certificate authority (CA), PGP and SSH keys can be issued by anybody. Here is a corresponding example of a valid PGP key and an imposter PGP key. Once again, which one would you trust?

Though, compromising CAs is not something that we can ignore. There are numerous examples where forged certificates issued by legitimate CAs are used:

Let’s also not forget that Stuxnet malware was signed by compromised JMicron and Realtec private keys. In the case of compromised CA, malicious actors don’t even need to use homographs to deceive the public – they can issue the certificate with the real name and domain.

Unlike the physical world though, the digital one misses the very important step of collecting audit information when the signature is verified. I will come back to that in the next post of the series where I plan to explore the various controls that can be put to increase security.

Based on the above though, it is obvious that the trust whether in a single entity or a central certificate authority (CA), has highly diminished in recent years.

Oh, and don’t trust the keys that I published on GitHub! 🙂 Anybody can copy them or generate new ones with my information – unfortunately obtaining that information is quite easy nowadays.

Exploiting Signatures With Attestation in the Offline World

Let’s look at the example I introduced in the previous post where more parties are involved in the process of selling my house. Here is the whole scenario!

Because I am unable to attend the signing of the documents, I need to issue a power of attorney for somebody to represent me. This person will be able to sign the documents on my behalf. First and foremost, I need to trust that person. But my trust in this person doesn’t automatically transfer to the title company that will handle the transaction. For the title company to trust my representative, the power of attorney needs to be attested by a certified notary. Only then will the title company trust the power of attorney document and accept the signature of my representative.

Here is the question: “How the introduction of the notary increases the security?” Note that I used the term “increase security”. While there is no 100% guarantee that this process will not fail…

By adding one more step to the process, we introduce an additional obstacle that reduces the probability for malicious activity to happen, which increases the security.

What the notary will eventually prevent is that my “representative” forcefully makes me sign the power of attorney. My security is compromised and now my evil representative can use the power of attorney to sell my house to himself for just a dollar. The purpose of the notary is to attest that I willfully signed the document and was present (and in good health) during the signing. Of course, this can easily be exploited if both, the representative and the notary are evil, as shown in the below diagram.

As you can see in this scenario, all parties have valid government-issued IDs that the title company trusts. However, the process is compromised if there is collusion between the malicious actor (evil representative) and the notary.

Other ways to exploit this process are if the notary or my representative are both or individually impersonated. The impersonation is described in the section above – Exploiting Signatures Without Attestation in the Offline World.

Exploiting Signatures With Attestation in the Digital World

There is a lot of talks recently about implementing attestation systems that will save signature receipts in an immutable ledger. This is presented as the silver bullet solution for signing software artifacts (check out the Sigstore project). Similar to the notary example in the previous section, this approach may increase security but it may also have a negative impact. Because they compare themselves to Let’s Encrypt, let me take a stab at how Let’s Encrypt impacted the security on the Web.

Before Let’s Encrypt, only owners that want to invest money to pay for valid certificates had HTTPS enabled on their websites. More importantly, though, browsers showed a clear indicator when a site was using plain HTTP protocol and not the secure one. From a user’s point of view it was easy to make the decision that if the browser address bar was red, you should not enter your username and password or your credit card. Recognizing malicious sites was relatively easy because malicious actors didn’t want to spend the money and time to get a valid certificate.

Let’s Encrypt (and the browser vendors) changed that paradigm. Being free, Let’s Encrypt allows anybody to issue a valid (and “trusted”??? 🤔) certificate and enable HTTPS for their site. Not only that but Let’s Encrypt made it so easy that you can get the certificate issued and deployed to your web server using automation within seconds. The only proof you need to provide is the ownership of the domain name for your server. At the same time, Google led the campaign to change the browser indicators to show a very mediocre lock icon in the address bar that nobody except maybe a few pays any attention to anymore. As a result, every malicious website now has HTTPS enabled and there is no indication in the browser to tell you that it is malicious. In essence, the lock gives you a false sense of security.

I would argue that Let’s Encrypt (and the browser vendors) in fact decreased the security on the web instead of increasing it. Let me be clear! While I think Let’s Encrypt (and the browser vendors) decreased the security, what they provide had a tremendous impact on privacy. Privacy should not be discounted! Though in marketing messages those two terms are used interchangeably and this is not for the benefit of the users.

In the digital world, the CA can play the role of the notary in the physical world. The CA verifies the identity of the entity that wants to sign artifacts and issues a “trusted” certificate. Similar to a physical world notary, the CA will issue a certificate for both legit as well as malicious actors, and unlike the physical world, the CA has very basic means to verify identities. In the case of Let’s Encrypt this is the domain ownership. In the case of Sigstore that will be a GitHub account. Everyone can easily buy a domain or register a GitHub account and get a valid certificate. This doesn’t mean though that you should trust it.

Summary

The takeaway from this post for you should be that every system can be exploited. We learn and create systems that reduce the opportunities for exploitation but that doesn’t make them bulletproof. Also, when evaluating technologies we should not only look at the shortcomings of the previous technology but also at the shortcoming of the new shiny one. Just adding attestation to the signatures will not be enough to make signatures more secure.

In the next post, I will look at some techniques that we can employ to make signatures and attestations more secure.

Photo by Erik Mclean on Unsplash

 

 

For the past few months, I’ve been working on a project for a secure software supply chain, and one topic that seems to always start passionate discussions is the software signatures. The President’s Executive Order on Improving the Nation’s Cybersecurity (EO) is a pivotal point for the industry. One of the requirements is for vendors to document the supply chain for software artifacts. Proving the provenance of a piece of software is a crucial part of the software supply chain, and signatures play a main role in the process. Though, there are conflicting views on how signatures should work. There is the traditional PKI (Public Key Infrastructure) approach that is well established in the enterprises, but there are other traditional and emerging technologies that are brought up in discussions. These include PGP key signatures, SSH key signatures, and the emerging ephemeral key (or keyless) signatures (here, here, and lately here).

While PKI is well established, the PKI shortcomings were outlined by Bruce Schneier and Carl Elisson more than 20 years ago in their paper. The new approaches are trying to overcome those shortcomings and democratize the signatures the same way Let’s Encrypt democratized HTTPS for websites. Though, the question is whether those new technologies improve security over PKI? And if so, how? In a series of posts, I will lay out my view of the problem and the pros and cons of using one or another signing approach, how the trust is established, and how to manage the signing keys. I will start with the basics using simple examples that relate to everyday life and map those to the world of digital signatures.

In this post, I will go over the identity, signature, and attestation concepts and explain why those matter when establishing trust.

What is Identity?

Think about your own experience. Your identity is you! You are identified by your gender, skin color, facial and body characteristics, thumbprint, iris print, hair color, DNA etc. Unless you have an identical twin, you are unique in the world. Even if you are identical twins, there are differences like thumbprints and iris prints that make you unique. The same is true for other entities like enterprises, organizations, etc. Organizations have names, tax numbers, government registrations, addresses, etc. As a general rule, changing your identity is hard if not impossible. You can have plastic surgery but you cannot change your DNA. The story may be a bit different for organizations that can rename themselves, get bought or sold, change headquarters, etc. but it is still pretty easy to uniquely identify organizations.

All the above points that identities are:

  • unique
  • and impossible (or very hard) to change

In the digital world, identities are an abstract concept. In my opinion, it is wrong to think that identities can be changed in both the physical and the digital world. Although we tend to think that they can be changed, this is not true – what can be changed is the way we prove our identity. We will cover that shortly but before that, let’s talk about trust.

If you are a good friend of mine, you may be willing to trust me but if you just met me, your level of trust will be pretty low. Trust is established based on historical evidence. The longer you know me, and the longer I behave honestly, the more you will be willing to trust me. Sometimes I may not be completely honest, or I may borrow some money from you and not return them. But I may buy you a beer every time we go out and offset that cost and you may be willing to forgive me. It is important to note that trust is very subjective, and while you may be very forgiving, another friend of mine maybe not. He may decide that I am not worth his trust and never borrow me money again.

How do We Prove Our Identity?

In the physical world, we prove our identity using papers like a driving license, a passport, an ID card, etc. Each one of those documents is issued for a purpose:

  • The driving license is mainly used to prove you can drive a motorized vehicle on the US streets. Unless it is an enhanced driving license, you (soon) will not be able to use it to board a domestic flight. However, you cannot cross borders with your driving license and you cannot use it to even rent a car in Europe (unless you have an international driving license).
  • To cross borders you need a passport. The passport is the only document that is recognized by border authorities in other countries that you visit. You cannot use your US driving license to cross the borders in Europe. The interesting part is that you do not need a driving license to get a passport or vice versa.
  • You also have your work badge. Your work badge identifies you as an employee of a particular organization. Despite the fact that you have a driving license and a passport, you cannot enter the buildings without your badge. However, to prove to your employer that you are who you are for them to issue you the badge, you must have a driving license or a passport.

In the digital world, there are similar concepts to prove our identity.

  • You can use a username, password and another factor (2FA/MFA token) to prove your identity to a particular system.
  • App secret that you can generate in a system can also be used to prove your identity.
  • OAuth or SSO (single sign-on) token issued by a third party is another way to prove your identity to a particular system. That system though needs to trust the third party.
  • SSH key can be an alternate way to prove your identity. You can use it in conjunction with username/password combination or separately.
  • You can use PGP key to prove your identity to an email recipient.
  • Or use a TLS certificate to prove the identity of your website.
  • And finally, you can use an X.509 certificate to prove your identity.

As you can see, similar to the physical world, in the digital world you have multiple ways to prove your identity to a system. You can use more than one way for a single system. The example that comes to mind is GitHub – you can use app secret or SSH key to push your changes to your repository.

How Does Trust Tie to the Concepts Above? Let’s say that I am a good developer. My code published on GitHub has a low level of bugs, it is well structured, well documented, easy to use, and updated regularly. You decide that you can trust my GitHub account. However, I also have DockerHub account that I am negligent with – I don’t update the containers regularly, they have a lot of vulnerabilities, and are sloppily built. Although you are my friend and you trust my GitHub account, you are not willing to trust my DockerHub account. This example shows that trust is not only subjective but also based on context.

OK, What Are Signatures?

Here is where things become interesting! In the physical world, a signature is a person’s name written in that person’s handwriting. Just the signature does not prove my identity. Wikipedia’s entry for signature defines the traditional function of a signature as follows:

…to permanently affix to a document a person’s uniquely personal, undeniable self-identification as physical evidence of that person’s personal witness and certification of the content of all, or a specified part, of the document.

The keyword above is self-identification. This word in the definition has a lot of implications:

  • First, as a signer, I can have multiple signatures that I would like to use for different purposes. I.e. my identity may use different signatures for different purposes.
  • Second, nobody attests to my signature. This means that the trust is put in a single entity – the signer.
  • Third, a malicious person can impersonate me and use my signature for nefarious purposes.

Interestingly though, we are willing to accept the signature as proof of identity depending on the level of trust we have in the signer. For example, if I borrow $50 from you and give you a receipt with my signature the I will pay you back in 30 days, you may be willing to accept it even if you don’t know me too much (i.e. your level of trust is relatively low). This is understandable because we decide to lower our level of trust to just self-identification. I can increase your level of trust if I show you my driving license that has my signature printed on it and you can compare both signatures. However, showing you my driver’s license is actually an attestation, which is covered in detail below.

In the digital world, to create a signature, you need a private key and to verify a signature, you need a public key (check the Digital Signature article on Wikipedia). The private and the public key are related and work in tandem – the private key signs the content and the public key verifies the signature. You own both but keep the private secret and publish the public to everybody to use. From the examples I have above, you can use PGP, SSH, and X.509 to sign content. However, they have differences:

  • PGP is a self-generated key-pair with additional details like name and email address included in the public certificate, that can be used for (pseudo)identification of the entity that signs the content. You can think of it as similar to a physical signature, where, in addition to the signature you verbally provide your name and home address as part of the signing process.
  • SSH is also a self-generated key pair but has no additional information attached. Think of it as the plain physical signature.
  • With X.509 you have a few options:
    • Self-generated key-pair similar to the PGP approach but you can provide more self-identifying information. When signing with such a private key you can assume that it is similar to the physical signature, where you verbally provide your name, address, and date of birth.
    • Domain Validated (DV) certificate that validates your ownership of a particular domain (this is exactly what Let’s Encrypt does). Think of this as similar to a physical signature where you verbally provide your name, address, and date of birth as well as show a utility bill with your name and address as part of the signing process.
    • Extended Validation (EV) certificate that validates your identity using legal documents. For example, this can be your passport as an individual or your state and tax registrations as an organization.
      Both, DV and EV X.509 certificates are issued by Certificate Authorities (CA), which are trusted authorities on the Internet or within the organization.

Note: X.509 is actually an ITU standard defining the format of public-key certificates and is at the basis of the PKI. The key pair can be generated using different algorithms. Though, the term X.509 is used (maybe incorrectly) as a synonym for the key-pair also.

Without any other variables in the mix, the level of trust that you may put on the above digital approaches would most probably be the following: (1-Lowest) SSH, (2) PGP and self-signed X.509, (3) DV X,509, and (4-Highest) EC X.509. Keep in mind that DV and EV X.509 are actually based on attestation, which is described next.

So, What is Attestation?

We finally came to it! Attestation, according to Meriam-Webster dictionary, is an official verification of something as true or authentic. In the physical world, one can increase the level of trust in a signature by having a Notary attest to the signature (lower level of trust) and adding government apostille (higher level of trust used internationally). In many states notaries are required (or highly encouraged) to keep a log for tracking purposes. While you may be OK with having only my signature on a paper for $50 loan, you certainly would want to have a notary attesting to a contract for selling your house to me for $500K. The level of trust in a signature increases when you add additional parties who attest to the signing process.

In the digital world, attestation is also present. As we’ve mentioned above, CAs act as the digital notaries who verify the identity of the signer and issue digital certificates. This is done for the DV and EV X.509 certificates only though. There is no attestation for PGP, SSH, and self-signed X.509 certificates. For digital signatures, there is one more traditional method of attestation – the Timestamp Authority (TSA). The TSA’s role is to provide an accurate timestamp of the signing to avoid tampering with the time by changing the clock on the computer where the signing occurs. Note that the TSA attests only for the accuracy of the timestamp of signing and not for the identity of the signer. One important thing to remember here is that without attestation you cannot fully trust the signature.

Here is a summary of the signing approaches and the level of trust we discussed so far.

Signing Keys and Trust

Signing Approach Level of Trust
SSH Key 1 - Lowest
PGP Key 2 - Low
X.509 Self-Signed 2 - Low
X.509 DV 3 - Medium
X.509 EV 4 - High

Now, that we’ve established the basics let’s talk about the validity period and why it matters.

Validity Period and Why it Matters?

Every identification document that you own in the physical world has an expiration date. OK, I lied! I have a German driving license that doesn’t have an expiration date. But this is an exception, and I can claim that I am one of the last who had that privilege – newer driving licenses in Germany have an expiration date. US driving licenses have an expiration date and an issue date. You need to renew your passport every five years in the US. Different factors determine why an identification document may expire. For a driving license, the reason may be that you lost some of your vision and you are not capable of driving anymore. For a passport, it may be because you moved to another country, became a citizen, and forfeit your right to be a US citizen.

Now, let’s look at physical signatures. Let’s say that I want to issue a power of attorney to you to represent me in the sale of my house while I am on a business trip for four weeks in Europe. I have two options:

  • Write you a power of attorney without an expiration date and have a notary attest to it (else nobody will believe you that you can represent me).
  • Write you a power of attorney that expires four weeks from today and have a notary attest to it.

Which one do you think is more “secure” for me? Of course the second one! The second power of attorney will give you only a limited period to sell my house. While this does not prevent you from selling it in a completely different transaction than the one I want, you are still given some time constraints. The counterparts in the transaction will check the power of attorney and will note the expiration date. If there is a final meeting four weeks and a day from now, that will require you to sign the final papers for the transaction, they should not allow you to do that because the power of attorney is not valid anymore.

Now, here is an interesting situation that often gets overlooked. Let’s say that I sign the power of attorney on Jan 1st, 2022. The power of attorney is valid till the end of day Jan 28th, 2022. I use my driving license to identify myself to the notary. My driving license has an expiration date of Jan 21st, 2022. Also, the notary’s license expires on Jan 24th, 2022. What is the last date that the power of attorney is valid? I will leave this exploration for one of the subsequent posts.

Time constraints are a basic measure to increase my security and prevent you from selling my house and pocketing the money later in the year. I will expand on this example in my next post where I will look at different ways to exploit signatures. But the basic lesson here is: the more time you have to exploit something, the higher probability there is for you to do so. Also, another lesson is: put an expiration date on all of your powers of attorney!

How does this look in the digital world?

  • SSH keys do not have expiration dates. Unless you provide the expiration date in the signature itself, the signature will be valid forever.
  • PGP keys have expiration dates a few years in the future. I just created a new key and it is set to expire on Jan 8th, 2026. If I sign an artifact with it and don’t provide an expiration date for the signature, it will be considered valid until Jan 8th, 2026.
  • X.509 certificates also have long expiration dates – 3, 12, or 24 months. Let’s Encrypt certificates have 3 months expiration dates. Root CA certificates have even longer expiration dates, which can be dangerous as we will explore in the future. Let’s Encrypt was the first to reduce the length of validity of their certificates to increase the security of certificate compromise because domains change hands quite often. Enterprises followed suit because the number of stolen enterprise certificates is growing.

Note: In the next post, I will expand a little bit more into the relationships between keys and signatures but for now, you can use them as the example above where I mention the various validity periods for documents used for the power of attorney.

Summary

If nothing else, here are the main takeaways that you should remember from this post:

  • Signatures cannot infer identities. Signatures can be forged even in the digital world.
  • One identity can have many signatures. Those signatures can be used for different purposes.
  • For a period of time, a signature can infer identity if it is attested to. However, the longer time passes, the lower the trust in this signature should be. Also, the period of time is subjective and dependent on the risk level of the signature consumer.
  • To increase security, signatures must expire. The shorter the expiration period, the higher the security (but also other constraints should be put in place).
  • Before trusting a signature, you should verify if the signed asset is still trustable. This is in line with the zero-trust principle for security: “Never trust, always verify!”.

Take a note that in the last bullet point, I intentionally use the term “asset is trustable” and not “signature is valid”. In the next post, I will go into more detail about what that means, how signatures can be exploited, and how context can provide value.

Featured image by StockSnap.

If you have missed the news lately, cybersecurity is one of the most discussed topics nowadays. From supply chain exploits to data leaks to business email compromise (BEC) there is no break – especially during the pandemic. Many (if not all) start with an account compromise. And if you ask any cybersecurity expert, they will tell you that the best way to protect your account is to use two-factor (or multi-factor) authentication. Well, let me tell you a secret – MFA sucks! Ask the Okta guys! Even they think MFA sucks. And they are a mobile security company. Though, Randall and I have different motives to make that claim.

By the way: 2FA stands for “two-factor authentication” while MFA stands for “multi-factor authentication”. I will use those two acronyms to save on some typing. And one more, TLA means a “three-letter acronym”.

Randall goes on and on in his post about why MFA sucks. Most of his points are valid! It is an annoying, and frustrating experience. I don’t know about slow, but I would argue against being pointless – it serves a purpose, a very good purpose. Where he is mainly wrong is thinking that the solution is yet another technology (Well, the whole point of his post is to market Okta’s new technology, so, he will get a pass for that 😉 ). This new technology will not address the source of the issue – that people are scared to use MFA. Take a look at the Twitter Account Security survey – why do you think only 2.3% (at the time of this writing) of all Twitter users have MFA enabled? Here are what I think the reasons are:

  • Complexity and lack of understanding of the technology
  • Fear of losing access to the accounts

I believe people are smart enough to grasp the benefits without too many explanations. What they are not clear is how to set it up and how to make sure they don’t lose access to their accounts. In general, my frustration is with how the technology vendors have implemented MFA – without any thought about the user experience. Let me illustrate what I mean with my own experience.

The Problem With Too Many MFAs

I have set up MFA on all my important accounts. The list is long: bank accounts, credit card accounts, stock brokerage accounts, government accounts (like taxes, DMV, etc), email accounts (like Office 365 and GMail), GitHub, Twitter, Facebook, you name it. Have been doing this for years already and the list continues to grow. I am also required by former and current employers to use MFA for my work accounts – also a long list. Here is a list of MFA methods I use (and I don’t claim this to be a comprehensive list):

  • SMS
  • email (several different emails)
  • authenticators apps (here the things are getting crazy)
    • Microsoft Authenticator
    • Google Authenticator (I started with this one)
    • Entrust
    • VIP Access
    • Lastpass Authenticator
  • other vendor-specific implementations (I really don’t know how to call those, but they have their own way to do it)
    • Apple
    • WordPress
    • Facebook
  • Yubikey (I have three of those and I will ignore those in this post because the hardware key experience has been the same since the … 90s or so)

You’ll think I am crazy, right? Why do I use so many authenticators and MFA methods? I don’t want to, but the problem is that I have to!

First of all, this all evolved over the last 7-8 years since I started using MFA. It started with Google Authenticator; then Microsoft Authenticator and the Lastpass one; then a few banks added email and SMS (not very secure but interestingly some very big financial institutions still don’t offer other options!!!) while others struck deals with specific vendors to use their own authenticators – hence, I had to install one-off apps for those accounts. A couple of years ago I bought Yubukeys to secure my password managers and some important work and bank accounts.

At some point in time, I decided to unify on a single method! Or a single authenticator app and the Yubikey. That turns out to be impossible! Despite the fact that there is a standard, the authenticator vendors do not give you an easy way to transfer the seeds from one app to another. The only way to do that is to go over tens of accounts and register the new authenticator. Who wants to do that! Also, a few of my financial institutions do not offer a way to use a different MFA method than the one they approved. So, I am stuck with the one-offs. Unless I want to change to different financial institutions. But… who wants to do that!

There is one more problem with the use of so many tools. Very often the systems that are set up with MFA ask you to “enter the code from your MFA app”. The question is: which MFA app have I registered for that system? There is an argument whether having details about the app used to generate the code will compromise the security. Some companies provide that information, while others don’t. If I am allowed to use a single authenticator app for all my accounts, I don’t mind not having information on what tool have I registered. But in the current situation, giving me a hint is a requirement for usability. Anyway, if my authenticator app is compromised, not telling me what app to use will make absolutely no difference.

Another problem with the authenticator apps is that (and this is prevalent in technology) the app thinks it knows best how to name things. If I for example have two accounts for GitHub and want to use MFA, the authenticators will show them both as GitHub. What if I have ten? Which code should I use for which account is really hard to figure out.

The Problem With Switching Phones

Wait! Should I say: “The Problem With Losing Phones?” No, the problem with losing phones is next!

This is where the complexity of the MFA approach starts to show up! I don’t think any of the above-mentioned vendors really thought the whole user experience thoroughly through. I will add Duo to that list because I used it and it also sucks. Also, I will make a note here that colleagues recommended me Twillio’s Authy but I am already so deep in my current (diverse) ecosystem that I have no desire to try one more of those.

My wife (who is not in technology) has an iPhone 7 and she strongly refuses to change it because of all the trouble she needs to go through to set up her apps on a new phone. I spent a lot of time convincing her to set up at least SMS-based MFA for her bank and credit card accounts. And I think that will be the extend that she will go to. After I switched from iPhone 7 to the latest (and greatest) iPhone 13, I completely understand her fear. (As a side note: changing your phone every year is really, really, really bad for the environment. Be environmentally friendly and use your gadgets for a longer time! I have set a goal to use my phones and other gadgets for at least 5 years.) It has been a few months already and I continue to use some of the authenticators on my old phone because it is such a pain to migrate them to the new one.

Let me quickly go over the experience one by one.

Moving MFA Apps From Phone to Phone

Moving Apple MFA to my new phone was fluent. At the end of the day they are the kings of the experiences and this was expected. Moving WordPress and Facebook was also relatively simple – as long as you manage to sign in to your WordPress and Facebook accounts on your new phone, you will start getting the prompts there.

Moving the Lastpass Authenticator should have been easy but they really screwed up the flow between the actual Lastpass app and the authenticator app. I was clicking like crazy back and forth, going in circles for quite some time until something magical happened and it started working on my new phone. For the accounts where I used Entrust I had to go and register the new phone. Inconvenient, but at least I had a self-service. The problems started appearing when I got to VIP Access – I have to call my financial institution because they are the only ones that can register it. This will mean at least one hour on the phone.

Now, let’s get to the big ones!

Google Authenticator apparently has export functionality that allows you to export the seeds and import them in your new phone. If you know about that, it works like a charm too but… I just recently learned about it from Dave Bittner and Joe Carrigan from The Cyberwire.

Microsoft Authenticator should have been the easiest one (they claim). As long as you are signed in with your Microsoft Account, you should be able to get all the codes on your new phone. Well, king of! This works for other MFA accounts except for Microsoft work and school accounts. With all due respect to my colleagues from Azure Active Directory – the work and school account move sucks! You just need to go and register the new phone with those. Really disappointing!

“Insecure” MFA Methods When Switching Phones

Let me write about the non-secure ways to use MFA!

As I mentioned above, I also use email and SMS for certain accounts. The email experience also sucks! OK, I will be honest – this is certainly my own problem and few people may have this one but this is my rant so I will go with it. I have created many email accounts collected over time (about the reasons for that in some other post). One or another email account is used for MFA depending on what email address I’ve used for registration on a particular website. Those emails are synchronized to a single email account but… the synchronization is on schedule – about 30 mins or so. Now, every MFA code normally expires within 10 mins. Either, I miss the time window to enter the code or I need to login into that particular email account to get the code or I need to force the sync manually (yeah, I can do that but it is annoying. Right, Randall?!). Switching my phone has nothing to do with my emails so – there is no impact.

And the last one – SMS! Well, SMS doesn’t suck! You heard me! SMS doesn’t suck! … most of the time. Sometimes you will not get the text message on time due to networking issues but it works perfectly 99.999% of the time. Oh, and if I switch my phone, it continues to work without any additional configurations, QR codes, or calling my telco – like magic 😉

The Problem With Losing Your Phone

Now, here is where things get serious! Or serious if you use mobile apps for MFA. If you lose your phone, you are screwed. You will be locked out of all your accounts or most of them.

Apple is fine if you are in the Apple ecosystem. I have a few MacBooks, iPad, and AppleWatch – at least one of them will get me the MFA code.

With Facebook, I am screwed unless I am signed in to one of my browsers. For a person like me who uses Facebook once in a blue moon, the probability is low, so if I lose my phone, I am screwed. (Maybe that will push me over the edge to finally stop using them 🙂 ). I assume the WordPress story will be the same as with Facebook. Oh, and have you ever tried to get Facebook support on the phone to help you unlock your account? 🙂

About the other ones…

Backup Codes (or The Problem With Backup Codes)

Well, most of the systems allow you to print a bunch of backup codes that you can store “safely” so if you get locked out you can “easily” sign back in. I emphasize the words “safely” and “easily”. Here is why!

Storing Backup Codes Safely

Define “safely”! The experts recommend that you print your backup codes on paper and store them “safely” offline. I assume that they mean putting it in my safe deposit box at home, right? Because I will need to have easy access to those when I get locked out. It is questionable how safe is that because robberies are not uncommon. I had a colleague who got robbed and he found his safe deposit box in the bushes behind his house – of course, empty! Also, most of those safe deposit boxes are not fire- and waterproof. So, you need to buy a fireproof safe deposit box and cement it in your basement so no grown person (I mean teenager or older) can take it out with a crowbar.

Or, they mean to put it in the safe deposit box in the bank. Where there is security and the probability of robbery is minuscule. But then, I need to run to the bank every time I get locked out.

In both cases, this is not easy. From both a logistics point of view and a usability point of view. At the end of the day, what if I am on a trip and lose my phone (which is a quite realistic scenario).

To avoid all this hassle, most of us find some workarounds. Here are a few for you (and be honest and admit that you do those too):

  • Saving the backup codes as files and putting them in a folder on your laptop.
  • Copying the backup codes and storing them in your password manager (together with your password – how secure is that? 🙂 )
  • Saving the backup codes as files and keeping them on a thumb drive in your drawer.
  • Saving them as files on your Dropbox, OneDrive, Google Drive, or another cloud drive.
  • You see where I am going with those…

To be honest, I even didn’t bother printing/saving the backup codes for some of my accounts (and not all of the systems offer that option), which I assume many of us do.

Even if I print them and store them in my safe, I need to print details of what account they belong to and if they get stolen, all my accounts will be compromised.

Storing the Seeds in the Cloud

Some of the authenticators like Microsoft Authenticator keep your seeds in the cloud. Authe was recommended to me for the same reason. The idea is that if you lose your phone, you can sign in to your authenticator on your new phone and it will sync your seeds on the new phone. Magical, right? Yes, if you are able to sign in to the authenticator on your new phone… without your MFA code. So, you are caught in this vicious circle that if you lose your phone, you will need an MFA code to sign in to your authenticator but you have no way to get the MFA code.

The Solution (Backed by the Numbers)

What are you left with if you lose your phone? The only two MFA methods that work for a lost phone are email and SMS (because even if I lose my phone, I can easily keep my number). They are the most insecure ones but have the lowest risk to get you locked out from your accounts.

I am not promoting the use of SMS and email as the second factor for authentication. But the numbers show that majority of the users who use MFA use SMS instead of an app or a hardware key (see the Twitter report). Let’s run this simple math:

  • Twitter has about 396.5M users.
  • 2.3% (at the time of this writing) use MFA for their Twitter account. This is ~9.12M MFA users.
  • 0.5% of those 9.12M use a hardware key. This is just 456K hardware key users.
  • 30.9% of those 9.12M use an auth app. This is 2.82M auth app users.
  • 79.6% of those 9.12M use SMS. This is 7.26M SMS users.

It would be nice if Twitter had a way to break those numbers down by occupation (although it will be a violation of privacy). Pretty sure they will show that the majority of people who use an auth app or a hardware key work in technology. The normal users who deem their account important protect them with SMS because SMS offers the easiest user experience.

One more thing about SMS. Because everybody is scared to lock themselves out from their accounts, people set up their authenticator app as the primary MFA tool but then they have SMS as the backup. This way, if they lose their phone, they can still gain access to their account using SMS. But as we know, security is as strong as the weakest link – in this case the SMS. The setup of an authenticator app just gives the false illusion of security.

Summary

Using more than one factor for authentication is a MUST. Using stronger authenticators would be nice but with the current experience will be hard to achieve. To convince more people to do that, companies need to offer a much friendlier experience to their users:

  • Freedom to choose authenticator app
  • Self-service
  • Easy recovery

Without those, the usage will be at the current Twitter numbers.

MFA is yet another technology developed without the user in mind. But unfortunately, a technology that is at the core of cybersecurity. It is a shame that the security vendors continue to produce all kinds of new technologies (with fancy names like SOAR, SIEM, EDR, XDR, ML, AI) without fixing the basic user experience.

In my previous post, Learn More About Your Home Network with Elastic SIEM – Part 1: Setting Up Elastic SIEM, I explained how you could set up Elastic SIEM on a Raspberry Pi[ad]. The next thing you would want to do is to collect the logs from your firewall and analyze them. Before I jump into the technical details, I should warn you that you may… not be able to do the steps below if you rely on consumer products or you use the equipment provided by your ISP.

Let me go on a short rant here! Every self-respected router vendor should allow firewall logs to be sent to an external system. A common approach is to use the SYSLOG protocol to collect the logs. If your router does not have this capability… well, I would suggest you buy a new, more advanced one.

I personally invested in a tiny Netgate SG-1100 box that runs the open-source PFSense router/firewall. You can, of course, install PFSense on your own hardware if you don’t want to buy a new device. PFSense allows you to configure up to three external log servers. Logstash, that we have configured in the previous post, can play the role of an SYSLOG server and send the events to Elasticsearch. Here is how simple the configuration of the PFSense log shipping looks:

The IP address 192.168.11.72 is the address of the Raspberry Pi, where the ELK SIEM is installed and 5140 is the port that Logstash uses to listen for incoming events. Thas is all you need to configure PFSense to send the logs to the ELK SIEM.

Our next step is to configure Logstash to collect the events from PFSense and feed them into an index in Elastic. The following project from Patrick Jennings will help you with the Logstash configuration. If you follow the instructions, you will see the new index show up in Kibana like this:

The last thing we need to do is to create a dashboard in Kibana to show the data collected from the firewall. Patrick Jennings’ project has pre-configured visualizations and a dashboard for the PFSense data. Unfortunately, when you import those, Kibana warns you that those need to be updated. The reason is that they use the old JSON format used by Kibana, and the latest versions require all objects to be described using the Newline Delimited NDJSON format (for more details, visit ndjson.org). The pfSense dashboard and visualization are available in my GitHub repository for Home SIEM.

Now, keep in mind that the pfSense logs will not feed into the SIEM functionality of the Elastic stack because it is not in the Elastic Common Schema (ECS) format. What we have created is just a dashboard that visualizes the firewall log data. Also, the dashboard and the visualizations are built using the pfSense data. If you use a different router/firewall, you will need to update the configuration to visualize the data, and things may not work out of the box. I would be curious to hear feedback on how other routers can send data to ELK.

In subsequent posts, I will describe how you can use Beats to get data from the machines connected to your local network and how you can dig deeper into the collected data.

 

 

 

 

 

Last night I had some free time to play with my network, and I ran  tcpdump out of curiosity. For a while, I’ve been interested to analyze what traffic is going through my home network, and the result of my test pushed me to get to work. I have a bunch of Raspberry Pi devices in my drawers, so, the simplest thing that I can do is get one and install Elastic SIEM on it. For those of you, who don’t know what SIEM is, it stands for Security Information and Event Management. My hope was that with it, I will be able to get a better understanding of the traffic on my home network.

Installing Elastic SIEM on Raspberry Pi

The first thing I had to do is to install the ELK stack on a Raspberry Pi. There are not too many good articles that explain how to set up Elastic SIEM on your Pi. According to Elastic, Elastic SIEM is generally available in Elastic Stack 7.6. So, installing the Elastic Stack should do the work.

A couple of notes before that:

  1. The first thing to keep in mind is that 8GB is the minimum requirement for the ELK stack. You can get around with 2GB Pi, but if you want to run the whole stack (Elasticsearch, Logstash, and Kibana) on a single device, make sure that you order Raspberry Pi 4 Model B Quad Core 64 Bit with 4GB[ad]. Even this one is a stretch if you collect a lot of information. A good option would be to split the stack over two devices: one for Elasticsearch and Kibana, and another one for the Logstash service
  2. Elastic has no builds for Raspbian. Hence, in the instructions below, I will use Debian packages and will describe the steps to install those on the Pi. This will require some custom configs and scripts, so be prepared for that. Well, this article is how to hack the installation and no warranties are provided 🙂
  3. You will not be able to use the ML functionality of Elasticsearch because it is not supported on the low-powered Raspbian device
  4. Below steps assume version 7.7.0 of the ELK stack. If you are installing a different version, make sure that you replace it accordingly in the commands below
  5. Last but not least (in the spirit of no warranties), Elasticsearch has a dependency on libc6that will be ignored and will break future updates. You have to deal with this at your own risk

Here the steps to follow.

Installing Elasticsearch on Raspberry Pi

  1. Set up your Raspberry Pi first. Here are the steps to set up your Rasberry Pi. The Raspberry Pi Imager makes it even easier to set up the OS. Once again, I would recommend using Raspberry Pi 4 Model B Quad Core 64 Bit with 4GB[ad] and a larger SD card[ad] to save logs for longer.
  2. Make sure all packages are up to date, and your Raspbian OS is fully patched.
    sudo apt-get update
    sudo apt-get upgrade
  3. Install the ZIP utility, we will need that later on for the Logstash configuration.
    sudo apt-get install zip
  4. Then, install the Java JRE because Elastic Stack requires it. You can use the open-source JRE to avoid licensing troubles with Oracle’s one.
    sudo apt-get install default-jre
  5. Once this is done, go ahead and download the Debian package for Elasticsearch. Make sure that you download the package with no JDK in it.
    wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.7.0-no-jdk-amd64.deb
  6. Once this is done, go ahead and install Elasticsearch using the package manager.
    sudo dpkg -i --force-all --ignore-depends=libc6 elasticsearch-7.7.0-no-jdk-amd64.deb
  7. Next, we need to configure Elasticsearch to use the installed JDK.
    sudo vi /etc/default/elasticsearch

    Set the JAVA_HOMEto the location of the JDK. Normally this is /usr/lib/jvm/default-java. You can also set the JAVA_HOMEto the same path in the /etc/environmentfile but this is not required.

  8. Last thing you need to do if to disable the ML XPack for Elasticsearch. Change the access mode to the /etc/elasticsearchdirectory first and edit the Elasticsearch configuration file.
    sudo chmod g+w /etc/elasticsearch
    sudo vi /etc/elasticsearch/elasticsearch.yml

    Change the xpack.ml.enabledsetting to falseas follows:

    xpack.ml.enabled: false

The above steps install and configure the Elasticsearch service on your Raspberry Pi. You can start the service with:

sudo service elasticsearch start

Or check its status with:

sudo service elasticsearch status

Installing Logstash on Raspberry Pi

Installing Logstash on the Raspberry Pi turned out to be a bit more problematic than Elasticsearch. Again, Elastic doesn’t have a Logstash package that targets ARM architecture and you need to install it manually. StackOverflow posts and GitHub issues were particularly helpful for that – I listed the two I used in the References at the end of this article. Here the steps:

  1. Download the Logstash Debian package from Elastic’s repository.
    wget https://artifacts.elastic.co/downloads/logstash/logstash-7.7.0.deb
  2. Install the downloaded package using the dpkg package installer.
    sudo dpkg -i logstash-7.7.0.deb
  3. If you run Logstash at this point and encounter error similar to logstash load error: ffi/ffi -- java.lang.NullPointerException: null get Alexandre Alouit’s fix from GitHub using.
    wget https://gist.githubusercontent.com/toddysm/6b4b9c63f32a3dfc476a725561fc23af/raw/06a2409df3eba5054d7266a8227b991a87837407/fix.sh
  4. Go to /usr/share/logstash/logstash-core/lib/jars and check the version of the jruby-complete-X.X.X.X.jarJAR
  5. Open the downloaded fix.sh, and replace the version of the jruby-complete-X.X.X.X.jaron line 11 with the one from your distribution. In my case, that was  jruby-complete-9.2.11.1.jar
  6. Change the permissions of the downloaded fix.shscript, and run it.
    chmod 755 fix.sh
    sudo ./fix.sh
  7. You can run Logstash with.
    sudo service logstash start

You can check the Logstash logs in /var/log/logstash/logstash-plain.logfor information on whether Logstash is successfully started.

Installing Kibana on Raspberry Pi

Installing Kibana had different challenges. The problem is that Kibana requires an older version of NodeJS, but the one that is packed with the Debian package doesn’t run on Raspbian. What you need to do is to replace the NodeJS version after you install the Debian package. Here the steps:

  1. Download the Kinabna Debian package from Elastic’s repository.
    wget https://artifacts.elastic.co/downloads/kibana/kibana-7.7.0-amd64.deb
  2. Install the downloaded package using the dpkg package installer.
    sudo dpkg -i --force-all kibana-7.7.0-amd64.deb
  3. Move the redistributed NodeJS to another folder (or delete it completely) and create a new empty directory nodein the Kibana installation directory.
    sudo mv /usr/share/kibana/node /usr/share/kibana/node.OLD
    sudo mkdir /usr/share/kibana/node
  4. Next, download version 10.19.0 of NodeJS. This is the required version of NodeJS for Kibana 7.7.0. If you are installing another version of Kibana, you may want to check what NodeJS version it requires. The best way to do that is to start the Kibana service and it will tell you.
    wget https://nodejs.org/download/release/v10.19.0/node-v10.19.0-linux-armv7l.tar.xz
  5. Unpack the TAR and move the content to the nodedirectory under the Kibana installation directory.
    sudo tar -xJvf node-v10.19.0-linux-armv7l.tar.xz
    sudo mv ./node-v10.19.0-linux-armv7l.tar.xz/* /usr/share/kibana/node
  6. You may also want to create symlinks for the NodeJS executable and its tools.
    sudo ln -s /usr/share/kibana/node/bin/node /usr/bin/node
    sudo ln -s /usr/share/kibana/node/bin/npm /usr/bin/npm
    sudo ln -s /usr/share/kibana/node/bin/npx /usr/bin/npx
  7. Configure Kibana to accept requests on any IP address on the device.
    sudo vi /etc/kibana/kibana.yml

    Set the server.hostsetting to 0.0.0.0like this:

    server.host: "0.0.0.0"
  8. You can run Kibana with.
    sudo service kibana start

Conclusion

Although not supported, you can run the complete ELK stack on a Raspberry Pi 4[ad] device. It is not the most trivial installation, but it is not so hard either. In the following posts, I will explain how you can use the Elastic SIEM to monitor the traffic on your network.

References

Here are some additional links that you may find useful:

For a while, I’ve been planning to build a cybersecurity research environment in the cloud that I can use to experiment with and research malicious cyber activities. Well, yesterday I received the following message on my cell phone:

Hello mate, your FEDEX package with tracking code GB-6412-GH83 is waiting for you to set delivery preferences: <url_goes_here>

My curiosity to follow the link was so tempting that I said: “Now is the time to get this sandbox working!” I will write about the scam above in another blog post, but in this one, I would like to explain what I needed in the cloud and how did I set it up.

Cybersecurity Research Needs

What (I think) I need from this sandbox environment? I know that my requirements will change over time when I get deeper in the various scenarios but for now, here is what I wanted to have:

  • First, and foremost, a dedicated network where I can click on malicious links and browse dark web sites without the fear that my laptop or local network will get infected. I also need to have the ability to split the network into subnets for different purposes
  • Next, I needed pre-built VM images for various purposes. Here some examples:
    • A Windows client machine to act as an unsuspicious user. Most probably, this VM will need to have Microsoft Office and other software like Acrobat Reader installed. Other more advanced software that will help track URLs and monitor process may also be required on this machine. I will go deeper into this in a separate post
    • Linux machine with networking tools that will allow me to better understand and monitor network traffic. Kali Linux may be the right choice, but Ubuntu and CentOS may also be helpful as different agents
    • I may also need some Windows Server and Linux server images to simulate enterprise environments
  • Very important for me was to be able to create the VM I need quickly, do the work I need, and tear it down after that. Automating the process of VM creation and set up was high up on the list
  • Also, I wanted to make sure that if I forget the VM running, it will be shut down automatically to save money

With those basic requirements, I got to work setting up the cloud environment. For my experiments, I choose Microsoft Azure because I have a good relationship with the Azure team and also some credits that I can use.

Segregating Network Access

As mentioned in the previous section, I need a separate network for the VMs to avoid any possibility of infecting my laptop. Now, the question that comes to mind is: Is a single virtual network with several subnets OK or not? Having in mind that I can destroy this environment at any time and recreate it (yes, this is the automation part), I decided to go with a single VNet and the following subnets in it:

  • Sandbox subnet to be used to spin up virtual machines that can simulate user behavior. Those will be either single VMs running Windows client and Microsoft Office or set of those if I want to observe the lateral movement within the network. I may also have a Linux machine with Wireshark installed on it to watch network traffic to the web sites that host the malicious pages
  • Honeypot subnet to be used to expose vulnerable devices to the internet. Those may be Windows Server Datacenter VMs or Linux servers without outdated patches and weaker or compromised passwords
  • Frontend subnet to be used to host exploit frameworks for red team scenarios. One example can be the Social Engineering Toolkit (SET). Also, simple redirection services or other apps can be placed in this subnet
  • Public subnet that is required in Azure if I need to set up any load balancers for the exploit apps

With all this in mind, I need to make sure that traffic between those subnets is not allowed. Hence, I had to set up a Network Security Group for each subnet to disable inbound and outbound VNet traffic.

Cybersecurity Virtual Machine Images

This can be an evergrowing list depending on the scenarios but below is the list of the first few I would like to start with. I will give them specific names based on the purpose I plan to use them for:

User VM

The purpose of the User VM is to simulate an office worker (think of receptionist, accountant or, admin). Typically such machines have a Windows Client OS installed on it as well as other office software like Word, Excel, PowerPoint, Outlook and Acrobat Reader. Different browsers like Edge, Chrome, and Firefox will be also useful to test.

At this point, the question that I need to answer is whether I would like to have any other software installed on this machine that will allow me to analyze the memory, reverse engineer binaries or, monitor network traffic on it. I decided to go with a clean user machine to be able to see the exact behavior the user sees and not impact it with the availability of other tools. One other reason I thought this would be a better approach is to avoid malware that checks for the existence of specialized software.

When I started building this VM image, I also had to decide whether I want to have any anti-virus software installed on it. And of course, the question is: “What anti-virus software?” My company is a Sophos Partner, and this would be my obvious choice. I decided to build two VM images to go with: one without anti-virus and one with.

User Analysis VM

This one is the same as the User VM but with added software for malware analysis. The minimum I need installed is:

  • Wireshark for network scanning
  • Cygwin for the necessary Linux tools
  • HEXDump viewer/editor
  • Decompiler

I am pretty sure the list will grow over time and I will keep a tap on what software I have installed or prefer to use. What my intent is to build Kali Linux type of VM but for Windows 🙂

Kali VM

This one is the obvious choice. It can be used not only for offensive capabilities but also to cover your identity when accessing malicious sites. It has all the necessary tools for hackers and it is available from the Microsoft Azure Marketplace.

Tor VM

One last VM type I would like to have is a machine on which I can install the Tor browser for private browsing. Similar to the Kali VM, it can be used for hiding the identity of the machine and the user who accesses the malicious sites. It can also be used to gain access to Dark Web sites and forums for cybersecurity research purposes.

Those are the VM images I plan for now. Later on, I may decide on more.

Automating the Security Research VM Creation

Ideally, I would like to be able to create a whole security research environment with a single script. While this is certainly possible, I will not be able to do it in an hour to load the above URL. I will slowly implement the automation based on my needs going forward. However, one thing that I will do immediately is to save regular snapshots of my VMs once I install new software. I will also need to version those.

I can use those snapshots to quickly spin up new VM with the required software if the previous one gets infected (ant this will certainly happen). So, for now, my automation will be only to create a VM from an Azure Disk snapshot.

Shutting Down the Security Research VMs

My last requirement was to shut down the security research VMs when I don’t need them. Like every other person in the world, I forget things, and if I forget the VMs running, I can incur some expenses that I would not be happy to pay. While I am working on full-fledged scheduling capability for Azure resources for customers, it is still in the works and I cannot use it yet. Hence, I will leverage the built-in Azure functionality for Dev/Test workloads and will schedule a daily shutdown at 10:00 PM PST time. This way, if I forget to turn off any VM, it will not continue to run all the time.

With all this, my plan is ready and I can move on to build my security research environment.

In my last post, I demonstrated how easy it is to create fake accounts on the major social networks. Now, let’s take a look at what can we do with those fake social network accounts. Also, let’s not forget that my goals here are to penetrate specific individual’s home network (in this case my own family’s home network) and gain access to files and passwords. To do that, I need to find as much information about this individual as possible, so I can discover a weak link to exploit.

1. Search Engines

The first and most obvious way to find anything on the web is, of course, Google. Have you ever Googled yourself? Search for “Toddy Mladenov” on Google yields about 4000 results (at the time of this writing). The first page gives me quite a few useful links:

  • My LinkedIn page
  • Some pages on sys-con.com (must be important if it is so highly ranked by Google). Will check it later
  • Link to my blog
  • Link to my Quora profile
  • Link to a book on Amazon
  • Two different Facebook links (interesting…)
  • Link to a site that can show me my email and phone number (wow! what more do I need 😀)
  • And a link to a site that can show me more about my business profile

The last two links look very intriguing. Whether we want it or not, there are thousands of companies that scrape the Web for information and sell our data for money. Of course, those two links point to such sites. Following those two links I can learn quite a lot about myself:

  • I have (at least) two emails – one with my own domain toddysm.com and one with Gmail. I can get two emails if I register for free on one of the sites or start a free trial on the other (how convenient 😀)
  • I am the CTO co-founder of Agitare Technologies, Inc
  • I get immediate access to my company’s location and phone number (for free and without any registrations)
  • Also, they give me more hints about emails t***@***.com (I can get the full email if I decide to join the free trial)
  • Clicking on the link to the company information additional information about the company – revenue, website as well as names of other employees in the company

OK, enough with those two sites. In just 5 minutes I got enough to start forming a plan. But those links made me curious. What will happen if I search for “Toddy Mladenov’s email address” and “Toddy Mladenov’s phone number” on Google? That doesn’t give me too much but points me to a few sites that do a specialized search like people search and reverse phone lookup. Following some links, I land on truepeoplesearch.com from where I can get my work email address. I also can learn that my real name is Metodi Mladenov. Another site that I learned about recently is pipl.com. A search on it, not surprisingly, returns my home address, my username (not sure for what) as well as the names of relatives and/or friends.

Doing an image search for “Toddy Mladenov” pulls out my picture as well as an old picture of my family.

Let’s do two more simple searches and be done with this. “Metodi Mladenov email address” and “Metodi Mladenov phone number” give my email addresses with Bellevue College and North Seattle College (NSC). I am sure that the NSC email address belongs to the target person because the picture matches. For the Bellevue College one, I can only assume.

Here is useful information I gathered so far with those searches:

  • Legal name
  • 3 email addresses (all work-related, 1 not confirmed)
  • 2 confirmed and 1 potential employer
  • Home address
  • Company name
  • Company address
  • Company phone number
  • Titles (at different employers)
  • Personal web site
  • Social media profiles (Facebook and LinkedIn so far)

Also, I have hints for two personal emails – one on toddysm.com and one on Gmail.

2. Social Media Sites

As you can assume, social media sites are full of information about you. If by now, you didn’t realize it, their business model is to sell that information (it doesn’t matter what Mark Zuckerberg tries to convince you). Now, let’s see what my fake accounts will offer.

Starting with Facebook, I sign in with my fake female account (SMJ) and type the URL to my personal Facebook page that Google returned. What I saw didn’t make me happy at all. Despite my ongoing efforts to lock down my privacy settings on Facebook, it turns out that anyone with a Facebook account can still see quite a lot about me. Mostly it was data from 2016 and before but enough to learn about my family and some friends. Using the fake account, I was able to see pictures of my family and the conversations that I had with friends around the globe. Of course, I was not able to see the list of my friends, but I was still able to click on some names from the conversations and photos and get to their profiles (that is smart, Facebook!). Not only that but Facebook was allowing the fake user to share these posts with other people, i.e., on her timeline (yet another smartie from the Facebook developers).

I was easily able to get to my wife’s profile (same last name, calling me “dear” in posts – easy to figure it out). From the pictures, I was able to see on my and my wife’s profile, I was also able to deduct that we have three daughters and that one of them is on Facebook, although under different last name. Unlike her parents (and like many other millennials), my daughter didn’t protect her list of friends – I was able to see all 43 of them without being her “friend.”

So far Facebook gave me valuable information about my target:

  • Family pictures
  • Wife’s name
  • Teenage daughter’s name
  • A couple of close friends

I also learned that my target and his wife are somehow privacy conscious but thanks to the sloppy privacy practices that Facebook follows, I am still able to gather important intelligence from it.

As we all know, teens are not so much on Facebook but on Instagram (like there is a big difference). Facebook though was nice enough to give me an entry point to continue my research. On to Instagram. My wife’s profile is private. Hence, I was not able to gather any more information about her from Instagram. It was different the case with my daughter – with the fake account I was able to see all her followers and people she follows. Crawling hundreds of followers manually is not something I want to do so, I decided to leave this research for later. I am pretty sure; I will be able to get something more out of her Instagram account. For now, I will just note both Instagram accounts and move on.

Next is LinkedIn. I signed in with the male (RJD) fake account that I created. The first interesting thing that I noticed is that I already had eight people accepting my invite (yes, the one I send by mistake). So, eight people blindly decided to connect with a fake account without a picture, job history details or any other detailed information whatsoever! For almost all of them I was able to see a personal email address, and for some even phone, and eventually address and birth date -pretty useful information, isn’t it? That made me think, how much data can I collect if I build a complete profile.

Getting back on track! Toddy Mladenov’s LinkedIn page gives me the whole work history, associations, and (oops) his birth date. The information confirmed that the emails found previous are most probably valid and can easily be used for a phishing campaign (you see where am I going with this 😀). Now, let’s look what my wife’s page can give me. I can say that I am really proud of her because contact info was completely locked (no email, birthdate or any other sensitive information). Of course, her full resume and work history were freely available. I didn’t expect to see my daughter on LinkedIn, but it turned out she already had a profile there – searching for her name yielded only two results in the US and only one in the area where we live. Doing some crosscheck with the information on Instagram and Facebook made me confident that I found her LinkedIn profile.

3. Government Websites

Government sites expose a lot of information about businesses and their owners. The Secretary of State site allows you to search any company name, and find not only details like legal name, address, and owners but also the owners’ personal addresses, emails and sometimes phone numbers. Department of Revenue web site also gives you the possibility to search for a company by address. Between those two, you can build a pretty good business profile.

Knowing the person’s physical address, the county’s site will give you a lot of details about their property, neighborhood, and neighbors (names are included). Google Maps and Zillow are two more tools that you can use to build a complete physical map of the property – something that you can leverage if you plan to penetrate the WiFi network (I will come back to this in one of my later posts where we will look at how to break into the WiFi network).

Closing Thoughts

I have always assumed that there will be a lot of information about me on the Internet, but until now I didn’t realize the extent to which this information can be used for malicious purposes. The other thing that I realized is that although I tried to follow the guidelines from Facebook and other social media sites to “protect” my information, it is not protected enough, which gives me no confidence that those companies will ever take care of our privacy (doesn’t matter what they say). Like many other people nowadays, I post things on the Internet without realizing the consequences of my actions.

Couple of notes for myself:

  • I need to figure out a way to remove from the web some of the personal information like emails and phone numbers. Not an easy task, especially in the US
  • Facebook, as always fails the privacy check. I have to find a way to hide those older posts from non-friends
  • I also need to educate my family on locking down their social media profiles (well, now that I have that information, I don’t want other people to use it)
  • Hide the birthday from my LinkedIn page

In my next post, I will start preparing a phishing campaign with the goal to discover my home network’s router IP. I will also try to post few suggestions, how you can restrict the access to information from your social media profiles – something that I believe, everyone should do.

Disclaimer: This post describes techniques for online reconnaissance and cyber attacks. The activities described in this post are performed with the consent of the impacted people or entities and are intended to improve the security of those people or entities as well as to bring awareness to the general public of the online threats existing today. Using the above steps without consent from the targeted parties is against the law. The reader of this post is fully responsible for any damage or legal actions against her or him, which may result from actions he or she has taken to perform malicious online activities against other people or entities.