In my last post, I’ve discussed the basics of the secure Web protocol HTTPS (or HTTP-over-SSL), which gives you some idea what HTTPS is and why it is important. In this post, though, I would like to go into details how the SSL (or TLS, I will use the latter in this post so that the new name sticks with you) part of HTTPS works, what are cipher suits, and why are those important. This post is targeted more towards software developers or IT professionals, although I will try to use language simple enough for everybody to understand.
The first thing I would like to ask you to do is to go to https://www.ssllabs.com/ssltest/analyze.html and get the SSL report for your website (or any other site for that matter). Keep the page open because I will refer to portions of it throughout the post. If scroll down the page, you will see a section Configuration and subsection about cipher suites – this is what I will explain. I have analyzed my own domain toddysm.com on the SSLLabs site and will use it as an example. One of the cipher suites that you will see for my domain (and most probably for yours) is this one TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, which will be the concrete example that we will decipher.
Now, let’s look at the so-called TLS handshake that happens when you try to load an HTTPS page from a website. Although the actual TLS 1.2 RFC-5246 has a pretty detailed explanation of the handshake protocol, I find IBMs Knowledge Center overview of TLS handshake much simpler and clearer to understand. Let’s look deeper into the steps one by one (italicized steps text is a credit to IBM’s Overview of SSL or TLS Handshake):
Step 1: The client sends a “client hello” message that lists information like TLS version and the cipher suites supported by the client in the client’s order of preference. The message also contains a random byte string that is used in subsequent computations.
The most important thing to note is that the information sent as part of the “client hello” message is not encrypted (i.e. it is in plain text). Thus, if somebody is sniffing on your traffic, they will know two pieces of information: 1.) the cipher suites your browser supports (in order of preference) and 2.) the random byte string used to create the master secret. It is also unfortunate that not all modern browsers (or TLS clients) tell you what is their “order of preference”, which is a shame because the negotiated cipher suite may not be the most secure one and having the ability to change the order of preference may increase your security.
Step 2: The server responds with a “server hello” message that contains the cipher suite chosen by the server from the list provided by the client, the session ID, and another random byte string. The server also sends its digital certificate.
Once again, the message exchanged in this step is not encrypted. For somebody listening to your traffic, all this information may be of some value.
Step 3: The client verifies the server’s certificate.
This is done through a trusted third party called Certificate Authority. Another option is if the server certificate is already added as a trusted certificate on the client’s machine.
Step 4: The client sends the random byte string that enables both the client and the server to compute the secret key to be used for encrypting subsequent message data. The random byte string itself is encrypted with the server’s public key.
The first thing to note here is that this is the first message that is encrypted (eventually) – in this case with the server’s public key and only the server can decrypt it. The next thing to note is the importance of the yet another random byte string. This random byte string is also called pre-master secret and it is used together with the previous two random values to generate the so-called master secret. Depending on the sophistication of the algorithm used to generate this pre-master secret, your connection with the server may or may not be vulnerable.
Let’s look back at our SSLLabs scan and the different ciphers that your server supports (see picture below).
The example I took above is the fifth from the list and means the following (credits to Daniel Szpisjak for his explanation of TLS-RSA vs TLS-ECDHE-RSA vs static DH on StackExchange):
- Ephemeral Elliptic-Curve Diffie-Hellman (aka ECDHE) is the algorithm used to generate key pairs on both the client and the server. The public keys are the random values exchanged in steps 1 and 2 above.
One important thing to note here is that the use of the ECDHE algorithm provides the so-called forward secrecy, which means that if future communication channels between those parties are compromised, the keys cannot be used to decrypt previous conversations.
- The pre-master secret is this case is NOT the shared secret key as per the Diffie-Hellman algorithm as some may think. The shared secret key is never sent over the wire. This random number is encrypted using the server’s RSA public key (according to IBM’s explanation) and is used to generate an uniform shared secret using a hashing algorithm. The encryption, in this case, is not so important because this is not the actual key used for encryption.
The RSA part of the cipher suite also denotes two more things:
- The public key type used to authenticate the server in step 3 (and the client if client authentication is required – part of step 4)
- And the public key type used to sign the ECDHE public keys during the exchange
In this step the client uses the following information to generate the master secret:
- Clients private key
- Server’s public key
Then the client uses the random value to generate an uniform hash of the master secret that is used to encrypt the traffic. In our example, the hash is generated using SHA384 (the last part of the cipher suite).
On the other side, the server uses the following information to come up with the same master secret:
- Client’s public key
- Server’s private key
Then the server uses the random value to generate the same uniform hash of the master secret that is used to encrypt the traffic.
Important to note is that using the ECDHE algorithm for key exchange, both the client and the server generate the master secret independently and do not share it over the wire, which limits the opportunities for sniffing the secret.
Now, let’s move forward and look at the remaining steps in the handshake and decipher the rest of the cipher suite…
Step 5: If the TLS server sent a “client certificate request”, the client sends a random byte string encrypted with the client’s private key, together with the client’s digital certificate, or a “no digital certificate alert”.
This step is used only for authentication purposes for the server to make sure that it communicates with the correct client. When browsing this step is not required because the browsers are not authenticated, however, if you develop services that need to authenticate their clients using certificates, this is an important step.
Note that the cipher suite is already agreed upon and in this step, the client should send a public key in the previously agreed format. In our case RSA.
Step 6: The TLS server verifies the client certificate.
This step is exactly the same as step 3 but on the server side.
Step 7: The TLS client sends the server a “finished” message
Step 8: The TLS server sends the client a “finished” message
Those two steps just confirm from both sides that the handshake is complete and both parties can start exchanging traffic securely.
Step 9: For the duration of the TLS session, the server and the client can now exchange messages that are symmetrically encrypted with the shared secret key.
The communication between the TLS server and the TLS client is now encrypted using the symmetric key (the master secret hash) that both parties generated independently. For the purpose of this communication, AES-256 symetric key encryption algorithm is used in our example. This is contrary to the popular belief that the traffic is encrypted with the public key of the server on the client side and decrypted with the private key on the server side.
To complete the example, we need to explain two more parts:
- GCM is the mode of operation for the symmetric key cryptographic block ciphers like AES. Another mode of operation is HMAC. Both are used to provide both data authenticity and confidentiality but GCM is proven to be quite efficient and it is widely used.
- The SHA384 is the Secure Hashing Algorithm used to hash every message in the above mode of operation in order to ensure the integrity of the message.
Keep in mind that the flow above describes the superset of steps for establishing TLS secure channel. Depending on the TLS version and agreed upon cipher suite some information may be ignored during the exchange, which may make your communication vulnerable.
If you want to get really technical how the handshake works, you can read Moserware’s post that dates back to 2009 http://www.moserware.com/2009/06/first-few-milliseconds-of-https.html (not a lot have changed though) or watch the Khan Academy video on Diffie-Hellman Key Exchange.