Tip:
Highlight text to annotate it
X
In this video, I’m going to explain some details about Transport Layer Security, or
TLS. It’s what you use when you use HTTPS, secure HTTP. It’s specified in RFC5246.
Transport layer security is exactly that: it provides security at the transport layer,
so between applications. It’s a session layer on top of TCP. It provides a stream
abstraction, just like TCP. So to most applications, it looks just like TCP, a bidirectional, reliable
byte stream. But TLS adds confidentiality, integrity, and authenticity to the stream.
So using TLS properly, you can protect your communication from eavesdropping, tampering,
and spoofing attacks. The most recent version of TLS is version
1.2, specified in RFC5246. TLS started as the secure socket layer (SSL) by Netscape,
back when the web was just starting. The version order is SSL 2.0, SSL 3.0, TLS 1.0, TLS 1.1,
and now TLS 1.2. TLS is used by HTTPS.
So what ciphers and keys does TLS use? It has a wide range that are available. As part
of setting up a TLS session, the client and server negotiate four separate ciphers. The
first is the cipher used to authenticate the server. You can also optionally authenticate
a client. The second is the cipher used to exchange symmetric keys. The third is the
cipher used for symmetric confidentiality. The fourth is the cipher used for symmetric
integrity.
TLS negotiates these four ciphers in a 5-step protocol to initiate a session. This all happens
after we open a TCP connection. These messages are sent over TCP.
In the first step, which is sent as plaintext, the client sends a list of ciphers it supports,
and a random number it has generated.
The server responds with what ciphers to use, its own random number, and a certificate containing
its public key. This is also sent as plaintext.
In the third step, the client sends something called a “pre-master secret” to the server,
encrypted with the server’s public key. Using this pre-master secret and the two random
numbers exchanged in plaintext, the client and server compute the keys for the session.
I’ll explain the details of how this works in a moment. But for now, just realize that
at this point the client and server have generated the symmetric keys that their ciphers need.
Next, the client sends a finish message, encrypted and MACed with symmetric keys generated with
the server random, client random, and pre-master secret. This message includes a MAC of the
handshake messages, to ensure that both sides saw the same messages. The MAC is also computed
with a symmetric key generated from the server random, client random, and pre- master secret.
Finally, the server sends a finish message. This is secured similarly to the client finish
message, and also contains a MAC of the handshake messages. MACing the handshake messages allows
TLS to protect against an adversary trying to force the two parties to choose a different
cipher. Since the first two steps are not secured, they have neither confidentiality
or integrity, then an adversary could perform a man-in-the-middle attack to change the offered
and selected ciphers. MACing the handshake messages lets them detect this.
Now, at this point, we have a secure connection, protected through symmetric ciphers that both
sides have agreed on.
What does that look like? Well, to provide integrity, TLS needs to break application
data up into chunks that it can provide MACs for. So TLS takes the application stream of
data and breaks it up into records. There are also records that don’t contain data,
such as records to generate new keys. But let’s focus on data records. TLS takes the
application stream and breaks it into data records, which state their length and have
a MAC. These records are encrypted with the chosen ciphers and keys and then sent over
TCP. This appears as a stream of data to TCP, which then breaks it into segments. Records
can be much larger than TCP segments. So a single record might be broken into many segments,
and record and segment boundaries might not line up.
TLS includes compression as one of its features. If, for example, you configure TLS to provide
integrity but not confidentiality, then you’d be sending plaintext. English text is generall
very compressible, almost 10 to 1. So you can configure TLS to compress the data. By
default compression is off. If there’s confidentiality, then compression won’t help since the bits
should seem random.
Let’s look at how TLS establishes its session keys. Remember, both the server and the client
provide random numbers. That way, even if one of them has a bad random number generator,
you’ll still have randomness. The client also sends a “pre-master secret” encrypted
with the server’s public key. The client and server both combine these three pieces
of information to something called a “master secret,” from which they generate their
session keys. Once they compute the master secret, the client and server throw away the
pre-master secret. They generate six keys, whose lengths are
determined by the ciphers used. They generate a key used to encrypt data from the client
to the server, a key to MAC data from the client to the server, key to encrypt data
from the server to client, a key to MAC data from the server to client, a client initialization
vector (for ciphers that need it) and a server initialization vector (For ciphers that need
it). Having this master secret, the client and
server can regenerate new keys by choosing new random numbers. So you can resume a session
with the same master secret but new keys.
Here’s a picture of the whole process. The client and server take the client random,
the server random, and the pre-master secret and passes them as input to something called
a pseudo- random function, or a PRF, which basically generates bits that look random.
This produces 48 bytes worth of random bits, the master secret. TLS takes the master secret
and the two random values and passes them as input to a pseudo-random function to generate
as many bits as you need for all of the keys. So, for example, if the MAC keys are 512 bits,
the write keys are 256 bits, and the initialization vectors are 128 bits, then you call the pseudo-random
function enough times to generate 1,792 bits, which become the keys.