Ask any security practitioner about protecting data in transit, and it is almost certain that they will mention Transport Layer Security (TLS) or its predecessor Secure Sockets Layer (SSL). In fact, it is not hyperbole to say that this technology is the cornerstone of secure communications for networked communications. There are, of course, other methods to securely communicate between devices (e.g., IPSEC, SSH), but the versatility and ubiquity of TLS puts it in a category unto itself. TLS drives everything from secure browsing, to securing RESTful interactions, to file transfer, to VPN connections, to cloud access, and dozens of other secure communication scenarios.

Given this, most practitioners will be quite familiar with the usage side of TLS; i.e., as a control and architectural mechanism for ensuring that communications are secured appropriately. But when it comes to the mechanics of how it's used—and more specifically, how it's configured and maintained—there are some practical challenges. Why? There are a few reasons, but chief among them is that, when everything is working appropriately, TLS usage is invisible. What I mean by "invisible" is that we don't really pay attention to it. TLS does what it does without a lot of care and feeding from us—it just works.

This "it just works" mentality can be dangerous in some situations though. It's a trap because it lulls us into a mindset where we're not paying explicit attention to it. And there are some truly catastrophic things that can go wrong when we're not paying attention. With this in mind, let's examine TLS (and, by extension, SSL) from an assurance point of view; specifically, what we need to do make sure that configuration is appropriate, that diligence is being employed in keeping it up to date, that potential vulnerabilities (in implementations and the protocol itself) are tracked and addressed, and that configuration is hardened and robust.

Ciphersuites and Certificates

Though many practitioners know what TLS is conceptually, there are a few key points that are important to level-set prior to any serious discussion about configuration and maintenance. Specifically, X.509v3 certificates, keys, and ciphersuite negotiation.

It bears mentioning that this is not intended to be an exhaustive or engineering-level discussion of how TLS works. Entire books have been written on the topic. And, in truth, an engineering or developer-level understanding isn't necessary to validate that usage and maintenance (at a high level) are appropriate from a typical assurance and assessment viewpoint. That said, there are few key concepts to tee up to go through the specific steps of validation.

First, ciphersuite negotiation. Not every TLS implementation will necessarily support every possible set of cryptographic algorithms that are possible under the standard; likewise, specific servers may have specific needs that influence what algorithms and modes of operation are acceptable. The protocol needs to be able to handle the situation that two parties need to communicate but where one party supports a different set of algorithms than the other. For example, say a server at a US federal agency (which has a regulatory mandate requiring only approved algorithms to be employed) wishes to communicate with another host that has a less restrictive set of parameters.

What happens in this case? The TLS protocol itself has a mechanism built into it to allow ciphers to be negotiated as part of the handshake process. The two participants in the communication negotiate a ciphersuite (a defined set of cryptographic primitives or algorithms) that they both can agree on at the beginning. Part of configuring TLS—both on the client and the server side of the equation—means establishing which ciphersuites are acceptable and which are not.

Each of the cihpersuites specify a (bulk) data encryption algorithm, an algorithm to provide a Message Authentication Code (MAC), and a key exchange algorithm. For example, a commonly-negotiated cihpersuite1 is ECDHE-RSA-AES256-GCM-SHA384, which specifies elliptic curve Diffie-Hellman key exchange with RSA signing, 256-bit AES for bulk encryption, and SHA-384 used to provide the MAC. Generally, the default set of ciphersuites will be acceptable to most organizations for most purposes, but as we outline below, care should be taken to explicitly validate this.

Secondly, certificates. Part of the design criteria of SSL/TLS is the ability to establish trust between parties in the connection by authenticating that they are who they are say they are. It leverages asymmetric cryptography to do this. This needs to happen even in situations where they have not encountered a host before. As a practical matter, it isn't reasonable to expect every client to know about—or share secrets with—every party that it may desire to communicate with. So how does the protocol account for this?

It does so by including a mechanism for allowing an implementation to maintain a trusted set of Certificate Authorities (CAs) responsible for issuing X.509v3 certificates to TLS servers (more rarely clients in the case that mutual authentication—i.e., authentication of both client and server—is required). X.509v3 is an International Telecommunications Union (ITU) standard that defines a portable data structure (i.e., the certificate) containing public key information for those parties they wish to communicate with as well as other information such as (typically) the fully qualified domain name (FQDN) of the host, a lifetime (starting validity and expiration), usage restrictions, and other pertinent details. The certificate is signed by the CA, meaning that a TLS client can rely on the public key contained therein.

Leveraging this, clients need only explicitly trust the CA's. The mechanism for what CA's are trusted is implementation dependent: In the case of browsers, for example, the CA Browser Forum (CAB Forum) maintains a set of baseline requirements (security and transparency requirements) for inclusion in the default trusted repositories of popular browsers. Should a certificate become compromised, a CA can revoke a given certificate, either by publishing a Certificate Revocation List (CRL) or by responding to Online Certificate Status Protocol (OCSP) requests made to validate certificate revocation status.

Validating Configuration

As you might imagine, packed in the above are a few things that behoove organizations to specifically evaluate as they consider their own usage. First, are the ciphersuites the organization is supporting appropriate and in line with security requirements? Have they, in fact, looked? If so, have they considered both from a server perspective what the default set of ciphersuites are and, from a client perspective, what those endpoints are configured to accept? Typically, and for most purposes, the default set might be sufficient. However, evaluating it and making a conscious decision that they are is different from just assuming they are and hoping for the best.

Secondly, the specifics of acceptable CA's as well as revocation status validation. Would it surprise you to learn that many implementations have historically had revocation checking disabled by default? It might if you're not looking for this explicitly and consciously. Beyond this, a number of commercial, default-trusted CA's have had significant security incidents, some of which have led to issuance of clearly-fraudulent certificates in the past. Do you know what your clients are configured to trust? Do you have confidence revocation checking is enabled? The details of how to evaluate and control this setting will vary depending on implementation, but adding it to what you look at as you evaluate applications and your overall configuration is certainly a good idea.

Next up, private keys. Any time there is a certificate, there is a corresponding private key—most of the time one that needs to be accessible at runtime (for example, in the case of a webserver). If parameters for storage of that key—and protection of that storage—has not been explicitly decided ahead of time, issues can arise, as you might imagine. Options such as hardware storage modules (HSM's) may be appropriate based on the specifics of usage, but at a minimum, having standards about where it can be stored (not in source code repositories being a good start) is, again, a consideration that deserves attention.

Implementation and artifact maintenance

Beyond ensuring that a known-good configuration is defined, vetted, and systematically analyzed, there is also the question of ongoing maintenance and hygiene. The reason why, for example, we are targeting TLS in this discussion—and not SSL—in large part has to do with the fact that SSL at this point is now almost entirely deprecated for most usage. There are a number of known issues in versions of TLS prior to version 1.2. Issues impacting widely-used TLS implementations (e.g., Heartbleed) periodically arise, while issues that impact specific versions of the protocols themselves (e.g., DROWN and POODLE) are themselves something that happen from time to time. It is imperative that, just like you would track and address the vulnerability status of any product that you deploy, you likewise track and address issues both in TLS implementations and even weaknesses in the underlying protocols themselves.

From an assurance point of view, explicit conversations with system administration personnel can go a long way to ensuring tracking and remediation is done. Some regulatory requirements do explicitly include these actions as both a requirement and something to explicitly validate (e.g., PCI DSS), but others do not. It is therefore important that someone leads the charge to make sure tracking and remediating vulnerabilities is happening. Beyond this, it can be a useful measure to include these issues in written policy and procedures documentation; there are a number of places this can happen: cryptographic usage policies, known and approved cryptographic module lists, the same procedures that ensure applications and hosts are patched (whatever your organization might happen to call them), etc. Requiring that they be addressed, though, is useful both to establish that they are explicitly checked and that documentation exists.

Beyond assurance, there are additional considerations that are important to stay on top of from a hygiene perspective. For example, maintenance of X.509v3 certificate information: For example, who is responsible for them, who requested them, where they are, which hosts they live on, what their properties are, and when they expire. For even a mid-sized organization with just a handful of certificates, keeping track of these things requires explicit recordkeeping. When we don't, consequences can arise such as unanticipated expiration—in the case of a RESTful API though (particularly where mutual authentication is used), it can cause application processing to fail in ways that might prove time consuming or difficult to debug. Unless we are specifically accounting for these issues as part of normative asset management, we might not stop to include these in the inventories we maintain and keep updated.

Regardless of the specific strategies that you employ to address these points, they should (like any good strategy) be something that you spend time thinking about and planning—and also something you specifically assess and validate. Falling into the trap of "it just works" is something that disciplined security pros should take care to avoid.

Ed will be leading a ransomware tabletop exercise at InfoSec World 2018 in Orlando, Florida, March 19-21, 2018. Join this session to work through a high-pressure, high-stakes scenario without the stress of a real-life security incident.