Randonmess and Entropy

This is another rekindled item from the past. As again it has been updated to try and address new thinking and to fix some of the language.

When we think of cryptography and where we follow the principle that security is in the key and the key alone then the length of the key is an important consideration. However possibly more important is the concept of randomness of the key. If keys are not random then an attacker has a greater chance of finding the key.

Cryptographic security is non-trivial and requires an understanding of a number of concepts and two of them are entropy and randomness, and these are strongly linked. In communications security there is often an assumption that any transmitted message will have some variation to all other transmitted messages, however in some communications systems there may be very strong levels of commonality between transmitted messages that may be exploited. A goal of cryptography is to mask the similarity between messages and commonly this is referred to as maximizing the entropy of a transmitted message. If a message is to be encrypted and the message has low entropy the cryptographer has to raise the entropy prior to encryption or as part of the encryption process. There are a number of examples of messages with inherently low entropy:

  • Short text messages.
  • Telematics status messages.
  • Call setup messages.

Message entropy is discussed in a number of mathematical sources but at the root is Shannon’s “A Mathematical Theory of Communication”. Essentially if the attacker knows or guesses that the message can take a small set of values the probability of correctly guessing bit N+1 after receiving bit N tends towards 1 whereas for a random binary alphabet the probability of a correct guess should always be 0.5. In a cryptographic context, where Alice is sending a message m to Bob in the form of a binary string the rule of thumb is that the bigger the entropy of the message m the more guesses required by an attacker to guess m. After encryption of message m to generate message c the entropy of c should be as high as possible.

The rule of thumb for randomness is that if an attacker that can get access to all the historic random elements (all N values) this has to give zero information to correctly guess the value of the (N+1)th element. If this condition is met then the element can be considered as having a random value – but only with respect to the previous elements. However we also need to determine if we can emulate the randomness so that even if prior knowledge gives no greater likelihood of guessing the (N+1)th element we have to be assured that knowledge of the context does not allow us to guess the (N+1)th element. The source of entropy in a system that seeds the random number generator has to be good and many of them are poor, and further, are unsuited to standalone devices. For example sources for entropy may include movement of the mouse (not available in embedded and virtual systems with no GUI), the amount of free memory (not always practical in a virtualised environment where the VMs have fixed memory allocations), CPU temperature readings (in server farms this will be load balanced and trimmed to work in a very small and optimised range with very high load factors in many VM environments too). If our sources of entropy are random over only a small range then their value is reduced to that range – so we should not rely on achieving 128-bit security when our randomness is only within (say) a 4-bit range.

Thus the ability for a system to generate random numbers is central to most forms of modern cryptography. Typically cryptographic suites, such as OpenSSL, rely on operating system-provided methods for sourcing random numbers. Examples include /dev/(u)random on *nix and rand_s() on Windows [1]. None of these methods provide truly-random numbers – that would require a physical source of randomness. Instead they are seeded by sources of entropy within the system and subsequently updated periodically [3]. Typically when an operating system is restarted a pseudo-random seed is written to disk and this seed is used as an additional source of entropy when the operating system starts back up again. This ensures that the machine will not boot in the same entropy-state that it has booted in previously. Systems providing insufficient randomness have been shown to have compromised the integrity of security suites running on them, examples include the breaking of Netscape’s implementation of SSL [4] and the ability to predict Java session-ids [5].

Analysing PRNGs

Analysing PRNGs is a difficult task. The Linux PRNG , for example, is a part of the kernel so any modifications to enable introspection require that the kernel be re-compiled (such as those described in “Not-so-Random Numbers in Virtualized Linux and the Whirlwind RNG”). This kernel re-compilation may affect kernel-provided entropy sources in such a way that the experimentation no longer represents what would happen on an un-modified kernel. Gutterman, Pinkas and Reinman get around this problem by using a user-mode simulator of the Linux RNG for their experimentation, however producing this simulator is no mean feat. They also comment that although the source code for the RNG is available, it is poorly documented and what documentation there is is not up-to-date.

Summary

Good randomness that leads to high entropy, or sources of entropy that lead to true randomness, cannot be ignored. If the underlying source of randomness is weak (i.e. not really random or random over a very small range) then any dependent security function is going to be weakened. The attacker is not going to be stupid and try and break the crypto engine and the protocols if he can use weak randomness as an attack vector.

Debunking the crypto myths

This post comes from material I published first a number of years ago. The message it tries to get across is not new and is unlikely to change in my lifetime. So I’ve tidied it up and republished it.

Very simply the news and entertainment media these days is full of misguided reporting on the strength of the cryptographic tools we use. I have not seen it reported that cryptography is a controlled tool with restrictions on its use, and its exportability. I have similarly not seen any realistic reporting of just how good modern cryptography is. So I’m going to offer some points of view and leave it for people to apply some thought to the process.

First lets review what cryptographic technology is all about. Cryptography can be used in a number of applications to provide confidentiality by encryption, integrity by means of cryptographic hash functions, authentication by means of digital signature and various forms of challenge-response protocols to prove knowledge of a secret, and access control by combining identification, authentication and proofs of entitlement.

If we take these functions at simple face value it means that to achieve confidential exchange of data between Alice and Bob, with Eve (the adversary), not being able to make any sense of the content of that exchange then encryption has to be such that only Alice and Bob are the parties who can see the content. This has to hold no matter how much power Eve is able to throw at the problem. Similarly for Eve to pass herself off as Alice the cryptography has to make it infeasible for Eve to do so. This is powerful stuff and cryptography does this kind of stuff really well. There is obviously a limit to what can be encrypted or what can be identified if only to make real systems work. Passing a message from Alice to Bob across a network will not work if the network cannot see the address of Bob. So some of the package has to be sent in clear – but in a layered system this only gives the adversary knowledge of the bit of the container package that is in clear. Good network design limits the value of even that knowledge.

We also need to be aware that technologies this powerful are double edged – good to keep data away from prying eyes but also it keeps data away from those who may need to protect us. To this end there is a set of international treaties on the use of what is termed dual use technology (i.e. technology that has merit in civil business but can be seen as a weapon in the wrong hands). The main impact is that for certain classes of use the effective key length, and the access to the cryptographic algorithm in equipment is restricted. In practice most commercial applications of cryptographic algorithms are incredibly strong. In mobile phones where the radio interface is encrypted for any data captured off the radio link it is quite simply infeasible for anyone to decrypt it and recover the content. In our web browsing and e-commerce it is  similarly infeasible that anyone intercepting our traffic will ever be able to decrypt the content. Quite simply the cryptographic tools we see in use work where they need to.

So let’s get down to my biggest gripe. Quite simply is that there is a lot of guff talked about in movies and written in the press about breaking encryption. I watch too much TV and too many films where the techie says “it’s 128 bit encryption, it’ll take me a few hours to break” and all they have is a wee laptop. Think of how big that number is and how many keys fit into 128 bits: it is 2 to the power of 128, roughly this means 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 keys. Now if you can check say 1000 billion keys a minute it would take 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 minutes to check the key space. That’s a lot of minutes that add up to a bunch of years – so many that the galaxy will be long gone before you make a sizeable dent in the pile. Quite simply you cannot attack modern encryption using brute force. When you see somebody claiming it on TV remember this – it is fiction and it moves the plot along nicely but it ain’t like that in the real world.

There has been press speculation on “back doors” in crypto-algorithms. This is nonsense – the majority of algorithms we rely on are quite simply too tested, too open and too critical to be purposefully weakened. If I have the key I can decrypt the content – if I don’t I can’t. That’s it. It is an old principle – leave the security to the key and the key alone. If the algorithm is weakened to allow “good people” in then you can be assured that “bad people” will get in too. Far safer to not allow any back doors – if they exist they’ll be used and you won’t know who’s using them.

Of course somebody can get access to the content – that’s what keys are all about. You lock it up with a key and you unlock it with a key. If you want to let someone into your house give them the key. If you want someone to access your encrypted content do likewise. Just in case you’re wondering – there are no “skeleton” keys in good modern cryptography.

Some of our cryptography has a finite lifetime though as it depends on “hard” problems remaining hard. The work that I’m looking at for ETSI TC CYBER covers the issue of the impact of quantum computing on the viability of cryptography and how to continue to keep one step (at least) ahead of the attackers.

I’d like readers to go away with the knowledge that TV and movies are doing cryptography a huge dis-service – it works and it works well.