Technorati Tags: cryptography, public-key, encryption, RSA, asymmetric encryption

The most successful public key cryptosystem in use today is RSA - named for its inventors Rivest, Shamir, and Adleman. I first learned about RSA in grad school from one of my professors, Errol Lloyd, who was one of Ron Rivest's students. Errol is without a doubt the best *teacher* I've ever had (and also a thoroughly nice guy). If you want to go to grad school to study algorithms, you frankly couldn't do better than heading to Delaware to work with Errol. I have very fond memories of Errol's class where we talked about this. He's got a way of teaching where he doesn't come out and *tell* you anything; what he does is ask questions that lead you through the process of figuring it out yourself. That's an incredibly effective way to teach *if* you can carry it off. Personally, I *can't*. And I've never known anyone except Errol who could do it for a topic like RSA encryption!

Anyway, moving on... In general, public key cryptosystems are based on problems that are easy to solve computationally in one direction, but really hard to solve computationally in the other. In the case of RSA, the basic underlying problem involves prime numbers: if you've got two really large prime numbers, then multiplying them together is easy; but if you've got a number that's the product of two really large primes, factoring it is *very* hard.

One of the things that makes RSA difficult to explain is that it's hard to find a starting point. The actual encryption and decryption processes are so simple that

they seem like they can't possibly work; there's just no way to make sense out

of *why* they work without understanding where the keys come from; but it's strange to describe an encryption system by starting with how to generate keys, when you don't yet know how they're going to get used.

The reason for this tangling comes back to the fundamental goal of a public key

system: you've got to have two keys which share a complex mathematical relationship.

The encryption system itself is really just a simple expression of the relationship

between the keys. So the algorithm is (conceptually) very simple; the whole thing is

based on the nature of the keys, their relationship, and the way that they're

generate. It's not like symmetric cryptography, where you can simply chose a key; in

public key systems, the keys have to be generated to satisfy a set of mathematical

relationships.

With that it mind, we'll start working through the way that RSA works by showing

how you can create a public/private key pair. The key generation process is actually

pretty simple, but most descriptions of it get tangled up because it uses a lot of

terminology from number theory. I'll try to present the standard terms as clearly as I

can.

The basic structure underlying an RSA key pair is a pair of two large prime

numbers. What's large? That depends on how hard you want it to be to crack. The

tradeoff is that the larger the original pair of primes are, the more complex it is to

compute ciphertext using a key. So you need to choose a key size which makes your keys

hard to crack, without making the cost of encryption and decryption unacceptably

high. Once you've decided on a key size, that dictates the size of your two prime numbers, and you're ready to compute a key pair, as follows.

- Compute two large primes, which are typically named P and Q.
- Using P and Q, you compute a number N=P×Q, which is called the
*modulus*of the keys being generated. - Compute the
*totient*of the modulus. For any integer, I,

the totient of I (written φ(I)) is the number of integers*smaller*

than I that are relatively prime to I. Because P and Q are prime, φ(P×Q)=(P-1)×(Q-1). (The totient of P is P-1, because there are P-1 numbers relatively prime to P; the totient of Q is Q-1 for the same reason; and since P and Q are (trivially) relatively prime to each other, the totient of P×Q is (P-1)×(Q-1)). - Choose an integer, E, smaller than and relatively prime to φ(N). E is

called the*public key exponent*. - Compute an integer D such that D×E=1 mod φ(N). D is called the
*private key exponent*.

The public key is the pair (N, E) of the modulus and the public key exponent; and the private key is the pain (N, D) of the modulus and the private key exponent. So you've got your key pair.

Encryption and decryption are amazingly simple.

Suppose that the ubiquitous Alice and Bob want to communicate. Alice gives Bob her

public key, (N, E). Now, when Alice wants to send a message to Bob, she encodes the

plaintext of the message as an integer, M. (I'll leave the exact encoding of plaintext open for now.) To encrypt with her private key, (N, D),

she takes that integer and computes:

Ciphertext = M

^{D}mod N

Then to decrypt the message, Bob uses his key pair, and computes:

M = Ciphertext

^{E}mod N

For Bob to encrypt a message for Alice, he does *exactly* the same thing that he did to the ciphertext - except he does it to the encoded message, M. For Alice to decrypt that, she does exactly what she did to *encrypt* the original M, except that she uses the ciphertext she recieved from Bob instead of the encoded plaintext M. In other words, if you've got a ciphertext message encrypted by the private key, decrypting it is *exactly* the same process as *encrypting* a plaintext with the public key, and vice versa. (This point is what used to cause me lots of confusion remembering what was symmetric and what was assymetric - RSA style asymmetric encryption is really very symmetric in how the algorithm works.)

How can this possibly work? On the face of it, it looks ridiculous! You encode by exponentiating once; you decode not by taking a root, but by exponentiating again!

It all comes back to the way the keys were generated. If we look at the process

in terms of modulo arithmetic, it's pretty easy to see why it works:

- Take an original message, M. Encrypted, it's

C = M^{D}mod N. - Now, take the ciphertext, C, and decrypt it.

M' = C^{E}mod N. - Now, expand C: M' = (M
^{D}mod N)^{E}mod N. - Now, we can combine the exponents: M' = M
^{D×E}mod N. - D×E = 1 mod N.
- By some trickiness, related to the fact that D and E are relative primes related to both each other and N by their relation to the

totient of N, we can show that the fact that D×E = 1 mod N means

that M' = M^{D×E}= M^{1}= M.

Watch how we can walk through a ridiculously simplified example. Let's start with

a pair of primes - 29 and 61.

- Generate keys:
- First, we compute the modulus: 29×61=1769.
- Then we compute the totient: 1680.
- Then we choose an E which is relatively prime with 1680. To make things easy,

I'll just pick a prime number: E=13. - Now, I need to compute a D, such that D×E=1 mod 1680;

or to pull out a bit of standard terminology, D is the modular multiplicative

inverse of E mod 1680. Doing that is an exercise in modulo arithmetic

which gets beyond the scope of what I want to talk about today, so I'll

cheat: whipping out Mathematica, I get D=517. - So, the public key is (1769,13), and the private key is (1769,517).

- Now, let's encode a message. Say the message is 236. So, encrypted by

the public key, that's 236^{13}mod 1769 = 7044595136617487310722334457856 mod 1769 = 573. - And now, let's decode it. That's 573
^{517}mod 1769 = 9245694881849602770197401240655288154608138663291308421093411147578949\

4462244595981994549450775979702694154377539110666284415651476199963925\

1321729713042853618725578035043275339491371655153997356248940804495270\

8984801258907252216498797520780679511957335315751292762455167434140192\

4399386435156204841871858091160737587648566008200581107378219553124074\

9374669246678685272914050993442585237015640426625522901746517901417734\

7954757584818934596889432709457570226928654243870833959974236930834811\

3204280174093386319717080086558480154647619900966739378086145377566860\

9195850562761848872414474511076491179637551417498920992360433000710853\

3935087767676514264669956171228256961906386882369681633698604708335082\

4532038380922358459546076540454839797340473363177061704334483341984626\

0992808331110751927026090969022077767175228102822099312339565763871188\

4006940217478961345132678704142880421227822352750998264777937728911493\

6283819708613556665220171457755862562705567302041244568283475209225680\

3828781673802145987568796646578059853165517716374603716155560308698631\

2650416428650673915642280074852668462543146649056536263032788685526436\

7863995916605455190338263161582118236712167656565774640875461698797822\

0025671340803745351386743506255677806855116853206286798808454276083160\

2812779603110688541701764925723493557773173917037390830611160570524069\

7471206139919064156642428626093922418127017110711925314998235480530680\

56675958655700624098981453 mod 1769 = 236.

So it works - but computing those results is not exactly trivial. It happens that there is a fast algorithm for doing those sorts of exponentiations - but "fast" is a relative term. RSA encryption is a *lot* slower than most symmetric encryptions, by a significant factor.

In real-world use, RSA is rarely used for full-message encryption. It's used for signatures (where you encrypt a small summary of a message), and for protocol negotiations to allow the two sides to settle on a symmetric encryption key to

use for the rest of their communication.

You'll notice that I still haven't gotten back to how you take a message

and convert it to an integer. That's *not* trivial. The problem is,

for certain message sizes, or messages with certain numeric properties,

RSA can be easily broken. So you need to protect against those cases. That's

done by *padding* the message - adding bits to it in a way that removes

or obscures the properties of the message that would otherwise make it vulnerable.

Discussing the vulnerabilities of RSA for certain types of messages, and how

to build a padding system that defeats them is a topic for another post.

In future posts, I'll also describe the two most common uses of RSA in more detail - both using RSA for signing a message to prove that you really wrote it; and using

RSA as part of a secure cryptographic protocol where most of the traffic is really

encrypted using a symmetric algorithm.

- Log in to post comments

Ah! Brilliant, thank you for that post! I always wanted to understand how RSA works. Can I ask for posts here? (Or make a couple of suggestions anyhow). I'd love to see an explanation on the Off-The-Record Messaging protocol, in particular what makes it so different than others and how this was done... and also, drifting a bit from your original intention with this series of posts, I'd like to know how my privacy is assured by third parties... banks, web applications, mobile phone companies. What are they doing to protect my privacy, and should I trust them?

"He's got a way of teaching where he doesn't come out and tell you anything; what he does is ask questions that lead you through the process of figuring it out yourself."

I think that's colloquially known as the Socratic Method. The same style is used in book form in the

The Little Schemer(formerlyThe Little LISPer, I believe) to teach Scheme. On the darker side of things, someone who is very skilled at arguing can use this technique to great effect to trap you with your own reasoning.It's indeed a nice post, but I have a question: in the examples you give, encryption is done with the

privatekey. Isn't this the way digital signatures are done? (Actually on real systems it's not, as they are typically hybrids that use both asymmetric and symmetric algorithms, to overcome the speed limitations you mentioned, but I digress).My point is, by saying things like

encrypt with her private key, without explaining that's for digital signatures (instead of encryption proper), aren't you creating a source of confusion? Even if you reserve the details of public key system for a later post, I think you should have at least mentioned that here. But as I said in the beginning, it's a nice post nonetheless :-)It's indeed a nice post, but I have a question: in the examples you give, encryption is done with the

privatekey. Isn't this the way digital signatures are done? (Actually on real systems it's not, as they are typically hybrids that use both asymmetric and symmetric algorithms, to overcome the speed limitations you mentioned, but I digress).My point is, by saying things like

encrypt with her private key, without explaining that's for digital signatures (instead of encryption proper), aren't you creating a source of confusion? Even if you reserve the details of public key system for a later post, I think you should have at least mentioned that here. But as I said in the beginning, it's a nice post nonetheless :-)Some nitpicking about academic credit...

RS and A were RE-inventors of the algorithm. It had first been discovered in the UK by James Ellis and Clifford Cocks of the GCHQ. They were soon joined by Malcolm Williamson, who went on and invented what became known as the Diffie-Hellman(-Merkle) key exchange. Unfortunately their role remained hidden, because they were not allowed to publish their discoveries at the time.

You can read the full story in the Code Book by Simon Singh.

Oscar:

Public key encryption

is used fordigital signatures, but that's not the only use of it. And even when it is used for signature, the way that it's used is toencrypta section of the message which contains an authentication code for the rest of the message.I'll talk about this in a later post, but a quick outline of how secure digital signatures work is:

(1) Take a message, M, that you want to send. You don't

care if other people can read M, but you want the

intended recipients to be absolutely sure that it was

you who sent it.

(2) Using a special kind of hash function, compute a value

D, called a

digestof M, where a D is a short codethat is produced solely from the content of the message.

(A good digest has a set of interesting properties, but

I'm not going to list them here.)

(3) Encrypt D, using your

privatekey, and appendthe result to the end of your message. The encrypted

digest is called a

signature.Now, anyone who sees the message M can compute D. They don't

need any kind of key - the value of D is solely dependent

on the content of the message.

Anyone can verify that

yousent the message, becauseonly youcould have encrypted the digest in a way that could be decrypted with your private key.The big advantage of this technique is that there's no secret that needs to be shared with your intended recipients. Unlike a MAC in a symmetric encryption system,

the digest is not encrypted, and anyone can generate it. But only you could produce an encrypted copy of the digest.

So digital signature is really just a particular use-case of public key encryption that encrypts the message digest instead of the entire message.

You can see this in a nice, visible form if you use the FireGPG plugin for Firefox. Given a message, you can

encrypt it without signing it; sign it without encrypting it; or encrypt it and sign it. In all three cases, the encryption is RSA; in two of the three, RSA is used to encrypt the entire message.

Xzy:

I know that it's called the socratic method; but in my experience, a lot of people

don'tknow what the socratic method was. I figured this post already had enoughjargon in it without needing to introduce the name for that teaching method, when I would still need to include the description.

And actually, I've never seen it used in a serious way to tangle people up. In various texts, particularly Plato, I've seen it used to tangle up

extremely stupidinterviewees. But I've never seen it used in real life to tangle up someone who wasn't already being an idiot to begin with. That sort of rhetorical use of it is generally a dressed up form of proof-by-contradiction, and the questioning process is really just for show.Nice article, just a small nitpick:

Perhaps you should clarify that you can only effectively compute D because you know the primes p and q. After all that is what makes the system secure.

If you could just plug E and M into Mathematica and find the modular inverse, anyone could do this, since (E,M) is the public key. But luckily the Extended Euclidian Algorithm require knowledge of the prime factors of M to work.

Mark:

You're right, but the use you make of the verb "encrypt" is somewhat loose. I mean, usually when one talks about "encrypting", one means "to write in a form that cannot be easily read", and you absolutely cannot do that using ("encrypting") with the private key (because everyone with the corresponding public key will be able to "decrypt" that message). But after reading this paragraph:

Suppose that the ubiquitous Alice and Bob want to communicate. Alice gives Bob her public key, (N, E). Now, when Alice wants to send a message to Bob, she encodes the plaintext of the message [...]. To encrypt with her private key, (N, D), she takes that integer and computes [...]

This might seem to suggest that the private key should be used to do encryption (as in "only the intended recipient gets to decrypt") which is far from being the case.

Final remark: in your reply, you said:

Anyone can verify that you sent the message, because

only you could have encrypted the digest in a way that could be decrypted with your private key.

You meant

public key, right? :)Lassi: GCHQ did not invent the RSA algorithm. They invented the algorithm also known as DH. While both are public key methods, they are quite different.

Mark: Oscar is right. You used the private key parameter, D, as part of the encryption step ("C = M^D mod N"). Like nearly all mistakes with cryptography, the impact is that your described system is totally insecure. I've already written articles (parts 1 and 2) about how to calculate E and N given only two messages encrypted with D. In your system where D, N is the public key, the attacker can calculate E, P, and Q without seeing any encrypted messages.

Take a moment to think about this. Inverting the sense of the E and D exponents in relation to encryption and decryption results in a system where your "private" key is no longer private! This is because of the asymmetry in selecting them, something which you used in your example for choosing the parameters.

While I appreciate your enthusiasm for cryptography, the lack of attention to detail in your posts is troubling. In crypto, being exact with your terms and paranoid in your implementation really matters. Thanks.

You mentioned that for too small M, RSA is easy to break, and that your typical encoding function to turn text into M will add padding for this reason. However, since everything relating to E and D doesn't depend in any way on the way M is encoded, why can't you just use a modified encoding function that leaves out the padding, pick a plaintext to make an M that is small enough, and then do whatever you can do when M is small to break the key?

Your detailed list had a small error: it's DÃE=1 mod Ï(N), not DxE=1 mod N.

In fact, I'm a little surprised you didn't mention the core relation:

aÏ(N) = 1 mod N

(from Euler, as I recall).

if you use the basic definition of D and E,

D * E = 1 mod Ï(N) â D * E = n Ï(N) + 1

then

MD*E = Mn Ï(N) + 1 = M*Mn Ï(N) = M*(1n) = M.

Also, on the totient of a product... you were a little too casual imho. It's not hard to explain.

For the number from 1 to pq, you need to eliminate all multiples of 'p' (p, 2p, 3p, ..., qp) and all multiples of 'q' (q, 2q, 3q, ..., pq), and then restore the duplicate of 'pq'.

That gives you pq - q - p + 1, or (p-1)(q-1) if you factor it.

A really cool way of looking at it geometrically is to draw a grid of p by q cells. Color one edge -- that's the multiples of 'p'. Color an adjacent edge -- that's the multiples of 'q'. What's left is the smaller rectangle (p-1)(q-1).

Very nice article. RSA really is quite astonishing in both its simplicity and its effectiveness. It always amazes me that although cryptography has been around for thousands of years, it was only 33 years ago that public-key crypto was conceptually invented and (in the classic Diffie-Hellman paper) and 32 years ago that RSA was designed.

One minor point to clarify: You calaculated 573517 mod 1769 by calculating 573517 first (giving a 1426-digit number) and then reducing the result mod 1769. That's a big number before reduction mod 1769, and it can get a lot bigger in practice (where the p and q are not 2-digit primes but more like 200-digit primes). So we wouldn't actually calculate it that way, but rather use a repeated squaring technique where we start by computing 5732 mod 1769 (=1064), then 5734 mod 1769 (=10642 mod 1769 =1705), then 5738 mod 1769 (=17052 mod 1769 =558), and so on up to 573512 mod 1769, then use the results of these calculations to find 573517 mod 1769. This keeps the digit size of the numbers involved down and therefore is a lot faster and more memory-efficient.

Of course you knew that, and were just keeping it simple for us, right? :)

Ditto on Lloyd.

I've said it before, but I'll say it again: my math is sorely lacking -- though I recognize number theory being used, mostly -- but I'm really enjoying these posts.