So why is any of this useful?
click me if you’re sick of this silly site layout
If you checked out my first article, you should understand by now how the math of elliptic curves works and how we can use that math to generate a private key x, and a corresponding public key represented by a point X on our curve. We also talked about how so far as anyone can tell, the discrete log problem is intractable. For review, that means that given a public key X, and the base point P, it doesn’t seem computationally feasible to reverse this into the private key x on a discrete ellliptic curve. So going from private key x to public key X, trivial, going the other way, impossible. But what we haven’t talked about is, “why should anyone care about this? What can I do with this?”
That’s a very salient question, as it might not be immediately apparent. What we’re really looking for is a way to prove that we have knowledge of the private key x while somehow not actually divulging any information about x (because we want to keep our private key private.) Ideally what we really want to do is to be able to securely sign a message such that it’s tamperproof and easily publicly verifiable that we did in fact sign it, all while not divulging any information about our private key, but one step at a time.
Proving we know x
So how can we prove to someone that we know x (our private key) while not giving up any information about x? If you recall from our first look at elliptic curves the following property for point addition was true:
n•P + r•P = (n+r)•P
Where n and r are just some numbers and P is the base point of our elliptic curve. Alright, hang on because unfortunately we’re about to do a good bit of algebra.
Let’s start by introducing some new function which we call hash() which accepts an arbitrary input and deterministically produces a corresponding strong cryptographic, well, hash of that input. In practice, this might be something like SHA-256 or keccack256. Then let’s feed that hash function some input, some arbitrary string m, and the point addition of r and P.
hash(m, r•P)
Now let’s tack a bit more on.
hash(m, r•P)•n•P+r•P
We can represent this in a slightly different way. The above is equivalent to the following:
(hash(m, r•P)*n+r)•P
Now let’s set them to be equal to one another since they are equivalent. Since they are equivalent, they will be equal to one another for all values of m, r, and n.
hash(m, r•P)•n•P+r•P = (hash(m, r•P) * n + r)•P
Next, let’s set n•P equal to X.
hash(m, r•P)•X+r•P = (hash(m, r•P)*n+r)•P
If you recall back to our first chat, you’ll remember that in order to calculate X from x and P we simply did x•P = X. Well, this means that if X = n•P then n must in fact be x:
hash(m, r•P)•X+r•P = (hash(m, r•P)*x+r)•P
Phew. Okay, so now I’d like to posit that if you can produce an m, an R, and a valid (hash(m, r•P)*x+r) that satisifes the above equation, you must know what the private key x is where again, X=x•P. How would we do that? Well, we can just pick a random string for m, and since R is just r•P we can also choose a random r. For the final term, well, if we have x we just simply calculate (hash(m, r•P)*x+r) If we do that what do we get?
# we start with
hash(m, r•P)•X+r•P = (hash(m, r•P)*x+r)•P
# we replace X with x•P
hash(m, r•P)•x•P+r•P = (hash(m, r•P)*x+r)•P
This equation should look somewhat familiar. It’s the equivalent function from above which we said would hold true for all values of m, r, and n, except the value for n is x. This proves that someone with the knowledge of x can provide valid values for m, R, and (hash(m, r•P)*x+r). But what if I don’t know x? Can I still produce valid values for those things? I dunno? Let’s try!
# we start with
hash(m, r•P)•n•P+r•P = (hash(m, r•P)*x+r)•P
What do we need? Well, essentially what we would need is to calculate some values for m, r, and n such that the hash of (m, r•P)•n•P+r•P is equal to the quantity of the hash of (m, r•P) * x + r, • P.
There are in fact values for m and r where n != x where this equation will hold, but how we can find them? Well, it turns out that so long as we chose a “good” hash function, this should be computationally infeasible due to the preimage resistance of a strong cryptographic hashing function. An “ideal hash function” by definition is one where the best way to attempt to compute a preimage is through brute force. For an n-bit hash this attack would have a time complexity of 2n so provided there are no glaring weaknesses in say keccack256, no perceivable amount of computing power should ever be able to find such values where the equation holds when n != x.
Does our validation function leak any information about x?
Alright, so we’ve proven that someone with knowledge of x can prove they know it, and that provided a strong cryptographic hash function exists that someone without its knowledge can’t subvert our check. These things on their own aren’t particularly useful. Heck, if I wanted to prove I knew x, I could just tell you what x is. We still need to prove that the proof we have chosen does not leak any information about x. But first, let’s make a couple more substitutions for the sake of readability and convention.
First, let’s take our final validation equation and set R = r•P, and s = hash(m, R)*x+r. Here’s what we’re left with:
# starting with
hash(m, r•P)•n•P+r•P = (hash(m, r•P)*x+r)•P
# this then becomes
hash(m, R)•X+R = s•P
So, the question is, if I provide good values for m, R, and s, is it possible to learn anything useful about our private key x? Well, we can rule out m and R straight away, as they have literally nothing to do with x, remember m was just some random string and R = r•P where r was just some random number. But what about s? Does it leak any information about x? Let’s try and get our algebra on again.
s = hash(m, R)*x+r
s - r = hash(m, R) * x
x = (s - r) / hash(m, R)
Well, this is at least a better candidate for attack than m or R were. In order to calculate x from s it would seem that really all we need is r. But r was chosen randomly… Hmmm, but wait a minute! R is just equal to r•P! Could we take advantage of that since we know R and P? Well, no, because calculating r from R = r•P is no different than calculating x from X = x•P. Again, we would have to solve the discrete log problem which as far as anyone knows is intractable.
The final boss: digital signatures using elliptic curves
If you’ve stuck around till now, congratulations. We’re about to wrap this whole thing up. We learned about what an elliptic curve is. We learned how to do point addition with elliptic curves and why that’s useful for producing a public key and a Blockchain address given some private key. We learned that (hopefully) the discrete log problem is hard and so that although it’s easy to calculate a public key X from a private key x and a base point P, doing the inverse is seemingly impossible. Lastly, we’ve come up with what would appear to be a decent validation equation that one could use in order to prove that they know their private key x while leaking no information about that key. That validation equation was:
hash(m, R)•X+R = s•P
It turns out that this is the exact equation we use for providing a digital signature, which we can use, among other things, to sign say Ethereum transactions. In order to do this, we need simply to choose a random value for r, use it to calculate R = r•P, set m to be the message we wish to sign, and then calculate s which was just (hash(m, r•P)*x+r). Then we simply pass the unsigned message, R, and s to someone. In order for our validation equation to pass, the message must not have been tampered with since if it were the left hand side of the equation will fail to equal the right hand side since the right hand side used the real m in order to calculate s.
Practical examples in Ethereum
So in the case of Ethereum, in order to broadcast a valid signed transaction, we begin by simply generating an unsigned transaction, m, which contains all of the relevant details of the transaction such as the amount we wish to send and who we want to send it to. Next, we take our private key x, generate some random value r, use it to find some point R=r•P on the curve, and calculate s via keccack256(m, R)*x+r. Now we have everything we need in order to broadcast. One final little got’cha though–remember back on our first chat how we mentioned that we could actually represent any point on the curve using only its x coordinate and a single bit to denote whether or not we were talking about the positive y coordinate or the negative? We do exactly that when we sign an Ethereum transaction in order to increase efficiency, so instead of sending R and s along with our unsigned transaction, where R is the point = r•P, we actually send three things along with the unsigned transaction.
- v - this is that single bit denoting which of the two possible y points we are referring to
- r - this is not the random r we chose when generating our signature but is instead the the x coordinate of the point R
- s - this is what we calculated above and essentially the proof that we know the private key x
Final Thoughts and what's next
Well, this was a fun introduction to elliptic curve cryptography. It is important to note that although the principles that we’ve gone over are true, we have over-simplified things at times and made the math a bit simpler by only talking about continuous elliptic key curves. In practice, we actually use the discrete versions of those curves. Additionally, we didn’t talk much about side channel attacks or other more in depth topics. As an example, although our point addition algorithm works, it actually has the potential to end up leaking information about our private key x since each one that is present in the binary expansion of x will take more CPU time than each zero. Broadly, the whole point of this exercise was an attempt to provide some level of intuitive understanding around how elliptic curves are useful for cryptography, how they allow us to create digital signatures, and how they are used with regards to cryptocurrency. At times we may have sacrificed attention to detail for the sake of brevity. That said, the next topic I’d like to discuss is Threshold Signature Scheme or TSS wallets. This conversation will necessarily have to get very down in the weeds, but from a high level view, the goal of TSS wallets is to retain all the security properties that exist for traditional multi-signature wallets while moving all of the expensive cryptography off chain, thus saving tons in fees. In practice, we have observed fee reductions of upwards of 80%. There are some tradeoffs to using TSS wallets though, but you’ll have to check out the article if you want the full scoop!
If you found any of this helpful, feel free to send me some dust at lutzy.eth!