Squeezing a key through a carry bit

No bug is small enough

If you suspend your transcription on, please add a timestamp below to indicate how far you progressed! This will help others to resume your work!

Please do not press “publish” on to save your progress, use “save draft” instead. Only press “publish” when you're done with quality control.

Video duration
The Go implementation of the P-256 elliptic curve had a small bug due to a misplaced carry bit affecting less than 0.00000003% of field subtraction operations. We show how to build a full practical key recovery attack on top of it, capable of targeting JSON Web Encryption.

<p>Carry bugs are fairly common, and usually too small to have big impact, or so they are considered. This one was no exception.</p>

<p><a href="">Go issue #20040</a> affected the optimized x86_64 assembly implementation of scalar multiplication on the NIST P-256 elliptic curve in the standard library.</p>

<p><code>p256SubInternal</code> computes <code>x - y mod p</code>. In order to be constant time it has to do both the math for <code>x &gt;= y</code> and for <code>x &lt; y</code>, it then chooses the result based on the carry bit of <code>x - y</code>. The old code chose wrong (<code>CMOVQNE</code> vs <code>CMOVQEQ</code>), but most of the times compensated by adding a carry bit that didn't belong in there (<code>ADCQ</code> vs <code>ANDQ</code>). Except when it didn't, once in a billion times (when <code>x - y &lt; 2^256 - p</code>). <a href="">The whole patch is 5 lines.</a></p>

<p>The bug was found by a Cloudflare engineer because it caused ECDSA verifications to fail erroneously but the security impact was initially unclear. We devised an adaptive bug attack that can recover a scalar input to <code>ScalarMult</code> by submitting attacker-controlled points and checking if the result is correct. Elliptic Curve Diffie-Hellman involves a secret scalar, a peer-provided point, and fails to establish a key if the result is incorrect.</p>

<p>We reported this to the Go team, Go 1.7.6 and 1.8.2 were issued and the vulnerability was assigned <a href="">CVE-2017-8932</a>.</p>

<p>At a high level, this P-256 ScalarMult implementation processes the scalar in blocks of 5 bits. We can precompute points that trigger the bug for each specific 5 bit value, and submit them. When the protocol fails, we learned 5 key bits, and we move on to the next 5, Hollywood style. In about 500 submissions on average we recover the whole key.</p>

<p>The precomputation involves a lot of unusable points and edge cases, but by modifying the optimized assembly implementation and generating points intelligently, we can produce a full round of points in seconds on 1000 machines (or spot instances). Each round depends on the previous ones, so must be computed live during each attack.</p>

<p>Normal ECDH does not offer an attacker multiple attempts against the same scalar, making the attack impossible. However, a variant of ECDH with a static scalar is used as a public key encryption scheme, for example in JSON Web Encryption. The attack can fully recover the private key in that scenario.</p>

<p>No bug is small enough.</p>

Talk ID
Saal Dijkstra
2 p.m.
Type of
Filippo Valsorda
Talk Slug & media link

Talk & Speaker speed statistics

Very rough underestimation:
136.5 wpm
717.9 spm
138.7 wpm
726.1 spm
100.0% Checking done100.0%
0.0% Syncing done0.0%
0.0% Transcribing done0.0%
0.0% Nothing done yet0.0%

Work on this video on Amara!

Talk & Speaker speed statistics with word clouds

Whole talk:
136.5 wpm
717.9 spm
Filippo Valsorda:
138.7 wpm
726.1 spm