back

Inside the AMD Microcode ROM

(Ab)Using AMD Microcode for fun and security

If you suspend your transcription on amara.org, please add a timestamp below to indicate how far you progressed! This will help others to resume your work!

Please do not press “publish” on amara.org to save your progress, use “save draft” instead. Only press “publish” when you're done with quality control.

Video duration
00:37:20
Language
English
Abstract
Microcode runs in most modern CPUs and translates the outer instruction set (e.g. x86) into a simpler form (usually a RISC architecture). It is updatable to fix bugs in the silicon (see Meltdown/Spectre), but these updates are encrypted and signed, so no one knows how microcode works on conventional CPUs. We successfully reverse engineered part of the microde semantics of AMD CPUs and are able to write our own programs. We also recovered the mapping between the physical readout (electron microscope) and the "virtual" addresses used by microcode itself. In this talk we present background on microcode, our findings, our open source framework to write custom microcode and our custom defensive measures implemented in microcode.

We build on our results presented on 34C3 to provide more insight into how microcode works and more details of the microcode ROM itself.

tl;dr diff to last talk:
- Mapped physical readout to virtual addresses, we can now read the microcode implementation of specfic instructions
- More microcode semantics known, more stable programs
- Opensource framework for creating, diassembling and testing microcode on AMD CPUs
- Simple hardware setup to develop microcode programs
- More practical examples of what you can do with microcode, focused on defense instead of offense this time

Since 34C3 we worked on recovering the microcode ROM completely and used that knowledge to implement constructive microcode programs that add to or enhance functionality of the CPU. We also worked on our now open source framework to create and diassemble microcode for AMD CPUs up to 2013. We will give a short intro into how to use it to create custom microcode programs and test them on real hardware. We also provide guidelines on how to construct the test setup we used, which is essentially any old AMD mainboard (native serial port required), a RaspberryPi with a serial adapter and some wiring including a few basic electronic components. Using this you can remotely and automatically test any number of microcode updates and it is integrated in our framework.

On the microcode program side we will show how to hijack microcoded instructions to replace them with new semantics, for example reviving the good-old BOUND x86 instruction. We also show how to roll your own microcode update verification scheme, so only trusted and signed updates can be loaded on vulnerable CPUs.

Additionally we will provide some implementation details found in the microcode ROM and show how it is used to implement complex functions like the instruction WRMSR, which among other functions is used to update the microcode.

We will start with a crash-course covering fundamentals related to instruction decoding, CPU architecture and microcode principles. We will then present our new insights and finish with a demo of how our framework works.

Talk ID
9614
Event:
35c3
Day
1
Room
Borg
Start
11:50 p.m.
Duration
00:40:00
Track
Security
Type of
lecture
Speaker
Benjamin Kollenda
Philipp Koppe
Talk Slug & media link
35c3-9614-inside_the_amd_microcode_rom

Talk & Speaker speed statistics

Very rough underestimation:
139.2 wpm
839.6 spm
100.0% Checking done100.0%
0.0% Syncing done0.0%
0.0% Transcribing done0.0%
0.0% Nothing done yet0.0%
  

Work on this video on Amara!

Talk & Speaker speed statistics with word clouds

Whole talk:
139.2 wpm
839.6 spm
microcodeinstructionx86bitrominstructionscodeupdateregisteraddressessentiallybranchupdatestalkcpuexampleimplementedregistersprettyangelwritememoryoperationsbasedinternalframeworkinsidetriadraspberryhardwareaccesslearnedbitssimplesoftwareamdpienginemapmicrophonestatedatasanitizeradresssetexecutedtimepowerramhandler