This is a summary of the Spectre (Kocher et al. 2017) attack.
Spectre is a fairly complex (some what hard to exploit, but also hard to prevent) attack on any CPU that implements branch prediction and speculative execution, and can be used to retrieve memory contents from a victim process.
- Transient instruction A transient instruction is an instruction that is executed speculatively by the CPU but never committed (retired).
- Direct threading A technique for faster execution of code popular in the Forth community, where one uses the instruction pointer to indicate which portions of code to jump to, and arranging these instructions such that code execution jumps to required labels on completion of each procedure without actually requiring procedure calls.
- ROP: Return oriented programming A technique popular in security community for identifying embedded sequences of instructions in any given moderately complex program (any thing that links to even libc) such that a set of these sequences (called gadgets) can perform arbitrary computation.
The attack starts by searching the victim binary for exploitable sequences of operations. These kinds of exploitable sequences is fairly common in any process that imports even libc, and are called gadgets. The general idea of ROP is that one can search any moderately complex program (any that links to libc) for a set of gadgets that when stringed together, can be used to write arbitrary complex code to do pretty much anything that the linked libraries are capable of.
So, the idea here is to search the victim process for a set of gadgets that when stringed together, acts as a transmitter of data by modifying cache lines or any other non-rolled back effects of transient instructions (or even timing differences in resource usage). Once such a sequence of gadgets is found, the next thing to do is to mistrain the branch predictor in CPUs so that a chosen branch will actually jump to the found gadget address speculatively, and perform the computation described by the gadget sequence. The result of such a sequence is then transmitted to the external process one bit at a time using loaded cache lines or other resource usage patterns.