New Intel CPU Flaws Expose VMs and Clouds to Full Takeover – Introducing RIDL, ZombieLoad and Fallout

Posted on May 15, 2019 by Derek Zimmer

I wrote a piece last year about the serious problems with shared hardware. It covers the additional attack surface of introducing virtualization technology to your servers, and the additional risks that come with it. At the time, Meltdown and Spectre were both headlines and the risks against businesses where security is a priority were serious.

Over time, operating systems, firmware, and cpu microcode were adjusted to help mitigate these problems, and everyone again got comfortable with virtualization and The Cloud, despite all of the fundamental layers of risk still lying in wait for the next major exploit.

Yesterday, the next major exploits came to light. All three of these new exploits impact Intel hardware and have serious end-consequences such as RCE, full takeover of machines regardless of operating system, theft of confidential data, extracting passwords, and allowing egress across a secure network.

RIDL – Rogue In-Flight Data Load

This exploit attacks the buffers that a CPU uses when reading from or writing to memory (called the Line Feed Buffer or LFB). The exploit allows unprivileged code to access critical data that lies inside of these buffers, and it can cross any security boundary such as residing in different VMs and different cloud instances. Security keys and credentials can be pulled to gain full control of the primary OS, and the exploit appears to impact all hypervisors equally (Xen, VMWare, etc).

This problem is particularly troubling because unlike Spectre, Meltdown, and Foreshadow, you can’t mitigate RIDL by suppressing specific types of errors when they occur. To add even more fuel to the fire, RIDL can be executed remotely by a malicious web page through javascript alone.

From the paper on RIDL:

Since the original Meltdown and Spectre disclosure, the family of memory disclosure attacks abusing speculative execution has grown steadily. While these attacks can leak sensitive information across security boundaries, they are all subject to strict addressing restrictions. In particular, Spectre variants allow attacker-controlled code to only leak within the loaded virtual address space. Meltdown and Foreshadow require the target physical address to at least appear in the loaded address translation data structures. Such restrictions have exposed convenient anchor points to deploy practical “spot” mitigations against existing attacks. This shaped the common perception that—until in-silicon mitigations are available on the next generation of hardware—per-variant, software-only mitigations are a relatively pain-free strategy to contain ever-emerging memory disclosure attacks based on speculative execution.

In this paper, we challenge the common perception by introducing Rogue In-flight Data Load(RIDL), a new class of speculative execution attacks that lifts all such addressing restrictions entirely. While existing attacks target information at specific addresses, RIDL operates akin to a passive sniffer that eavesdrops on in-flight data (e.g., data in the line fill buffers) flowing through CPU components. RIDL is powerful: it can leak information across address space and privilege boundaries by solely abusing micro-optimizations implemented in commodity Intel processors. Unlike existing attacks, RIDL is non-trivial to stop with practical mitigations in software.

More information about this exploit will be released on May 20th at the IEEE Symposium on Security and Privacy.

The Full RIDL Paper is Available Here.

RIDL Mitigation

The current recommended method of partially mitigating RIDL is to disable SMT (also known as Hyper-Threading) on all affected CPUs. This does not fully close the flaws that the line feed buffers have, which will have to be done with additional microcode updates that will flush those buffers more frequently, likely causing a noticeable performance hit.

Fallout – Stealing Confidential Data from Store Buffers

This exploit attacks the buffers that a CPU uses every time it needs to store data for any purpose. Even worse, once the exploit is successfully implemented, it can be tasked to steal specific data instead of random data in the buffer. It specifically breaks through countermeasures designed to make this type of exploit harder to usefully execute by bypassing Kernel Address Space Layout Randomization (KASLR). This means that credentials, keys, and any information that you would need to escalate access to the machine are vulnerable.

The results of their experiment speak for themselves.

Experimental Setup.

We evaluate Fallout on two Intel machines, a Kaby Lake i7-7600U and a Coffee Lake R i9-9900K. Both machines run a fully up-dated Ubuntu 16.04 system, with all countermeasures in their default configuration. On both systems, we empirically test the possible locations on the kernel in its address space obtaining about 490 locations,implying about 9 bits of entropy.

Experimental Results.

We run the attack 1000 times each, on both the Kaby Lake and the Coffee Lake machines. Our attack can recover the kernel location with 100% accuracy on both machines, within about 0.27 seconds.

The full Fallout paper is available here.

Mitigation of Fallout:

Additional patches will have to come in the form of OS and microcode updates, but for now, the best mitigation is to disable SMT (aka HyperThreading) on all affected Intel CPUs.

ZombieLoad – Pulling Data from Store Buffers that Belongs to Other CPUs

This side-channel similarly depends on hitting the buffer logic in CPUs. It causes an information leak by intentionally creating errors and causing the processor to give information that is stored from different areas of the buffer that are supposed to be isolated.

This means that data can be retrieved from outside security boundaries such as outside of a VM or cloud instance, protected areas of memory (the kernel), or even across the Intel SGX framework specifically designed to protect from these types of problems.

In the proof-of-concept build used to demonstrate the vulnerability, ZombieLoad was used to pull AES-128 encryption keys from a server in less than 10 seconds. It pulled the debug Intel SGX sealing key (that is supposed to isolate processes from one another), and demonstrated that pulling live SGX keys is possible.  It was also used to pull data from across security boundaries, and they achieved 20KB/sec of raw data leakage from across different VMs, which will work regardless of operating system.

The full ZombieLoad paper is available here.

Mitigating Zombieload

ZombieLoad is mitigated by disabling HyperThreading. Because of the nature of the attacks, there will be large performance hits from software and firmware updates that attempt to mitigate this problem. This is because buffers will need to be flushed and performance optimizations will need to be avoided in order to prevent the exploit from working.