RIDL

Rogue In-flight Data Load

Stephan van Schaik - Alyssa Milburn

Sebastian Österlund - Pietro Frigo - Giorgi Maisuradze*

Kaveh Razavi - Herbert Bos - Cristiano Giuffrida

MDS Attacks

MDS Attacks

Let’s first talk about cache attacks

Background

Background

FLUSH + RELOAD

Previous Attacks

Meltdown
Spectre
Foreshadow or L1TF

Mitigations

Kernel Page Table Isolation
Array index masking
XOR masking

KPTI

Problem: leak kernel data from virtual addresses

KPTI

Solution: unmap kernel addresses

So we have a system with all mitigations in-place

What can we still do as an attacker?

Meet Rogue In-flight Data Load or RIDL

A new class of speculative execution attacks

that knows no boundaries

Privilege levels are just a social construct

Security Domains

We can leak between hardware threads!

Security Domains

But can we leak across other security domains?

Security Domains

Yes, we can!

Security Domains

We leak from the kernel …

Security Domains

... across VMs …

Security Domains

... from the hypervisor …

Security Domains

... and from SGX enclaves!

We leak across all security domains!

Security Domains

Can we leak in the web browser?

Security Domains

Yes, we can!

We reproduced RIDL in Mozilla Firefox
⇒ No need for special instructions

We leak across security domains, and in the browser!

Memory addresses are a social construct too

Previous Attacks

Previous attacks show we can speculatively leak from addresses

Previous Attacks

Our mitigation efforts focus on isolating/masking addresses

Spectre: access out-of-bound addresses
Meltdown: leak kernel data from virtual addresses
Foreshadow: leak from physical address

Spectre: mask array index to limit address range
Meltdown: unmap kernel addresses from userspace
Foreshadow: invalidate physical address

Previous Attacks

Previous attacks exploit addressing
Mitigation by isolating/masking addresses

RIDL

RIDL does not depend on addressing:

⇒ Bypass all address-based security checks
⇒ Makes RIDL hard to mitigate

What CPUs does RIDL affect?

We bought Intel and AMD CPUs from almost every generation since 2008

... and sent the invoices to our professor Herbert Bos

RIDL works on all mainstream Intel CPUs since 2008

Intel announces Coffee Lake Refresh

In-silicon mitigations against Meltdown and Foreshadow

Let’s buy the Intel Core i9-9900K!

... and send another invoice to our professor Herbert Bos

We got it the day after we submitted the paper

===

RIDL works regardless of these in-silicon mitigations

AMD

We also tried to reproduce it on AMD

AMD

We also tried to reproduce it on AMD

RIDL does not affect AMD

But where are we actually leaking from?

Leaky Sources

Leaky Sources

Previous attacks had it easy, they leak from caches

Leaky Sources

Caches are well documented and well understood.

Leaky Sources

But RIDL does not leak from caches!

Leaky Sources

But what else is there to leak from?

Leaky Sources

There are other internal CPU buffers

Leaky Sources

Line Fill Buffers, Store Buffers and Load Ports

Leaky Sources

But there is more!

Leaky Sources

Uncached Memory

We can leak from various internal CPU buffers!

RIDL is a class of speculative execution attacks

also known as Micro-architectural Data Sampling

Let’s focus on one particular instance:

Line Fill Buffers

Manuals

We first read the manuals
Some references to internal CPU buffers
But no further explanation
Where would you even start?

That’s why we started reading patents instead!

We read a lot of patents, and survived!

So today I can tell you a bit more about them

But wait, what are these

Line Fill Buffers?

Central buffer between execution units, L1d and L2 to improve memory throughput

Line Fill Buffers?

Central buffer between execution units, L1d and L2 to improve memory throughput

Line Fill Buffers?

Central buffer between execution units, L1d and L2 to improve memory throughput

Line Fill Buffers?

Central buffer between execution units, L1d and L2 to improve memory throughput

Line Fill Buffers?

Multiple roles:

Asynchronous memory requests
Load squashing
Write combining
Uncached memory

Line Fill Buffers?

Multiple roles:

Asynchronous memory requests
Load squashing
Write combining
Uncached memory

Line Fill Buffers?

CPU design: what to do on a cache miss?

Send out memory request
Wait for completion
Blocks other loads/stores

Line Fill Buffers?

Solution: keep track of address in LFB

Send out memory request
Allocate LFB entry
Store address in LFB
Serve other loads/stores
Pending request eventually completes

Line Fill Buffers?

Solution: keep track of address in LFB

Send out memory request
Allocate LFB entry
Store address in LFB
Serve other loads/stores
Pending request eventually completes

Line Fill Buffers?

Allocate LFB entry

May contain data from previous load

RIDL exploits this

Experiments

Experiments in the paper

Experiments

Experiments in the paper

Experiments

Experiments in the paper

Experiments

Conclusion: our primary RIDL instance leaks from Line Fill Buffers

Cool… so how do we actually mount a RIDL attack?

Ideas

We can leak in-flight data
Let’s get some sensitive data in-flight!

Confused deputy

Observation: invoking passwd utility reads /etc/shadow contents
We can control the affinity of the process with taskset
Try to leak from the other Hyper-Thread when /etc/shadow is in-flight
Not so easy…

Challenges

We need to synchronize or do some post-processing

Challenges

We need to synchronize or do some post-processing

Synchronize: not possible, we cannot change passwd binary

Challenges

We need to synchronize or do some post-processing

Synchronize: not possible, we cannot change passwd binary
Post-processing: we can repeat measurements, stitch them together

Challenges

What does this program look like?

Challenges

RIDL is like drinking from a fire hose

You just get whatever data is in flight!

Filtering Data

How can we filter data?

We want to leak from /etc/shadow
First line /etc/shadow is for root
Starts with "root:"
Use prefix matching:
- Match ⇒ we learn a new byte
- No Match ⇒ discard

Filtering

Filtering

Challenges

Result

We can leak the root password hash from an unprivileged user

Result

We can leak the root password hash from an unprivileged user

Let’s extend this a bit…

Result

We can leak the root password hash from an unprivileged user

Let’s extend this a bit…

to the cloud!

Threat Model

Victim VM in the cloud

Threat Model

We get a VM on the same server

Threat Model

We make sure it is co-located

Threat Model

Victim VM runs an SSH server

In-Flight Data

How do we get data in flight?

In-Flight Data

We run an SSH client…

In-Flight Data

... that keeps connecting to the SSH server

In-Flight Data

The SSH server loads /etc/shadow through LFB

In-Flight Data

The contents from /etc/shadow are in flight

Leaking

Now that the data is in flight, we want to leak it

Leaking

We run our RIDL program on our server…

Leaking

...which leaks the data from the LFB

More Examples

More examples in the paper:

Leaking internal CPU data (e.g. page tables)
Arbitrary kernel read
Leaking in the browser

Arbitrary kernel leak

We can use Spectre in combination with RIDL
Use gadgets to pull data into LFB
Train branch predictor to allow arbitrary OOB read

RIDL + Spectre

copy_from_user() can access arbitrary user-supplied pointer
Repeatedly call setrlimit() with valid user pointer to train branch predictor
After training, we supply it a kernel pointer we want to leak
Will be executed speculatively, pulled into LFB
At the same time we leak using RIDL

What next??

We attacked the cloud and have an arbitrary kernel read.

We still need a local account on the target…

From the browser

Portability

Some environments do not have TSX
clflush might also not be available

Portability

No clflush
- Use EVICT + RELOAD
No TSX
- Use demand paging to generate valid page-fault (error supression)

Portability

/* Evict buffer from cache. */
evict(buffer);

/* Speculatively load the secret. */
char value = *(new_page);

/* Calculate the corresponding entry. */
char *entry_ptr = buffer + (1024 * value);

/* Load that entry into the cache. */
*(entry_ptr);

/* Time the reload of each buffer entry to
   see which entry is now cached. */
for(k=0;k<256;++k){
  t0 = cycles();
  *(buffer + 1024 * k);
  if (cycles - t0 < 100) ++results[k];
}

From the browser

We can generate this code from WebAssembly!

From the browser

From the browser

Existing Mitigations

Three mechanisms:

Inhibit Trigger (stop speculation, fences, retpoline)
Hide Secret (KPTI, array index masking, L1d flush)
Disrupt channel of leakage (disable timers)

Why they fail

Existing mitigations fail because they assume addressing

RIDL Mitigations

Same-thread:
- verw overwrites affected buffers
- Special Assembly snippets

MD_CLEAR workaraound

	__asm__ __volatile__ (
		"lfence\n\t"
		"orpd (%1), %%xmm0\n\t"
		"orpd (%1), %%xmm0\n\t"
		"xorl	%%eax, %%eax\n\t"
		"1:clflushopt 5376(%0,%%rax,8)\n\t"
		"addl	$8, %%eax\n\t"
		"cmpl $8*12, %%eax\n\t"
		"jb 1b\n\t"
		"sfence\n\t"
		"movl	$6144, %%ecx\n\t"
		"xorl	%%eax, %%eax\n\t"
		"rep stosb\n\t"
		"mfence\n\t"
		: "+D" (dst)
		: "r" (zero_ptr)
		: "eax", "ecx", "cc", "memory"
	);

RIDL Mitigations

Same-thread:
- verw overwrites affected buffers
- Special Assembly snippets
Cross-thread:
- Complex scheduling and synchronization

RIDL Mitigations

RIDL Mitigations

Same-thread:
- verw overwrites affected buffers
- Special Assembly snippets
Cross-thread:
- Complex scheduling and synchronization
- Disable Intel Hyper-Threading®