True IOMMU Protection from DMA Attacks:When Copy is Faster than Zero Copy

some paper of DMA:

ISCA’10 IOMMU: Strategies for Mitigating the IOTLB Bottleneck

ATC’11 vIOMMU: Efficient IOMMU Emulation

ATC’15 Utilizing the IOMMU Scalably

ASPLOS’15 rIOMMU: Efficient IOMMU for I/O Devices that Employ Ring Buffers

ASPLOS’16 True IOMMU Protection from DMA Attacks: When Copy is Faster than Zero Copy

ASPLOS’18 DAMN: Overhead-free IOMMU Protection for Networking

ATC’20 coIOMMU: A Virtual IOMMU with Cooperative DMA Buffer Tracking for Efficient Memory Management in Direct I/O

Security’21 Static Detection of Unsafe DMA Accesses in Device Drivers

EuroSys’21 Characterizing, exploiting, and detecting DMA code injection vulnerabilities in the presence of an IOMMU

https://zhuanlan.zhihu.com/p/20393380

https://zhuanlan.zhihu.com/p/20383904

https://hardenedlinux.github.io/system-security/2020/01/18/peripheral-based_attack_memory.html

ASPLOS ’16:True IOMMU Protection from DMA Attacks: When Copy is Faster than Zero Copy

problems:

  • (1) it provides protection at page granularity only, whereas DMA buffers can reside on the same page as other data: maps other data in an iommu-mapped page.
  • (2) it delays DMA buffer unmaps due to performance considerations, creating a vulnerability window in which devices can access in-use memory: because of IOTLB, unmapping is expensive. For the sake of performance, OSes implement deferred protection.

contribution:

  • (1) observing that copying DMA buffers is preferable to IOTLB invalidation;
  • (2) providing a truly secure, fast, and scalable intra-OS protection scheme with strict sub-page safety;
  • (3) implementing the new scheme in Linux and evaluating it with networking workloads at 40 Gb/s.

Assumptions:

1.protecting the OS from unauthorized DMAs to targets not mapped with the DMA API;

2.IOMMU trustworthy,secure boot-stage.

3.only focus on no-mapping DMA accesses.

Attacker Model:

The attacker controls a set of DMA-capable hardware devices but cannot otherwise access the OS.
However, shadow copy design can support untrusted drivers.

Intra-OS Protection via DMA Shadowing(Key design)

The basic idea is simple: we restrict a device’s DMAs to a set of shadow DMA buffers that are permanently mapped in the IOMMU, and copy data to (or from) these buffers from (or to) the OS-allocated DMA buffers.

Goals

1.Transparency: without modifying DMA API, it can intergrate into any OSes. describe in $5.2, extend in $5.4.

2.Scalability: to reduce overhead, it must minimize synchronization. (locks that are for multiple cores is global)describe in $5.3

3.Generality:support all workloads, including huge DMA buffers. describe in $5.5

$5.2 DMA Shadowing Implementation of the DMA API

primitive DMA API:

  • dma_map:
    • input: buf addr, size, device rights.
    • functionality:alloc a IOVA region from device’s IOVA space. create IOMMU page table.
    • return value: starting of IOVA region.
    • misc: after do this, devices can access this region while OS/driver cannot.
  • dma_unmap:
    • input:IOVA
    • functonality: remove the mapping in IOMMU, delete the device’s IOMMU page table.
    • return value:
    • misc: after do this, devices cannot access this region while OS/driver can again.
  • dma_alloc_coherent of shared buffer:
    • input:
    • functonality:
    • return value:
    • misc: alloc a region that drivers and devices can access it simultaneously. In page grained. use dma free coherent to free.

shadowing Implementation DMA API:

  • dma_map: acquires a shadow buffer of the appropriate size
    and access rights from the pool, then associated it with mapped OS buffer. return shadow buffer’s IOVA.
  • dma_unmap: finds the shadow buffer associated with the OS
    buffer. copy the contents of shadow buffer into OS buffer, then releases the shadow buffer and return.
  • dma_alloc_coherent and dma_free_coherent: infrequent operations, implement equally with the primitive DMA API.

security:

although the devices could always access all the shadow buffers, the OS only read value from OS buffer on the time of invoking dma_map and write at dma_unmap time.

$5.3 Shadow Buffer Pool

  • Each device is associated with a unique shadow buffer pool.
  • API of shadow buffer pool:
    • iova t acquire_shadow(buf, size, rights): Acquires a shadow buffer and associates it with the OS buffer buf.
    • void* find_shadow(iova):Looks up the shadow buffer whose IOVA is iova and returns the OS buffer associated with it.
    • void release_shadow(shbuf):Releases the shadow buffer shbuf back to the pool, disassociating it from its OS buffer.

pool design

A pool maintains a unique set of free lists. Each list holds free shadow buffers of a particular size and device access rights.

each size = 3 lisss: read, write, both.

each core = own free lists -> concurrent operations,

inter-numa-node access -> quickly access

free page to the list where it was allocated -> never change its rights of mapping -> no flush IOTLB

shadow buffer metadata:

each numa domain = a array of Shadow buffer metadata structures for each size class.