True IOMMU Protection from DMA Attacks:When Copy is Faster than Zero Copy
some paper of DMA:
ISCA’10 IOMMU: Strategies for Mitigating the IOTLB Bottleneck
ATC’11 vIOMMU: Efficient IOMMU Emulation
ATC’15 Utilizing the IOMMU Scalably
ASPLOS’15 rIOMMU: Efficient IOMMU for I/O Devices that Employ Ring Buffers
ASPLOS’16 True IOMMU Protection from DMA Attacks: When Copy is Faster than Zero Copy
ASPLOS’18 DAMN: Overhead-free IOMMU Protection for Networking
ATC’20 coIOMMU: A Virtual IOMMU with Cooperative DMA Buffer Tracking for Efficient Memory Management in Direct I/O
Security’21 Static Detection of Unsafe DMA Accesses in Device Drivers
EuroSys’21 Characterizing, exploiting, and detecting DMA code injection vulnerabilities in the presence of an IOMMU
https://zhuanlan.zhihu.com/p/20393380
https://zhuanlan.zhihu.com/p/20383904
https://hardenedlinux.github.io/system-security/2020/01/18/peripheral-based_attack_memory.html
ASPLOS ’16:True IOMMU Protection from DMA Attacks: When Copy is Faster than Zero Copy
problems:
- (1) it provides protection at page granularity only, whereas DMA buffers can reside on the same page as other data: maps other data in an iommu-mapped page.
- (2) it delays DMA buffer unmaps due to performance considerations, creating a vulnerability window in which devices can access in-use memory: because of IOTLB, unmapping is expensive. For the sake of performance, OSes implement deferred protection.
contribution:
- (1) observing that copying DMA buffers is preferable to IOTLB invalidation;
- (2) providing a truly secure, fast, and scalable intra-OS protection scheme with strict sub-page safety;
- (3) implementing the new scheme in Linux and evaluating it with networking workloads at 40 Gb/s.
Assumptions:
1.protecting the OS from unauthorized DMAs to targets not mapped with the DMA API;
2.IOMMU trustworthy,secure boot-stage.
3.only focus on no-mapping DMA accesses.
Attacker Model:
The attacker controls a set of DMA-capable hardware devices but cannot otherwise access the OS.
However, shadow copy design can support untrusted drivers.
Intra-OS Protection via DMA Shadowing(Key design)
The basic idea is simple: we restrict a device’s DMAs to a set of shadow DMA buffers that are permanently mapped in the IOMMU, and copy data to (or from) these buffers from (or to) the OS-allocated DMA buffers.
Goals
1.Transparency: without modifying DMA API, it can intergrate into any OSes. describe in $5.2, extend in $5.4.
2.Scalability: to reduce overhead, it must minimize synchronization. (locks that are for multiple cores is global)describe in $5.3
3.Generality:support all workloads, including huge DMA buffers. describe in $5.5
$5.2 DMA Shadowing Implementation of the DMA API
primitive DMA API:
- dma_map:
- input: buf addr, size, device rights.
- functionality:alloc a IOVA region from device’s IOVA space. create IOMMU page table.
- return value: starting of IOVA region.
- misc: after do this, devices can access this region while OS/driver cannot.
- dma_unmap:
- input:IOVA
- functonality: remove the mapping in IOMMU, delete the device’s IOMMU page table.
- return value:
- misc: after do this, devices cannot access this region while OS/driver can again.
- dma_alloc_coherent of shared buffer:
- input:
- functonality:
- return value:
- misc: alloc a region that drivers and devices can access it simultaneously. In page grained. use dma free coherent to free.
shadowing Implementation DMA API:
- dma_map: acquires a shadow buffer of the appropriate size
and access rights from the pool, then associated it with mapped OS buffer. return shadow buffer’s IOVA. - dma_unmap: finds the shadow buffer associated with the OS
buffer. copy the contents of shadow buffer into OS buffer, then releases the shadow buffer and return. - dma_alloc_coherent and dma_free_coherent: infrequent operations, implement equally with the primitive DMA API.
security:
although the devices could always access all the shadow buffers, the OS only read value from OS buffer on the time of invoking dma_map and write at dma_unmap time.
$5.3 Shadow Buffer Pool
- Each device is associated with a unique shadow buffer pool.
- API of shadow buffer pool:
- iova t acquire_shadow(buf, size, rights): Acquires a shadow buffer and associates it with the OS buffer buf.
- void* find_shadow(iova):Looks up the shadow buffer whose IOVA is iova and returns the OS buffer associated with it.
- void release_shadow(shbuf):Releases the shadow buffer shbuf back to the pool, disassociating it from its OS buffer.
pool design
A pool maintains a unique set of free lists. Each list holds free shadow buffers of a particular size and device access rights.
each size = 3 lisss: read, write, both.
each core = own free lists -> concurrent operations,
inter-numa-node access -> quickly access
free page to the list where it was allocated -> never change its rights of mapping -> no flush IOTLB
shadow buffer metadata:
each numa domain = a array of Shadow buffer metadata structures for each size class.