On Tue, Nov 19, 2024 at 09:55:27PM +0100, Brian Johannesmeyer wrote: > We discovered a security-related issue in the DMA pool allocator. > > V1 of our RFC was submitted to the Linux kernel security team. They > recommended submitting it to the relevant subsystem maintainers and the > hardening mailing list instead, as they did not consider this an explicit > security issue. Their rationale was that Linux implicitly assumes hardware > can be trusted. > > **Threat Model**: While Linux drivers typically trust their hardware, there > may be specific drivers that do not operate under this assumption. Hence, > this threat model assumes a malicious peripheral device capable of > corrupting DMA data to exploit the kernel. In this scenario, the device > manipulates kernel-initialized data (similar to the attack described in the > Thunderclap paper [0]) to achieve arbitrary kernel memory corruption. > > **DMA pool background**. A DMA pool aims to reduce the overhead of DMA > allocations by creating a large DMA buffer --- the "pool" --- from which > smaller buffers are allocated as needed. Fundamentally, a DMA pool > functions like a heap: it is a structure composed of linked memory > "blocks", which, in this context, are DMA buffers. When a driver employs a > DMA pool, it grants the device access not only to these blocks but also to > the pointers linking them. > > **Vulnerability**. Similar to traditional heap corruption vulnerabilities > --- where a malicious program corrupts heap metadata to e.g., hijack > control flow --- a malicious device may corrupt DMA pool metadata. This > corruption can trivially lead to arbitrary kernel memory corruption from > any driver that uses it. Indeed, because the DMA pool API is extensively > used, this vulnerability is not confined to a single instance. In fact, > every usage of the DMA pool API is potentially vulnerable. An exploit > proceeds with the following steps: > > 1. The DMA `pool` initializes its list of blocks, then points to the first > block. > 2. The malicious device overwrites the first 8 bytes of the first block --- > which contain its `next_block` pointer --- to an arbitrary kernel address, > `kernel_addr`. > 3. The driver makes its first call to `dma_pool_alloc()`, after which, the > pool should point to the second block. However, it instead points to > `kernel_addr`. > 4. The driver again calls `dma_pool_alloc()`, which incorrectly returns > `kernel_addr`. Therefore, anytime the driver writes to this "block", it may > corrupt sensitive kernel data. > > I have a PDF document that illustrates how these steps work. Please let me > know if you would like me to share it with you. I know I said it privately, but I'll say it here in public, very cool finding, this is nice work! > **Proposed mitigation**. To mitigate the corruption of DMA pool metadata > (i.e., the pointers linking the blocks), the metadata should be moved into > non-DMA memory, ensuring it cannot be altered by a device. I have included > a patch series that implements this change. Since I am not deeply familiar > with the DMA pool internals, I would appreciate any feedback on the > patches. I have tested the patches with the `DMAPOOL_TEST` test and my own > basic unit tests that ensure the DMA pool allocator is not vulnerable. > > **Performance**. I evaluated the patch set's performance by running the > `DMAPOOL_TEST` test with `DMAPOOL_DEBUG` enabled and with/without the > patches applied. Here is its output *without* the patches applied: > ``` > dmapool test: size:16 align:16 blocks:8192 time:3194110 > dmapool test: size:64 align:64 blocks:8192 time:4730440 > dmapool test: size:256 align:256 blocks:8192 time:5489630 > dmapool test: size:1024 align:1024 blocks:2048 time:517150 > dmapool test: size:4096 align:4096 blocks:1024 time:399616 > dmapool test: size:68 align:32 blocks:8192 time:6156527 > ``` > > And here is its output *with* the patches applied: > ``` > dmapool test: size:16 align:16 blocks:8192 time:3541031 > dmapool test: size:64 align:64 blocks:8192 time:4227262 > dmapool test: size:256 align:256 blocks:8192 time:4890273 > dmapool test: size:1024 align:1024 blocks:2048 time:515775 > dmapool test: size:4096 align:4096 blocks:1024 time:523096 > dmapool test: size:68 align:32 blocks:8192 time:3450830 > ``` You had mentioned that the size:68 numbers were going to be re-run, has that happened and this really is that much of a boost to that size? Or is this the original numbers? thanks, greg k-h