Btw, there is another option: Most real systems already require having swiotlb to bounce buffer in some cases. We could simply force bounce buffering in the dma mapping code for too small or not properly aligned transfers and just decrease the dma alignment.