Re: [PATCH v2 00/16] block atomic writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 21/12/2023 13:22, Christoph Hellwig wrote:
On Thu, Dec 21, 2023 at 01:18:33PM +0000, John Garry wrote:
For SGL-capable devices that would be
BIO_MAX_VECS, otherwise 1.
ok, but we would need to advertise that or whatever segment limit. A statx
field just for that seems a bit inefficient in terms of space.
I'd rather not hard code BIO_MAX_VECS in the ABI, which suggest we
want to export is as a field.  Network file systems also might have
their own limits for one reason or another.

Hi Christoph,

I have been looking at this issue again and I am not sure if telling the user the max number of segments allowed is the best option. I’m worried that resultant atomic write unit max will be too small.

The background again is that we want to tell the user what the maximum atomic write unit size is, such that we can always guarantee to fit the write in a single bio. And there would be no iovec length or alignment rules.

The max segments value advertised would be min(queue max segments, BIO_MAX_VECS), so it would be 256 when the request queue is not limiting.

The worst case scenario for iovec layout (most inefficient) which the user could provide would be like .iov_base = 0x...0E00 and .iov_length = 0x400, which would mean that we would have 2x pages and 2x DMA sg elems required for each 1024B-length iovec. I am assuming that we will still use the direct IO rule of LBS length and alignment.

As such, we then need to set atomic write unit max = min(queue max segments, BIO_MAX_VECS) * LBS. That would mean atomic write unit max 256 * 512 = 128K (for 512B LBS). For a DMA controller of max segments 64, for example, then we would have 32K. These seem too low.

Alternative I'm thinking that we should just limit to 1x iovec always, and then atomic write unit max = (min(queue max segments, BIO_MAX_VECS) - 1) * PAGE_SIZE [ignoring first/last iovec contents]. It also makes support for non-enterprise NVMe drives more straightforward. If someone wants, they can introduce support for multi-iovec later, but it would prob require some more iovec length/alignment rules.

Please let me know your thoughts.

Thanks,
John





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux