On Fri, Sep 01, 2023 at 08:10:14AM +0200, Christoph Hellwig wrote: > On Thu, Aug 31, 2023 at 12:32:38PM -0300, Jason Gunthorpe wrote: > > The entry is variable sized, so it depends on what is stuffed in > > it. For alot of common use cases, especially RDMA page lists, it will > > be able to use an 8 byte entry. This is pretty much the most space > > efficient it could be. > > How do you get away with a 8 byte entry for addr+len? It's a compression. The basic base/length/flags has alot of zero bits in common cases. I was drafting: 2 bits for 'encoding == 8 bytes' 2 bits for flags 28 bits for length 32 bits for address >> 12 So if the range has zero bits in the right places then it fits in 8 bytes. Otherwise the compressor will choose a 16 byte entry: 2 bits for 'encoding == 16 bytes' 2 bits for flags 36 bits for length 64 bits for address 24 bits for offset And a 24 byte entry with 36 bits of flags and no limitations. So we can store anything, but common cases of page lists will use only 8 bytes/entry. This is a classical compression trade off, better space efficiency for long term storage, worse iteration efficiency. > > With your direction I felt we could safely keep bio as it is and > > cheaply make a fast DMA mapper for it. Provide something like this as > > the 'kitchen sink' version for dmabuf/rdma/etc that are a little > > different. > > So for the first version I see no need to change the bio_vec > representation as part of this project, Right > but at the same time the bio_vec representation causes problems for > other reasons. So I want to change it anyway. I don't feel competent in this area, so I'm not sure what this will be. I was hoping to come with some data and benchmarks and we consider options. The appeal of smaller long term memory footprint for the RDMA use case is interesting enough to look at it. Jason