Re: [LSF/MM/BPF TOPIC] File system checksum offload

Kanchan Joshi <joshi.k@xxxxxxxxxxx> · Fri, 31 Jan 2025 18:41:12 +0530

On 1/30/2025 7:58 PM, Theodore Ts'o wrote:
> On Thu, Jan 30, 2025 at 02:45:45PM +0530, Kanchan Joshi wrote:
>> I would like to propose a discussion on employing checksum offload in
>> filesystems.
>> It would be good to co-locate this with the storage track, as the
>> finer details lie in the block layer and NVMe driver.
> 
> I wouldn't call this "file system offload".  Enabling the data
> integrity feature or whatever you want to call it is really a block
> layer issue.  The file system doesn't need to get involved at all.
> Indeed, looking the patch, the only reason why the file system is
> getting involved is because (a) you've added a mount option, and (b)
> the mount option flips a bit in the bio that gets sent to the block
> layer.

Mount option was only for the RFC. If everything else gets sorted, it 
would be about choosing whatever is liked by the Btrfs.
   > But this could also be done by adding a queue specific flag, at which
> point the file system doesn't need to be involved at all.  Why would
> you want to enable the data ingregity feature on a per block I/O
> basis, if the device supports it?

Because I thought users (filesystems) would prefer flexibility. Per-IO 
control helps to choose different policy for say data and meta. Let me 
outline the differences.

Block-layer auto integrity
- always attaches integrity-payload for each I/O.
- it does compute checksum/reftag for each I/O. And this part does not 
do justice to the label 'offload'.

The patches make auto-integrity
- attach the integrity-buffer only if the device configuration demands.
- never compute checksum/reftag at the block-layer.
- keeps the offload choice at per I/O level.

Btrfs checksum tree is created only for data blocks, so the patches 
apply the flag (REQ_INTEGRITY_OFFLOAD) on that. While metadata blocks, 
which maybe more important, continue to get checksummed at two levels 
(block and device).