RE: [PATCHv10 0/9] write hints with nvme fdp, scsi streams

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> Sent: Friday, November 8, 2024 5:55 PM
> To: Keith Busch <kbusch@xxxxxxxxxx>
> Cc: Christoph Hellwig <hch@xxxxxx>; Keith Busch <kbusch@xxxxxxxx>; linux-
> block@xxxxxxxxxxxxxxx; linux-nvme@xxxxxxxxxxxxxxxxxxx; linux-scsi@xxxxxxxxxxxxxxx;
> io-uring@xxxxxxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx; joshi.k@xxxxxxxxxxx;
> Javier Gonzalez <javier.gonz@xxxxxxxxxxx>; bvanassche@xxxxxxx
> Subject: Re: [PATCHv10 0/9] write hints with nvme fdp, scsi streams
> 
> On Fri, Nov 08, 2024 at 08:51:31AM -0700, Keith Busch wrote:
> > On Fri, Nov 08, 2024 at 03:18:52PM +0100, Christoph Hellwig wrote:
> > > We're not really duplicating much.  Writing sequential is pretty easy,
> > > and tracking reclaim units separately means you need another tracking
> > > data structure, and either that or the LBA one is always going to be
> > > badly fragmented if they aren't the same.
> >
> > You're getting fragmentation anyway, which is why you had to implement
> > gc. You're just shifting who gets to deal with it from the controller to
> > the host. The host is further from the media, so you're starting from a
> > disadvantage. The host gc implementation would have to be quite a bit
> > better to justify the link and memory usage necessary for the copies
> > (...queue a copy-offload discussion? oom?).
> 
> But the filesystem knows which blocks are actually in use.  Sending
> TRIM/DISCARD information to the drive at block-level granularity hasn't
> worked out so well in the past.  So the drive is the one at a disadvantage
> because it has to copy blocks which aren't actually in use.

It is true that trim has not been great. I would say that at least enterprise
SSDs have fixed this in general. For FDP, DSM Deallocate is respected, which
Provides a good "erase" interface to the host.

It is true though that this is not properly described in the spec and we should
fix it.

> 
> I like the idea of using copy-offload though.

We have been iterating in the patches for years, but it is unfortunately
one of these series that go in circles forever. I don't think it is due
to any specific problem, but mostly due to unaligned requests form
different folks reviewing. Last time I talked to Damien he asked me to 
send the patches again; we have not followed through due to bandwidth.

If there is an interest, we can re-spin this again...





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux