> -----Original Message----- > From: Matthew Wilcox <willy@xxxxxxxxxxxxx> > Sent: Friday, November 8, 2024 5:55 PM > To: Keith Busch <kbusch@xxxxxxxxxx> > Cc: Christoph Hellwig <hch@xxxxxx>; Keith Busch <kbusch@xxxxxxxx>; linux- > block@xxxxxxxxxxxxxxx; linux-nvme@xxxxxxxxxxxxxxxxxxx; linux-scsi@xxxxxxxxxxxxxxx; > io-uring@xxxxxxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx; joshi.k@xxxxxxxxxxx; > Javier Gonzalez <javier.gonz@xxxxxxxxxxx>; bvanassche@xxxxxxx > Subject: Re: [PATCHv10 0/9] write hints with nvme fdp, scsi streams > > On Fri, Nov 08, 2024 at 08:51:31AM -0700, Keith Busch wrote: > > On Fri, Nov 08, 2024 at 03:18:52PM +0100, Christoph Hellwig wrote: > > > We're not really duplicating much. Writing sequential is pretty easy, > > > and tracking reclaim units separately means you need another tracking > > > data structure, and either that or the LBA one is always going to be > > > badly fragmented if they aren't the same. > > > > You're getting fragmentation anyway, which is why you had to implement > > gc. You're just shifting who gets to deal with it from the controller to > > the host. The host is further from the media, so you're starting from a > > disadvantage. The host gc implementation would have to be quite a bit > > better to justify the link and memory usage necessary for the copies > > (...queue a copy-offload discussion? oom?). > > But the filesystem knows which blocks are actually in use. Sending > TRIM/DISCARD information to the drive at block-level granularity hasn't > worked out so well in the past. So the drive is the one at a disadvantage > because it has to copy blocks which aren't actually in use. It is true that trim has not been great. I would say that at least enterprise SSDs have fixed this in general. For FDP, DSM Deallocate is respected, which Provides a good "erase" interface to the host. It is true though that this is not properly described in the spec and we should fix it. > > I like the idea of using copy-offload though. We have been iterating in the patches for years, but it is unfortunately one of these series that go in circles forever. I don't think it is due to any specific problem, but mostly due to unaligned requests form different folks reviewing. Last time I talked to Damien he asked me to send the patches again; we have not followed through due to bandwidth. If there is an interest, we can re-spin this again...