Re: [PATCH v2 5/7] block: implement async discard as io_uring cmd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 22, 2024 at 02:07:16PM +0100, Pavel Begunkov wrote:
> > > Note, unlike ioctl(BLKDISCARD) with stronger guarantees against races,
> > > we only do a best effort attempt to invalidate page cache, and it can
> > > race with any writes and reads and leave page cache stale. It's the
> > > same kind of races we allow to direct writes.
> > 
> > Can you please write up a man page for this that clear documents the
> > expecvted semantics?
> 
> Do we have it documented anywhere how O_DIRECT writes interact
> with page cache, so I can refer to it?

I can't find a good writeup.  Adding Dave as he tends to do long
emails on topic like this so he might have one hiding somewhere.

> > GFP_KERNEL can often will block.  You'll probably want a GFP_NOWAIT
> > allocation here for the nowait case.
> 
> I can change it for clarity, but I don't think it's much of a concern
> since the read/write path and pretty sure a bunch of other places never
> cared about it. It does the main thing, propagating it down e.g. for
> tag allocation.

True, we're only doing the nowait allocation for larger data
structures.  Which is a bit odd indeed.

> I'd rather avoid calling bio_discard_limit() an extra time, it does
> too much stuff inside, when the expected case is a single bio and
> for multi-bio that overhead would really matter.

Compared to a memory allocation it's not really doing all the much.
In the long run we really should move splitting discard bios down
the stack like we do for normal I/O anyway.

> Maybe I should uniline blk_alloc_discard_bio() and dedup it with

uniline?  I read that as unіnline, but as it's not inline I don't
understand what you mean either.

> > > +#define BLOCK_URING_CMD_DISCARD			0
> > 
> > Is fs.h the reight place for this?
> 
> Arguable, but I can move it to io_uring, makes things simpler
> for me.

I would have expected a uapi/linux/blkdev.h for it (and I'm kinda
surprised we don't have that yet).

> 
> > Curious:  how to we deal with conflicting uring cmds on different
> > device and how do we probe for them?  The NVMe uring_cmds
> > use the ioctl-style _IO* encoding which at least helps a bit with
> > that and which seem like a good idea.  Maybe someone needs to write
> > up a few lose rules on uring commands?
> 
> My concern is that we're sacrificing compiler optimisations
> (well, jump tables are disabled IIRC) for something that doesn't even
> guarantee uniqueness. I'd like to see some degree of reflection,
> like user querying a file class in terms of what operations it
> supports, but that's beyond the scope of the series.

We can't guaranteed uniqueness, but between the class, the direction,
and the argument size we get a pretty good one.  There is a reason
pretty much all ioctls added in the last 25 years are using this scheme.





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux