Re: fcntl(fd, F_SETFL, O_DIRECT) succeeds followed by EINVAL in write

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 26, 2022 at 2:02 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Wed, Jan 26, 2022 at 09:05:48AM +1100, Daniel Black wrote:
>
> O_RDONLY is defined to be 0, so don't worry about it.

Thanks.

> > The kernel code in setfl seems to want to return EINVAL for
> > filesystems without a direct_IO structure member assigned,
> >
> > A noop_direct_IO seems to be used frequently to just return EINVAL
> > (like cifs_direct_io).
>
> Sorry for the confusion.  You've caught us mid-transition.  Eventually,
> ->direct_IO will be deleted, but for now it signifies whether or not the
> filesystem supports O_DIRECT, even though it's not used (except in some
> scenarios you don't care about).

Is it going to be reasonable to expect fcntl(fd, F_SETFL, O_DIRECT) to
return EINVAL if O_DIRECT isn't supported?

> > Lastly on the list of peculiar behaviors here, is tmpfs will return
> > EINVAL from the fcntl call however it works fine with O_DIRECT
> > (https://bugs.mysql.com/bug.php?id=26662). MySQL (and MariaDB still
> > has the same code) that currently ignores EINVAL, but I'm willing to
> > make that code better.
>
> Out of interest, what behaviour do you _want_ from doing O_DIRECT
> to tmpfs?  O_DIRECT is defined to bypass the page cache, but tmpfs
> only stores data in the page cache.  So what do you intend to happen?

It occurs to me because EINVAL is returned, it's just operating in
non-O_DIRECT mode.

It occurs to me that someone probably added this because (too much)
MySQL/MariaDB
testing is done on tmpfs and someone didn't want to adjust the test
suite to handle
failures everywhere on O_DIRECT. I don't think there was any kernel
expectation there.

My problem it seems, I'll see what I can do to get back to using real
filesystems more.

> > Does a userspace have to fully try to write to an O_DIRECT file, note
> > the failure, reopen or clear O_DIRECT, and resubmit to use O_DIRECT?
> >
> > While I see that the success/failure of a O_DIRECT read/write can be
> > related to the capabilities of the underlying block device depending
> > on offset/length of the read/write, are there other traps?
>
> It also must be aligned in memory,

yep, knew this one.

> but I'm not quite sure what
> limitations cifs imposes.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux