On 9/24/21 8:43 PM, Vivek Goyal wrote: > On Fri, Sep 24, 2021 at 11:06:07AM +0800, JeffleXu wrote: >> >> >> On 9/24/21 9:02 AM, Vivek Goyal wrote: >>> On Fri, Sep 24, 2021 at 08:26:18AM +1000, Dave Chinner wrote: >>>> On Thu, Sep 23, 2021 at 03:02:41PM -0400, Vivek Goyal wrote: >>>>> On Thu, Sep 23, 2021 at 05:25:23PM +0800, Jeffle Xu wrote: >>>>>> We add 'always', 'never', and 'inode' (default). '-o dax' continues to >>>>>> operate the same which is equivalent to 'always'. To be consistemt with >>>>>> ext4/xfs's tri-state mount option, when neither '-o dax' nor '-o dax=' >>>>>> option is specified, the default behaviour is equal to 'inode'. >>>>> >>>>> So will "-o dax=inode" be used for per file DAX where dax mode comes >>>>> from server? >>>>> >>>>> I think we discussed this. It will be better to leave "-o dax=inode" >>>>> alone. It should be used when we are reading dax status from file >>>>> attr (like ext4 and xfs). >>>>> >>>>> And probably create separate option say "-o dax=server" where server >>>>> specifies which inode should use dax. >>>> >>>> That seems like a poor idea to me. >>>> >>>> The server side already controls what the client side does by >>>> controlling the inode attributes that the client side sees. That >>>> is, if the server is going to specify whether the client side data >>>> access is going to use dax, then the server presents the client with >>>> an inode that has the DAX attribute flag set on it. >>> >>> Hi Dave, >>> >>> Currently in fuse/virtiofs, DAX is compltely controlled by client. Server >>> has no say in it. If client is mounted with "-o dax", dax is enabled on >>> all files otherwise dax is disabled on all files. One could think of >>> implementing an option on server so that server could deny mmap() >>> requests that come from client, but so far nobody asked for such >>> an option on server side. >>> >>> When you say "inode that has DAX attribute flag set on it", are you >>> referring to "S_DAX (in inode->i_flags)" or persistent attr >>> "FS_XFLAG_DAX"? >>> >>> As of now S_DAX on client side inode is set by fuse client whenever >>> client mounted filesystem with "-o dax". And I think you are assuming >>> that DAX attribute of inode is already coming from server. That's not >>> the case. In fact that seems to be the proposal. Provide capability >>> so that server can specify which inode should be using DAX and which >>> inode should not be. >>> >>>> >>>> In that case, turning off dax on the guest side should be >>>> communicated to the fuse server so the server turns off the DAX flag >>>> on the server side iff server side policy allows it. >>> >>> Not sure what do you mean by server turns of DAX flag based on client >>> turning off DAX. Server does not have to do anything. If client decides >>> to not use DAX (in guest), it will not send FUSE_SETUPMAPPING requests >>> to server and that's it. >>> >>>> When the guest >>>> then purges it's local inode and reloads it from the server then it >>>> will get an inode with the DAX flag set according to server side >>>> policy. >>> >>> So right now we don't have a mechanism for server to specify DAX flag. >>> And that's what this patch series is trying to do. >>> >>>> >>>> Hence if the server side is configured with "dax=always" or >>>> dax="never" semantics it means the client side inode flag state >>>> cannot control DAX mode. That means, regardless of the client side >>>> mount options, DAX is operating under always or never policy, >>> >>> Hmm..., When you say "server side is configured with "dax=always", >>> do you mean shared directory on host is mounted with "-o dax=always", >>> or you mean some virtiofs server specific option which enables >>> dax on all inodes from server side. >>> >>> In general, DAX on host and DAX inside guest are completely independent. >>> Host filesystem could very well be mounted with dax or without dax and >>> that has no affect on guests's capability to be able to enable DAX or >>> not. >> >> Hi Dave, I think you are referring to "shared directory on host is >> mounted with "-o dax=always"" when you are saying "server side is >> configured with "dax=always". And just as Vivek said, there's no >> necessary relationship between the DAX mode in host and that in guest, >> technically. >> >> >>> >>>> enforced by the server side by direct control of the client inode >>>> DAX attribute flag. If dax=inode is in use on both sides, the the >>>> server honours the requests of the client to set/clear the inode >>>> flags and presents the inode flag according to the state the client >>>> side has requested. >>>> >>>> This policy state probably should be communicated to >>>> the fuse client from the server at mount time so policy conflicts >>>> can be be resolved at mount time (e.g. reject mount if policy >>>> conflict occurs, default to guest overrides server or vice versa, >>>> etc). This then means that that the client side mount policies will >>>> default to server side policy when they set "dax=inode" but also >>>> provide a local override for always or never local behaviour. >>>> >>>> Hence, AFAICT, there is no need for a 'dax=server' option - this >>>> seems to be exactly what 'dax=inode' behaviour means on the client >>>> side - it behaves according to how the server side propagates the >>>> DAX attribute to the client for each inode. >>> >>> Ok. So "-o dax=inode" in fuse will have a different meaning as opposed >>> to ext4/xfs. This will mean that server will pass DAX state of inode >>> when inode is instantiated and client should honor that. >>> >>> But what about FS_XFLAG_DAX flag then. Say host file system >>> does support this att and fuse/virtiofs allows querying and >>> setting this attribute (I don't think it is allowed now). So >>> will we not create a future conflict where in case of fuse/virtiofs >>> "-o dax=inode" means something different and it does look at >>> FS_XFLAG_DAX file attr. >>> >>>> >>>>> Otherwise it will be very confusing. People familiar with "-o dax=inode" >>>>> on ext4/xfs will expect file attr to work and that's not what we >>>>> are implementing, IIUC. >>>> >>>> The dax mount option behaviour is already confusing enough without >>>> adding yet another weird, poorly documented, easily misunderstood >>>> mode that behaves subtly different to existing modes. >>>> >>>> Please try to make the virtiofs behaviour compatible with existing >>>> modes - it's not that hard to make the client dax=inode behaviour be >>>> controlled by the server side without any special client side mount >>>> modes. >>> >>> Given I want to keep the option of similar behavior for "dax=inode" >>> across ext4/xfs and virtiofs, I suggested "dax=server". Because I >>> assumed that "dax=inode" means that dax is per inode property AND >>> this per inode property is specified by persistent file attr >>> FS_XFLAG_DAX. >>> >>> But fuse/virtiofs will not be specifying dax property of inode using >>> FS_XFLAG_DAX (atleast as of now). And server will set DAX property >>> using some bit in protocol. >>> >>> So these seem little different. If we use "dax=inode" for server >>> specifying DAX property of inode, then in future if client can >>> query/set FS_XFLAG_DAX on inode, it will be a problem. There will >>> be a conflict. >>> >>> Use case I was imagining was, say on host, user might set FS_XFLAG_DAX >>> attr on relevant files and then mount virtiofs in guest with >>> "-o dax=inode". ... >> >> Yes this will be a real using case, where users could specify which >> files should be DAX enabled by setting FS_XFLAG_DAX attr **on host**. In >> this case, the FS_XFLAG_DAX attr could still be conveyed by >> FUSE_ATTR_DAX (introduced in this patch set) in FUSE_LOOKUP reply. Then >> extra option may be added to fuse daemon, e.g., '-o policy=server' for >> getting DAX attr according to fuse daemon's own policy, and '-o >> policy=flag' for querying persistent FS_XFLAG_DAX attr of the host inode. >> >> And thus 'dax=server' mount option inside guest could be omitted and >> replaced by 'policy=server' option on the fuse daemon side. In this >> case, the fuse kernel module could be as simple as possible, while all >> other strategies could be implemented on the fuse daemon side. >> >> >>> ... Guest will query state of FS_XFLAG_DAX on inode >>> and enable DAX accordingly. (And server is not necessarily playing >>> an active role in determining which files should use DAX). >> >> It will be less efficient for guest to initiate another FUSE_IOCTL >> request to query if FS_XFLAG_DAX attr is on the inode. I think >> FUSE_ATTR_DAX flag in FUSE_LOOKUP reply shall be adequate for conveying >> DAX related attr. > > This is a good point. In case of fuse, for querying FS_XFLAG_DAX, there > will be extra round trip for FUSE_IOCTL message. It will be better if > server could have an option which enables checking FS_XFLAG_DAX and > respond with FUSE_ATTR_DAX flag in FUSE_LOOKUP message. > > Ok, so for the case of fuse, "-o dax=inode" can just mean that enablement > of DAX is per inode. But which files should use DAX is determined by > server. And client will not have a say in this. > > And server could have dax options (as you mentioned) to determine whether > it should query FS_XFLAG_DAX to determine dax status of file or it > could implement its own policy (say size based) to determine which inodes > should use DAX. We probably will have document this little deviation > in behavior in Documentation/filesystem/dax.rst Agreed. > > I guess this sounds reasonable. There is little deviation from ext4/xfs > but they are local filesystems to exact match might not make lot of > sense in fuse context. >> >> >> However it is indeed an issue when it comes to the consistency of >> FS_XFLAG_DAX flag with ext4/xfs. There are following two semantics when >> using per-file DAX for ext4/xfs: >> >> 1. user sets FS_XFLAG_DAX attribute on inode, for which per-file DAX >> shall be enabled, and the attribute will be stored persistently on the >> inode; > > So this one we can still support as long as fuse supports ioctl and > query and setting FS_XFLAG_DAX, right? Yes, if it is really needed later. > >> 2. user can query the FS_XFLAG_DAX attribute to see if DAX is enabled >> for specific file, when it's in 'dax=inode' mode. > > Documentation/filesystems/dax.rst says. > > 1. There exists an in-kernel file access mode flag `S_DAX` that corresponds to > the statx flag `STATX_ATTR_DAX`. See the manpage for statx(2) for details > about this access mode. > > So my understanding is that if user wants to know if DAX is currently > being used on a file, they should call statx() and check for > STATX_ATTR_DAX flag instead? And this will be set based on S_DAX status > of inode. > > If this is correct, then it is not an issue. We should be able to > return STATX_ATTR_DAX based on S_DAX and user should be able to > figure out which files are using DAX. Just that it will not necessarily > match FS_XFLAG_DAX because server might be configured to use its own > custom policy to determine DAX status of a file/inode. Good catch! You are right. Then semantic 2) can be matched without any modification. > >> >> >> For semantic 1), I'm quite doubted if there's necessity and possibility >> of this using case for fuse so far, given admin could already specify >> which files should be DAX enabled on the host side. If this semantic >> indeed shall be implemented later, then I'm afraid 'policy=server' >> option shall always be specified with fuse daemon, that is, fuse daemon >> shall always query the FS_XFLAG_DAX attr of the inode on the host. >> >> For semantic 2), could we return a fake FS_XFLAG_DAX once the server >> shows that this file should be DAX enabled? > > I guess we don't have to return fake FS_XFLAG_DAX at all. If user space > is calling statx(), we should just set STATX_ATTR_DAX if file/inode is > using DAX. > Yes. Thanks! I will send a new version later based on these discussions. -- Thanks, Jeffle