On Fri, Apr 03, 2020 at 11:18:43AM -0700, Ira Weiny wrote: > On Fri, Apr 03, 2020 at 07:03:38PM +0200, Jan Kara wrote: > > On Fri 03-04-20 08:48:29, Ira Weiny wrote: > > > On Fri, Apr 03, 2020 at 09:27:31AM +0200, Christoph Hellwig wrote: > > > > On Thu, Apr 02, 2020 at 01:55:19PM -0700, Ira Weiny wrote: > > > > > > I'd just return an error for that case, don't play silly games like > > > > > > evicting the inode. > > > > > > > > > > I think I agree with Christoph here. But I want to clarify. I was heading in > > > > > a direction of failing the ioctl completely. But we could have the flag change > > > > > with an appropriate error which could let the user know the change has been > > > > > delayed. > > > > > > > > > > But I don't immediately see what error code is appropriate for such an > > > > > indication. Candidates I can envision: > > > > > > > > > > EAGAIN > > > > > ERESTART > > > > > EUSERS > > > > > EINPROGRESS > > > > > > > > > > None are perfect but I'm leaning toward EINPROGRESS. > > > > > > > > I really, really dislike that idea. The whole point of not forcing > > > > evictions is to make it clear - no this inode is "busy" you can't > > > > do that. A reasonably smart application can try to evict itself. > > > > > > I don't understand. What Darrick proposed would never need any > > > evictions. If the file has blocks allocated the FS_XFLAG_DAX flag can > > > not be changed. So I don't see what good eviction would do at all. > > > > I guess there's some confusion here (may well be than on my side). Darrick > > propose that we can switch FS_XFLAG_DAX only when file has no blocks > > allocated - fine by me. But that still does not mean than we can switch > > S_DAX immediately, does it? Because that would still mean we need to switch > > aops on living inode and that's ... difficult and Christoph didn't want to > > clutter the code with it. > > > > So I've understood Darrick's proposal as: Just switch FS_XFLAG_DAX flag, > > S_DAX flag will magically switch when inode gets evicted and the inode gets > > reloaded from the disk again. Did I misunderstand anything? > > > > And my thinking was that this is surprising behavior for the user and so it > > will likely generate lots of bug reports along the lines of "DAX inode flag > > does not work!". So I was pondering how to make the behavior less > > confusing... The ioctl I've suggested was just a poor attempt at that. > > Ok but then I don't understand Christophs comment to "just return an error for > that case"? Which case? > > > > > > > But returning an error and doing a lazy change anyway is straight from > > > > the playbook for arcane and confusing API designs. > > > > > > Jan countered with a proposal that the FS_XFLAG_DAX does change with > > > blocks allocated. But that S_DAX would change on eviction. Adding that > > > some eviction ioctl could be added. > > > > No, I didn't mean that we can change FS_XFLAG_DAX with blocks allocated. I > > was still speaking about the case without blocks allocated. > > Ah ok good point. But again what 'error' do we return when FS_XFLAG_DAX > changed but S_DAX did not? > > > > > > You then proposed just returning an error for that case. (This lead me to > > > believe that you were ok with an eviction based change of S_DAX.) > > > > > > So I agreed that changing S_DAX could be delayed until an explicit eviction. > > > But, to aid the 'smart application', a different error code could be used to > > > indicate that the FS_XFLAG_DAX had been changed but that until that explicit > > > eviction occurs S_DAX would remain. > > > > > > So I don't fully follow what you mean by 'lazy change'? > > > > > > Do you still really, really dislike an explicit eviction method for changing > > > the S_DAX flag? > > > > > > If FS_XFLAG_DAX can never be changed on a file with blocks allocated and the > > > user wants to change the mode of operations on their 'data'; they would have to > > > create a new file with the proper setting and move the data there. For example > > > copy the file into a directory marked FS_XFLAG_DAX==true? > > > > > > I'm ok with either interface as I think both could be clear if documented. > > > > I agree that what Darrick suggested is technically easily doable and can be > > documented. But it is not natural behavior (i.e., different than all inode > > flags we have) and we know how careful people are when reading > > documentation... > > > > Ok For 5.8 why don't we not allow FS_XFLAG_DAX to be changed on files _at_ > _all_... > > In summary: > > - Applications must call statx to discover the current S_DAX state. Ok. > - There exists an advisory file inode flag FS_XFLAG_DAX that is set based on > the parent directory FS_XFLAG_DAX inode flag. (There is no way to change > this flag after file creation.) > > If FS_XFLAG_DAX is set and the fs is on pmem then it will enable S_DAX at > inode load time; if FS_XFLAG_DAX is not set, it will not enable S_DAX. > Unless overridden... Ok, fine with me. :) > - There exists a dax= mount option. > > "-o dax=off" means "never set S_DAX, ignore FS_XFLAG_DAX" > "-o nodax" means "dax=off" I surveyed the three fses that support dax and found that none of the filesystems actually have a 'nodax' flag. Now would be the time not to add such a thing, and make people specify dax=off instead. It would be handy if we could have a single fsparam_enum for figuring out the dax mount options. > "-o dax=always" means "always set S_DAX (at least on pmem), ignore FS_XFLAG_DAX" > "-o dax" by itself means "dax=always" > "-o dax=iflag" means "follow FS_XFLAG_DAX" and is the default > > - There exists an advisory directory inode flag FS_XFLAG_DAX that can be > changed at any time. The flag state is copied into any files or > subdirectories when they are created within that directory. If programs > require file access runs in S_DAX mode, they'll have to create those files "...they must create..." > inside a directory with FS_XFLAG_DAX set, or mount the fs with an > appropriate dax mount option. Otherwise seems ok to me. --D > > > ??? > > Ira >