On Mon, Feb 18, 2019 at 08:23:20PM -0800, Hugh Dickins wrote: > On Fri, 15 Feb 2019, Hugh Dickins wrote: > > On Thu, 14 Feb 2019, Darrick J. Wong wrote: > > > > On Mon, 11 Feb 2019 15:18:11 +0100 Matej Kupljen <matej.kupljen@xxxxxxxxx> wrote: > > > > > > > > > > it seems that when opening file on file system that is mounted on > > > > > tmpfs with the O_TMPFILE flag and using linkat call after that, it > > > > > uses 2 inodes instead of 1. > ... > > > > > > Heh, tmpfs and its weird behavior where each new link counts as a new > > > inode because "each new link needs a new dentry, pinning lowmem, and > > > tmpfs dentries cannot be pruned until they are unlinked." > > > > That's very much a peculiarity of tmpfs, so agreed: it's what I expect > > to be the cause, but I've not actually tracked it through and fixed yet. > ... > > > > > I /think/ the proper fix is to change shmem_link to decrement ifree only > > > if the inode has nonzero nlink, e.g. > > > > > > /* > > > * No ordinary (disk based) filesystem counts links as inodes; > > > * but each new link needs a new dentry, pinning lowmem, and > > > * tmpfs dentries cannot be pruned until they are unlinked. If > > > * we're linking an O_TMPFILE file into the tmpfs we can skip > > > * this because there's still only one link to the inode. > > > */ > > > if (inode->i_nlink > 0) { > > > ret = shmem_reserve_inode(inode->i_sb); > > > if (ret) > > > goto out; > > > } > > > > > > Says me who was crawling around poking at O_TMPFILE behavior all morning. > > > Not sure if that's right; what happens to the old dentry? > > Not sure what you mean by "what happens to the old dentry?" > But certainly the accounting feels a bit like a shell game, > and my attempts to explain it have not satisfied even me. > > The way I'm finding it helpful to think, is to imagine tmpfs' > count of inodes actually to be implemented as a count of dentries. > And the 1 for the last remaining goes away in the shmem_free_inode() > at the end of shmem_evict_inode(). Does that answer "what happens"? > > Since applying the patch, I have verified (watching "dentry" and > "shmem_inode_cache" in /proc/slabinfo) that doing Matej's sequence > repeatedly does not leak any "df -i" nor dentries nor inodes. > > > > > I'm relieved to see your "/think/" above and "Not sure" there :) > > Me too. It is so easy to get these counting things wrong, especially > > when distributed between the generic and the specific file system. > > > > I'm not going to attempt a pronouncement until I've had time to > > sink properly into it at the weekend, when I'll follow your guide > > and work it through - thanks a lot for getting this far, Darrick. > > I have now sunk into it, and sure that I agree with your patch, > filled out below (I happen to have changed "inode->i_nlink > 0" to > "inode->i_nlink" just out of some personal preference at the time). > One can argue that it's not technically quite the right place, but > it is the place where we can detect the condition without getting > into unnecessary further complications, and does the job well enough. > > May I change "Suggested-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>" > to your "Signed-off-by" before sending on to Andrew "From" you? That's fine with me! > Thanks! > Hugh > > [PATCH] tmpfs: fix link accounting when a tmpfile is linked in > > tmpfs has a peculiarity of accounting hard links as if they were separate > inodes: so that when the number of inodes is limited, as it is by default, > a user cannot soak up an unlimited amount of unreclaimable dcache memory > just by repeatedly linking a file. > > But when v3.11 added O_TMPFILE, and the ability to use linkat() on the fd, > we missed accommodating this new case in tmpfs: "df -i" shows that an > extra "inode" remains accounted after the file is unlinked and the fd > closed and the actual inode evicted. If a user repeatedly links tmpfiles > into a tmpfs, the limit will be hit (ENOSPC) even after they are deleted. > > Just skip the extra reservation from shmem_link() in this case: there's > a sense in which this first link of a tmpfile is then cheaper than a > hard link of another file, but the accounting works out, and there's > still good limiting, so no need to do anything more complicated. > > Fixes: f4e0c30c191 ("allow the temp files created by open() to be linked to") > Reported-by: Matej Kupljen <matej.kupljen@xxxxxxxxx> > Suggested-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> Or if you prefer: Reviewed-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> --D > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> > --- > > mm/shmem.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > --- 5.0-rc7/mm/shmem.c 2019-01-06 19:15:45.764805103 -0800 > +++ linux/mm/shmem.c 2019-02-18 13:56:48.388032606 -0800 > @@ -2854,10 +2854,14 @@ static int shmem_link(struct dentry *old > * No ordinary (disk based) filesystem counts links as inodes; > * but each new link needs a new dentry, pinning lowmem, and > * tmpfs dentries cannot be pruned until they are unlinked. > + * But if an O_TMPFILE file is linked into the tmpfs, the > + * first link must skip that, to get the accounting right. > */ > - ret = shmem_reserve_inode(inode->i_sb); > - if (ret) > - goto out; > + if (inode->i_nlink) { > + ret = shmem_reserve_inode(inode->i_sb); > + if (ret) > + goto out; > + } > > dir->i_size += BOGO_DIRENT_SIZE; > inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode);