Re: [PATCH] fix bitmap corruption on close_range() with CLOSE_RANGE_UNSHARE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 05, 2024 at 05:04:05PM -0700, Linus Torvalds wrote:
> On Mon, 5 Aug 2024 at 16:45, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > So... do we really need that indirect?  The problem would be
> > seeing ->max_fds update before that of the array pointers.
> 
> The reason is simply so that we can have one single allocation.
> 
> In fact, quite often, it's zero allocations when you can use the
> 'struct fdtable fdtab' that is embedded in 'struct files_struct'.

More to the point, we use arrays embedded into files_struct.

>But
> the 'struct fdtable' was a convenient way to allocate all those
> bitmaps _and_ the 'struct file *' array all together.

I don't think so - IIRC, it was introduced when we added RCU'd
file lookup.  Let me check...  Yep; badf16621c1f "[PATCH] files:
break up files struct", with RCU support as rationale.  Followed
by ab2af1f50050 "[PATCH] files: files struct with RCU".

Before those commits ->max_fds and ->fd used to live in
files_struct and fget() (OK, fcheck_files()) had been taking
->files_lock, so that concurrent expand_files() would not
screw us over.

The problem was not just the need to delay freeing old ->fd
array; that could be dealt with easily enough.  Think what
would've happened if fcheck_files() ended up fetching
new value of ->max_fds and old value of ->fd, which pointed
to pre-expansion array.  Indirection allowed to update
both in one store.

The thing is, ->max_fds for successive ->fdt is monotonously
increasing.  So a lockless reader seeing the old size is
fine with the new table - we just need to prevent the opposite.

Would rcu_assign_pointer of pointers + smp_store_release of max_fds on expand
(all under ->files_lock, etc.) paired with
smp_load_acquire of max_fds + rcu_dereference of ->fd on file lookup side
be enough, or do we need an explicit smp_wmb/smp_rmb in there?

> And yes, I think it's entirely a historical artifact of how that thing
> grew to be. Long long long ago there was no secondary allocation at
> all, and MAX_OPEN was fixed at 20.

Yes - but that had changed way before 2005 when those patches went in.
Separate allocation of bitmaps is 2.3.12pre7 and separate allocation of
->fd is even older - 2.1.90pre1.  NR_OPEN had been 1024 at that point;
earlier history is
	0.97: 20 -> 32
	0.98.4: 32 -> 256
	2.1.26: 256 -> 1024




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux