Re: [PATCH] fs: clear close-on-exec flag as part of put_unused_fd()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 11, 2013 at 11:36:35PM +0100, Mateusz Guzik wrote:

> >From my reading this will break at least the following:
> fd = open(..., .. | O_CLOEXEC);
> dup2(whatever, fd);
> 
> now fd has O_CLOEXEC even though it should not

Moreover, consider fork() done by a thread that shares descriptor
table with somebody else.  Suppose it happens in the middle of
open() with O_CLOEXEC being done by another thread.  We copy descriptor
table after descriptor had been reserved (and marked close-on-exec),
but before a reference to struct file has actually been inserted there.
This code
        for (i = open_files; i != 0; i--) {
                struct file *f = *old_fds++;
                if (f) {
                        get_file(f);
                } else {
                        /*    
                         * The fd may be claimed in the fd bitmap but not yet
                         * instantiated in the files array if a sibling thread
                         * is partway through open().  So make sure that this
                         * fd is available to the new process.
                         */
                        __clear_open_fd(open_files - i, new_fdt);
                }
                rcu_assign_pointer(*new_fds++, f);
        }
        spin_unlock(&oldf->file_lock);
in dup_fd() will clear the corresponding bit in open_fds, leaving close_on_exec
alone.  Currently that's fine (we will override whatever had been in
close_on_exec when we reserve that descriptor again), but AFAICS with this
patch it will break.

Sure, it can be fixed up (ditto with dup2(), etc.), but what's the point?
Result will require more subtle reasoning to prove correctness and will
be more prone to breakage.  Does that really yield visible performance
improvements that would be worth the extra complexity?  After all, you
trade some writes to close_on_exec on descriptor reservation for unconditional
write on descriptor freeing; if anything, I would expect that you'll get
minor _loss_ from that change, assuming they'll be measurable in the first
place...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux