Re: [PATCH v2 04/16] overlayfs: Document critical override_creds() operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 24, 2024 at 06:13:39PM GMT, Vinicius Costa Gomes wrote:
> Vinicius Costa Gomes <vinicius.gomes@xxxxxxxxx> writes:
> 
> > Miklos Szeredi <miklos@xxxxxxxxxx> writes:
> >
> >> On Thu, 22 Aug 2024 at 03:25, Vinicius Costa Gomes
> >> <vinicius.gomes@xxxxxxxxx> wrote:
> >>>
> >>> Add a comment to these operations that cannot use the _light version
> >>> of override_creds()/revert_creds(), because during the critical
> >>> section the struct cred .usage counter might be modified.
> >>
> >> Why is it a problem if the usage counter is modified?  Why is the
> >> counter modified in each of these cases?
> >>
> >
> > Working on getting some logs from the crash that I get when I convert
> > the remaining cases to use the _light() functions.
> >
> 
> See the log below.
> 
> > Perhaps I was wrong on my interpretation of the crash.
> >
> 
> What I am seeing is that ovl_setup_cred_for_create() has a "side
> effect", it creates another set of credentials, runs the security hooks
> with this new credentials, and the side effect is that when it returns,
> by design, 'current->cred' is this new credentials (a third set of
> credentials).

Well yes, during ovl_setup_cred_for_create() the fs{g,u}id needs to be
overwritten. But I'm stil confused what the exact problem is as it was
always clear that ovl_setup_cred_for_create() wouldn't be ported to
light variants.

/me looks...

> 
> And this implies that refcounting for this is somewhat tricky, as said
> in commit d0e13f5bbe4b ("ovl: fix uid/gid when creating over whiteout").
> 
> I see two ways forward:
> 
> 1. Keep using the non _light() versions in functions that call
>    ovl_setup_cred_for_create().
> 2. Change ovl_setup_cred_for_create() so it doesn't drop the "extra"
>    refcount.
> 
> I went with (1), and it still sounds to me like the best way, but I
> agree that my explanation was not good enough, will add the information
> I just learned to the commit message and to the code.
> 
> Do you see another way forward? Or do you think that I should go with
> (2)?

... ok, I understand. Say we have:

ovl_create_tmpfile()
/* current->cred == ovl->creator_cred without refcount bump /*
old_cred = ovl_override_creds_light()
-> ovl_setup_cred_for_create()
   /* Copy current->cred == ovl->creator_cred */
   modifiable_cred = prepare_creds()

   /* Override current->cred == modifiable_cred */
   mounter_creds = override_creds(modifiable_cred)

   /*
    * And here's the BUG BUG BUG where we decrement the refcount on the
    * constant mounter_creds.
    */
   put_cred(mounter_creds) // BUG BUG BUG

   put_cred(modifiable_creds)

So (1) is definitely the wrong option given that we can get rid of
refcount decs and incs in the creation path.

Imo, you should do (2) and add a WARN_ON_ONC(). Something like the
__completely untested__:

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index ab65e98a1def..e246e0172bb6 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -571,7 +571,12 @@ static int ovl_setup_cred_for_create(struct dentry *dentry, struct inode *inode,
                put_cred(override_cred);
                return err;
        }
-       put_cred(override_creds(override_cred));
+
+       /*
+        * We must be called with creator creds already, otherwise we risk
+        * leaking creds.
+        */
+       WARN_ON_ONCE(override_creds(override_cred) != ovl_creds(dentry->d_sb));
        put_cred(override_cred);

        return 0;

> 
> > Thanks for raising this, I should have added more information about this.
> >
> >
> > Cheers,
> > -- 
> > Vinicius
> 
> [    4.646955] [touch 1512] commit_creds(0000000009e62474{1})
> [    4.648637] [touch 1512] __put_cred(00000000200a9944{0})
> [    4.648844] [virtm 1502] prepare_creds() alloc 0000000050563530
> [    4.651631] [virtm 1513] prepare_creds() alloc 00000000da716e80
> [    4.652515] [mktem 1513] commit_creds(00000000da716e80{1})
> [    4.654056] ovl_create_or_link: [override] cred 0000000007112f42
> [    4.654108] ovl_override_creds_light: new cred 0000000007112f42{1}
> [    4.654155] ovl_override_creds_light: old cred 00000000da716e80{3}
> [    4.654199] [mktem 1513] prepare_creds() alloc 000000003c8d17b7
> [    4.654246] [mktem 1513] override_creds(000000003c8d17b7{1})
> [    4.654292] [mktem 1513] override_creds() = 0000000007112f42{1}
> [    4.654337] [mktem 1513] __put_cred(0000000007112f42{0})
> [    4.654388] [mktem 1513] __put_cred(0000000007112f42{0})
> [    4.654431] ------------[ cut here ]------------
> [    4.654470] ODEBUG: activate active (active state 1) object: 00000000ad88840d object type: rcu_head hint: 0x0
> [    4.654484] [swapp    0] exit_creds(1507,00000000efafcffd,00000000efafcffd,{2})
> [    4.654575] WARNING: CPU: 23 PID: 1513 at lib/debugobjects.c:515 debug_print_object+0x7d/0xb0
> [    4.654596] [swapp    0] __put_cred(00000000efafcffd{0})
> [    4.654674] Modules linked in: sha512_ssse3(E) isst_if_common(E-) crct10dif_pclmul(E) sha256_ssse3(E) skx_edac_common(E) nfit(E) virtio_net(E) net_failover(E) i2c_piix4(E) input_leds(E) psmouse(E) serio_raw(E) failover(E) i2c_smbus(E) pata_acpi(E) floppy(E) qemu_fw_cfg(E) mac_hid(E) overlay(E) 9pnet_virtio(E) virtiofs(E) 9p(E) 9pnet(E) netfs(E)
> [    4.654686] CPU: 23 UID: 0 PID: 1513 Comm: mktemp Tainted: G            E      6.11.0-rc5+ #4
> [    4.654689] Tainted: [E]=UNSIGNED_MODULE
> [    4.654689] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> [    4.654690] RIP: 0010:debug_print_object+0x7d/0xb0
> [    4.654692] Code: 01 8b 4b 14 48 c7 c7 d8 ce 3a 99 89 15 ec 53 77 02 8b 53 10 50 4c 8b 4d 00 48 8b 14 d5 80 32 e8 98 4c 8b 43 18 e8 73 fb a0 ff <0f> 0b 58 83 05 dd 35 c6 01 01 48 83 c4 08 5b 5d c3 cc cc cc cc 83
> [    4.654693] RSP: 0018:ff5fa086c391bd28 EFLAGS: 00010282
> [    4.654695] RAX: 0000000000000000 RBX: ff5fa086c391bd60 RCX: 0000000000140017
> [    4.654696] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000001
> [    4.654697] RBP: ffffffff98e28c40 R08: 0000000000000000 R09: ff4deb8a8531a0a8
> [    4.654697] R10: 0000000000000000 R11: 0000000000000001 R12: ff4deb8a87d29de8
> [    4.654698] R13: ffffffff98e28c40 R14: 0000000000000202 R15: ffffffff9aa64e58
> [    4.654699] FS:  00007ff8543af740(0000) GS:ff4deb8ab8580000(0000) knlGS:0000000000000000
> [    4.654700] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    4.654701] CR2: 00007ff8540ec040 CR3: 0000000005784002 CR4: 0000000000771ef0
> [    4.654704] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    4.654705] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
> [    4.654706] PKRU: 55555554
> [    4.654707] Call Trace:
> [    4.654709]  <TASK>
> [    4.654710]  ? __warn+0x83/0x130
> [    4.654725]  ? debug_print_object+0x7d/0xb0
> [    4.654726]  ? report_bug+0x18e/0x1a0
> [    4.654773] [swapp    0] exit_creds(1508,00000000b957e777,00000000b957e777,{2})
> 
> 
> Cheers,
> -- 
> Vinicius




[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux