On Tue, 2012-07-31 at 15:42 +0100, David Howells wrote: > Fix a number of bugs in the NFS idmapper code: > > (1) Only registered key types can be passed to the core keys code, so > register the legacy idmapper key type. > > This is a requirement because the unregister function cleans up keys > belonging to that key type so that there aren't dangling pointers to the > module left behind - including the key->type pointer. > > (2) Rename the legacy key type. You can't have two key types with the same > name, and (1) would otherwise require that. > > (3) complete_request_key() must be called in the error path of > nfs_idmap_legacy_upcall(). > > (4) There is one idmap struct for each nfs_client struct. This means that > idmap->idmap_key_cons is shared without the use of a lock. This is a > problem because key_instantiate_and_link() - as called indirectly by > idmap_pipe_downcall() - releases anyone waiting for the key to be > instantiated. > > What happens is that idmap_pipe_downcall() running in the rpc.idmapd > thread, releases the NFS filesystem in whatever thread that is running in > to continue. This may then make another idmapper call, overwriting > idmap_key_cons before idmap_pipe_downcall() gets the chance to call > complete_request_key(). > > I *think* that reading idmap_key_cons only once, before > key_instantiate_and_link() is called, and then caching the result in a > variable is sufficient. > > Bug (4) is the cause of: > > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: [< (null)>] (null) > PGD 0 > Oops: 0010 [#1] SMP > CPU 1 > Modules linked in: ppdev parport_pc lp parport ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack nfs fscache xt_CHECKSUM auth_rpcgss iptable_mangle nfs_acl bridge stp llc lockd be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi snd_hda_codec_realtek snd_usb_audio snd_hda_intel snd_hda_codec snd_seq snd_pcm snd_hwdep snd_usbmidi_lib snd_rawmidi snd_timer uvcvideo videobuf2_core videodev media videobuf2_vmalloc snd_seq_device videobuf2_memops e1000e vhost_net iTCO_wdt joydev coretemp snd soundcore macvtap macvlan i2c_i801 snd_page_alloc tun iTCO_vendor_support microcode kvm_intel kvm sunrpc hid_logitech_dj usb_storage i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan] > Pid: 1229, comm: rpc.idmapd Not tainted 3.4.2-1.fc16.x86_64 #1 Gateway DX4710-UB801A/G33M05G1 > RIP: 0010:[<0000000000000000>] [< (null)>] (null) > RSP: 0018:ffff8801a3645d40 EFLAGS: 00010246 > RAX: ffff880077707e30 RBX: ffff880077707f50 RCX: ffff8801a18ccd80 > RDX: 0000000000000006 RSI: ffff8801a3645e75 RDI: ffff880077707f50 > RBP: ffff8801a3645d88 R08: ffff8801a430f9c0 R09: ffff8801a3645db0 > R10: 000000000000000a R11: 0000000000000246 R12: ffff8801a18ccd80 > R13: ffff8801a3645e75 R14: ffff8801a430f9c0 R15: 0000000000000006 > FS: 00007fb6fb51a700(0000) GS:ffff8801afc80000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000000 CR3: 00000001a49b0000 CR4: 00000000000027e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Process rpc.idmapd (pid: 1229, threadinfo ffff8801a3644000, task ffff8801a3bf9710) > Stack: > ffffffff81260878 ffff8801a3645db0 ffff8801a3645db0 ffff880077707a90 > ffff880077707f50 ffff8801a18ccd80 0000000000000006 ffff8801a3645e75 > ffff8801a430f9c0 ffff8801a3645dd8 ffffffff81260983 ffff8801a3645de8 > Call Trace: > [<ffffffff81260878>] ? __key_instantiate_and_link+0x58/0x100 > [<ffffffff81260983>] key_instantiate_and_link+0x63/0xa0 > [<ffffffffa057062b>] idmap_pipe_downcall+0x1cb/0x1e0 [nfs] > [<ffffffffa0107f57>] rpc_pipe_write+0x67/0x90 [sunrpc] > [<ffffffff8117f833>] vfs_write+0xb3/0x180 > [<ffffffff8117fb5a>] sys_write+0x4a/0x90 > [<ffffffff81600329>] system_call_fastpath+0x16/0x1b > Code: Bad RIP value. > RIP [< (null)>] (null) > RSP <ffff8801a3645d40> > CR2: 0000000000000000 > > Signed-off-by: David Howells <dhowells@xxxxxxxxxx> > Reviewed-by: Steve Dickson <steved@xxxxxxxxxx> > --- This patch should already be present in nfs-for-next as of yesterday. (I've been on a 2 week vacation). -- Trond Myklebust Linux NFS client maintainer NetApp Trond.Myklebust@xxxxxxxxxx www.netapp.com ��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥