On Fri, 2010-02-05 at 06:13 -0800, Andrew Morton wrote: > On Fri, 05 Feb 2010 10:31:42 +0200 Maxim Levitsky <maximlevitsky@xxxxxxxxx> wrote: > > > On Thu, 2010-02-04 at 16:09 -0800, Andrew Morton wrote: > > > On Fri, 5 Feb 2010 01:18:15 +0200 Maxim Levitsky <maximlevitsky@xxxxxxxxx> wrote: > > > > > > > Currently removal of the card leads to del_disk called indirectly by mmc core. > > > > This function expects userspace to be running, which isn't when .resume is called > > > > > > > > Fix that by removing the code that did that in mmc_resume_host. It is possible > > > > because card detection logic will kick it later and remove the card. > > > > > > I don't really understand. The above implies that to trigger this bug, > > > one needs to physically remove the card during a resume operation. ie: > > > a human-vs-computer race. Sounds unlikely? > > > > > > So... exactly what steps does the user need to take to trigger this > > > > Sorry for describing this poorly. > > The steps are: > > > > -> Have a kernel with CONFIG_MMC_UNSAFE_RESUME > > -> Insert MMC/SD card > > -> Suspend/hibernate the system > > -> While system is hibernated/suspended pull the card off > > -> Resume the system > > -> Hang > > > > > > if CONFIG_MMC_UNSAFE_RESUME is set, mmc core allows the user to > > suspend/resume the card normally assuming he won't change the card or > > modify it in another system. The former case is actually handled quite > > well. > > > > if CONFIG_MMC_UNSAFE_RESUME isn't set, it removes the card during > > suspend, and I now think (and will test) that this will still hang the > > system this time on suspend. > > > > Maybe we can make del_disk behave well if called with userspace frozen? > > After all if user calls it, very likely that hardware is absent thus > > there is no point in syncing (which I think triggers the hang).... > > > > There is no del_disk in the kernel. Let's be more specific (and > accurate!) about the hang. I assume it's > mmc_remove_card->device_del->kobject_uevent? Sorry! I was referring to del_gendisk. <4>[15241.042047] [<ffffffff8106620a>] ? prepare_to_wait+0x2a/0x90 <4>[15241.042159] [<ffffffff810790bd>] ? trace_hardirqs_on+0xd/0x10 <4>[15241.042271] [<ffffffff8140db12>] ? _raw_spin_unlock_irqrestore+0x42/0x80 <4>[15241.042386] [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20 <4>[15241.042496] [<ffffffff8112a39e>] bdi_sched_wait+0xe/0x20 <4>[15241.042606] [<ffffffff8140af6f>] __wait_on_bit+0x5f/0x90 <4>[15241.042714] [<ffffffff8112a390>] ? bdi_sched_wait+0x0/0x20 <4>[15241.042824] [<ffffffff8140b018>] out_of_line_wait_on_bit+0x78/0x90 <4>[15241.042935] [<ffffffff81065fd0>] ? wake_bit_function+0x0/0x40 <4>[15241.043045] [<ffffffff8112a2d3>] ? bdi_queue_work+0xa3/0xe0 <4>[15241.043155] [<ffffffff8112a37f>] bdi_sync_writeback+0x6f/0x80 <4>[15241.043265] [<ffffffff8112a3d2>] sync_inodes_sb+0x22/0x120 <4>[15241.043375] [<ffffffff8112f1d2>] __sync_filesystem+0x82/0x90 <4>[15241.043485] [<ffffffff8112f3db>] sync_filesystem+0x4b/0x70 <4>[15241.043594] [<ffffffff811391de>] fsync_bdev+0x2e/0x60 <4>[15241.043704] [<ffffffff812226be>] invalidate_partition+0x2e/0x50 <4>[15241.043816] [<ffffffff8116b92f>] del_gendisk+0x3f/0x140 <4>[15241.043926] [<ffffffffa00c0233>] mmc_blk_remove+0x33/0x60 [mmc_block] <4>[15241.044043] [<ffffffff81338977>] mmc_bus_remove+0x17/0x20 <4>[15241.044152] [<ffffffff812ce746>] __device_release_driver+0x66/0xc0 <4>[15241.044264] [<ffffffff812ce89d>] device_release_driver+0x2d/0x40 <4>[15241.044375] [<ffffffff812cd9b5>] bus_remove_device+0xb5/0x120 <4>[15241.044486] [<ffffffff812cb46f>] device_del+0x12f/0x1a0 <4>[15241.044593] [<ffffffff81338a5b>] mmc_remove_card+0x5b/0x90 <4>[15241.044702] [<ffffffff8133ac27>] mmc_sd_remove+0x27/0x50 <4>[15241.044811] [<ffffffff81337d8c>] mmc_resume_host+0x10c/0x140 <4>[15241.044929] [<ffffffffa00850e9>] sdhci_resume_host+0x69/0xa0 [sdhci] <4>[15241.045044] [<ffffffffa0bdc39e>] sdhci_pci_resume+0x8e/0xb0 [sdhci_pci] > > Yes, I'd have thought that it would be a good idea for the > kobject_uevent code (or lower, in call_usermodehelper) to take avoiding > action if userspace is frozen. However such action would probably > involve doing a WARN_ON() too, so we'd still need MMC changes to avoid > that. > > Best regards, Maxim Levitsky -- To unsubscribe from this list: send the line "unsubscribe linux-mmc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html