Re: [Bug 12945] New: SCSI Generic (sg): BUG: sleeping function called from invalid context

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 26 Mar 2009 19:43:02 +0100
Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:

> On Thu, Mar 26 2009, Andrew Morton wrote:
> > 
> > (switched to email.  Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> > 
> > On Thu, 26 Mar 2009 12:27:53 GMT bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote:
> > 
> > > http://bugzilla.kernel.org/show_bug.cgi?id=12945
> > > 
> > >            Summary: SCSI Generic (sg): BUG: sleeping function called from
> > >                     invalid context
> > >            Product: SCSI Drivers
> > >            Version: 2.5
> > >     Kernel Version: 2.6.28.9
> > >           Platform: All
> > >         OS/Version: Linux
> > >               Tree: Mainline
> > >             Status: NEW
> > >           Severity: normal
> > >           Priority: P1
> > >          Component: Other
> > >         AssignedTo: scsi_drivers-other@xxxxxxxxxxxxxxxxxxxx
> > >         ReportedBy: txtoxtox285@xxxxxxxxxxxxxx
> > >         Regression: No
> > > 
> > > 
> > > Created an attachment (id=20685)
> > >  --> (http://bugzilla.kernel.org/attachment.cgi?id=20685)
> > > Stack trace on program kill (2.6.28.9)
> > > 
> > > I am experimenting with CD audio extraction. I use the SCSI Generic driver for
> > > this.
> > > 
> > > My test program uses read() and write() (instead of ioctl) to send requests to
> > > the driver and receive responses. I use SG_FLAG_DIRECT_IO.
> > > 
> > > When I kill my program (because I don't want to wait until it has ripped the
> > > entire CD), I am often rewarded with messages like "BUG: sleeping function
> > > called from invalid context at linux-2.6.28.9/include/linux/pagemap.h:347". I
> > > have attached typical stack trace.
> > > 
> > > Another case when I hit this BUG is when I set a time out and the CD drive
> > > doesn't respond fast enough. A stack trace is attached.
> > 
> > > [34215.786870] BUG: sleeping function called from invalid context at /mnt/var-pub/src/linux-2.6.28.9/include/linux/pagemap.h:347
> > > [34215.786880] in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper
> > > [34215.786886] Pid: 0, comm: swapper Not tainted 2.6.28.9 #1
> > > [34215.786890] Call Trace:
> > > [34215.786894]  <IRQ>  [<ffffffff8026c4cc>] set_page_dirty_lock+0x1a/0x45
> > > [34215.786911]  [<ffffffff802ae17d>] bio_unmap_user+0x1e/0x4a
> > > [34215.786920]  [<ffffffff802e876b>] __blk_rq_unmap_user+0x14/0x20
> > > [34215.786928]  [<ffffffff80210852>] pit_next_event+0x2e/0x49
> > > [34215.786934]  [<ffffffff802e8795>] blk_rq_unmap_user+0x1e/0x4b
> > > [34215.786965]  [<ffffffffa0163475>] sg_finish_rem_req+0x6d/0x88 [sg]
> > > [34215.786979]  [<ffffffffa0164ef3>] sg_rq_end_io+0x131/0x205 [sg]
> > > [34215.786986]  [<ffffffff802e5c1f>] end_that_request_last+0x58/0x194
> > > [34215.786992]  [<ffffffff802e5e00>] blk_end_io+0x48/0x7d
> > > [34215.787019]  [<ffffffffa0026bef>] scsi_next_command+0x219/0x283 [scsi_mod]
> > > [34215.787039]  [<ffffffffa00279b1>] scsi_io_completion+0x181/0x53b [scsi_mod]
> > > [34215.787047]  [<ffffffff802e9737>] blk_done_softirq+0x5f/0x6d
> > > [34215.787054]  [<ffffffff80230787>] __do_softirq+0x5e/0xf8
> > > [34215.787061]  [<ffffffff8020ca8c>] call_softirq+0x1c/0x28
> > > [34215.787067]  [<ffffffff8020d6bc>] do_softirq+0x2c/0x68
> > > [34215.787073]  [<ffffffff80230696>] irq_exit+0x36/0x82
> > > [34215.787079]  [<ffffffff8020d79e>] do_IRQ+0xa6/0xb8
> > > [34215.787085]  [<ffffffff8020c256>] ret_from_intr+0x0/0xa
> > > [34215.787088]  <EOI>  [<ffffffff8034f648>] menu_reflect+0x0/0x6d
> > > [34215.787112]  [<ffffffffa0147d51>] acpi_idle_enter_simple+0x170/0x1d6 [processor]
> > > [34215.787127]  [<ffffffffa0147d47>] acpi_idle_enter_simple+0x166/0x1d6 [processor]
> > > [34215.787134]  [<ffffffff8034eb32>] cpuidle_idle_call+0x73/0xb1
> > > [34215.787140]  [<ffffffff8020ac2a>] cpu_idle+0x3c/0x73
> > 
> > Argh.  sg_finish_rem_req() is called from interrupt context.  But
> > blk_rq_unmap_user() can run
> > __bio_unmap_user()->set_page_dirty_lock()->lock_page(), which can call
> > schedule().  If it does call schedule(), the machine will crash.
> > 
> > afacit, blk_rq_unmap_user() has always been a can-sleep function, and
> > this is a regression caused by
> > 
> > commit 6e5a30cba5e7c03b2cd564e968f1dd667a0f7c42
> 
> Yep, it is. The problem is the usage of:
> 
>         blk_execute_rq_nowait(sdp->device->request_queue, sdp->disk,
>                               srp->rq, 1, sg_rq_end_io);
> 
> and then doing the sg_finish_rem_req() -> blk_rq_unmap_user() from the
> end_io path, where other users do a sync request and then unmap from the
> same context.

Right. And only sg does that. I've already converted st and osst to
use the block layer but they works synchronously.


> Hmm. Perhaps we can add some request flag to specify doing
> the completion from user context, then other users could be converted do
> the _nowait() approach as well and get some unification/cleanup there as
> well.

Since only sg needs this so I simply fixed sg instead of changing the
block layer. But it might be nice if block layer can handle this.

Seems there are several patches for the block layer (including
mapping) from Tejun and Boaz. I'll read them to see what we could do.
I'm always too busy in March with the company matters.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux