Really awesome! This is a big bug. I have re-write the code of processing requests from the request queue. The new code is copied from drivers/mtd/mtd_blkdevs.c and did some necessary modifies. Now it works well. Many thanks to you :) BTW, I noticed that MTD driver (drivers/mtd/mtd_blkdevs.c) and MMC driver (drivers/mmc/card/block.c and queue.c) also register a block device, and they create a kernel thread to process the request queue instead of process it directly. Why they do it like that? Is there any special reason for that? Thanks a lot. Rgds, Yunpeng Gao -----Original Message----- From: Jens Axboe [mailto:jens.axboe@xxxxxxxxxx] Sent: 2009年2月19日 21:13 To: Gao, Yunpeng Cc: linux-ide@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx Subject: Re: help! locks problem in block layer request queue? On Thu, Feb 19 2009, Gao, Yunpeng wrote: > > Hi all, > > Sorry for the too long email. But I encountered a kernle OOP problem > when testing my standalone NAND block driver (it's almost a normal > block device driver) and not sure why this happen. > > In my development environment, the linux 2.6.27 kernel boot with > initrd, then 'chroot' to an MMC card. After chroot, I try to mkfs.ext3 > on NAND device. but it caused the kernel OOP message. If I mkfs.ext3 > on NAND device before chroot, then it works well (it can mount/umount, > copy file correctly accross system reboot). > > Below is the log message (/dev/mmcblk0 is the MMC card device node, > and /dev/nda is the NAND flash device node) and part of the driver > code. > > From the OOP message, It seems there's improper usage of locks in my > driver code, but actually, there only one spinlock used in the driver > (spinlock_t qlock defined in struct spectra_nand_dev). And it only > used by registered request queue. Also, I used a semaphore > ('spectra_sem') to prevent the low layer function from being > re-entered. As the low layer (hardware layer) now works in PIO mode > and it's very slowly, so maybe it holds the spinlock or semaphore for > too long time? You call the bvec_kmap_irq() and then call a function that does a down(). This is illegal, as you cannot block/schedule with interrupts disabled. -- Jens Axboe ?韬{.n?????%??檩??w?{.n???{炳'^??骅w*jg????????G??⒏⒎?:+v????????????"??????