Hi, On Tue, 21 Dec 2010 00:40:38 +0100, Jan Misiak wrote: > On 19 December 2010 18:04, Ryusuke Konishi <ryusuke@xxxxxxxx> wrote: > > Two more questions here. > > > > 1) Did the panic arise during mount? > > Yes, the panic occurs just after issuing the 'mount' command. > > > 2) Did you see the following message just before this oops? > > "segctord starting. Construction interval = 5 seconds, CP frequency < 30 seconds" > > Yes, netconsole managed to capture it: > > sd 2:0:0:0: [sdb] Attached SCSI removable disk > segctord starting. Construction interval = 5 seconds, CP frequency < 30 seconds > BUG: unable to handle kernel paging request at 00001000 > IP: [<c10d572b>] page_address+0xb/0xd0 > *pde = 00000000 > Oops: 0000 [#1] PREEMPT SMP > > > After that, try just a read-only mount without the norecovery option. > > > > # mount -t nilfs2 -o ro /dev/sdb1 /your-mount-dir > > It does not trigger the oops. Sorry for not mentioning it earlier. The > kernel only panics if the file system is mounted 'rw'. > Thank you for looking into this. > > Regards, > Jan Thank you for your cooperation. According to the situation, I'm guessing the oops is triggered by the writeback of two super blocks. Could you get device size information in a few ways? (In the following examples, I assumed the target device is /dev/sdb1.) 1) sysfs reported sizes # cat /sys/block/sdb/size # cat /sys/block/sdb/sdb1/size 2) Sizes on the partition table # fdisk -lu /dev/sdb 3) Dump of the first super block (it has the layout information) # dd if=/dev/sdb1 bs=1k count=1 skip=1 2>/dev/null | hd Thanks, Ryusuke Konishi > > > > On Sun, 19 Dec 2010 13:04:23 +0100, Jan Misiak wrote: > >> On 19 December 2010 06:13, Ryusuke Konishi <ryusuke@xxxxxxxx> wrote: > >> > Hi, > >> > On Sat, 18 Dec 2010 15:08:45 +0100, Jan Misiak wrote: > >> >> Hello, > >> >> > >> >> I am just a simple end-user but as nobody in my distribution has had > >> >> the same problem I was forced to turn to the upstream. Please bear > >> >> with me. > >> >> > >> >> I have been using nilfs2 on a 16GB usb-stick on a x86 thin client > >> >> running Arch Linux. The box had been running 24/7 and had an uptime of > >> >> about two weeks with kernel 2.6.36/nilfs-utils 2.0.20 when it > >> >> panicked. Unfortunately nothing was to be seen in the logs (system > >> >> partition was ext3). Now it panics every time I attempt to mount the > >> >> volume. > >> >> > >> >> I tried to use netconsole to capture the panic message but it gets > >> >> truncated so I had to resort to taking pictures. > >> >> > >> >> box #1 kernel 2.6.36.2/nilfs-utlis 2.0.20 > >> >> http://fijam.eu.org/other/netconsole.log > >> >> http://fijam.eu.org/other/0000.jpg > >> >> > >> >> I tried to mount the usb-stick on a laptop with the same kernel > >> >> (2.6.36.2) to capture more of the panic messages: > >> >> > >> >> box #2 kernel 2.6.36.2/nilfs-utlis 2.0.20 > >> >> http://fijam.eu.org/other/0001.jpeg > >> >> http://fijam.eu.org/other/0002.jpeg > >> >> > >> >> It crashes when I try to mount with kernel 2.6.32.27 as well: > >> >> > >> >> box #2 kernel 2.6.32.27/nilfs-utlis 2.0.20 > >> >> http://fijam.eu.org/other/0003.jpeg > >> >> http://fijam.eu.org/other/0004.jpeg > >> >> > >> >> I would be grateful for advice on how can I help with getting to the > >> >> bottom of this. > >> >> > >> >> Regards, > >> >> Jan > >> > > >> > It looks like these oopses were hit in the common block layer code > >> > called from the usb mass storage driver. > >> > > >> > Could you do some tests to narrow down the issue ? > >> > > >> > 1) Use "nogc" mount option to see whether the oops depends on the > >> > context of garbage collection or not: > >> > > >> > # mount -t nilfs2 -o nogc /dev/sdb1 /your-mount-dir > >> > > >> > 2) Mount the partition read-only with "norecovery" option and make > >> > read accesses to the filesystem as below: > >> > > >> > # mount -t nilfs2 -o ro,norecovery /dev/sdb1 /your-mount-dir > >> > # find /your-mount-dir -type f -exec cat {} > /dev/null \; > >> > > >> > 3) Try to read the block device directly with "dd": > >> > > >> > # dd if=/dev/sdb1 bs=4k > /dev/null > >> > > >> > 4) Try lssu and lscp commands in the read-only mount to do quick > >> > sanity checks of meta data files. > >> > > >> > # mount -t nilfs2 -o ro,norecovery /dev/sdb1 /your-mount-dir > >> > # lssu -a > >> > # lscp > >> > > >> > > >> > Regards, > >> > Ryusuke Konishi > >> > > >> > >> Thank you for your reply and suggestions. I have tried the following: > >> > >> # mount -t nilfs2 -o nogc /dev/sdb1 /your-mount-dir > >> Results in exactly the same kernel panic. > >> > >> # mount -t nilfs2 -o ro,norecovery /dev/sdb1 /your-mount-dir > >> # find /your-mount-dir -type f -exec cat {} > /dev/null \; > >> Doesn't trigger the oops. I was able to retrieve my data but > >> haven't checked them for correctness yet. > >> > >> # lssu -a > >> http://fijam.eu.org/other/lssu > >> # lscp > >> http://fijam.eu.org/other/lscp > >> > >> # dd if=/dev/sdb1 bs=4k > /dev/null > >> Likewise, it doesn't trigger the oops. > >> > >> Is there anything else I could do to help? > >> > >> Regards, > >> Jan > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html