2009/6/8 Jaswinder Singh Rajput <jaswinder@xxxxxxxxxx>: > Hello Chris, > > On Mon, 2009-06-08 at 11:58 +0100, Chris Clayton wrote: >> 2009/6/8 Chris Clayton <chris2553@xxxxxxxxxxxxxx>: >> > Hi Neil, >> > >> > Thanks for the reply. >> > >> > 2009/6/7 NeilBrown <neilb@xxxxxxx>: >> >> On Mon, June 8, 2009 8:31 am, Jaswinder Singh Rajput wrote: >> >>> On Sun, 2009-06-07 at 19:38 +0100, Chris Clayton wrote: >> >>>> 2009/6/7 Jaswinder Singh Ra >> >>>> >> > http://img231.imageshack.us/img231/8931/dscn0610.jpg >> >> >> >> This message says that it found a vfat filesystem on 8:3x (I cannot see >> >> what digit should be 'x'). That is probably sdc1 or sdc2. Maybe even >> >> sdc6 or sdc7. >> >> However the vfat filesystem didn't have /sbin/init. >> >> >> > >> >>>> http://img99.imageshack.us/my.php?image=dscn0617b.jpg >> >> >> >> This one says it couldn't find anything at 8,22, which I think >> >> should be sdb6. >> >> It also shows that you have and sdc6, but sdb only goes up to sdb3. >> >> >> >> So it seems that your disk drives have changed name - not a wholely >> >> unexpected event these days. >> >> >> >> We now need answers to questions like: >> >> - what device do you expect the root filesystem to be on >> >> - how is the kernel being told this? Maybe it is hard coded >> >> into your initrd. Knowing which distro and what /etc/fstab >> >> says might help (though it wouldn't help me, I'm just about out >> >> of my depth at this point) >> >> Maybe if you changed /etc/fstab to mount by uuid instead of hardcoding >> >> e.g. /etc/sdb3, and then run "mkinitramfs" or whatever, it might work. >> >> >> > >> > Yes, I've just been looking at the photographs of the panics again and >> > I've noticed that two of my discs are being detected in the "wrong >> > order". There are three HDDS. The first, /dev/sda, is the master on >> > the first IDE port and contains sda1..sda7. The second, normally >> > /dev/sdb, is the slave on that port and contains sdb1..sdb6. The >> > third, normally /dev/sdc, is attached to the first SATA port and >> > contains sdc1..sdc3. The second photograph I posted shows that sdb and >> > sdc have been reversed. The first partition on the disc that is >> > normally /dev/sdb does indeed have a FAT32 filesystem in the first >> > partition. >> > >> > By the way, I should have said that in between the panics that the two >> > photographs show, I copied contents of /dev/sdc1, which I normally >> > boot from, to /dev/sdb6, so that I minimised the risk to sdc1 in the >> > reboot festival that bisecting would involve. I also, of course, >> > changed the name of the root partition that is passed to the kernel by >> > GRUB and amended /etc/fstab on /dev/sdb6. That's why the partitions >> > shown in the photographs seem inconsistent. Sorry I forgot to mention >> > that - I really shouldn't do these things late at night :-). >> > >> > As I indicate above, when booting the partition I have set up to do >> > this bisecting, I expect the root filesystem to be on /dev/hdb6. As I >> > also indicate, this information is passed to the kernel through GRUB's >> > /boot/grub/menu.lst. The kernel is configured specifically for my >> > system and the drivers needed to boot the system are built in to the >> > kernel, so I don't use an initrd. IIRC, that's the way Slackware is >> > installed today, except, of course, it's a big fat kernel with all >> > drivers needed to boot any system built in. I could be wrong on that >> > though, it's a while since I installed >> > >> > As to the distro, it used to be (the now defunct) Peanut Linux, which >> > was derived from Slackware. However, it's years since I installed it >> > and I have upgraded just about everything in user space and added many >> > other things (udev, dbus...). I don't think that makes any difference >> > here, though, because we don't get as far as user space. On a >> > successful boot, the system is stable and runs trouble-free for >> > several hours a day, every day. >> > >> > Hope this helps. >> > >> > I'm a good way through bisecting again and this time the system has to >> > boot without a panic 100 times before I mark a kernel as good. I'll >> > post the result later. >> > >> >> Finally got to the end of the bisection/reboot festival. I ended up here: >> >> [chris:~/kernel/linux-2.6]$ git bisect good >> d5a877e8dd409d8c702986d06485c374b705d340 is first bad commit >> commit d5a877e8dd409d8c702986d06485c374b705d340 >> Author: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> >> Date: Sun May 24 13:03:43 2009 -0700 >> >> async: make sure independent async domains can't accidentally entangle >> >> The problem occurs when async_synchronize_full_domain() is called when >> the async_pending list is not empty. This will cause lowest_running() >> to return the cookie of the first entry on the async_pending list, which >> might be nothing at all to do with the domain being asked for and thus >> cause the domain synchronization to wait for an unrelated domain. This >> can cause a deadlock if domain synchronization is used from one domain >> to wait for another. >> >> Fix by running over the async_pending list to see if any pending items >> actually belong to our domain (and return their cookies if they do). >> >> Signed-off-by: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> >> Signed-off-by: Arjan van de Ven <arjan@xxxxxxxxxxxxxxx> >> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> >> >> :040000 040000 fab1e0c06572605a7015061db4a7e0a77c04fa91 >> 34252dbb7fed3942f5952c25639564bbd77357da M kernel >> >> I can't claim to know what the change actually means, but the change >> seems to be a much better candidate than my previous bisection outcome >> where I required only 20 "panicless" boots to regard the kernel as >> good. As I said earlier today, this time I required 100 such boots. >> >> I'll revert that change, give the new kernel the reboot treatment :-) >> and report back later. >> > > Good work. Please also share this info with other signed-off members, So > adding CC. > OK. I reversed that change and built and installed the kernel. It has withstood 100 reboots without a panic. Additionally, I pulled the latest changes (that will be rc8-git5, I think) from kernel.org, reversed the change to that kernel and built and installed it. That too withstood 100 reboots without a panic. Let me know if there's anything else I can do to help fix this. Chris > Thanks, > -- > JSR > > -- No, Sir; there is nothing which has yet been contrived by man, by which so much happiness is produced as by a good tavern or inn - Doctor Samuel Johnson -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html