On Fri, 2007-07-13 at 19:34 +0400, Edward Shishkin wrote: > Jake Maciejewski wrote: > > >On Wed, 2007-07-11 at 23:48 +0400, Edward Shishkin wrote: > > > > > >>Jake Maciejewski wrote: > >> > >> > >> > >>>I've hit the same panic looping kernel builds (while true ; do make > >>>mrproper ; make allmodconfig ; make -j4 ; done) on 2.6.21.1 with the > >>>Namesys patch and reiser4 debug enabled. I've seen it on my amd64 > >>>desktop and x86 laptop. > >>> > >>>Another one I've seen is: > >>> reiser4 panicked cowardly: reiser4[fixdep(16043)]: sibling_list_remove (fs/reiser4/tree_walk.c:814)[zam-32245] > >>> > >>>In both cases the fsck didn't find anything, as you observed. > >>> > >>>On Wed, 2007-07-11 at 06:46 +0200, Ingo Bormuth wrote: > >>> > >>> > >>> > >>> > >>>>Hmm, whenever I try to build busybox (1.4.2) I get nikita-191 panics: > >>>> > >>>>[...] > >>>>cc console_tools/clear.o > >>>>reiser4 panicked cowardly: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]: > >>>>kernel panic - not syncing: reiser4[cc1(13066)]: save_file_hint (fs/reiser4/plugin/file.c:705) [nikity-1991]: > >>>> > >>>> > >>>> > >>>> > >>Somebody missed set_file_hint(), which synchronizes the coords. > >> > >> > err, sorry, its name is reiser4_set_hint > > >>Unfortunately I can not reproduce it. Would you please (if possible) > >>catch the stack with the attached patch? > >> > >> > > > >[<ffffffff88186b5e>] :reiser4:save_file_hint+0xee/0x3c0 > >[<ffffffff88189c60>] :reiser4:read_unix_file+0x940/0xa10 > >[<ffffffff80276bbb>] vfs_read+0xdb/0x180 > >[<ffffffff80277083>] sys_read+0x53/0x90 > >[<ffffffff8020993e>] system_call+0x7e/0x83 > > > > > > Thanks! > Indeed, the coords are not synchronized when reading tails. However, > it is not a fatal bug: we are victims of brain damaged and unreadable > hint interface. > > The possible fix is attached. Would you please test it? > Also don't forget to apply this patch: > http://lkml.org/lkml/diff/2007/7/11/396/1 > as it also can be related to the problem. > > Edward. Sorry for being so late to reply. Yes, the fix works, but it took some time to test because I'm still seeing the previously mentioned panic in sibling_list_remove, except now it takes an hour or two to panic. I'm reasonably sure I'm not seeing the save_file_hint panic anymore, though. > > >As for reproducing it, I think I should mention that: > > > >1. I'm using distcc to speed things up. Without offloading the compiling > >work, my laptop has lasted ~3.5hrs before a panic. My desktop with > >distcc configured usually only lasts a few minutes. > > > >2. My local storage is encrypted through dm-crypt, but I've also tried > >over open-iscsi and got the same results. > > > > > > > >>>>Running fsck.reiser4 before and after the panic doesn't show any complaints. > >>>>The partition is heavily used. I'm not aware of any other problem. > >>>> > >>>>Vanilla-2.6.21.6 (kernel.org) with reiser4-2.6.21-path (namesys.com). > >>>> > >>>>Not that I understood the code, but why is it an assertion at all? > >>>>Couldn't one just use an empty hint if the current one is invalid? > >>>> > >>>> > >>>> > >>>> > >>Sure, it is possible to not use it at all. But if the current one is valid, > >>it would be nice to use it to avoid tree traversal with waiting for > >>possible locks, etc.. > >> > >>Thanks, > >>Edward. > >> > >> > >> > > -- Jake Maciejewski <maciejej@xxxxxxxx> - To unsubscribe from this list: send the line "unsubscribe reiserfs-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html