On Saturday May 25, neilb@cse.unsw.edu.au wrote: > > I spent lots of Friday pouring over the code, and most of it looks > right, as one would expect. > The only path that I couldn't convince myself was right is when > journal_unmap_buffer finds that the buffer it is unmapping is > on the committing transaction. It seems as though this buffer would > stay dirty and could eventually be flushed out, but there could well > be something that I am missing. > > I might put a printk in here and boot into a 2.4.19-pre plus 0.9.18 > based kernel on monday and see if it shows anything. Well, I did that... We went all weekend with data=ordered on the problematic server and got zero messages (One of the "raid5: multiple 1 requests" on each of the other two servers that don't seem to have the right load). I rebooted into 2.4.18-pre8 plus ext3 0.9.18 (plus raid and nfs stuff) plus some printks. It came up at 12:35 and got the first "raid5: multiple 1 requests for sector" at 13:26 at which time there were a burst of 13 messages (actually 1 as 13:26:42, 11 at 13:26:57 and 1 at 13:27:04). I have been logging the address of every bh that got to the JBUFFER_TRACE(jh, "on committing transaction"); branch of journal_unmap_buffer. With the "raid5: multiple..." messages, I was logging the addresses of the two bh's - the "old" (which did not get written) and the "new" (which did). I tried to match these bh addresses with the ones reported with "on committing transaction", and got a very good match. Every "old" bh (except 2) had been reported as "on committing transaction" at 13:08 (precisely: 13:05:50 x1, 13:08:49x9, 13:10:04x1). No "new" bh has been similarly reported. Th "except 2" is because I use net_ratelimit to avoid flooding kern.log (just in case) and it lost a few messages at both times. While this isn't conclusive that it is the same buffer_head (it could be the same piece of memory being reused: I should printk b_rsector), it is a very strong indicator. I'm guessing that in this branch of journal_unmap_buffer, we really want to clear the BH_JBDDirty flag, but I'm not willing to do that without the OK for one of the developers.... NeilBrown P.S. between writing and posting this I have had three more raid5: messages that show the same behaviour. NeilBrown