On Wed 04-04-12 12:46:57, Josef Bacik wrote: > On Wed, Apr 04, 2012 at 09:55:20AM +0200, Jan Kara wrote: > > On Tue 10-01-12 13:12:55, Josef Bacik wrote: > > > If we are journalling data (ie journal=data or big symlinks) we can discard > > > buffers and move them to different transactions to make sure they get cleaned up > > > properly. The problem is b_modified could still be set from the last > > > transaction that touched it, so putting it on the currently running transaction > > > or setting it up to be put on the next transaction will run into problems if the > > > buffer gets reused in that transaction as the space accounting logic won't be > > > done, which will result in panics at commit time because t_nr_buffers will end > > > up being more than t_outstanding_credits. Thanks to Jan Kara for pointing out > > > the other part of this problem a few months ago. Thanks, > > > > > > Signed-off-by: Josef Bacik <josef@xxxxxxxxxx> > > So I think I've nailed this down. Your feeling that the problem is with > > refiling buffer to BJ_Forget list of the running transaction was right. The > > missing piece to the puzzle was that journal_invalidatepage() can get > > called not only when underlying block is freed but also when someone > > flushes page cache. The traces I have suggest that someone has flushed page > > cache (likely of the block device), that moved buffer from the checkpoint > > list to BJ_Forget list of the running transaction and then the same running > > transaction tried to modify the buffer which triggered the accounting > > problem you spotted. > > > > I have updated the changelog and pushed the patch to my tree (for JBD > > only). I'll duplicate the patch for JBD2 tomorrow. > > > > Ok now it's my turn to be unsure ;). I thought invalidatepage could only be > called via truncate? You say it happens when someone flushes pagecache, do you > mean like echo 3 > /proc/sys/vm/drop_caches? Yup, or things like BLKFLSBUF ioctl. But yes, you are right they don't end up calling ext3_invalidatepage() I often get confused by the name of invalidate_mapping_pages()... Anyway ext3_invalidatepage() definitely gets called (I see that in my traces) and now I tend to thing it's from ext3_evict_inode(). The guy was using 2.6.37 kernel which doesn't have b22570d9abb3d844e65c15c8bc0d57a78129e3b4 so truncate_inode_pages() gets called from ext3_evict_inode() before the buffer is checkpointed and that causes the described scenario. But the guy claims he's seen the problem with 3.2 as well. So I guess I'll forward-port the buffer tracking patches and ask him to reproduce with 3.2. Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html