On Wed 19-03-08 07:46:46, David Chinner wrote: > On Tue, Mar 18, 2008 at 02:43:26PM +0100, Jan Kara wrote: > > > 2.6.25-rc3, 4p ia64, ext3 root drive. > > > > > > I was running an XFS stress test on one of the XFS partitions on > > > the machine (zero load on the root ext3 drive), when the system > > > locked up in kjournald with this on the console: > > > > > > BUG: spinlock lockup on CPU#2, kjournald/2150, a000000100e022e0 > > > > > <snip traces> > > <snip other stuff> > > > > Anyone know the reason why drop_pagecache_sb() uses such a brute-force > > > mechanism to free up clean page cache pages? > > Yes, we know that drop_pagecache_sb() has locking issues but since it > > is intended to be used for debugging purposes only, nobody cared enough > > to fix it. Completely untested patch below if you dare to try ;) > > It may be intended for debuging purposes, but it does get used in > production HPC environments (a lot!). I guess I've never seen this > lockup before because SGI customers don't use ext3, but they have > complained about the system "stopping" while drop_caches is executed. > This locking ..... strategy would explain it, though. ;) But in case your customers use it in production, shouldn't you push for a better interface for such feature? Just a thought... > I'll try the patch, but I can't guarantee anything - I only saw this > lockup once in about 18 hours when dropping caches every 2 seconds. Well, if you enable lockdep, it should warn you about possible problems much earlier (at least we already got several reports of lockdep warnings when using drop_caches). Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html