On Wed, 2010-08-25 at 16:53 -0400, Ted Ts'o wrote: > On Wed, Aug 25, 2010 at 03:35:42PM +0200, Peter Zijlstra wrote: > > > > While I appreciate that it might be somewhat (a lot) harder for a > > filesystem to provide that guarantee, I'd be deeply worried about your > > claim that its impossible. > > > > It would render a system without swap very prone to deadlocks. Even with > > the very tight dirty page accounting we currently have you can fill all > > your memory with anonymous pages, at which point there's nothing free > > and you require writeout of dirty pages to succeed. > > For file systems that do delayed allocation, the situation is very > similar to swapping over NFS. Sometimes in order to make some free > memory, you need to spend some free memory... Which means you need to be able to compute a bounded amount of that memory. > which implies that for > these file systems, being more aggressive about triggering writeout, > and being more aggressive about throttling processes which are > creating too many dirty pages, especially dirty delayed allocaiton > pages (regardless of whether this is via write(2) or accessing mmapped > memory), is a really good idea. That seems unrelated, the VM has a strict dirty limit and controls writeback when needed. That part works. > A pool of free pages which is reserved for routines that are doing > page cleaning would probably also be a good idea. Maybe that's just > retrying with GFP_ATOMIC if a normal allocation fails, or maybe we > need our own special pool, or maybe we need to dynamically resize the > GFP_ATOMIC pool based on how many subsystems might need to use it.... We have a smallish reserve, accessible with PF_MEMALLOC, but its use is not regulated nor bounded, it just mostly works good enough. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html