Re: [patch][rfc] mm: hold page lock over page_mkwrite

jim owens <jowens@xxxxxx> · Mon, 02 Mar 2009 10:26:21 -0500

Nick Piggin wrote:

So assuming there is no reasonable way to do out of core algorithms
on the filesystem metadata (and likely you don't want to anyway
because it would be a significant slowdown or diverge of code
paths), you still only need to reserve one set of those 30-40 pages
for the entire kernel.

You only ever need to reserve enough memory for a *single* page
to be processed. In the worst case that there are multiple pages
under writeout but can't allocate memory, only one will be allowed
access to reserves and the others will block until it is finished
and can unpin them all.

Sure, nobody will mind seeing lots of extra pinned memory ;)

Don't forget to add the space for data transforms and raid
driver operations in the write stack, and whatever else we
may not have thought of.  With good engineering we can make
it so "we can always make forward progress".  But it won't
matter because once a real user drives the system off this
cliff there is no difference between "hung" and "really slow
progress".  They are going to crash it and report a hang.

Well I'm not saying it is an immediate problem or it would be a
good use of anybody's time to rush out and try to redesign their
fs code to fix it ;) But at least for any new core/generic library
functionality like fsblock, it would be silly not to close the hole
there (not least because the problem is simpler here than in a
complex fs).

Hey, I appreciate anything you do in VM to make the ugly
dance with filesystems (my area) a little less ugly.

I'm sure you also appreciate that every time VM tries to
save 32 bytes, someone else tries to take 32 K-bytes.
As they say... memory is cheap :)

jim

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html