Re: [PATCH 0/3] Volatile Ranges (v7) & Lots of words

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 28 Sep 2012 23:16:30 -0400 John Stultz <john.stultz@xxxxxxxxxx> wrote:

> 
> After Kernel Summit and Plumbers, I wanted to consider all the various
> side-discussions and try to summarize my current thoughts here along
> with sending out my current implementation for review.
> 
> Also: I'm going on four weeks of paternity leave in the very near
> (but non-deterministic) future. So while I hope I still have time
> for some discussion, I may have to deal with fussier complaints
> then yours. :)  In any case, you'll have more time to chew on
> the idea and come up with amazing suggestions. :)

Hi John,

 I wonder if you are trying to please everyone and risking pleasing no-one?
 Well, maybe not quite that extreme, but you can't please all the people all
 the time.

 For example, allowing sub-page volatile region seems to be above and beyond
 the call of duty.  You cannot mmap sub-pages, so why should they be volatile?

 Similarly the suggestion of using madvise - while tempting - is probably a
 minority interest and can probably be managed with library code.  I'm glad
 you haven't pursued it.

 I think discarding whole ranges at a time is very sensible, and so merging
 adjacent ranges is best avoided.  If you require page-aligned ranges this
 becomes trivial - is that right?

 I wonder if the oldest page/oldest range issue can be defined way by
 requiring apps the touch the first page in a range when they touch the range.
 Then the age of a range is the age of the first page.  Non-initial pages
 could even be kept off the free list .... though that might confuse NUMA
 page reclaim if a range had pages from different nodes.


 Application to non-tmpfs files seems very unclear and so probably best
 avoided.
 If I understand you correctly, then you have suggested both that a volatile
 range would be a "lazy hole punch" and a "don't let this get written to disk
 yet" flag.  It cannot really be both.  The former sounds like fallocate,
 the latter like fadvise.
 I think the later sounds more like the general purpose of volatile ranges,
 but I also suspect that some journalling filesystems might be uncomfortable
 providing a guarantee like that.  So I would suggest firmly stating that it
 is a tmpfs-only feature.  If someone wants something vaguely similar for
 other filesystems, let them implement it separately.


 The SIGBUS interface could have some merit if it really reduces overhead.  I
 worry about app bugs that could result from the non-deterministic
 behaviour.   A range could get unmapped while it is in use and testing for
 the case of "get a SIGBUS half way though accessing something" would not
 be straight forward (SIGBUS on first step of access should be easy).
 I guess that is up to the app writer, but I have never liked anything about
 the signal interface and encouraging further use doesn't feel wise.

 That's my 2c worth for now.  Keep up the good work,

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]