Re: [ATTEND] many topics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 26-01-17 10:19:31, NeilBrown wrote:
> On Wed, Jan 25 2017, Vlastimil Babka wrote:
> 
> > On 01/23/2017 08:34 PM, NeilBrown wrote:
> >> On Tue, Jan 24 2017, Theodore Ts'o wrote:
> >>
> >>> On Sun, Jan 22, 2017 at 10:05:44PM -0800, Matthew Wilcox wrote:
> >>>>
> >>>> I don't have a clear picture in my mind of when Java promotes objects
> >>>> from nursery to tenure
> >>>
> >>> It's typically on the order of minutes.   :-)
> >>>
> >>>> ... which is not too different from my lack of
> >>>> understanding of what the MM layer considers "temporary" :-)  Is it
> >>>> acceptable usage to allocate a SCSI command (guaranteed to be freed
> >>>> within 30 seconds) from the temporary area?  Or should it only be used
> >>>> for allocations where the thread of control is not going to sleep between
> >>>> allocation and freeing?
> >>>
> >>> What the mm folks have said is that it's to prevent fragmentation.  If
> >>> that's the optimization, whether or not you the process is allocating
> >>> the memory sleeps for a few hundred milliseconds, or even seconds, is
> >>> really in the noise compared with the average lifetime of an inode in
> >>> the inode cache, or a page in the page cache....
> >>>
> >>> Why do you think it matters whether or not we sleep?  I've not heard
> >>> any explanation for the assumption for why this might be important.
> >>
> >> Because "TEMPORARY" implies a limit to the amount of time, and sleeping
> >> is the thing that causes a process to take a large amount of time.  It
> >> seems like an obvious connection to me.
> >
> > There's no simple connection to time, it depends on the larger picture - what's 
> > the state of the allocator and what other allocations/free's are happening 
> > around this one. Perhaps let me try to explain what the flag does and what 
> > benefits are expected.
> 
> If there is no simple connection to time, then I would discourage use of
> the word "TEMPORARY" as that has a strong connection with the concept of time.
> 
> >
> > GFP_TEMPORARY, compared to GFP_KERNEL, adds __GFP_RECLAIMABLE, which tries to 
> > place the allocation within MIGRATE_RECLAIMABLE pageblocks - GFP_KERNEL implies 
> > MIGRATE_UNMOVABLE pageblocks, and userspace allocations are typically 
> > MIGRATE_MOVABLE. The main goal of this "mobility grouping" is to prevent the 
> > unmovable pages spreading all over the memory, making it impossible to get 
> > larger blocks by defragmentation (compaction). Ideally we would have all these 
> > problematic pages fit neatly into the smallest possible number of pageblocks 
> > that can accomodate them. But we can't know in advance how many, and we don't 
> > know their lifetimes, so there are various heuristics for relabeling pageblocks 
> > between the 3 types as we exceed the existing ones.
> >
> > Now GFP_TEMPORARY means we tell the allocator about the relatively shorter 
> > lifetime, so it places the allocation within the RECLAIMABLE pageblocks, which 
> > are also used for slab caches that have shrinkers. The expected benefit of this 
> > is that we potentially prevent growing the number of UNMOVABLE pageblocks 
> > (either directly by this allocation, or a subsequent GFP_KERNEL one, that would 
> > otherwise fit within the existing pageblocks). While the RECLAIMABLE pages also 
> > cannot be defragmented (at least currently, there are some proposals for the 
> > slab caches...), we can at least shrink them, so the negative impact on 
> > compaction is considered less severe in the longer term.
> 
> Hmmm...  this seems like a fuzzy heuristic.
> I can use GFP_TEMPORARY as long  I'll free the memory eventually, or
> there is some way for you to ask me to free the memory, though I don't
> have to succeed - every.

I guess this was the original motivation. If you look at current users
then the pattern seems to be
	object = alloc(GFP_TEMPORARY);
	do_something_that_terminates_shortly();
	free(object);

Another pattern is
	cache = kmemcache_create(SLAB_RECLAIM_ACCOUNT)
	[...]
	object = kmem_cache_alloc(GFP_KERNEL)

so the later one is an implicit GFP_TEMPORARY.

I completely agree that GFP_TEMPORARY is confusing and it needs a much
better documentation.

> If this heuristic actually works, and reduces fragmentation, then I
> suspect it is more luck than good management.  You have maybe added
> GFP_TEMPORARY in a few places which fit with your understanding of what
> you want and which don't ruin the outcomes in your tests.  But without a
> strong definition of when it can and cannot be used, it seems quite
> likely that someone else will start using it in a way that fits within
> your vague statement of requirements, but actually results in much more
> fragmentation.

After more thinking about this I completely agree. And it wouldn't
be for the first time when this would happen. I actually think that
we should simply remove GFP_TEMPORARY. I seriously doubt those few
users would change anything wrt. to the memory fragmentation. The
SLAB_RECLAIM_ACCOUNT resp.  __GFP_RECLAIMABLE makes perfect sense but
the explicit usage of GFP_TEMPORARY without any contract just calls for
problems.
 
> i.e. I think this is a fragile heuristic and not a long term solution
> for anything.

Agreed!

> I think it would be better if we could discard the idea of "reclaimable"
> and just stick with "movable" and "unmovable".  Lots of things are not
> movable at present, but could be made movable with relatively little
> effort.  Once the interfaces are in place to allow arbitrary kernel code
> to find out when things should be moved, I suspect that a lot of
> allocations could become movable.

I believe we need both. There will be many objects which are hard to be
movable yet they are reclaimable which can help to reduce the
fragmentation longterm.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux