Re: [PATCH 6/6] mm: use atomic bit operations in set_pageblock_flags_group()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 28, 2014 at 03:15:04PM +0100, Vlastimil Babka wrote:
> set_pageblock_flags_group() is used to set either migratetype or skip bit of a
> pageblock. Setting migratetype is done under zone->lock (except from __init
> code), however changing the skip bits is not protected and the pageblock flags
> bitmap packs migratetype and skip bits together and uses non-atomic bit ops.
> Therefore, races between setting migratetype and skip bit are possible and the
> non-atomic read-modify-update of the skip bit may cause lost updates to
> migratetype bits, resulting in invalid migratetype values, which are in turn
> used to e.g. index free_list array.
> 
> The race has been observed to happen and cause panics, albeit during
> development of series that increases frequency of migratetype changes through
> {start,undo}_isolate_page_range() calls.
> 
> Two possible solutions were investigated: 1) using zone->lock for changing
> pageblock_skip bit and 2) changing the bitmap operations to be atomic. The
> problem of 1) is that zone->lock is already contended and almost never held in
> the compaction code that updates pageblock_skip bits. Solution 2) should scale
> better, but adds atomic operations also to migratype changes which are already
> protected by zone->lock.

How about 3) introduce new bitmap for pageblock_skip?
I guess that migratetype bitmap is read-intensive and set/clear pageblock_skip
could make performance degradation.

> 
> Using mmtests' stress-highalloc benchmark, little difference was found between
> the two solutions. The base is 3.13 with recent compaction series by myself and
> Joonsoo Kim applied.
> 
>                 3.13        3.13        3.13
>                 base     2)atomic     1)lock
> User         6103.92     6072.09     6178.79
> System       1039.68     1033.96     1042.92
> Elapsed      2114.27     2090.20     2110.23
> 

I really wonder how 2) is better than base although there is a little difference.
Is it the avg result of 10 runs? Do you have any idea what happens?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]