On Wed, Apr 26, 2017 at 08:42:10PM +0800, Huang, Ying wrote: > Minchan Kim <minchan@xxxxxxxxxx> writes: > > > On Fri, Apr 21, 2017 at 08:29:30PM +0800, Huang, Ying wrote: > >> "Huang, Ying" <ying.huang@xxxxxxxxx> writes: > >> > >> > Minchan Kim <minchan@xxxxxxxxxx> writes: > >> > > >> >> On Wed, Apr 19, 2017 at 04:14:43PM +0800, Huang, Ying wrote: > >> >>> Minchan Kim <minchan@xxxxxxxxxx> writes: > >> >>> > >> >>> > Hi Huang, > >> >>> > > >> >>> > On Fri, Apr 07, 2017 at 02:49:01PM +0800, Huang, Ying wrote: > >> >>> >> From: Huang Ying <ying.huang@xxxxxxxxx> > >> >>> >> > >> >>> >> void swapcache_free_entries(swp_entry_t *entries, int n) > >> >>> >> { > >> >>> >> struct swap_info_struct *p, *prev; > >> >>> >> @@ -1075,6 +1083,10 @@ void swapcache_free_entries(swp_entry_t *entries, int n) > >> >>> >> > >> >>> >> prev = NULL; > >> >>> >> p = NULL; > >> >>> >> + > >> >>> >> + /* Sort swap entries by swap device, so each lock is only taken once. */ > >> >>> >> + if (nr_swapfiles > 1) > >> >>> >> + sort(entries, n, sizeof(entries[0]), swp_entry_cmp, NULL); > >> >>> > > >> >>> > Let's think on other cases. > >> >>> > > >> >>> > There are two swaps and they are configured by priority so a swap's usage > >> >>> > would be zero unless other swap used up. In case of that, this sorting > >> >>> > is pointless. > >> >>> > > >> >>> > As well, nr_swapfiles is never decreased so if we enable multiple > >> >>> > swaps and then disable until a swap is remained, this sorting is > >> >>> > pointelss, too. > >> >>> > > >> >>> > How about lazy sorting approach? IOW, if we found prev != p and, > >> >>> > then we can sort it. > >> >>> > >> >>> Yes. That should be better. I just don't know whether the added > >> >>> complexity is necessary, given the array is short and sort is fast. > >> >> > >> >> Huh? > >> >> > >> >> 1. swapon /dev/XXX1 > >> >> 2. swapon /dev/XXX2 > >> >> 3. swapoff /dev/XXX2 > >> >> 4. use only one swap > >> >> 5. then, always pointless sort. > >> > > >> > Yes. In this situation we will do unnecessary sorting. What I don't > >> > know is whether the unnecessary sorting will hurt performance in real > >> > life. I can do some measurement. > >> > >> I tested the patch with 1 swap device and 1 process to eat memory > >> (remove the "if (nr_swapfiles > 1)" for test). I think this is the > >> worse case because there is no lock contention. The memory freeing time > >> increased from 1.94s to 2.12s (increase ~9.2%). So there is some > >> overhead for some cases. I change the algorithm to something like > >> below, > >> > >> void swapcache_free_entries(swp_entry_t *entries, int n) > >> { > >> struct swap_info_struct *p, *prev; > >> int i; > >> + swp_entry_t entry; > >> + unsigned int prev_swp_type; > >> > >> if (n <= 0) > >> return; > >> > >> + prev_swp_type = swp_type(entries[0]); > >> + for (i = n - 1; i > 0; i--) { > >> + if (swp_type(entries[i]) != prev_swp_type) > >> + break; > >> + } > > > > That's really what I want to avoid. For many swap usecases, > > it adds unnecessary overhead. > > > >> + > >> + /* Sort swap entries by swap device, so each lock is only taken once. */ > >> + if (i) > >> + sort(entries, n, sizeof(entries[0]), swp_entry_cmp, NULL); > >> prev = NULL; > >> p = NULL; > >> for (i = 0; i < n; ++i) { > >> - p = swap_info_get_cont(entries[i], prev); > >> + entry = entries[i]; > >> + p = swap_info_get_cont(entry, prev); > >> if (p) > >> - swap_entry_free(p, entries[i]); > >> + swap_entry_free(p, entry); > >> prev = p; > >> } > >> if (p) > >> > >> With this patch, the memory freeing time increased from 1.94s to 1.97s. > >> I think this is good enough. Do you think so? > > > > What I mean is as follows(I didn't test it at all): > > > > With this, sort entries if we found multiple entries in current > > entries. It adds some condition checks for non-multiple swap > > usecase but it would be more cheaper than the sorting. > > And it adds a [un]lock overhead for multiple swap usecase but > > it should be a compromise for single-swap usecase which is more > > popular. > > > > How about the following solution? It can avoid [un]lock overhead and > double lock issue for multiple swap user case and has good performance > for one swap user case too. How worse with approach I suggested compared to as-is? Unless it's too bad, let's not add more complicated thing to just enhance the minor usecase in such even *slow* path. It adds code size/maintainance overead. With your suggestion, it might enhance a bit with speicific benchmark but not sure it's really worth for real practice. > > Best Regards, > Huang, Ying > > From 7bd903c42749c448ef6acbbdee8dcbc1c5b498b9 Mon Sep 17 00:00:00 2001 > From: Huang Ying <ying.huang@xxxxxxxxx> > Date: Thu, 23 Feb 2017 13:05:20 +0800 > Subject: [PATCH -v5] mm, swap: Sort swap entries before free > > To reduce the lock contention of swap_info_struct->lock when freeing > swap entry. The freed swap entries will be collected in a per-CPU > buffer firstly, and be really freed later in batch. During the batch > freeing, if the consecutive swap entries in the per-CPU buffer belongs > to same swap device, the swap_info_struct->lock needs to be > acquired/released only once, so that the lock contention could be > reduced greatly. But if there are multiple swap devices, it is > possible that the lock may be unnecessarily released/acquired because > the swap entries belong to the same swap device are non-consecutive in > the per-CPU buffer. > > To solve the issue, the per-CPU buffer is sorted according to the swap > device before freeing the swap entries. Test shows that the time > spent by swapcache_free_entries() could be reduced after the patch. > > With the patch, the memory (some swapped out) free time reduced > 13.6% (from 2.59s to 2.28s) in the vm-scalability swap-w-rand test > case with 16 processes. The test is done on a Xeon E5 v3 system. The > swap device used is a RAM simulated PMEM (persistent memory) device. > To test swapping, the test case creates 16 processes, which allocate > and write to the anonymous pages until the RAM and part of the swap > device is used up, finally the memory (some swapped out) is freed > before exit. > > Signed-off-by: Huang Ying <ying.huang@xxxxxxxxx> > Acked-by: Tim Chen <tim.c.chen@xxxxxxxxx> > Cc: Hugh Dickins <hughd@xxxxxxxxxx> > Cc: Shaohua Li <shli@xxxxxxxxxx> > Cc: Minchan Kim <minchan@xxxxxxxxxx> > Cc: Rik van Riel <riel@xxxxxxxxxx> > > v5: > > - Use a smarter way to determine whether sort is necessary. > > v4: > > - Avoid unnecessary sort if all entries are from one swap device. > > v3: > > - Add some comments in code per Rik's suggestion. > > v2: > > - Avoid sort swap entries if there is only one swap device. > --- > mm/swapfile.c | 43 ++++++++++++++++++++++++++++++++++++++----- > 1 file changed, 38 insertions(+), 5 deletions(-) > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 71890061f653..10e75f9e8ac1 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -37,6 +37,7 @@ > #include <linux/swapfile.h> > #include <linux/export.h> > #include <linux/swap_slots.h> > +#include <linux/sort.h> > > #include <asm/pgtable.h> > #include <asm/tlbflush.h> > @@ -1065,20 +1066,52 @@ void swapcache_free(swp_entry_t entry) > } > } > > +static int swp_entry_cmp(const void *ent1, const void *ent2) > +{ > + const swp_entry_t *e1 = ent1, *e2 = ent2; > + > + return (int)(swp_type(*e1) - swp_type(*e2)); > +} > + > void swapcache_free_entries(swp_entry_t *entries, int n) > { > struct swap_info_struct *p, *prev; > - int i; > + int i, m; > + swp_entry_t entry; > + unsigned int prev_swp_type; > > if (n <= 0) > return; > > prev = NULL; > p = NULL; > - for (i = 0; i < n; ++i) { > - p = swap_info_get_cont(entries[i], prev); > - if (p) > - swap_entry_free(p, entries[i]); > + m = 0; > + prev_swp_type = swp_type(entries[0]); > + for (i = 0; i < n; i++) { > + entry = entries[i]; > + if (likely(swp_type(entry) == prev_swp_type)) { > + p = swap_info_get_cont(entry, prev); > + if (likely(p)) > + swap_entry_free(p, entry); > + prev = p; > + } else if (!m) > + m = i; > + } > + if (p) > + spin_unlock(&p->lock); > + if (likely(!m)) > + return; > + > + /* Sort swap entries by swap device, so each lock is only taken once. */ > + sort(entries + m, n - m, sizeof(entries[0]), swp_entry_cmp, NULL); > + prev = NULL; > + for (i = m; i < n; i++) { > + entry = entries[i]; > + if (swp_type(entry) == prev_swp_type) > + continue; > + p = swap_info_get_cont(entry, prev); > + if (likely(p)) > + swap_entry_free(p, entry); > prev = p; > } > if (p) > -- > 2.11.0 > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>