+ sched-avoid-taking-rq-lock-in-wake_priority_sleeper.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     sched: avoid taking rq lock in wake_priority_sleeper
has been added to the -mm tree.  Its filename is
     sched-avoid-taking-rq-lock-in-wake_priority_sleeper.patch

See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
out what to do about this

------------------------------------------------------
Subject: sched: avoid taking rq lock in wake_priority_sleeper
From: Christoph Lameter <clameter@xxxxxxx>

This patchset moves potentially expensive load balancing out of the scheduler
tick (where we run with interrupts disabled) into a tasklet that is triggered
if necessary from scheduler_tick().  Load balancing will then run with
interrupts enabled.  This eliminates interrupt holdoff times and avoids
potential machine livelock if f.e.  load balancing is performed over a large
number of processors and many of the nodes experience heavy load which may
lead to delays in fetching cachelines.  We have currently up to 1024
processors and may go up to 4096 soon.  Similar issues were seen on a Fujitsu
system in the past.

However, this issue also highlights the general problem of interrupt holdoff
during scheduler load balancing.

The moving of the load balancing into a tasklet also allows some cleanup in
scheduler_tick().  It gets easier to read and the determination of the state
for load balancing can be moved out of scheduler_tick().

Further optimization of scheduler_tick() processing occurs because we no
longer check all the sched domains on each tick.  We determine the time of the
next load balancing on every load balancing and check against this single
value in scheduler_tick().

Another optimization is that we perform the staggering of the individual load
balance operations not during load balancing but shift that to the setup of
the sched domains.

For the earlier discussion see:
http://marc.theaimsgroup.com/?t=116119187800002&r=1&w=2
V1: http://marc.theaimsgroup.com/?l=linux-kernel&m=116171494001548&w=2
V2: http://marc.theaimsgroup.com/?l=linux-kernel&m=116200355408187&w=2

V1-V2:
- Keep last_balance and calculate the next balancing from that start
  point.
- Move more code into time_slice calculation and rename time_slice()
  to task_running_tick().
- Separate out the wake_priority_sleeper optimization as a first patch.

V2->V3
- Rediff against 2.6.19-rc4-mm2
- Remove useless check for rq->idle in rebalance_domains()



This patch:


Avoid taking the request queue lock in wake_priority_sleeper if there are no
running processes.

Signed-off-by: Christoph Lameter <clameter@xxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxx>
Cc: Nick Piggin <nickpiggin@xxxxxxxxxxxx>
Cc: "Siddha, Suresh B" <suresh.b.siddha@xxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxx>
---

 kernel/sched.c |    3 +++
 1 files changed, 3 insertions(+)

diff -puN kernel/sched.c~sched-avoid-taking-rq-lock-in-wake_priority_sleeper kernel/sched.c
--- a/kernel/sched.c~sched-avoid-taking-rq-lock-in-wake_priority_sleeper
+++ a/kernel/sched.c
@@ -2898,6 +2898,9 @@ static inline int wake_priority_sleeper(
 	int ret = 0;
 
 #ifdef CONFIG_SCHED_SMT
+	if (!rq->nr_running)
+		return 0;
+
 	spin_lock(&rq->lock);
 	/*
 	 * If an SMT sibling task has been put to sleep for priority
_

Patches currently in -mm which might be from clameter@xxxxxxx are

create-compat_sys_migrate_pages.patch
wire-up-sys_migrate_pages.patch
memory-page-alloc-minor-cleanups.patch
memory-page-alloc-minor-cleanups-fix.patch
get-rid-of-zone_table.patch
deal-with-cases-of-zone_dma-meaning-the-first-zone.patch
get-rid-of-zone_table-fix-3.patch
introduce-config_zone_dma.patch
optional-zone_dma-in-the-vm.patch
optional-zone_dma-in-the-vm-no-gfp_dma-check-in-the-slab-if-no-config_zone_dma-is-set.patch
optional-zone_dma-in-the-vm-no-gfp_dma-check-in-the-slab-if-no-config_zone_dma-is-set-reduce-config_zone_dma-ifdefs.patch
optional-zone_dma-for-ia64.patch
remove-zone_dma-remains-from-parisc.patch
remove-zone_dma-remains-from-sh-sh64.patch
set-config_zone_dma-for-arches-with-generic_isa_dma.patch
zoneid-fix-up-calculations-for-zoneid_pgshift.patch
radix-tree-rcu-lockless-readside.patch
sched-avoid-taking-rq-lock-in-wake_priority_sleeper.patch
sched-disable-interrupts-for-locking-in-load_balance.patch
sched-extract-load-calculation-from-rebalance_tick.patch
sched-stagger-load-balancing-in-build_sched_domains.patch
sched-move-idle-stat-calculation-into-rebalance_tick.patch
sched-use-tasklet-to-call-balancing.patch
sched-call-tasklet-less-frequently.patch
zvc-support-nr_slab_reclaimable--nr_slab_unreclaimable-swap_prefetch.patch
reduce-max_nr_zones-swap_prefetch-remove-incorrect-use-of-zone_highmem.patch
numa-add-zone_to_nid-function-swap_prefetch.patch
readahead-state-based-method-aging-accounting.patch

-
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Kernel Newbies FAQ]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Photo]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux