Re: [RFC] mm/vmscan.c: avoid possible long latency caused by too_many_isolated()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Hillf,

On 4/30/2021 2:43 PM, Hillf Danton wrote:
On Fri, 30 Apr 2021 13:33:57 +0800 Xing Zhengjun wrote:

I use my compaction test case to test it, 1/10 ratio can reproduce 100ms
sleep.

  60) @ 103942.6 us |      shrink_node();

  60) @ 103795.8 us |      shrink_node();

Thanks for your test.

In bid to cut the number of 100ms sleepers further down, add another place
for them to nap by flushing lru cache before falling in sleep, instead of
mulling why 50ms or 10ms is more adequate.

Alternatively, and simpler IMHO, take a 5ms nap one time until !tmi.

--- y/mm/vmscan.c
+++ x/mm/vmscan.c
@@ -118,6 +118,9 @@ struct scan_control {
  	/* The file pages on the current node are dangerously low */
  	unsigned int file_is_tiny:1;
+ unsigned int file_tmi:1; /* too many isolated */
+	unsigned int anon_tmi:1;
+
  	/* Allocation order */
  	s8 order;
@@ -2092,6 +2095,22 @@ static int current_may_throttle(void)
  		bdi_write_congested(current->backing_dev_info);
  }
+static void set_sc_tmi(struct scan_control *sc, bool file, int tmi)
+{
+	if (file)
+		sc->file_tmi = tmi;
+	else
+		sc->anon_tmi = tmi;
+}
+
+static bool is_sc_tmi(struct scan_control *sc, bool file)
+{
+	if (file)
+		return sc->file_tmi != 0;
+	else
+		return sc->anon_tmi != 0;
+}
+
  /*
   * shrink_inactive_list() is a helper for shrink_node().  It returns the number
   * of reclaimed pages
@@ -2109,11 +2128,23 @@ shrink_inactive_list(unsigned long nr_to
  	enum vm_event_item item;
  	struct pglist_data *pgdat = lruvec_pgdat(lruvec);
  	bool stalled = false;
+	bool drained = false;
while (unlikely(too_many_isolated(pgdat, file, sc))) {
  		if (stalled)
  			return 0;
+ if (!is_sc_tmi(sc, file)) {
+			set_sc_tmi(sc, file, 1);
+			return 0;
+		}
+
+		if (!drained) {
+			drained = true;
+			lru_add_drain_all();
+			continue;
+		}
+
  		/* wait a bit for the reclaimer. */
  		msleep(100);
  		stalled = true;
@@ -2123,6 +2154,9 @@ shrink_inactive_list(unsigned long nr_to
  			return SWAP_CLUSTER_MAX;
  	}
+ if (is_sc_tmi(sc, file))
+		set_sc_tmi(sc, file, 0);
+
  	lru_add_drain();
spin_lock_irq(&lruvec->lru_lock);


I tried the patch, it still can reproduce the 100ms sleep.

52) @ 103829.8 us |      shrink_lruvec();

--
Zhengjun Xing




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux