Re: RFC for multipath queue_if_no_path timeout.

Alasdair G Kergon <agk@xxxxxxxxxx> · Fri, 27 Sep 2013 09:37:42 +0100

But this still dodges the fundamental problem:

  What is the right value to use for the timeout?
  - How long should you wait for a path to (re)appear?
    - In the current model, reinstating a path is a userspace 
      responsibility.

The timeout, as proposed, is being used in two conflicting ways:
  - How long to wait for path recovery when all paths went down
  - How long to wait when the system locks without enough free
    memory even to reinstate a path (because of broken userspace
    code) before having multipath fail queued I/O in a desperate
    attempt at releasing memory to assist recovery

The second case should point to a very short timeout.
The first case probably wants a longer one.

In my view the correct approach for the case Frank is discussing is to
use a different trigger to detect the (approaching?) locking up of the
system.   E.g.  should something related to the handling of an out
of memory condition have a hook to instruct multipath to release such
queued I/O?

Alasdair

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel