Hi, If we use multipath for "/", temporal all-paths failure could lead to system stall because multipathd depends on callout programs on "/". I would like to hear your comments about my idea to fix it. For example, the script below causes system stall on the following environmnt. o "/" on a multipath device o setting 'no_path_retry = queue' o using priority callout (If your storage doesn't have priority callout, using "/bin/echo 1" should be fine for testing.) ----------------------------------------------------------------- #!/bin/sh # specify all paths for your root filesystem paths="sdd sdg" while true; do for dev in $paths; do echo offline > /sys/block/${dev}/device/state done for dev in $paths; do echo running > /sys/block/${dev}/device/state done done ----------------------------------------------------------------- This is because the path checker thread stalls on executing the priority callout and revived paths aren't reinstated. To fix it, my proposal is to build all priority callouts into multipathd as library functions like path checkers. (But keep the feature to use external priority callouts as an option.) Although the proposal doesn't work if target device for down/up path is deleted/added because getuid callouts are used for path addition, the target device deletion can be controlled by the "dev_loss_tmo" parameter of transport layer. Also, source codes of getuid callouts are outside of multipath-tools. So I think making only all priority callouts built-in is enough now. Ideally, multipathd shouldn't do file I/Os nor get memory after started. I think the proposal above is the first step for the ideal multipathd. What do you think about it? Thanks, Kiyoshi Ueda -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel