[dm-devel] MPIO + offlined devices

Lars Marowsky-Bree <lmb@xxxxxxx> · Thu, 30 Jun 2005 16:42:02 +0200

Summary:

The SCSI stack might set devices to "offline" during certain error
recovery code paths; offline devices fail _all_ IO immediately. These
devices then require special activation again before even the path
checkers can test it again.

We've investigated hacks to make the kernel not offline devices for
"FAILFAST" IO. However, this has proven to be lacking, because the
SCSI error handling stack is somewhat complex to decipher. ;-)

As the "special handling" consists of

	echo 1 >/sys/block/<sda>/device/online

only, my proposal would be to do this every single time before we invoke
a path checker anywhere, at least for SCSI devices (ie, where sg_id is
set).

The most elegant way might be a gateway function: int
call_checker(struct path *pp), which did whatever was necessary and then
called pp->checkerfn(); it could also take care of setting up the
context, opening the fd, figuring out the checkerfn if not yet set etc.

What do you think?

Sincerely,
    Lars Marowsky-Brée <lmb@xxxxxxx>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business	 -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"