Re: ESXi + LIO + Ceph RBD problem

Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx> · Fri, 21 Aug 2015 16:21:43 -0400

On Wed, Aug 19, 2015 at 2:12 PM, Mike Christie <mchristi@xxxxxxxxxx> wrote:
> On 08/19/2015 11:16 AM, Alex Gorbachev wrote:
>> What is the difference, and is there willingness to allow LIO to be
>> modified to work with this ESXi behavior?  Or should we ask Vmware to
>> do something for ESXi to play better with LIO?  I cannot fix the code,
>> but would be happy to be the voice of the issue via any available
>> channels.
>
> I think we want to:
>
> 1. Allow lio to do more than wait for a command during aborts. For lio
> we will want to add callouts similar to how we can override
> discard/unmap behavior.
>
> 2. In the block layer add callouts/cmds so that we can abort
> requests/bios at the LLD level.
>
> 3. For rbd, we will implement support for #2. In ceph then we would need
> to add code to be able to track down commands and kill them if we can or
> at least figure out what is going on and log a message so we do not have
> these mysterious hung commands.

We just had a short network disruption, likely simply leaf/spine
overload, which temporarily hung up RBD<->LIO traffic.  ESXi<->LIO
traffic stayed up.  RBD seems to allow for long IO waits, i.e. you
could wait 30+ seconds for RBD IO to complete, but ESXi goes into a
death spiral after 5 seconds.  So if there were an option on either
LIO or RBD side to just fail an IO that did not complete within say 4
seconds, this would take care of the nasty consequences on ESXi side.

Can RBD IO be aborted after a given number of seconds?

ESXi will then retry the IO and if the problem was transient, that IO
will complete and life goes on.

Thanks guys, this would make a huge difference for production critical
operations.

Alex

>
> I have been meaning to get to this, but as you have seen on the list I
> have taken a couple wrong turns on the cluster support and am still
> working on that.
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html