Re: [PATCH 04/16] scsi_dh_alua: Make stpg synchronous

Hannes Reinecke <hare@xxxxxxx> · Thu, 13 Feb 2014 10:31:34 +0100

On 02/12/2014 06:31 PM, Mike Christie wrote:
> On 2/12/14 10:26 AM, Mike Christie wrote:
>> On 2/12/14 10:11 AM, Mike Christie wrote:
>>> On 2/12/14 9:29 AM, Hannes Reinecke wrote:
>>>> On 02/07/2014 02:54 AM, Mike Christie wrote:
>>>>> On 02/06/2014 07:24 PM, Mike Christie wrote:
>>>>>> On 01/31/2014 03:29 AM, Hannes Reinecke wrote:
>>>>>>> We should be issuing STPG synchronously as we need to
>>>>>>> evaluate the return code on failure.
>>>>>>>
>>>>>>> Signed-off-by: Hannes Reinecke <hare@xxxxxxx>
>>>>>>
>>>>>> I think we need to also make dm-mpath.c use a non-ordered
>>>>>> workqueue
>>>>>> for
>>>>>> kmpath_handlerd. With this patch and the current ordered
>>>>>> workqueue in
>>>>>> dm-mpath I think we will only be able to do one STPG at a time. I
>>>>>> think
>>>>>> if we use a normal old non-ordered workqueue then we would be
>>>>>> limited to
>>>>>> have max_active STPGs executing.
>>>>>
>>>>> I goofed and commented in the order I saw the patches :) I take
>>>>> this
>>>>> comment back for this patch, because I see in 16/16 you added a
>>>>> new
>>>>> workqueue to scsi_dh_alua to do rtpgs and stpgs.
>>>>>
>>>>> For 16/16 though, do we want to make kmpath_aluad a non single
>>>>> threaded
>>>>> workqueue? It looks like max_active for single threaded is only
>>>>> one
>>>>> work
>>>>> at a time too.
>>>>>
>>>> Well, that was by intention.
>>>>
>>>> The workqueue will be triggered very infrequently (basically for
>>>> every path switch).
>>>> For implicit ALUA we just need to issue a RTPG to get the new path
>>>> status; there we might be suffering from single threaded behaviour.
>>>> But we need to issue it only once and it should be processed
>>>> reasonably fast.
>>>> For explicit ALUA we'll have to send an STPG, which has potentially
>>>> system-wide implications. So sending several to (supposedly)
>>>> different targets might actually be contraproductive, as the first
>>>> might have already set the status for the second call.
>>>> Here we most definitely _want_ serialisation to avoid
>>>> superfluous STPGs.
>>>>
>>>
>>> What target is this?
>>>
>>> For our target it adds a regression. It only affects devices on
>>> the same
>>> port group. We then have multiple groups. Before the patch, we could
>>> failover/failback multiple devices in parallel. To do 64 devices
>>> it took
>>> about 3 seconds. With your patch it takes around 3 minutes.
>>>
>>
>> Also, with your pg change patch, I think we can serialize based on
>> group
>> and it will do what you want and also allow us to do STPGs to
>> different
>> groups in parallel.
> 
> I guess that would work for me, but I think if you had the same
> target port in multiple port groups then you could hit the issue you
> described, right?
> 
Yes, we might. But it's worth a shot anyway.

So to alleviate all this, this patch:

diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c
b/drivers/scsi/device_ha
ndler/scsi_dh_alua.c
index 569af9d..46cc1ef 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -1353,7 +1353,7 @@ static int __init alua_init(void)
 {
        int r;

-       kmpath_aluad = create_singlethread_workqueue("kmpath_aluad");
+       kmpath_aluad = create_workqueue("kmpath_aluad");
        if (!kmpath_aluad) {
                printk(KERN_ERR "kmpath_aluad creation failed.\n");
                return -EINVAL;

should be sufficient, right?

Could you test and see if it makes a difference?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@xxxxxxx			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html