Re: reinstate path not working

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ben,

This is regarding the add map issue I have been discussing. Posting the issue again to remind.

Case 1 : remove and add map.
root@x86-generic-64:~# multipathd -k'show maps'
name    sysfs uuid
dmpath0 dm-0  1IET_00010001
root@x86-generic-64:~# multipathd -k'remove map dmpath0'
ok
root@x86-generic-64:~# multipathd -k'show maps'
root@x86-generic-64:~# multipathd -k'add map dmpath0'
ok
root@x86-generic-64:~# multipathd -k'show maps'
root@x86-generic-64:~#
Once a map is removed, we are able to add it only using #multipath  command and not using multipathd tools.

I have fixed the problem with two approaches. I would like you to review the same.
Patch1 : By making 'remove map dmpath0' to remove only the map and not the device. I have added extra functions discard_map and dm_remove_map so as to not interfere with the existing code.

Patch 2: The approach you have suggested.By getting wwid from the mapname and doing coalesce_paths. I have just moved the following code in ev_add_map to cli_add_map.

r = get_refwwid(dev, DEV_DEVMAP, vecs->pathvec,&refwwid);

        if (refwwid) {
                r = coalesce_paths(vecs, NULL, refwwid,0);
                dm_lib_release();
        }

changed dev to param.

I have tested the same in all 3 versions -0.4.8, 0.4.9 and 0.5.0. It would be great if you can review the same so that it doesn't cause any extra side effects.

I guess Patch2 is the way u have suggested me in the previous mail. Please review it and share your views.

Regards,
Tejaswini


On Fri, Jun 12, 2015 at 2:21 AM, Benjamin Marzinski <bmarzins@xxxxxxxxxx> wrote:
On Wed, Jun 10, 2015 at 11:46:51AM +0530, Tejaswini Poluri wrote:
>       
>    >    We are testing multipathd tools with all the possible options and the
>    >    following fails.
>    >
>    >    Case 1 : remove and add map.
>    >    root@x86-generic-64:~# multipathd -k'show maps'
>    >    name    sysfs uuid
>    >    dmpath0 dm-0  1IET_00010001
>    >    root@x86-generic-64:~# multipathd -k'remove map dmpath0'
>    >    ok
>    >    root@x86-generic-64:~# multipathd -k'show maps'
>    >    root@x86-generic-64:~# multipathd -k'add map dmpath0'
>    >    ok
>    >    root@x86-generic-64:~# multipathd -k'show maps'
>    >    root@x86-generic-64:~#
>    >    Once a map is removed, we are able to add it only using #multipath 
>    >    command and not using multipathd tools.
>
>    It is working the way it was designed, but possibly it would make sense
>    to change the design.
>
>    You have mentioned that it would make sense to change the design to add
>    map. Are there plans to change the design ?
>    I am trying to understand the code flow to change the design. Can you
>    guide me if we should stop removing the device from in the remove map code
>    flow or start adding the device and the map in the add map code flow.
>
>     have tried to understand the remove map code flow of multipathd in 0.4.8
>    code.

I think that we want multipath to actually remove the map (instead of
just not monitoring it) when you call "remove map <map>". We just want
"add map <map>" to try to create the map if it doesn't exist.  To do
that, you would need to first firgure out what WWID is associated with
<map>. Presumably, <map> could either be an alias, wwid, or even the
name of a path in the map. Once you found the map, you would have to
call the code to create the map.

Also, to answer your IRC question, no the 0.4.8 code is not still being
developed upstream.  All upstream patches only go against the current
head. There are no other upstream branches.

-Ben
>
>    ev_remove_map (char * devname, struct vectors * vecs)
>
>              flush_map(mpp, vecs);
>
>                           dm_flush_map(mpp->alias, DEFAULT_TARGET);
>
>                                     if (!dm_map_present(mapname))
>
>                                            return 0;
>
>           if (dm_type(mapname, type) <= 0)
>
>                   return 1;
>
>           if (dm_remove_partmaps(mapname))
>
>                   return 1;
>
>           if (dm_get_opencount(mapname)) {
>
>                   condlog(2, "%s: map in use", mapname);
>
>                   return 1;
>
>           }      
>
>           r = dm_simplecmd(DM_DEVICE_REMOVE, mapname);
>
>           if (r) {
>
>                   condlog(4, "multipath map %s removed", mapname);
>
>                                                  return 0;
>
>                                           }
>
>     
>
>                            orphan_paths(vecs->pathvec, mpp);
>
>                            remove_map(mpp, vecs, stop_waiter_thread, 1);
>
>    Is removing this below line, the right step to stop removing the device ?
>    r = dm_simplecmd(DM_DEVICE_REMOVE, mapname);
>
>    Regards,
>
>    Tejaswini
>
>    On Mon, Jun 8, 2015 at 11:15 AM, Tejaswini Poluri
>    <[1]tejaswinipoluri3@xxxxxxxxx> wrote:
>
>      Thanks a lot Ben for the quick and detailed reply. I have been
>      struggling to understand and conclude the issues with multipath as I am
>      the only one working from my team. Your inputs help me a lot. Thanks
>      again.
>      Regards,
>      Tejaswini
>      On Sat, Jun 6, 2015 at 3:36 AM, Benjamin Marzinski
>      <[2]bmarzins@xxxxxxxxxx> wrote:
>
>        On Fri, Jun 05, 2015 at 02:31:20PM +0530, Tejaswini Poluri wrote:
>        >    Hii Ben,
>        >
>        >    We are testing multipathd tools with all the possible options and
>        the
>        >    following fails.
>        >
>        >    Case 1 : remove and add map.
>        >    root@x86-generic-64:~# multipathd -k'show maps'
>        >    name    sysfs uuid
>        >    dmpath0 dm-0  1IET_00010001
>        >    root@x86-generic-64:~# multipathd -k'remove map dmpath0'
>        >    ok
>        >    root@x86-generic-64:~# multipathd -k'show maps'
>        >    root@x86-generic-64:~# multipathd -k'add map dmpath0'
>        >    ok
>        >    root@x86-generic-64:~# multipathd -k'show maps'
>        >    root@x86-generic-64:~#
>        >    Once a map is removed, we are able to add it only using
>        #multipath 
>        >    command and not using multipathd tools.
>
>        It is working the way it was designed, but possibly it would make
>        sense
>        to change the design.  The "remove map" command, not only stops
>        multipathd from monitoring the multipath device, but it removes it
>        from
>        the system as well.  The "add map" command makes multipath monitor an
>        already existing multipath device that in wasn't previously
>        monitoring.
>        These commands do this for historical reasons.  multipathd wasn't
>        originally in charge of creating multipath devices, multipath was. 
>        Once
>        it had created the device, it ran
>
>        multipathd -k"add map <MAP>"
>
>        to make multipathd start monitoring it.  However things haven't worked
>        this way since RHEL4, so possibly "add map" should actually create the
>        device if it doesn't currently exist.
>        >    Case 2 : Active paths  test case
>        >    # while true ; do sleep 3 ; multipathd -k'remove path sdb' ;
>        multipathd
>        >    -k'add path sdb' ; multipathd -k'show maps status' ; done
>        >    ok
>        >    ok
>        >    name failback queueing paths dm-st
>        >    dmpath0 - - 1 active // It should be 2.
>
>        This is simply a timing issue.  What you are seeing it the number of
>        active paths.  These are paths that the kernel can use. The "add path"
>        command doesn't update the kernel state.  This happens later in
>        response
>        to the kernel reloading the device table. So, in a second or two, this
>        will say 2, as expected.
>
>        >    We would like to know if the test cases are valid and if these
>        are bugs or
>        >    any design issues.
>        >
>        >    Case 3 : Fail path and reinstate path
>        >    root@x86-generic-64:~# multipathd -k"fail path sdc"; multipathd
>        >    -k'reinstate path sdc'; multipathd -k"show paths";
>        >    >    [ 3962.708523] device-mapper: multipath: Failing path 8:32.
>        >    >    ok
>        >    >    ok
>        >    >    hcil    dev dev_t pri dm_st   chk_st   next_check
>        >    >    4:0:0:1 sdc 8:32  1   [active][faulty] .......... 1/20  
>        <==CHECK
>        >    >    5:0:0:1 sdd 8:48  1   [active][ready]  XX........ 4/20
>        >    sdc path becomes [active][ready] only after the polling interval
>        but not
>        >    immediately after the reinstate path command.
>        >    You have answered that this is a design issue. But we have heard
>        from our
>        >    test team that the same test case works in RHEL6. Did you observe
>        it?
>        >    I am also finding that the test cases fail because we are trying
>        to do
>        >    multiple commands at one shot.  Please share your thoughts so
>        that it
>        >    could help me in debugging the issues further.
>        >
>
>        It's totally possible that the checker state is immediately updated in
>        RHEL6.  Like I said before, what it currently does, although correct,
>        is confusing, and perhaps we need a different checker state for paths
>        where the "fail path" command has been used.
>
>        -Ben
>        >    Regards,
>        >    Tejaswini
>        >    On Tue, May 19, 2015 at 5:37 PM, Tejaswini Poluri
>        >    <[1][3]tejaswinipoluri3@xxxxxxxxx> wrote:
>        >
>        >      Thanks a lot Ben. I will look into it more. 
>        >      On Mon, May 18, 2015 at 9:57 PM, Benjamin Marzinski
>        >      <[2][4]bmarzins@xxxxxxxxxx> wrote:
>        >
>        >        On Mon, May 18, 2015 at 02:09:27PM +0530, Tejaswini Poluri
>        wrote:
>        >        >    Hii,
>        >        >    We are trying to test multipath setup in our target and
>        tried the
>        >        various
>        >        >    commands of multipathd demaon and we find the following
>        error:
>        >        >    root@x86-generic-64:~# multipathd -k"fail path sdc";
>        multipathd
>        >        >    -k'reinstate path
>        >        >    sdc'; multipathd -k"show paths";
>        >        >    [ 3962.708523] device-mapper: multipath: Failing
>        path 8:32.
>        >        >    ok
>        >        >    ok
>        >        >    hcil    dev dev_t pri dm_st   chk_st   next_check
>        >        >    4:0:0:1 sdc 8:32  1   [active][faulty] .......... 1/20 
>          <<<===
>        >        CHECK
>        >        >    5:0:0:1 sdd 8:48  1   [active][ready]  XX........ 4/20
>        >        >    sdc path becomes [active][ready] only after the polling
>        interval
>        >        but not
>        >        >    immediately after the reinstate path command. 
>        >        >    I am observing this in latest multipath tools in ubuntu
>        machine
>        >        as well.
>        >        >    Please let me know if its a known issue or if I am doing
>        >        something wrong.
>        >        >    Regards.
>        >        >    Tejaswini
>        >
>        >        the reinstate command is supposed to reinstate the device
>        with the
>        >        kernel, and it does that. The checker state doesn't change
>        until the
>        >        next time that the path is checked.  I agree that it's odd
>        that the
>        >        check state switches to faulty as soon as you fail the path,
>        but it
>        >        doesn't switch back until the next check after you reinistate
>        it.
>        >
>        >        The issue is that multipathd needs to override the checker
>        output,
>        >        so that a failed path won't be immeditately reinstated.  Once
>        the
>        >        path comes back, multipathd wants to record the switch in the
>        checker
>        >        thread, so that it can refresh path information what wasn't
>        >        automatically done when the path was reinstated.  However, it
>        may make
>        >        more sense to have a different checker state for when the
>        device is
>        >        in the failed state, so that it's obvious that the checker
>        state is
>        >        being overruled.
>        >
>        >        -Ben
>        >
>        >        > --
>        >        > dm-devel mailing list
>        >        > [3][5]dm-devel@xxxxxxxxxx
>        >        > [4][6]https://www.redhat.com/mailman/listinfo/dm-devel
>        >
>        >        --
>        >        dm-devel mailing list
>        >        [5][7]dm-devel@xxxxxxxxxx
>        >        [6][8]https://www.redhat.com/mailman/listinfo/dm-devel
>        >
>        > References
>        >
>        >    Visible links
>        >    1. mailto:[9]tejaswinipoluri3@xxxxxxxxx
>        >    2. mailto:[10]bmarzins@xxxxxxxxxx
>        >    3. mailto:[11]dm-devel@xxxxxxxxxx
>        >    4. [12]https://www.redhat.com/mailman/listinfo/dm-devel
>        >    5. mailto:[13]dm-devel@xxxxxxxxxx
>        >    6. [14]https://www.redhat.com/mailman/listinfo/dm-devel
>
> References
>
>    Visible links
>    1. mailto:tejaswinipoluri3@xxxxxxxxx
>    2. mailto:bmarzins@xxxxxxxxxx
>    3. mailto:tejaswinipoluri3@xxxxxxxxx
>    4. mailto:bmarzins@xxxxxxxxxx
>    5. mailto:dm-devel@xxxxxxxxxx
>    6. https://www.redhat.com/mailman/listinfo/dm-devel
>    7. mailto:dm-devel@xxxxxxxxxx
>    8. https://www.redhat.com/mailman/listinfo/dm-devel
>    9. mailto:tejaswinipoluri3@xxxxxxxxx
>   10. mailto:bmarzins@xxxxxxxxxx
>   11. mailto:dm-devel@xxxxxxxxxx
>   12. https://www.redhat.com/mailman/listinfo/dm-devel
>   13. mailto:dm-devel@xxxxxxxxxx
>   14. https://www.redhat.com/mailman/listinfo/dm-devel

Attachment: 0.4.8_add_map_patch1
Description: Binary data

Attachment: 0.5.0_add_map_patch2
Description: Binary data

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux