Re: [PATCH 1/3] libmultipath: merge_mptable(): sort table by wwid

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2022-08-24 at 11:16 -0500, Benjamin Marzinski wrote:
> On Wed, Aug 24, 2022 at 09:07:56AM +0200, Martin Wilck wrote:
> > On Tue, 2022-08-23 at 15:48 -0500, Benjamin Marzinski wrote:
> > > On Thu, Aug 18, 2022 at 11:06:28PM +0200, mwilck@xxxxxxxx wrote:
> > > > From: Martin Wilck <mwilck@xxxxxxxx>
> > > > 
> > > > If the mptable is very large (for example, in a configuration
> > > > with
> > > > lots of maps assigned individual aliases), merge_mptable may
> > > > get
> > > > very slow because it needs to make O(n^2) string comparisons
> > > > (with
> > > > n being the number of mptable entries). WWID strings often
> > > > differ
> > > > only in the last few bytes, causing a lot of CPU time spent in
> > > > strcmp().
> > > > 
> > > > merge_mptable is executed whenever multipath or multipathd is
> > > > started, that
> > > > is, also for "multipath -u" and "multipath -U" invocations from
> > > > udev rules.
> > > > Optimize it by sorting the mptable before merging. This way we
> > > > don't need
> > > > to iterate towards the end of the vector searching for
> > > > duplicates.
> > > > 
> > > > Signed-off-by: Martin Wilck <mwilck@xxxxxxxx>
> > > > ---
> > > >  libmultipath/config.c | 15 +++++++++++++--
> > > >  libmultipath/vector.c |  8 ++++++++
> > > >  libmultipath/vector.h |  1 +
> > > >  3 files changed, 22 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/libmultipath/config.c b/libmultipath/config.c
> > > > index ab8b26e..a2c79a4 100644
> > > > --- a/libmultipath/config.c
> > > > +++ b/libmultipath/config.c
> > > > @@ -520,6 +520,14 @@ merge_mpe(struct mpentry *dst, struct
> > > > mpentry
> > > > *src)
> > > >         merge_num(mode);
> > > >  }
> > > >  
> > > > +static int wwid_compar(const void *p1, const void *p2)
> > > > +{
> > > > +       const char *wwid1 = (*(struct mpentry * const *)p1)-
> > > > >wwid;
> > > > +       const char *wwid2 = (*(struct mpentry * const *)p2)-
> > > > >wwid;
> > > > +
> > > > +       return strcmp(wwid1, wwid2);
> > > > +}
> > > > +
> > > >  void merge_mptable(vector mptable)
> > > >  {
> > > >         struct mpentry *mp1, *mp2;
> > > > @@ -533,10 +541,13 @@ void merge_mptable(vector mptable)
> > > >                         free_mpe(mp1);
> > > >                         continue;
> > > >                 }
> > > > +       }
> > > > +       vector_sort(mptable, wwid_compar);
> > > > +       vector_foreach_slot(mptable, mp1, i) {
> > > >                 j = i + 1;
> > > >                 vector_foreach_slot_after(mptable, mp2, j) {
> > > > -                       if (!mp2->wwid || strcmp(mp1->wwid,
> > > > mp2-
> > > > > wwid))
> > > > -                               continue;
> > > > +                       if (strcmp(mp1->wwid, mp2->wwid))
> > > > +                               break;
> > > >                         condlog(1, "%s: duplicate multipath
> > > > config
> > > > section for %s",
> > > >                                 __func__, mp1->wwid);
> > > >                         merge_mpe(mp2, mp1);
> > > 
> > > The way merge_mpe() works, values are only copied from mp1's
> > > variables
> > > if the corresponding variable is unset in mp2. This requires the
> > > original order of mp1 and mp2 to be unchanged by the sorting
> > > algorithm,
> > > but according to the qsort() man page "If two members compare as
> > > equal,
> > > their order in the sorted array is undefined."
> > > 
> > > One quick and dirty way we could fix this is to add a variable to
> > > struct
> > > mptable called config_idx, which is its index in the config
> > > file.  If
> > > the wwids are equal, you compare that.
> > 
> > Thanks for pointing this out. I believe it's easier than that: as
> > we're
> > passed the offsets into the array (aka struct mpentry **), we can
> > simply compare p1 and p2 if the strings are equal.
> > 
> > Agree?
> 
> I don't think so. Comparing the array offsets assumes that two
> mpentries
> are still ordered correctly when they are compared against each
> other.
> But the way quick sort works, array elements can get swapped around
> multiple times, based on whether they are larger or smaller than some
> pivot point. It's possible that the two mpentries already switched
> order
> before they were compared.
> 
> Essentially, all comparing the offset does is force qsort to not
> switch
> the place of two otherwise equal entries. But for speed reasons
> alone,
> there is no reason why qsort would bother to swap the location of two
> equal entries.  
> 
> Here's an example of what could go wrong:
> 
> Say you start with the array (I'm also tracking the mpentry's
> original
> config index)
> 
> array offset:   0       1       2       3       4
> config_idx:     0       1       2       3       4
> wwid:           D       B       C       D       A       
> 
> Say qsort picks the element at array offset 2 (wwid "C") as the
> pivot.
> Usually quicksort works by walking towards the center of the array
> segment from both ends, swapping the positions of elements bigger
> than
> the pivot with the positions of ones smaller than or equal to the
> pivot.
> So after the first round you would swap the element at array offset 4
> (wwid "A") with the element at array offset 0 (wwid "D"). This would
> give you.
> 
> array offset:   0       1       2       3       4
> config_idx      4       1       2       3       0
> wwid:           A       B       C       D       D
> 
> After this no further swaps will happen using the original
> wwid_compar(). Adding a comparison to the array offset won't change
> anything. But the wwid "D" mpentry that was orinally earlier in the
> config (config_idx = 0) is now after the wwid "D" mpentry that was
> originally later (config_idx = 3).
> 
> Comparing the config_idx will cause the elements at array offset 3
> and 4
> to switch places, restoring their original ordering.

Hm, too bad. 

I don't like the idea of adding another field to the array just to
stabilize the sort. But a fast, stable sort algorithm in for strings in
C seems to be hard to find. So yes, let's add the sort index for now,
perhaps we'll find a more elegant solution later.


Martin

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/dm-devel





[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux