Re: [PATCH v1 07/16] vfio/ccw: remove unnecessary malloc alignment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2022-12-16 at 15:10 -0500, Matthew Rosato wrote:
> On 11/21/22 4:40 PM, Eric Farman wrote:
> > Everything about this allocation is harder than necessary,
> > since the memory allocation is already aligned to our needs.
> > Break them apart for readability, instead of doing the
> > funky artithmetic.
> > 
> > Of the structures that are involved, only ch_ccw needs the
> > GFP_DMA flag, so the others can be allocated without it.
> > 
> > Signed-off-by: Eric Farman <farman@xxxxxxxxxxxxx>
> > ---
> >  drivers/s390/cio/vfio_ccw_cp.c | 39 ++++++++++++++++++------------
> > ----
> >  1 file changed, 21 insertions(+), 18 deletions(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_cp.c
> > b/drivers/s390/cio/vfio_ccw_cp.c
> > index d41d94cecdf8..4b6b5f9dc92d 100644
> > --- a/drivers/s390/cio/vfio_ccw_cp.c
> > +++ b/drivers/s390/cio/vfio_ccw_cp.c
> > @@ -311,40 +311,41 @@ static inline int is_tic_within_range(struct
> > ccw1 *ccw, u32 head, int len)
> >  static struct ccwchain *ccwchain_alloc(struct channel_program *cp,
> > int len)
> >  {
> >         struct ccwchain *chain;
> > -       void *data;
> > -       size_t size;
> > -
> > -       /* Make ccw address aligned to 8. */
> > -       size = ((sizeof(*chain) + 7L) & -8L) +
> > -               sizeof(*chain->ch_ccw) * len +
> > -               sizeof(*chain->ch_pa) * len;
> > -       chain = kzalloc(size, GFP_DMA | GFP_KERNEL);
> > +
> > +       chain = kzalloc(sizeof(*chain), GFP_KERNEL);
> 
> I suppose you could consider a WARN_ONCE here if one of these
> kzalloc'd addresses has something in the low-order 3 bits; would
> probably make it more obvious if for some reason the alignment
> guarantee was broken vs some status after-the-fact in the IRB.  But
> as per our discussion off-list I think that can only happen if
> ARCH_KMALLOC_MINALIGN were to change.

Yeah, maybe, but the "status after-the-fact" is a program check that
would be generated by the channel, just as would be done if the ORB was
located in a similarly-weird location (which we don't check for
either). Since this is all mainline paths, I don't think it makes sense
to re-check all those possible permutations here.

(And, for what it's worth, it's not this allocation that matters, but
rather the one that gets stuffed into the ORB below [1])

> 
> >         if (!chain)
> >                 return NULL;
> >  
> > -       data = (u8 *)chain + ((sizeof(*chain) + 7L) & -8L);
> > -       chain->ch_ccw = (struct ccw1 *)data;
> > -
> > -       data = (u8 *)(chain->ch_ccw) + sizeof(*chain->ch_ccw) *
> > len;
> > -       chain->ch_pa = (struct page_array *)data;
> > +       chain->ch_ccw = kcalloc(len, sizeof(*chain->ch_ccw),
> > GFP_DMA | GFP_KERNEL);

[1]

> > +       if (!chain->ch_ccw)
> > +               goto out_err;
> >  
> > -       chain->ch_len = len;
> > +       chain->ch_pa = kcalloc(len, sizeof(*chain->ch_pa),
> > GFP_KERNEL);
> > +       if (!chain->ch_pa)
> > +               goto out_err;
> >  
> >         list_add_tail(&chain->next, &cp->ccwchain_list);
> >  
> >         return chain;
> > +
> > +out_err:
> > +       kfree(chain->ch_ccw);
> > +       kfree(chain);
> > +       return NULL;
> >  }
> >  
> >  static void ccwchain_free(struct ccwchain *chain)
> >  {
> >         list_del(&chain->next);
> > +       kfree(chain->ch_pa);
> > +       kfree(chain->ch_ccw);
> >         kfree(chain);
> >  }
> >  
> >  /* Free resource for a ccw that allocated memory for its cda. */
> >  static void ccwchain_cda_free(struct ccwchain *chain, int idx)
> >  {
> > -       struct ccw1 *ccw = chain->ch_ccw + idx;
> > +       struct ccw1 *ccw = &chain->ch_ccw[idx];
> >  
> >         if (ccw_is_tic(ccw))
> >                 return;
> > @@ -443,6 +444,8 @@ static int ccwchain_handle_ccw(u32 cda, struct
> > channel_program *cp)
> >         chain = ccwchain_alloc(cp, len);
> >         if (!chain)
> >                 return -ENOMEM;
> > +
> > +       chain->ch_len = len;
> >         chain->ch_iova = cda;
> >  
> >         /* Copy the actual CCWs into the new chain */
> > @@ -464,7 +467,7 @@ static int ccwchain_loop_tic(struct ccwchain
> > *chain, struct channel_program *cp)
> >         int i, ret;
> >  
> >         for (i = 0; i < chain->ch_len; i++) {
> > -               tic = chain->ch_ccw + i;
> > +               tic = &chain->ch_ccw[i];
> 
> These don't seem equivalent...  Before at each iteration you'd offset
> tic by i bytes, now you're treating i as an index of 8B ccw1 structs,
> so it seems like this went from tic = x + i to tic = x + (8 * i)? 
> Was the old code broken or am I missing something? 

I think the latter. :) The old code did one allocation measured in
bytes, stored it in chain, and then calculated locations within that
for ch_ccw and ch_pa, cast to the respective pointer types. (See the
reference [1] above.)

So any use of "i" was an index into the pointer types and thus already
a "8 * i" addition from your example. My intention here was to remove
the pseudo-assembly above, and changed these along the way as I was un-
tangling everything. Looking at the resulting assembly before/after,
these hunks don't end up changing at all so I'll back these changes
back out. Especially since...

> 
> >  
> >                 if (!ccw_is_tic(tic))
> >                         continue;
> > @@ -739,8 +742,8 @@ int cp_prefetch(struct channel_program *cp)
> >         list_for_each_entry(chain, &cp->ccwchain_list, next) {
> >                 len = chain->ch_len;
> >                 for (idx = 0; idx < len; idx++) {
> > -                       ccw = chain->ch_ccw + idx;
> > -                       pa = chain->ch_pa + idx;
> > +                       ccw = &chain->ch_ccw[idx];
> > +                       pa = &chain->ch_pa[idx];
> 
> Same sort of question re: ch_pa

...this prompted me to notice that I didn't change the users of "chain-
>ch_pa + i" when calling page_array_unpin_free(), so now we have both
flavors which isn't ideal.

BEFORE:
                        ccw = chain->ch_ccw + idx;
                        pa = chain->ch_pa + idx;
    1536:       eb 3b 00 01 00 0d       sllg    %r3,%r11,1
    153c:       b9 08 00 3b             agr     %r3,%r11
    1540:       eb 33 00 03 00 0d       sllg    %r3,%r3,3
    1546:       e3 30 80 28 00 08       ag      %r3,40(%r8)
                        ccw = chain->ch_ccw + idx;
    154c:       eb 2b 00 03 00 0d       sllg    %r2,%r11,3
    1552:       e3 20 80 10 00 08       ag      %r2,16(%r8)
AFTER
                        ccw = &chain->ch_ccw[idx];
                        pa = &chain->ch_pa[idx];
    15be:       eb 3b 00 01 00 0d       sllg    %r3,%r11,1
    15c4:       b9 08 00 3b             agr     %r3,%r11
    15c8:       eb 33 00 03 00 0d       sllg    %r3,%r3,3
    15ce:       e3 30 80 28 00 08       ag      %r3,40(%r8)
                        ccw = &chain->ch_ccw[idx];
    15d4:       eb 2b 00 03 00 0d       sllg    %r2,%r11,3
    15da:       e3 20 80 10 00 08       ag      %r2,16(%r8)


> 
> >  
> >                         ret = ccwchain_fetch_one(ccw, pa, cp);
> >                         if (ret)
> 





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux