Re: [PATCH 4/4] dma-buf: nuke reservation_object seq number

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 14, 2019 at 05:42:48PM +0100, Chris Wilson wrote:
> Quoting Daniel Vetter (2019-08-14 16:39:08)
> > Sorry I burried myself in some other stuff ...
> > 
> > On Sat, Aug 10, 2019 at 12:51:00PM +0200, Christian König wrote:
> > > Am 07.08.19 um 16:17 schrieb Chris Wilson:
> > > > Quoting Christian König (2019-08-07 14:53:12)
> > > > > The only remaining use for this is to protect against setting a new exclusive
> > > > > fence while we grab both exclusive and shared. That can also be archived by
> > > > > looking if the exclusive fence has changed or not after completing the
> > > > > operation.
> > > > > 
> > > > > v2: switch setting excl fence to rcu_assign_pointer
> > > > > 
> > > > > Signed-off-by: Christian König <christian.koenig@xxxxxxx>
> > > > > ---
> > > > >   drivers/dma-buf/reservation.c | 24 +++++-------------------
> > > > >   include/linux/reservation.h   |  9 ++-------
> > > > >   2 files changed, 7 insertions(+), 26 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
> > > > > index 90bc6ef03598..f7f4a0858c2a 100644
> > > > > --- a/drivers/dma-buf/reservation.c
> > > > > +++ b/drivers/dma-buf/reservation.c
> > > > > @@ -49,12 +49,6 @@
> > > > >   DEFINE_WD_CLASS(reservation_ww_class);
> > > > >   EXPORT_SYMBOL(reservation_ww_class);
> > > > > -struct lock_class_key reservation_seqcount_class;
> > > > > -EXPORT_SYMBOL(reservation_seqcount_class);
> > > > > -
> > > > > -const char reservation_seqcount_string[] = "reservation_seqcount";
> > > > > -EXPORT_SYMBOL(reservation_seqcount_string);
> > > > > -
> > > > >   /**
> > > > >    * reservation_object_list_alloc - allocate fence list
> > > > >    * @shared_max: number of fences we need space for
> > > > > @@ -103,9 +97,6 @@ static void reservation_object_list_free(struct reservation_object_list *list)
> > > > >   void reservation_object_init(struct reservation_object *obj)
> > > > >   {
> > > > >          ww_mutex_init(&obj->lock, &reservation_ww_class);
> > > > > -
> > > > > -       __seqcount_init(&obj->seq, reservation_seqcount_string,
> > > > > -                       &reservation_seqcount_class);
> > > > >          RCU_INIT_POINTER(obj->fence, NULL);
> > > > >          RCU_INIT_POINTER(obj->fence_excl, NULL);
> > > > >   }
> > > > > @@ -282,12 +273,10 @@ void reservation_object_add_excl_fence(struct reservation_object *obj,
> > > > >                  dma_fence_get(fence);
> > > > >          preempt_disable();
> > > > > -       write_seqcount_begin(&obj->seq);
> > > > > -       /* write_seqcount_begin provides the necessary memory barrier */
> > > > > -       RCU_INIT_POINTER(obj->fence_excl, fence);
> > > > > +       rcu_assign_pointer(obj->fence_excl, fence);
> > > > > +       /* pointer update must be visible before we modify the shared_count */
> > 
> > Pls add a "see reservation_object_fence()" here or similar.
> > 
> > > > >          if (old)
> > > > > -               old->shared_count = 0;
> > > > > -       write_seqcount_end(&obj->seq);
> > > > > +               smp_store_mb(old->shared_count, 0);
> > 
> > So your comment and the kerneldoc don't match up. Quoting
> > Documentation/memory-barriers.txt:
> > 
> >      This assigns the value to the variable and then inserts a full memory
> >      barrier after it.  It isn't guaranteed to insert anything more than a
> >      compiler barrier in a UP compilation.
> > 
> > So order is 1. store 2. fence, but your comment suggests you want it the
> > other way round.
> 
> What's more weird is that it is a fully serialising instruction that is
> used to fence first as part of the update. If that's way PeterZ uses
> it...

I haven't looked at the implementations tbh, just going with the text. Or
do you mean in the write_seqlock that we're replacing?

> 
> > > > >          preempt_enable();
> > > > >          /* inplace update, no shared fences */
> > > > > @@ -368,11 +357,8 @@ int reservation_object_copy_fences(struct reservation_object *dst,
> > > > >          old = reservation_object_get_excl(dst);
> > > > >          preempt_disable();
> > > > > -       write_seqcount_begin(&dst->seq);
> > > > > -       /* write_seqcount_begin provides the necessary memory barrier */
> > > > > -       RCU_INIT_POINTER(dst->fence_excl, new);
> > > > > -       RCU_INIT_POINTER(dst->fence, dst_list);
> > > > > -       write_seqcount_end(&dst->seq);
> > > > > +       rcu_assign_pointer(dst->fence_excl, new);
> > > > > +       rcu_assign_pointer(dst->fence, dst_list);
> > > > >          preempt_enable();
> > > > >          reservation_object_list_free(src_list);
> > > > > diff --git a/include/linux/reservation.h b/include/linux/reservation.h
> > > > > index 044a5cd4af50..fd29baad0be3 100644
> > > > > --- a/include/linux/reservation.h
> > > > > +++ b/include/linux/reservation.h
> > > > > @@ -46,8 +46,6 @@
> > > > >   #include <linux/rcupdate.h>
> > > > >   extern struct ww_class reservation_ww_class;
> > > > > -extern struct lock_class_key reservation_seqcount_class;
> > > > > -extern const char reservation_seqcount_string[];
> > > > >   /**
> > > > >    * struct reservation_object_list - a list of shared fences
> > > > > @@ -71,7 +69,6 @@ struct reservation_object_list {
> > > > >    */
> > > > >   struct reservation_object {
> > > > >          struct ww_mutex lock;
> > > > > -       seqcount_t seq;
> > > > >          struct dma_fence __rcu *fence_excl;
> > > > >          struct reservation_object_list __rcu *fence;
> > > > > @@ -156,14 +153,12 @@ reservation_object_fences(struct reservation_object *obj,
> > > > >                            struct reservation_object_list **list,
> > > > >                            u32 *shared_count)
> > > > >   {
> > > > > -       unsigned int seq;
> > > > > -
> > > > >          do {
> > > > > -               seq = read_seqcount_begin(&obj->seq);
> > > > >                  *excl = rcu_dereference(obj->fence_excl);
> > 
> > I think you need a barrier between this and the read of shared_count
> > below. But rcu_derefence only gives you a dependent barrier, i.e. only
> > stuff that's accesses through this pointer is ordered. Which means the
> > access to ->shared_count, which goes through another pointer, isn't
> > actually ordered.
> 
> Well,
> 
> do {
> 	excl = ...
> 	smp_rmb();
> 	(list, count) = ...
> 	smp_rmb();
> } while (excl != ...)
> 
> would be the straightforward conversion of the seqlock.

Yeah I cheated by looking there, and couldn't convince myself that we
can't drop the first smp_rmb() ...

> 
> > I think the implementation is that it is an unconditional compiler barrier
> > (but that might change), but you're definitely missing the cpu barrier, so
> > a cpue might speculate the entire thing out of order.
> > 
> > I think you need another smb_rmb(); here
> > 
> > 
> > > > >                  *list = rcu_dereference(obj->fence);
> > > > >                  *shared_count = *list ? (*list)->shared_count : 0;
> > > > > -       } while (read_seqcount_retry(&obj->seq, seq));
> > > > > +               smp_rmb(); /* See reservation_object_add_excl_fence */
> > 
> > This fence here I think prevents the re-reading of ->fence_excl from
> > getting hoisted above the critical reads. So this is just the open-coded
> > seqlock retry loop.
> 
> Without the seq.
> 
> The dilemma for dropping the seq would be what if we were to add another
> state here, such as modified or even invalidate.
> 
> > > > > +       } while (rcu_access_pointer(obj->fence_excl) != *excl);
> > 
> > What if someone is real fast (like really real fast) and recycles the
> > exclusive fence so you read the same pointer twice, but everything else
> > changed? reused fence pointer is a lot more likely than seqlock wrapping
> > around.
> 
> It's an exclusive fence. If it is replaced, it must be later than all
> the shared fences (and dependent on them directly or indirectly), and
> so still a consistent snapshot.

I'm not worried about the fence, that part is fine. But we're defacto
using the fence as a fancy seqlock-of-sorts. And if the fence gets reused
and the pointers match, then our seqlock-of-sorts breaks. But I haven't
looked around whether there's more in the code that makes this an
irrelevant issue.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux