Re: [PATCH v2 8/8] IB/isert: use ib_client API to wrap ib_device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 11, 2018 at 6:51 PM, Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> On Mon, Jun 11, 2018 at 10:55:39AM +0200, Roman Penyaev wrote:
>
>> can be optimized by the *compiler* (again, only compiler optimization
>> matters):
>>
>>    tmp = *ptr;         <<< NOT NICE
>>    rcu_read_lock();
>>    if (tmp)
>>        do_something();
>>    rcu_read_unlock();
>
> No, this already isn't possible.
>
>>     preempt_count_inc();      <<< just increase percpu variable
>>     barrier();                <<< compiler barrier, which prevents
>>                                   other volatile accesses leak up
>
> barrier() prevents everything from moving out.
>
>>     if (*(volatile int *)ptr) <<< explicit volatile access
>>         do_something();
>
> Alpha inserts a special CPU barrier in here.
>
>> barrier();                <<< compiler barrier, prevents from
>>                              leaking down
>> preempt_count_dec();
>
> The point of rcu_dereference is to pair with rcu_assign_pointer so the
> code can make a control flow decision without re-ordering, eg:
>
>  if (rcu_dereference(ptr) != 0)
>      assert(*something != NULL);
>
> Guarentees that if the store side does
>
>  rcu_assign_pointer(ptr, 0);
>  *something = NULL;
>
> Then the assert cannot fire.
>
> It has nothing to do with accesses leaking outside the rcu critical,
> section.
>
>> Returning to my case:
>>
>>     rcu_read_lock();
>> device = ib_get_client_data(...);
>> ...
>> rcu_read_unlock();
>>
>> Previously I mentioned that ib_get_client_data() function has a
>> spin lock, but this actually also does not matter, what matters
>> is that this is a real function *call*.
>
> No, it does matter. Code should never rely on a function call to
> create an implicit compiler barrier. eg We have LTO now which washes
> that side-effect away.
>
> The spinlock was the right answer, because there is no RCU protected
> data here, and none of the special properties of RCU are required.

Indeed, I missed the alpha case completely.  I was confused with
the volatile cast on x86_64, but it is needed only to prevent compiler
from deducing the resulting pointer, not reordering.  Thanks for
pointing that out.

>
>> >> And it seems that such approach will require some new ib_client
>> >> callbacks to create and free ulp device, which I personally do
>> >> not like.
>> >
>> > Why? Seems like a fine idea to me, and simplifies all the ULPs
>> >
>> > Add two new call backs:
>> >  create_kref
>> >  relase_kref
>> >
>> > And the core code will call create the very first time the kref data
>> > is accessed, and do so safely under some kind of locking.
>>
>> The thing is that inside ib_(get|set)_client_data() spin lock is used,
>> because those two can be called from any context, thus ib_client.alloc()
>> callback can't be called under spin lock, because alloc() can sleep.
>
>> If we assume that each ULP knows that for the first time
>> ib_alloc_or_get_client_data() (or any analogue) is called on the path,
>> where sleep is allowed - then yes, that should work out.  But seems
>> that should be covered by fat comment.
>
> Seems reasonable..
>
>> +struct ib_client_data;
>> +
>>  struct ib_client {
>>         char  *name;
>>         void (*add)   (struct ib_device *);
>>         void (*remove)(struct ib_device *, void *client_data);
>>
>> +       struct ib_client_data *(*alloc_client_data)(struct ib_device *);
>> +       void (*free_client_data)(struct ib_client_data *);
>> +
>
>> +struct ib_client_data {
>> +       struct kref kref;
>> +       struct ib_device *device;
>> +       struct ib_client *client;
>> +       struct list_head device_entry;
>> +       struct list_head client_entry;
>> + };
>
> Well, the point of the alloc callback was to not alloc data until the
> client became activated,

That is true, but from the very beginning that can be used only for
a new code: nvme and iser cases.  Others ULP devs can be switched to
a new API without significant changes, but still allocate data on
add_one() (i.e. preallocate) and then step by step that can be also
changed.  I am not sure that in one go I can refactor all the ULP
devs.

> exposing this struct doesn't really seem to
> help that since you'd still need a void * in here with the data pointer.

Why?  "struct ib_client_data" can be embedded inside ulp_dev specific
structure and then ulp_dev->ib_data can be linked in all the lists
inside device.c, so exposing ib_client_data structure we can avoid
setting void *.

>
>> > Why such a mess to avoid a simple and well defined callback?
>>
>> Indeed we can, if change API and refactor internals of device.c a bit.
>> I can send an RFC in couple of days with making ib_client_data public
>> and the stuff which I described above.  What do you think?
>
> Sure, you can mess around with the core code to make a nicer cleanup.

Ok, I will try to prepare an RFC.

--
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux