Re: [RFC rdma-next 0/1] Lock hardware stats

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 19, 2018 at 10:32:40AM +0200, Leon Romanovsky wrote:
> On Mon, Mar 19, 2018 at 10:06:20AM +0200, Mark Bloch wrote:
> >
> >
> > On 19/03/2018 08:39, Leon Romanovsky wrote:
> > > On Sun, Mar 18, 2018 at 04:38:00PM +0200, Mark Bloch wrote:
> > >>
> > >>
> > >> On 18/03/2018 16:00, Leon Romanovsky wrote:
> > >>> On Sun, Mar 18, 2018 at 01:33:31PM +0000, Mark Bloch wrote:
> > >>>> Today the get_hw_stats() API looks like this:
> > >>>>
> > >>>> get_hw_stats(struct ib_device *ibdev, struct rdma_hw_stats *stats,
> > >>>>              u8 port, int index)
> > >>>>
> > >>>> The driver gets a rdma_hw_stats structure, a port and an index and is given the
> > >>>> possibility to:
> > >>>>
> > >>>> 1) Fill only the counter at location index
> > >>>> 2) Fill all the counters
> > >>>>
> > >>>> Filling the counters is done like so:
> > >>>> 	stats->value[i], 0 <= i < stats->num_counters
> > >>>> The stats structure is shared between all the counters. Today all the drivers
> > >>>> that implement get_hw_stats() always fill all the counters regardless of the
> > >>>> index given.
> > >>>>
> > >>>> This may lead to the following scenario if we have multiple concurrent reads
> > >>>> of counters, cpu 0 (tries to read counter at index 0) cpu 1 (tries to read
> > >>>> counter at index 1)
> > >>>>
> > >>>>            CPU 0                                CPU 1
> > >>>> driver:  stats->value[0] = x;           |       ....
> > >>>>              ....                       |       ....
> > >>>>              ....                       |       ....
> > >>>> ib_core: return stats->value[0] to user | driver: stats->value[0] = y;
> > >>>>
> > >>>> We end up with read/write to the same location at the same time.
> > >>>> Which can lead to providing an incorrect value to the user. This RFC introduces
> > >>>> a lock to protect against that.
> > >>>>
> > >>>
> > >>> As I said in offline discussion, it solves non-existent problem.
> > >>> SW (kernel) doesn't guarantee that counters are correct every time you
> > >>> snapshot them. All CPUs are calling to FW prior to counter update, so
> > >>> they will write correct data, just a little bit out-of-sync.
> > >>
> > >> You assume here the driver/compiler assigns only once to stats->value[i] (which you can't)
> > >> and that read/write are atomic (which you also can't).
> > >
> > > I'm not assuming any driver/compiler optimizations, but the fact that
> > > locking is needed to provide concurrent read/write access to shared
> > > variables. In case of hw_stats, every thread is providing its struct
> > > to be filled. Your RFC patch supports it very clearly, by calling
> > > with hw_stats from the stack.
> >
> > ?
> > hw_stats is allocated and attached to port or (and) to a device.
> > see: setup_hw_stats().
> > The callback that is called when trying to read a counter is show_hw_stats()
> > it takes the hw_stats from the port (or the device) and passes it
> > into the driver as an argument (see get_hw_stats() callback provided by drivers).
> >
> > I don't understand the stack reference.
>
>  794 static ssize_t show_hw_stats(struct kobject *kobj, struct attribute  *attr,
>  795                              char *buf)
>  796 {
>  797         struct ib_device *dev;
>  798         struct ib_port *port;
>  799         struct hw_stats_attribute *hsa;
>  800         struct rdma_hw_stats *stats;
>  801         int ret;
>  802
>  803         hsa = container_of(attr, struct  hw_stats_attribute, attr);
>  804         if (!hsa->port_num) {
>  805                 dev = container_of((struct device *)kobj,
>  806                                    struct ib_device,  dev);
>  807                 stats = dev->hw_stats;
>  808         } else {
>  809                 port = container_of(kobj, struct ib_port, kobj);
>  810                 dev = port->ibdev;
>  811                 stats = port->hw_stats;

Mark pointed to me that "stats" is actually pointer to shared memory.

Thanks

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux