Re: [PATCH rdma-core 2/5] kernel-boot: Perform device rename to make stable names

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 08, 2019 at 02:16:31PM +0000, Jason Gunthorpe wrote:
> On Wed, Mar 06, 2019 at 12:08:47PM +0200, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> >
> > Generalize the naming scheme for RDMA devices, so users will always
> > see names based on topology/GUID information. Such naming scheme has
> > big advantage that the names are fully automatic, fully predictable
> > and they stay fixed even if hardware is added or removed (i.e. no
> > reenumeration takes place) and that broken hardware can be replaced
> > seamlessly.
> >
> > The naming policy is possible to chose from NAME_KERNEL, NAME_PCI,
> > NAME_GUID or NAME_FALLBACK, which is controlled by udev rule.
> >
> >  * NAME_KERNEL - don't change names and rely on kernel assignment. This
> >    will keep RDMA names as before. Example: "mlx5_0".
> >  * NAME_PCI - read PCI location and topology as a source for stable names,
> >    which won't change in any software event (reset, PCI probe e.t.c.).
> >    Example: "mlxp0s12f4".
>
> I don't think we should have the vendor/driver name in the stable
> names. Ethernet doesn't do this.
>
> At worst it should be the base technology:
>
> ibp0s12f4
> rocep0s12f4
> iwp0s12f4
> opap0s12f4

It was my initial thought, but mlx4 with his dual mode, where one port
can be IB and another can be RoCE doesn't allow us to use this scheme.
Any idea how to name mlx4?

My second though was to call rdmap0s12f4, but it is too long, there are
drivers which are not RDMA (e.g. EFA) and it losses some information in
naming, because cxgb3 != mlx5 at all. Such general scheme is good for
commodity devices, which is not the case in RDMA.

>
> etc
>
> > +static int by_pci(struct data *d)
> > +{
> > +	char *path, *token, *pci;
> > +	char buf[256];
> > +	long p, s, f;
> > +	ssize_t len;
> > +	int ret;
> > +
> > +	ret = asprintf(&path, "/sys/class/infiniband/%s", d->curr);
> > +	if (ret == -1) {
> > +		path = NULL;
> > +		ret = -ENOMEM;
> > +		goto out;
> > +	}
> > +
> > +	len = readlink(path, buf, sizeof(buf)-1);
> > +	if (len == -1) {
> > +		ret = -EINVAL;
> > +		goto out;
> > +	}
> > +	pci = buf + strlen("../../devices/pci0000:00/");
>
> This is really sketchy.
>
> Do
>   dev_path = realpath(/sys/class/infiniband/%s/device/)
>
> Check that
>
> stat(dev_path + /subsystem).st_inode == stat("/sys/bus/pci").st_inode
>
> To confirm PCI
>
> Strip the last path off to get the domain:B:D.f:
>
> basename(dev_path)

The current implementation is needed for RXE devices connected in VM
over virtio devices, in such case your "basename" will return
virtio_net.

In my fist implementations, I did your suggestion and it is how I came
to this code.

Thanks

>
> Jason

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux