> From: Leon Romanovsky <leon@xxxxxxxxxx> > Sent: Thursday, October 1, 2020 11:10 PM > > On Thu, Oct 01, 2020 at 05:20:35PM +0000, Ertman, David M wrote: > > > -----Original Message----- > > > From: Leon Romanovsky <leon@xxxxxxxxxx> > > > Sent: Thursday, October 1, 2020 1:32 AM > > > To: Ertman, David M <david.m.ertman@xxxxxxxxx> > > > Cc: linux-rdma@xxxxxxxxxxxxxxx > > > Subject: Re: [PATCH 1/6] Add ancillary bus support > > > > > > On Wed, Sep 30, 2020 at 10:05:29PM -0700, Dave Ertman wrote: > > > > Add support for the Ancillary Bus, ancillary_device and ancillary_driver. > > > > It enables drivers to create an ancillary_device and bind an > > > > ancillary_driver to it. > > > > > > > > The bus supports probe/remove shutdown and suspend/resume > callbacks. > > > > Each ancillary_device has a unique string based id; driver binds > > > > to an ancillary_device based on this id through the bus. > > > > > > > > Co-developed-by: Kiran Patil <kiran.patil@xxxxxxxxx> > > > > Signed-off-by: Kiran Patil <kiran.patil@xxxxxxxxx> > > > > Co-developed-by: Ranjani Sridharan > > > > <ranjani.sridharan@xxxxxxxxxxxxxxx> > > > > Signed-off-by: Ranjani Sridharan > > > > <ranjani.sridharan@xxxxxxxxxxxxxxx> > > > > Co-developed-by: Fred Oh <fred.oh@xxxxxxxxxxxxxxx> > > > > Signed-off-by: Fred Oh <fred.oh@xxxxxxxxxxxxxxx> > > > > Reviewed-by: Pierre-Louis Bossart > > > > <pierre-louis.bossart@xxxxxxxxxxxxxxx> > > > > Reviewed-by: Shiraz Saleem <shiraz.saleem@xxxxxxxxx> > > > > Reviewed-by: Parav Pandit <parav@xxxxxxxxxxxx> > > > > Reviewed-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > > > Signed-off-by: Dave Ertman <david.m.ertman@xxxxxxxxx> > > > > --- > > > > Documentation/driver-api/ancillary_bus.rst | 230 > > > +++++++++++++++++++++ > > > > Documentation/driver-api/index.rst | 1 + > > > > drivers/bus/Kconfig | 3 + > > > > drivers/bus/Makefile | 3 + > > > > drivers/bus/ancillary.c | 191 +++++++++++++++++ > > > > include/linux/ancillary_bus.h | 58 ++++++ > > > > include/linux/mod_devicetable.h | 8 + > > > > scripts/mod/devicetable-offsets.c | 3 + > > > > scripts/mod/file2alias.c | 8 + > > > > 9 files changed, 505 insertions(+) create mode 100644 > > > > Documentation/driver-api/ancillary_bus.rst > > > > create mode 100644 drivers/bus/ancillary.c create mode 100644 > > > > include/linux/ancillary_bus.h > > > > > > > > diff --git a/Documentation/driver-api/ancillary_bus.rst > > > b/Documentation/driver-api/ancillary_bus.rst > > > > new file mode 100644 > > > > index 000000000000..0a11979aa927 > > > > --- /dev/null > > > > +++ b/Documentation/driver-api/ancillary_bus.rst > > > > @@ -0,0 +1,230 @@ > > > > +.. SPDX-License-Identifier: GPL-2.0-only > > > > + > > > > +============= > > > > +Ancillary Bus > > > > +============= > > > > + > > > > +In some subsystems, the functionality of the core device > > > (PCI/ACPI/other) is > > > > +too complex for a single device to be managed as a monolithic > > > > +block or a > > > part of > > > > +the functionality needs to be exposed to a different subsystem. > > > > +Splitting > > > the > > > > +functionality into smaller orthogonal devices would make it > > > > +easier to > > > manage > > > > +data, power management and domain-specific interaction with the > > > hardware. A key > > > > +requirement for such a split is that there is no dependency on a > > > > +physical > > > bus, > > > > +device, register accesses or regmap support. These individual > > > > +devices split > > > from > > > > +the core cannot live on the platform bus as they are not physical > > > > +devices > > > that > > > > +are controlled by DT/ACPI. The same argument applies for not > > > > +using MFD > > > in this > > > > +scenario as MFD relies on individual function devices being > > > > +physical > > > devices > > > > +that are DT enumerated. > > > > + > > > > +An example for this kind of requirement is the audio subsystem > > > > +where a > > > single > > > > +IP is handling multiple entities such as HDMI, Soundwire, local > > > > +devices > > > such as > > > > +mics/speakers etc. The split for the core's functionality can be > > > > +arbitrary or be defined by the DSP firmware topology and include > > > > +hooks for > > > test/debug. This > > > > +allows for the audio core device to be minimal and focused on > > > > +hardware- > > > specific > > > > +control and communication. > > > > + > > > > +The ancillary bus is intended to be minimal, generic and avoid > > > > +domain- > > > specific > > > > +assumptions. Each ancillary_device represents a part of its > > > > +parent functionality. The generic behavior can be extended and > > > > +specialized as > > > needed > > > > +by encapsulating an ancillary_device within other domain-specific > > > structures and > > > > +the use of .ops callbacks. Devices on the ancillary bus do not > > > > +share any structures and the use of a communication channel with > > > > +the parent is domain-specific. > > > > + > > > > +When Should the Ancillary Bus Be Used > > > > +===================================== > > > > + > > > > +The ancillary bus is to be used when a driver and one or more > > > > +kernel > > > modules, > > > > +who share a common header file with the driver, need a mechanism > > > > +to > > > connect and > > > > +provide access to a shared object allocated by the > > > > +ancillary_device's registering driver. The registering driver > > > > +for the ancillary_device(s) and > > > the > > > > +kernel module(s) registering ancillary_drivers can be from the > > > > +same > > > subsystem, > > > > +or from multiple subsystems. > > > > + > > > > +The emphasis here is on a common generic interface that keeps > > > subsystem > > > > +customization out of the bus infrastructure. > > > > + > > > > +One example could be a multi-port PCI network device that is > > > > +rdma- > > > capable and > > > > +needs to export this functionality and attach to an rdma driver > > > > +in another subsystem. The PCI driver will allocate and register > > > > +an ancillary_device for each physical function on the NIC. The > > > > +rdma driver will register an ancillary_driver that will be > > > > +matched with and probed for each of these ancillary_devices. > > > > +This will give the rdma driver access to the shared > > > data/ops > > > > +in the PCI drivers shared object to establish a connection with > > > > +the PCI > > > driver. > > > > + > > > > +Another use case is for the a PCI device to be split out into > > > > +multiple sub functions. For each sub function an > > > > +ancillary_device will be created. A PCI sub function driver will > > > > +bind to such devices that will create its own one or more class > > > > +devices. A PCI sub function ancillary device will likely be > > > > +contained in a struct with additional attributes such as user > > > > +defined sub function number and optional attributes such as > > > > +resources and a link to > > > the > > > > +parent device. These attributes could be used by systemd/udev; > > > > +and > > > hence should > > > > +be initialized before a driver binds to an ancillary_device. > > > > + > > > > +Ancillary Device > > > > +================ > > > > + > > > > +An ancillary_device is created and registered to represent a part > > > > +of its > > > parent > > > > +device's functionality. It is given a name that, combined with > > > > +the > > > registering > > > > +drivers KBUILD_MODNAME, creates a match_name that is used for > > > > +driver > > > binding, > > > > +and an id that combined with the match_name provide a unique name > > > > +to > > > register > > > > +with the bus subsystem. > > > > + > > > > +Registering an ancillary_device is a two-step process. First you > > > > +must call ancillary_device_initialize(), which will check several > > > > +aspects of the ancillary_device struct and perform a > > > > +device_initialize(). After this step completes, any error state > > > > +must have a call to put_device() in its > > > resolution > > > > +path. The second step in registering an ancillary_device is to > > > > +perform a > > > call > > > > +to ancillary_device_add(), which will set the name of the device > > > > +and add > > > the > > > > +device to the bus. > > > > + > > > > +To unregister an ancillary_device, just a call to > > > ancillary_device_unregister() > > > > +is used. This will perform both a device_del() and a put_device(). > > > > > > Why did you chose ancillary_device_initialize() and not > > > ancillary_device_register() to be paired with > ancillary_device_unregister()? > > > > > > Thanks > > > > We originally had a single call to ancillary_device_register() that > > paired with unregister, but there was an ask to separate the register > > into an initialize and add to make the error condition unwind more > compartimentalized. > > It is correct thing to separate, but I would expect: > ancillary_device_register() > ancillary_device_add() > device_initialize(), device_add() and device_unregister() is the pattern widely followed in the core. > vs. > ancillary_device_unregister() > > It is not a big deal, just curious. > > The much more big deal is that I'm required to create 1-to-1 mapping > between device and driver, and I can't connect all my different modules to > one xxx_core.pf.y device in N-to-1 mapping. "N" represents different > protocols (IB, ETH, SCSI) and "1" is one PCI core. For one mlx5 (pf/vf/sf) device, there are three class erivers (ib, vdpa, en). So there should be one ida allocate per mlx5 device type. Static const mlx5_adev_names[MLX5_INTERFACE_PROTOCOL_MAX] = { "rdma", "eth", "vdpa" }; Something like for current mlx5_register_device(), mlx5_register_device() { id = ida_alloc(0, UINT_MAX, GFP_KERNEL); for (i = 0; I < MLX5_INTERFACE_PROTOCOL_MAX; i++) { adev = kzalloc(sizeof(*adev), GFP_KERNEL); adev.name = mlx5_adev_names[i]; adev.id = ret; adev.dev.parent = mlx5_core_dev->device; adev->coredev = mlx5_core_dev; ret = ancillary_device_initialize(&adev); ret = ancillary_device_register(adev); } This will create 3 ancillary devices for each PCI PF/VF/SF. mlx5_core.rdma.1 mlx5_core.eth.1 mlx5_core.vdpa.1 and mlx5_ib driver will do ancillary_driver_register() { For ID of mlx5_core.rdma. } mlx5_vdpa driver does, ancillary_driver_register() { For ID of mlx5_core.vdpa } This is uniform for pf/vf/sf for one or more all protocols.