On Tue, Jun 21, 2022 at 10:41:46AM -0600, Alex Williamson wrote: > On Mon, 20 Jun 2022 00:49:09 -0300 > Jason Gunthorpe <jgg@xxxxxxxxxx> wrote: > > > On Sun, Jun 19, 2022 at 12:25:50PM +0300, Yishai Hadas wrote: > > > > > Means, staying with a single device_ops but just inline a check whether > > > migration is really supported inside the migration get/set state callbacks > > > and let the core call it unconditionally. > > > > I find it much cleaner to have op == NULL means unsupported. > > > > As soon as you start linking supported/unsupported to other flags it > > can get very complicated fairly fast. I have this experiance from RDMA > > where we've spent a long time now ripping out hundreds of flag tests > > and replacing them with NULL op checks. Many bugs were fixed doing > > this as drivers never fully understood what the flags mean and ended > > up with flags set that their driver doesn't properly implement. > > > > The mistake we made in RDMA was not splitting the ops, instead the ops > > were left mutable so the driver could load the right combination based > > on HW ability. > > I don't really have an issue with splitting the ops, but what > techniques have you learned from RDMA to make drivers setting ops less > ad-hoc than proposed here? Are drivers expected to set ops before a > formally defined registration point? Yes, the flow is the same three step process as in VFIO: vfio_init_group_dev() [driver continues to prepare the vfio_device] vfio_register_group_dev() I included the 'ops' as an argument to vfio_init_group_dev() as a code reduction not a statement that the ops must be fully set before vfio_init_group_dev() returns. The entire point of the init step is to allow the core and driver to co-operate and fully initialize the object before moving to register. So I don't view it as ad-hoc that the object needs further setup after vfio_init_group_dev(). > Is there an API for setting ops or is it open coded per driver? RDMA uses a function ib_set_device_ops() but that is only because there is alot of code in that function to do the copying of ops pointers. Splitting avoids the copying so we don't really need a function. Jason