> -----Original Message----- > From: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> > Sent: Monday, August 12, 2019 11:39 AM > To: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx> > Cc: sashal@xxxxxxxxxx; bhelgaas@xxxxxxxxxx; linux- > hyperv@xxxxxxxxxxxxxxx; linux-pci@xxxxxxxxxxxxxxx; KY Srinivasan > <kys@xxxxxxxxxxxxx>; Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>; > olaf@xxxxxxxxx; vkuznets <vkuznets@xxxxxxxxxx>; linux- > kernel@xxxxxxxxxxxxxxx > Subject: Re: [PATCH v2] PCI: hv: Detect and fix Hyper-V PCI domain number > collision > > On Tue, Aug 06, 2019 at 11:52:11PM +0000, Haiyang Zhang wrote: > > Currently in Azure cloud, for passthrough devices including GPU, the > > host sets the device instance ID's bytes 8 - 15 to a value derived from > > the host HWID, which is the same on all devices in a VM. So, the device > > instance ID's bytes 8 and 9 provided by the host are no longer unique. > > > > This can cause device passthrough to VMs to fail because the bytes 8 and > > 9 is used as PCI domain number. So, as recommended by Azure host team, > > we now use the bytes 4 and 5 which usually contain unique numbers as PCI > > domain. The chance of collision is greatly reduced. In the rare cases of > > collision, we will detect and find another number that is not in use. > > This is not clear at all. Why "finding another number" is fine with > this patch while it is not with current kernel code ? Also does this > have backward compatibility issues ? The bytes 4, 5 have more uniqueness (info entropy) than bytes 8, 9, so we use bytes 4, 5. On older hosts, bytes 4, 5 can also be used -- so it has no backward compatibility issues. > I do not understand if a collision is a problem or not from the > log above. Collision will cause the second device with the same domain number fails to load. I will include these info into the patch description. > > > Thanks to Michael Kelley <mikelley@xxxxxxxxxxxxx> for proposing this > idea. > > Add it as Suggested-by: tag. I will add this line. Thanks, - Haiyang