> From: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> > Sent: Friday, April 29, 2022 3:15 AM > To: Dexuan Cui <decui@xxxxxxxxxxxxx> > Cc: wei.liu@xxxxxxxxxx; KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang > <haiyangz@xxxxxxxxxxxxx>; Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>; > bhelgaas@xxxxxxxxxx; linux-hyperv@xxxxxxxxxxxxxxx; > linux-pci@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Michael Kelley (LINUX) > <mikelley@xxxxxxxxxxxxx>; robh@xxxxxxxxxx; kw@xxxxxxxxx; Jake Oshins > <jakeo@xxxxxxxxxxxxx> > Subject: Re: [PATCH] PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce > VM boot time > > On Tue, Apr 19, 2022 at 03:00:07PM -0700, Dexuan Cui wrote: > > A VM on Azure can have 14 GPUs, and each GPU may have a huge MMIO > BAR, > > e.g. 128 GB. Currently the boot time of such a VM can be 4+ minutes, and > > most of the time is used by the host to unmap/map the vBAR from/to pBAR > > when the VM clears and sets the PCI_COMMAND_MEMORY bit: each > unmap/map > > operation for a 128GB BAR needs about 1.8 seconds, and the pci-hyperv > > driver and the Linux PCI subsystem flip the PCI_COMMAND_MEMORY bit > > eight times (see pci_setup_device() -> pci_read_bases() and > > pci_std_update_resource()), increasing the boot time by 1.8 * 8 = 14.4 > > seconds per GPU, i.e. 14.4 * 14 = 201.6 seconds in total. > > > > Fix the slowness by not turning on the PCI_COMMAND_MEMORY in > pci-hyperv.c, > > so the bit stays in the off state before the PCI device driver calls > > pci_enable_device(): when the bit is off, pci_read_bases() and > > pci_std_update_resource() don't cause Hyper-V to unmap/map the vBARs. > > With this change, the boot time of such a VM is reduced by > > 1.8 * (8-1) * 14 = 176.4 seconds. > > I believe you need to clarify this commit message. It took me a while > to understand what you are really doing. > > What this patch is doing is bootstrapping PCI devices with command > memory clear because there is no need to have it set (?) in the first > place and because, if it is set, the PCI core layer needs to toggle it > on and off in order to eg size BAR regions, which causes the slow down > you are reporting. > > I assume, given the above, that there is strictly no need to have > devices with command memory set at kernel startup handover and if > there was it would not be set in the PCI Hyper-V host controller > driver (because that's what you are _removing_). > > I think this should not be merged as a fix and I'd be careful > about possible regressions before sending it to stable kernels, > if you wish to do so. > > It is fine by me to go via the Hyper-V tree even though I don't > see why that's better, unless you want to send it as a fix and > I think you should not. > > You can add my tag but the commit log should be rewritten and > you should add a Link: to the discussion thread. > > Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> Thanks, Lorenzo! I'll post v2 with the commit message revised. It's ok to me to have this patch go through the hyperv-next branch rather than hyperv-fixes.