On 15/10/2024 15.20, Jason Gunthorpe wrote: > On Sun, Oct 13, 2024 at 06:54:32PM +0000, Zhi Wang wrote: >> On 27/09/2024 1.51, Jason Gunthorpe wrote: >>> On Sun, Sep 22, 2024 at 05:49:26AM -0700, Zhi Wang wrote: >>>> GSP firmware needs to know the number of max-supported vGPUs when >>>> initialization. >>>> >>>> The field of VF partition count in the GSP WPR2 is required to be set >>>> according to the number of max-supported vGPUs. >>>> >>>> Set the VF partition count in the GSP WPR2 when NVKM is loading the GSP >>>> firmware and initializes the GSP WPR2, if vGPU is enabled. >>> >>> How/why is this different from the SRIOV num_vfs concept? >>> >> >> 1) The VF is considered as an HW interface of vGPU exposed to the VMM/VM. >> >> 2) Number of VF is not always equal to number of max vGPU supported, >> which depends on a) the size of metadata of video memory space allocated >> for FW to manage the vGPUs. b) how user divide the resources. E.g. if a >> card has 48GB video memory, and user creates two vGPUs each has 24GB >> video memory. Only two VFs are usable even SRIOV num_vfs can be large >> than that. > > But that can't be determine at driver load time, the profiling of the > VFs must happen at run time when the orchestation determins what kind > of VM instance type to run. > > Which again gets back to the question of why do you need to specify > the number of VFs at FW boot time? Why isn't it just fully dynamic and > driven on the SRIOV enable? > The FW needs to pre-calculate the reserved video memory for its own use, which includes the size of metadata of max-supported vGPUs. It needs to be decided at the FW loading time. We can always set it to the max number and the trade-off is we lose some usable video memory, at around (549-256)MB so far. > Jason