On Fri, 24 Sep 2021, Alex Williamson wrote: > On Tue, 14 Sep 2021 17:11:55 +0800 > Colin Xu <colin.xu@xxxxxxxxx> wrote: > >> Due to historical reason, some legacy shipped system doesn't follow >> OpRegion 2.1 spec but still stick to OpRegion 2.0, in which the extended >> VBT is not contiguous after OpRegion in physical address, but any >> location pointed by RVDA via absolute address. Also although current >> OpRegion 2.1+ systems appears that the extended VBT follows OpRegion, >> RVDA is the relative address to OpRegion head, the extended VBT location >> may change to non-contiguous to OpRegion. In both cases, it's impossible >> to map a contiguous range to hold both OpRegion and the extended VBT and >> expose via one vfio region. >> >> The only difference between OpRegion 2.0 and 2.1 is where extended >> VBT is stored: For 2.0, RVDA is the absolute address of extended VBT >> while for 2.1, RVDA is the relative address of extended VBT to OpRegion >> baes, and there is no other difference between OpRegion 2.0 and 2.1. >> To support the non-contiguous region case as described, the updated read >> op will patch OpRegion version and RVDA on-the-fly accordingly. So that >> from vfio igd OpRegion view, only 2.1+ with contiguous extended VBT >> after OpRegion is exposed, regardless the underneath host OpRegion is >> 2.0 or 2.1+. The mechanism makes it possible to support legacy OpRegion >> 2.0 extended VBT systems with on the market, and support OpRegion 2.1+ >> where the extended VBT isn't contiguous after OpRegion. >> >> V2: >> Validate RVDA for 2.1+ before increasing total size. (Alex) >> >> V3: (Alex) >> Split read and write ops. >> On-the-fly modify OpRegion version and RVDA. >> Fix sparse error on assign value to casted pointer. >> >> V4: (Alex) >> No need support write op. >> Direct copy to user buffer with several shift instead of shadow. >> Copy helper to copy to user buffer and shift offset. >> >> V5: (Alex) >> Simplify copy help to only cover common shift case. >> Don't cache patched version and rvda. Patch on copy if necessary. >> >> V6: >> Fix comment typo and max line width. >> >> Cc: Zhenyu Wang <zhenyuw@xxxxxxxxxxxxxxx> >> Cc: Hang Yuan <hang.yuan@xxxxxxxxxxxxxxx> >> Cc: Swee Yee Fonn <swee.yee.fonn@xxxxxxxxx> >> Cc: Fred Gao <fred.gao@xxxxxxxxx> >> Signed-off-by: Colin Xu <colin.xu@xxxxxxxxx> >> --- >> drivers/vfio/pci/vfio_pci_igd.c | 231 ++++++++++++++++++++++++-------- >> 1 file changed, 173 insertions(+), 58 deletions(-) >> >> diff --git a/drivers/vfio/pci/vfio_pci_igd.c b/drivers/vfio/pci/vfio_pci_igd.c >> index 228df565e9bc..081be59c7948 100644 >> --- a/drivers/vfio/pci/vfio_pci_igd.c >> +++ b/drivers/vfio/pci/vfio_pci_igd.c >> @@ -25,19 +25,119 @@ >> #define OPREGION_RVDS 0x3c2 >> #define OPREGION_VERSION 0x16 >> >> +struct igd_opregion_vbt { >> + void *opregion; >> + void *vbt_ex; >> +}; >> + >> +/** >> + * igd_opregion_shift_copy() - Copy OpRegion to user buffer and shift position. >> + * @dst: User buffer ptr to copy to. >> + * @off: Offset to user buffer ptr. Increased by bytes on return. >> + * @src: Source buffer to copy from. >> + * @pos: Increased by bytes on return. >> + * @remaining: Decreased by bytes on return. >> + * @bytes: Bytes to copy and adjust off, pos and remaining. >> + * >> + * Copy OpRegion to offset from specific source ptr and shift the offset. >> + * >> + * Return: 0 on success, -EFAULT otherwise. >> + * >> + */ >> +static inline unsigned long igd_opregion_shift_copy(char __user *dst, >> + loff_t *off, >> + void *src, >> + loff_t *pos, >> + loff_t *remaining, >> + loff_t bytes) > > @bytes and @remaining should be size_t throughout. > Fixed in v7. At first min() reports __careful_cmp() warning so I change to loff_t. But the proper is way to do a cast. >> +{ >> + if (copy_to_user(dst + (*off), src, bytes)) >> + return -EFAULT; >> + >> + *off += bytes; >> + *pos += bytes; >> + *remaining -= bytes; >> + >> + return 0; >> +} >> + >> static size_t vfio_pci_igd_rw(struct vfio_pci_device *vdev, char __user *buf, >> size_t count, loff_t *ppos, bool iswrite) >> { >> unsigned int i = VFIO_PCI_OFFSET_TO_INDEX(*ppos) - VFIO_PCI_NUM_REGIONS; >> - void *base = vdev->region[i].data; >> - loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK; >> + struct igd_opregion_vbt *opregionvbt = vdev->region[i].data; >> + loff_t pos = *ppos & VFIO_PCI_OFFSET_MASK, off = 0, remaining; >> >> if (pos >= vdev->region[i].size || iswrite) >> return -EINVAL; >> >> count = min(count, (size_t)(vdev->region[i].size - pos)); >> + remaining = count; >> + >> + /* Copy until OpRegion version */ >> + if (remaining && pos < OPREGION_VERSION) { >> + loff_t bytes = min(remaining, OPREGION_VERSION - pos); >> + >> + if (igd_opregion_shift_copy(buf, &off, >> + opregionvbt->opregion + pos, &pos, >> + &remaining, bytes)) >> + return -EFAULT; >> + } >> + >> + /* Copy patched (if necessary) OpRegion version */ >> + if (remaining && pos < OPREGION_VERSION + sizeof(__le16)) { >> + loff_t bytes = min(remaining, >> + OPREGION_VERSION + (loff_t)sizeof(__le16) - >> + pos); >> + __le16 version = *(__le16 *)(opregionvbt->opregion + >> + OPREGION_VERSION); >> + >> + /* Patch to 2.1 if OpRegion 2.0 has extended VBT */ >> + if (le16_to_cpu(version) == 0x0200 && opregionvbt->vbt_ex) >> + version = cpu_to_le16(0x0201); >> + >> + if (igd_opregion_shift_copy(buf, &off, >> + &version, &pos, >> + &remaining, bytes)) > > This looks wrong, what if pos was (OPREGION_VERSION + 1)? We'd copy > the first byte instead of the second. We need to add (pos - > OPREGION_VERSION) to the source. > > Fixed in v7. Thanks for the catching. >> + return -EFAULT; >> + } >> + >> + /* Copy until RVDA */ >> + if (remaining && pos < OPREGION_RVDA) { >> + loff_t bytes = min((loff_t)remaining, OPREGION_RVDA - pos); >> + >> + if (igd_opregion_shift_copy(buf, &off, >> + opregionvbt->opregion + pos, &pos, >> + &remaining, bytes)) >> + return -EFAULT; >> + } >> + >> + /* Copy modified (if necessary) RVDA */ >> + if (remaining && pos < OPREGION_RVDA + sizeof(__le64)) { >> + loff_t bytes = min(remaining, OPREGION_RVDA + >> + (loff_t)sizeof(__le64) - pos); >> + __le64 rvda = cpu_to_le64(opregionvbt->vbt_ex ? >> + OPREGION_SIZE : 0); >> + >> + if (igd_opregion_shift_copy(buf, &off, >> + &rvda, &pos, >> + &remaining, bytes)) > > Same here, + (pos - OPREGION_RVDA) > >> + return -EFAULT; >> + } >> >> - if (copy_to_user(buf, base + pos, count)) >> + /* Copy the rest of OpRegion */ >> + if (remaining && pos < OPREGION_SIZE) { >> + loff_t bytes = min(remaining, OPREGION_SIZE - pos); >> + >> + if (igd_opregion_shift_copy(buf, &off, >> + opregionvbt->opregion + pos, &pos, >> + &remaining, bytes)) >> + return -EFAULT; >> + } >> + >> + /* Copy extended VBT if exists */ >> + if (remaining && >> + copy_to_user(buf + off, opregionvbt->vbt_ex, remaining)) > > And here, + (pos - OPREGION_SIZE) > > Also this doesn't apply to mainline, please rebase to linux-next or at > least the latest rc kernel. Thanks, > I noticed there comes some patch on vfio-pci. Now rebased to linux-next. > Alex > > btw, I'm not working in Intel currently so continue the revise with personal email. -- Best Regards, Colin Xu