Hi Eric, On 10/12/22 20:33, Marc Zyngier wrote: > Hi Eric, > > Before I comment on this patch, a couple of things that need > addressing: > >> "Cc: marc.zyngier@xxxxxxx, cdall@xxxxxxxxxx" > > None of these two addresses are valid anymore, and haven't been for > several years. > > Please consult the MAINTAINERS file for up-to-date addresses for > current maintainers and reviewers, all of whom should be Cc'd on this > email. I've now added them as well as Eric Auger who has written most > of the ITS migration code, and the new mailing list (the Columbia list > is about to be killed). > > On Wed, 12 Oct 2022 17:59:25 +0100, > Eric Ren <renzhengeek@xxxxxxxxx> wrote: >> >> Reproducer hints: >> 1. Create ARM virt VM with pxb-pcie bus which adds >> extra host bridges, with qemu command like: >> >> ``` >> -device pxb-pcie,bus_nr=8,id=pci.x,numa_node=0,bus=pcie.0 \ >> -device pcie-root-port,..,bus=pci.x \ >> ... >> -device pxb-pcie,bus_nr=37,id=pci.y,numa_node=1,bus=pcie.0 \ >> -device pcie-root-port,..,bus=pci.y \ >> ... >> >> ``` >> 2. Perform VM migration which calls save/restore device tables. >> >> In that setup, we get a big "offset" between 2 device_ids ( >> one is small, another is big), which makes unsigned "len" round >> up a big positive number, causing loop to continue exceptionally. > > You'll have to spell it out for me here. If you have a very sparse > device ID and you are only using a single level device table, you are > bound to have a large len. Now, is the issue that 'size' is so large > that it is negative as an 'int'? Describing the exact situation you're > in would help a lot. > >> >> Signed-off-by: Eric Ren <renzhengeek@xxxxxxxxx> >> --- >> arch/arm64/kvm/vgic/vgic-its.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c >> index 24d7778d1ce6..673554ef02f9 100644 >> --- a/arch/arm64/kvm/vgic/vgic-its.c >> +++ b/arch/arm64/kvm/vgic/vgic-its.c >> @@ -2141,7 +2141,7 @@ static int scan_its_table(struct vgic_its *its, gpa_t base, int size, u32 esz, >> int start_id, entry_fn_t fn, void *opaque) >> { >> struct kvm *kvm = its->dev->kvm; >> - unsigned long len = size; >> + ssize_t len = size; > > This feels wrong, really. If anything, all these types should be > unsigned, not signed. Signed types in this context make very little > sense... After digging into the code back again, I realized I told you something wrong. The next_offset is the delta between the current device id and the next one. The next device can perfectly be in a different L1 device table, - it is your case actually- , in which case the code is definitely broken. So I guess we should rather have a while (true) { ../.. if (byte_offset >= len) break; len -= byte_offset; } You can add a Fixes tag too: Fixes: 920a7a8fa92a ("KVM: arm64: vgic-its: Add infrastructure for table lookup") and cc stable@xxxxxxxxxxxxxxx Thanks Eric > > Thanks, > > M. >