On Wed, 8 Feb 2023 15:48:26 +0100 Nico Boehr <nrb@xxxxxxxxxxxxx> wrote: > The KVM_S390_GET_CMMA_BITS ioctl may return incorrect values when userspace > specifies a start_gfn outside of memslots. > > This can occur when a VM has multiple memslots with a hole in between: > > +-----+----------+--------+--------+ > | ... | Slot N-1 | <hole> | Slot N | > +-----+----------+--------+--------+ > ^ ^ ^ ^ > | | | | > GFN A A+B | | > A+B+C | > A+B+C+D > > When userspace specifies a GFN in [A+B, A+B+C), it would expect to get the > CMMA values of the first dirty page in Slot N. However, userspace may get a > start_gfn of A+B+C+D with a count of 0, hence completely skipping over any > dirty pages in slot N. > > The error is in kvm_s390_next_dirty_cmma(), which assumes > gfn_to_memslot_approx() will return the memslot _below_ the specified GFN > when the specified GFN lies outside a memslot. In reality it may return > either the memslot below or above the specified GFN. > > When a memslot above the specified GFN is returned this happens: > > - ofs is calculated, but since the memslot's base_gfn is larger than the > specified cur_gfn, ofs will underflow to a huge number. > - ofs is passed to find_next_bit(). Since ofs will exceed the memslot's > number of pages, the number of pages in the memslot is returned, > completely skipping over all bits in the memslot userspace would be > interested in. > > Fix this by resetting ofs to zero when a memslot _above_ cur_gfn is > returned (cur_gfn < ms->base_gfn). > > Signed-off-by: Nico Boehr <nrb@xxxxxxxxxxxxx> Reviewed-by: Claudio Imbrenda <imbrenda@xxxxxxxxxxxxx> > --- > arch/s390/kvm/kvm-s390.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c > index e4890e04b210..a171a66681b4 100644 > --- a/arch/s390/kvm/kvm-s390.c > +++ b/arch/s390/kvm/kvm-s390.c > @@ -2158,6 +2158,10 @@ static unsigned long kvm_s390_next_dirty_cmma(struct kvm_memslots *slots, > ms = container_of(mnode, struct kvm_memory_slot, gfn_node[slots->node_idx]); > ofs = 0; > } > + > + if (cur_gfn < ms->base_gfn) > + ofs = 0; > + > ofs = find_next_bit(kvm_second_dirty_bitmap(ms), ms->npages, ofs); > while (ofs >= ms->npages && (mnode = rb_next(mnode))) { > ms = container_of(mnode, struct kvm_memory_slot, gfn_node[slots->node_idx]);