Re: Question about lock_all_vcpus

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2025-02-10 at 15:57 +0000, Marc Zyngier wrote:
> On Thu, 06 Feb 2025 20:08:10 +0000,
> Maxim Levitsky <mlevitsk@xxxxxxxxxx> wrote:
> > Hi!
> > 
> > KVM on ARM has this function, and it seems to be only used in a couple of places, mostly for
> > initialization.
> > 
> > We recently noticed a CI failure roughly like that:
> 
> Did you only recently noticed because you only recently started
> testing with lockdep? As far as I remember this has been there
> forever.

Hi,

I also think that this is something old, I guess our CI started to
test aarch64 kernels with debug lags enabled or something like that.

> 
> > [  328.171264] BUG: MAX_LOCK_DEPTH too low!
> > [  328.175227] turning off the locking correctness validator.
> > [  328.180726] Please attach the output of /proc/lock_stat to the bug report
> > [  328.187531] depth: 48  max: 48!
> > [  328.190678] 48 locks held by qemu-kvm/11664:
> > [  328.194957]  #0: ffff800086de5ba0 (&kvm->lock){+.+.}-{3:3}, at: kvm_ioctl_create_device+0x174/0x5b0
> > [  328.204048]  #1: ffff0800e78800b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
> > [  328.212521]  #2: ffff07ffeee51e98 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
> > [  328.220991]  #3: ffff0800dc7d80b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
> > [  328.229463]  #4: ffff07ffe0c980b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
> > [  328.237934]  #5: ffff0800a3883c78 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
> > [  328.246405]  #6: ffff07fffbe480b8 (&vcpu->mutex){+.+.}-{3:3}, at: lock_all_vcpus+0x16c/0x2a0
> > 
> > 
> > ..
> > ..
> > ..
> > ..
> > 
> > 
> > As far as I see currently MAX_LOCK_DEPTH is 48 and the number of
> > vCPUs can easily be hundreds.
> 
> 512 exactly. Both of which are pretty arbitrary limits.
> 
> > Do you think that it's possible? or know if there were any efforts
> > to get rid of lock_all_vcpus to avoid this problem? If not possible,
> > maybe we can exclude the lock_all_vcpus from the lockdep validator?
> 
> I'd be very wary of excluding any form of locking from being checked
> by lockdep, and I'd rather we bump MAX_LOCK_DEPTH up if KVM is enabled
> on arm64. it's not like anyone is going to run that in production
> anyway. task_struct may not be happy about that though.
> 
> The alternative is a full stop_machine(), and I don't think that will
> fly either.
> 
> > AFAIK, on x86 most of the similar cases where lock_all_vcpus could
> > be used are handled by assuming and enforcing that userspace will
> > call these functions prior to first vCPU is created an/or run, thus
> > the need for such locking doesn't exist.
> 
> This assertion doesn't hold on arm64, as this ordering requirement
> doesn't exist. We already have a bunch of established VMMs doing
> things in random orders (QEMU being the #1 offender), and the sad
> reality of the Linux ABI means this needs to be supported forever.

Understood.

Best regards,
	Maxim Levitsky

> 
> Thanks,
> 
> 	M.
> 






[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux