On Wed, Sep 02, 2015 at 10:18:32AM +0100, Ian Campbell wrote: > [resending to correct stable address, sorry folks] > > TL;DR: Any backport of 30b03d05e074 to earlier than commit 1401c00e59e > ("xen/gntdev: convert priv->lock to a mutex", which was added in v4.0) > needs $something doing to it, either s/mutex/spinlock/ or (more likely) > backporting of 1401c00e59e too. > > Looking at LTS: > > 3.18.y: Backported both. > 3.16.y: Has backported neither > 3.14.y: * Only backported 30b03d05e074 > 3.12.y: Has backported neither > 3.10.y: * Only backported 30b03d05e074 > 3.4.y: Has backported neither > 3.2.y: Has backported neither > > So AFAICT 3.14.y and 3.10.y need fixes, probably following 3.18 and > backporting 1401c00e59e. > > 3.16/12/4/2 might need to be careful if they subsequently pick up 30b03d05. > Thank you Ian. In fact, I had explicitly dropped 30b03d05e074 ("xen/gntdevt: Fix race condition in gntdev_release()") from the 3.16 kernel and notified stable maintainers about this problem (in a reply to a 3.12 review email). Simply replacing the mutex by the spinlock in this commit seems to cause problems (sleep in atomic) as pointed out by Jiri in other thread. Since 1401c00e59ea ("xen/gntdev: convert priv->lock to a mutex") is a clean cherry-pick for 3.16 (and probably to older kernels as well), I'm happy to pick both commits if you can confirm they are both good for older stable kernels (they seem to be!) Cheers, -- Luís > See below for the build log error. > > On Tue, 2015-09-01 at 11:05 +0100, Ian Campbell wrote: > > On Tue, 2015-09-01 at 10:57 +0100, Ian Jackson wrote: > > > Ian Campbell writes ("Re: [linux-3.14 bisection] complete test-amd64 > > > -i386 > > > -xl-qcow2"): > > > > On Wed, 2015-08-26 at 20:02 +0000, osstest service owner wrote: > > > > > commit 9e6c072a69d87100808d16279d60e9f857291340 > > > > > Author: Marek Marczykowski-Górecki < > > > > > marmarek@xxxxxxxxxxxxxxxxxxxxxx > > > > > > > > > > > Date: Fri Jun 26 03:28:24 2015 +0200 > > > > > > > > > > xen/gntdevt: Fix race condition in gntdev_release() > > > > > > > > I'm not sure what to make of this. > > > > > > > > The qcow2 test is one of the only ones I'd expect to be exercising > > > > gntdev > > > > (most tests use LVM+blkback), which explains why this particular > > > > commit > > > > is > > > > apparently seeing issues due to this particular change. > > > > > > (You mean `which explains why this particular _test_ is [failing]', > > > I think.) > > > > Indeed. > > > > > The host serial log in one of the confirmation tests of 9e6c072a shows > > > serious trouble: > > > > > > http://logs.test-lab.xenproject.org/osstest/logs/60893/test-amd64-i386 > > > -xl-qcow2/serial-huxelrebe1.log > > > > > > Aug 26 19:36:51.841068 [ 738.050547] BUG: unable to handle kernel > > > NULL pointer dereference at 00000014 > > > > > > Aug 26 19:36:56.753068 [ 738.050594] IP: [] > > > __mmu_notifier_invalidate_range_start+0x33/0x70 > > > > > > And the immediately preceding confirmation flight, which got a pass on > > > 9e6c072a~1, seems fine: > > > > > > http://logs.test-lab.xenproject.org/osstest/logs/60892/test-amd64-i386 > > > -xl-qcow2/serial-huxelrebe1.log > > > > > > But, it's difficult to see how that gntdev fix would be responsible > > > for the bug. Perhaps it changes the order in which certain things > > > happen so as to expose another bug. > > > > Or perhaps there was a fix and/or change in behaviour in the mmunotifier > > stuff which the patch relied on but which isn't in 3.14? > > Looking at http://logs.test-lab.xenproject.org/osstest/logs/60949/'s build > jobs: > http://logs.test-lab.xenproject.org/osstest/logs/60949/build-amd64 > -pvops/5.ts-kernel-build.log contains: > > drivers/xen/gntdev.c: In function ‘gntdev_release’: > drivers/xen/gntdev.c:532:2: warning: passing argument 1 of ‘mutex_lock’ > from incompatible pointer type [enabled by default] > In file included from include/linux/notifier.h:13:0, > from include/linux/memory_hotplug.h:6, > from include/linux/mmzone.h:821, > from include/linux/gfp.h:5, > from include/linux/kmod.h:22, > from include/linux/module.h:13, > from drivers/xen/gntdev.c:24: > include/linux/mutex.h:157:13: note: expected ‘struct mutex *’ but > argument is of type ‘struct spinlock_t *’ > drivers/xen/gntdev.c:539:2: warning: passing argument 1 of > ‘mutex_unlock’ from incompatible pointer type [enabled by default] > In file included from include/linux/notifier.h:13:0, > from include/linux/memory_hotplug.h:6, > from include/linux/mmzone.h:821, > from include/linux/gfp.h:5, > from include/linux/kmod.h:22, > from include/linux/module.h:13, > from drivers/xen/gntdev.c:24: > include/linux/mutex.h:174:13: note: expected ‘struct mutex *’ but > argument is of type ‘struct spinlock_t *’ > > Which is somehow a warning and not a build failure. > > Ian. > -- > To unsubscribe from this list: send the line "unsubscribe stable" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html