Hi Dexuan, On Mon, Aug 25, 2014 at 02:02:21PM +0000, Dexuan Cui wrote: > > -----Original Message----- > > From: Sitsofe Wheeler > > Sent: Wednesday, August 20, 2014 17:27 PM > > > > While booting a Hyper-V 3.17.0-rc1 guest on a 2012 R2 host a BUG was > > triggered while registering hyperv_fb which in turn caused a panic. > > Various kernel debugging options (CONFIG_DEBUG_PAGEALLOC, > > CONFIG_SLUB_DEBUG=y...) were on at the time. This only seems to happen > > if the guest is being booted with only one CPU allocated to it. > > I can reproduce the exact issue with the same commit + your kconfig + UP > guest (SMP guest seems ok.) Thanks for getting back - I was wondering if my mails had dropped into a black hole as I haven't heard anything on any of them for a few days (and no one had mentioned they had been able to reproduce the issues reported). > > [ 7.645526] hv_vmbus: registering driver hyperv_fb > > [ 7.657553] BUG: unable to handle kernel paging request at > > ffff880077800004 > > [ 7.658224] IP: [<ffffffff8159a7ac>] hv_ringbuffer_write+0x7c/0x150 > > [ 7.658224] PGD 2da9067 PUD 2dac067 PMD 7fa27067 PTE > > 8000000077800060 > > [ 7.658224] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC > It seems > hv_ringbuffer_write() -> > hv_get_ringbuffer_availbytes(): > reading rbi->ring_buffer->read_index causes a page fault. > > It looks rbi->ring_buffer was unmapped somehow according to the > semantics of CONFIG_DEBUG_PAGEALLOC??? Or, was there a memory > corruption somewhere? > > It looks the panic will disappear if the guest isn't configured with a > "Network Adapter ". This sounds very fishy as if network setup has left things in a bad state. What is baffles me is the whole UP vs SMP thing - why would UP make this show up consistently? Perhaps some assertions could be added to check that rbi->ring_buffer still has sane values in it after operations on it are finished? I guess you could try switching things around and using kmemcheck (https://www.kernel.org/doc/Documentation/kmemcheck.txt ). If the whole area close to rbi->ring_buffer->read_index is being stomped on it should show up. If it's just being set to a duff value or freed that going to be harder to track down although poisoning before freeing should allow us to distinguish that case... >From your analysis this doesn't sound framebuffer related - perhaps we could drop the linuxfb CC's on these mails going forward? -- Sitsofe | http://sucs.org/~sits/ _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel