On 07/20/2017 03:14 PM, Piotr Gackiewicz wrote: > On Thu, 20 Jul 2017, Kevin Stange wrote: > >> On 07/20/2017 05:31 AM, Piotr Gackiewicz wrote: >>> On Wed, 19 Jul 2017, Johnny Hughes wrote: >>> >>>> On 07/19/2017 09:23 AM, Johnny Hughes wrote: >>>>> On 07/19/2017 04:27 AM, Piotr Gackiewicz wrote: >>>>>> On Mon, 17 Jul 2017, Johnny Hughes wrote: >>>>>> >>>>>>> Are the testing kernels (kernel-4.9.37-29.el7 and >>>>>>> kernel-4.9.37-29.el6, >>>>>>> with the one config file change) working for everyone: >>>>>>> >>>>>>> (turn off: CONFIG_IO_STRICT_DEVMEM) >>>>>> >>>>>> Hello. >>>>>> Maybe it's not the most appropriate thread or time, but I have been >>>>>> signalling it before: >>>>>> >>>>>> 4.9.* kernels do not work well for me any more (and for other people >>>>>> neither, as I know). Last stable kernel was 4.9.13-22. >>> >>> I think I have nailed down the faulty combo. >>> My tests showed, that SLUB allocator does not work well in Xen Dom0, on >>> top of Xen Hypervisor. >>> Id does not work at least on one of my testing servers (old AMD K8 (1 >>> proc, >>> 1 core), only 1 paravirt guest). >>> If kernel with SLUB booted as main (w/o Xen hypervisor), it works well. >>> If booted as Xen hypervisor module - it almost instantly gets page >>> allocation failure. >>> >>> >>> SLAB=>SLUB was changed in kernel config, starting from 4.9.25. Then >>> problems >>> started to explode in my production environment, and on testing server >>> mentioned >>> above. >>> >>> After recompiling recent 4.9.34 with SLAB - everything works well on >>> that testing machine. >>> A will try to test 4.9.38 with the same config on my production servers. >> >> I was having page allocation failures on 4.9.25 with SLUB, but these >> problems seem to be gone with 4.9.34 (still with SLUB). Have you >> checked this build? It was moved to the stable repo on July 4th. > > Yes, 4.9.34 was failing too. And this was actually the worst case, with > I/O error on guest: > > Jul 16 06:01:03 dom0 kernel: [452360.743312] CPU: 0 PID: 28450 Comm: > 12.xvda3-0 Tainted: G O 4.9.34-29.el6.x86_64 #1 > Jul 16 06:01:03 guest kernel: end_request: I/O error, dev xvda3, sector > 9200640 > Jul 16 06:01:03 dom0 kernel: [452360.758931] SLUB: Unable to allocate > memory on node -1, gfp=0x2000000(GFP_NOWAIT) > Jul 16 06:01:03 guest kernel: Buffer I/O error on device xvda3, logical > block 1150080 > Jul 16 06:01:03 guest kernel: lost page write due to I/O error on xvda3 > Jul 16 06:01:03 guest kernel: Buffer I/O error on device xvda3, logical > block 1150081 > Jul 16 06:01:03 guest kernel: lost page write due to I/O error on xvda3 > Jul 16 06:01:03 guest kernel: Buffer I/O error on device xvda3, logical > block 1150082 > Jul 16 06:01:03 guest kernel: lost page write due to I/O error on xvda3 > Jul 16 06:01:03 guest kernel: Buffer I/O error on device xvda3, logical > block 1150083 > Jul 16 06:01:03 guest kernel: lost page write due to I/O error on xvda3 > Jul 16 06:01:03 guest kernel: Buffer I/O error on device xvda3, logical > block 1150084 > Jul 16 06:01:03 guest kernel: lost page write due to I/O error on xvda3 > Jul 16 06:01:03 dom0 kernel: [452361.449389] 12.xvda3-0: page allocation > failure: order:0, mode:0x2200000(GFP_NOWAIT|__GFP_NOTRACK) > Jul 16 06:01:03 dom0 kernel: [452361.449685] CPU: 1 PID: 28450 Comm: > 12.xvda3-0 Tainted: G O 4.9.34-29.el6.x86_64 #1 > Jul 16 06:01:03 dom0 kernel: [452361.449934] Hardware name: Supermicro > X8SIL/X8SIL, BIOS 1.0c 02/25/2010 > Jul 16 06:01:03 guest kernel: end_request: I/O error, dev xvda3, sector > 6102784 > Jul 16 06:01:03 dom0 kernel: [452361.462103] SLUB: Unable to allocate > memory on node -1, gfp=0x2000000(GFP_NOWAIT) > Jul 16 06:01:03 dom0 kernel: [452361.676257] 12.xvda3-0: page allocation > failure: order:0, mode:0x2200000(GFP_NOWAIT|__GFP_NOTRACK) > Jul 16 06:01:03 dom0 kernel: [452361.676531] CPU: 0 PID: 28450 Comm: > 12.xvda3-0 Tainted: G O 4.9.34-29.el6.x86_64 #1 > Jul 16 06:01:03 guest kernel: end_request: I/O error, dev xvda3, sector > 6127872 > Jul 16 06:01:03 dom0 kernel: [452361.692171] SLUB: Unable to allocate > memory on node -1, gfp=0x2000000(GFP_NOWAIT) > Jul 16 06:01:07 dom0 kernel: [452365.438565] 12.xvda3-0: page allocation > failure: order:0, mode:0x2200000(GFP_NOWAIT|__GFP_NOTRACK) > Jul 16 06:01:07 dom0 kernel: [452365.438870] CPU: 0 PID: 28450 Comm: > 12.xvda3-0 Tainted: G O 4.9.34-29.el6.x86_64 #1 > Jul 16 06:01:07 dom0 kernel: [452365.454213] SLUB: Unable to allocate > memory on node -1, gfp=0x2000000(GFP_NOWAIT) > Jul 16 06:01:07 guest kernel: end_request: I/O error, dev xvda3, sector > 6477112 > Jul 16 06:01:09 dom0 kernel: [452366.732994] 12.xvda3-0: page allocation > failure: order:0, mode:0x2200000(GFP_NOWAIT|__GFP_NOTRACK) > Jul 16 06:01:09 dom0 kernel: [452366.733306] CPU: 0 PID: 28450 Comm: > 12.xvda3-0 Tainted: G O 4.9.34-29.el6.x86_64 #1 > Jul 16 06:01:09 dom0 kernel: [452366.746362] SLUB: Unable to allocate > memory on node -1, gfp=0x2000000(GFP_NOWAIT) > Jul 16 06:01:09 guest kernel: end_request: I/O error, dev xvda3, sector > 6546488 > Jul 16 06:01:09 guest kernel: Buffer I/O error on device xvda3, logical > block 818311 > Jul 16 06:01:09 guest kernel: lost page write due to I/O error on xvda3 > Jul 16 06:01:09 guest kernel: Buffer I/O error on device xvda3, logical > block 818312 > Jul 16 06:01:09 guest kernel: lost page write due to I/O error on xvda3 > Jul 16 06:01:09 guest kernel: Buffer I/O error on device xvda3, logical > block 818313 > Jul 16 06:01:09 guest kernel: lost page write due to I/O error on xvda3 > Jul 16 06:01:09 guest kernel: Buffer I/O error on device xvda3, logical > block 818314 > Jul 16 06:01:09 guest kernel: lost page write due to I/O error on xvda3 > Jul 16 06:01:09 guest kernel: Buffer I/O error on device xvda3, logical > block 818315 > Jul 16 06:01:09 dom0 kernel: [452366.913734] 12.xvda3-0: page allocation > failure: order:0, mode:0x2200000(GFP_NOWAIT|__GFP_NOTRACK) > Jul 16 06:01:09 dom0 kernel: [452366.914002] CPU: 1 PID: 28450 Comm: > 12.xvda3-0 Tainted: G O 4.9.34-29.el6.x86_64 #1 > Jul 16 06:01:09 guest kernel: end_request: I/O error, dev xvda3, sector > 6366208 > Jul 16 06:01:09 dom0 kernel: [452366.929809] SLUB: Unable to allocate > memory on node -1, gfp=0x2000000(GFP_NOWAIT) > Jul 16 06:01:09 dom0 kernel: [452367.288193] 12.xvda3-0: page allocation > failure: order:0, mode:0x2200000(GFP_NOWAIT|__GFP_NOTRACK) > Jul 16 06:01:09 dom0 kernel: [452367.288455] CPU: 1 PID: 28450 Comm: > 12.xvda3-0 Tainted: G O 4.9.34-29.el6.x86_64 #1 > Jul 16 06:01:09 dom0 kernel: [452367.301690] SLUB: Unable to allocate > memory on node -1, gfp=0x2000000(GFP_NOWAIT) > Jul 16 06:01:09 guest kernel: end_request: I/O error, dev xvda3, sector > 6630656 > Jul 16 06:01:10 dom0 kernel: [452368.253435] 12.xvda3-0: page allocation > failure: order:0, mode:0x2200000(GFP_NOWAIT|__GFP_NOTRACK) > Jul 16 06:01:10 dom0 kernel: [452368.253701] CPU: 0 PID: 28450 Comm: > 12.xvda3-0 Tainted: G O 4.9.34-29.el6.x86_64 #1 > Jul 16 06:01:10 guest kernel: end_request: I/O error, dev xvda3, sector > 6708224 > I will happily create a test kernel with SLAB .. what is your config file diff?
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ CentOS-virt mailing list CentOS-virt@xxxxxxxxxx https://lists.centos.org/mailman/listinfo/centos-virt