On 09/05/2017 02:26 PM, Kevin Stange wrote: > On 09/04/2017 05:27 PM, Johnny Hughes wrote: >> On 09/04/2017 03:59 PM, Kevin Stange wrote: >>> On 09/02/2017 08:11 AM, Johnny Hughes wrote: >>>> On 09/01/2017 02:41 PM, Kevin Stange wrote: >>>>> On 08/31/2017 07:50 AM, PJ Welsh wrote: >>>>>> A recently created and fully functional CentOS 7.3 VM fails to boot >>>>>> after applying CR updates: >>>>> <snip> >>>>>> Server OS is CentOS 7.3 using Xen (no CR updates): >>>>>> rpm -qa xen\* >>>>>> xen-hypervisor-4.6.3-15.el7.x86_64 >>>>>> xen-4.6.3-15.el7.x86_64 >>>>>> xen-licenses-4.6.3-15.el7.x86_64 >>>>>> xen-libs-4.6.3-15.el7.x86_64 >>>>>> xen-runtime-4.6.3-15.el7.x86_64 >>>>>> >>>>>> uname -a >>>>>> Linux tsxen2.xx.com <http://tsxen2.xx.com> 4.9.39-29.el7.x86_64 #1 SMP >>>>>> Fri Jul 21 15:09:00 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux >>>>>> >>>>>> Sadly, the other issue is that the grub menu will not display for me to >>>>>> select another kernel to see if it is just a kernel issue. >>>>>> >>>>>> The dracut prompt does not show any /dev/disk folder either. >>>>>> >>>>> >>>>> I'm seeing this as well. My host is 4.9.44-29 and Xen 4.4.4-26 from >>>>> testing repo, my guest is 3.10.0-693.1.1. Guest boots fine with >>>>> 514.26.2. The kernel messages that appear to kick off the failure for >>>>> me start with a page allocation failure. It eventually reaches dracut >>>>> failures due to systemd/udev not setting up properly, but I think the >>>>> root is this: >>>>> >>>>> [ 1.970630] ------------[ cut here ]------------ >>>>> [ 1.970651] WARNING: CPU: 2 PID: 225 at mm/vmalloc.c:131 >>>>> vmap_page_range_noflush+0x2c1/0x350 >>>>> [ 1.970660] Modules linked in: >>>>> [ 1.970668] CPU: 2 PID: 225 Comm: systemd-udevd Not tainted >>>>> 3.10.0-693.1.1.el7.x86_64 #1 >>>>> [ 1.970677] 0000000000000000 000000008cddc75d ffff8803e8587bd8 >>>>> ffffffff816a3d91 >>>>> [ 1.970688] ffff8803e8587c18 ffffffff810879c8 00000083811c14e8 >>>>> ffff8800066eb000 >>>>> [ 1.970698] 0000000000000001 ffff8803e86d6940 ffffffffc0000000 >>>>> 0000000000000000 >>>>> [ 1.970708] Call Trace: >>>>> [ 1.970725] [<ffffffff816a3d91>] dump_stack+0x19/0x1b >>>>> [ 1.970736] [<ffffffff810879c8>] __warn+0xd8/0x100 >>>>> [ 1.970742] [<ffffffff81087b0d>] warn_slowpath_null+0x1d/0x20 >>>>> [ 1.970748] [<ffffffff811c0781>] vmap_page_range_noflush+0x2c1/0x350 >>>>> [ 1.970758] [<ffffffff811c083e>] map_vm_area+0x2e/0x40 >>>>> [ 1.970765] [<ffffffff811c1590>] __vmalloc_node_range+0x170/0x270 >>>>> [ 1.970774] [<ffffffff810fe754>] ? module_alloc_update_bounds+0x14/0x70 >>>>> [ 1.970781] [<ffffffff810fe754>] ? module_alloc_update_bounds+0x14/0x70 >>>>> [ 1.970792] [<ffffffff8105f143>] module_alloc+0x73/0xd0 >>>>> [ 1.970798] [<ffffffff810fe754>] ? module_alloc_update_bounds+0x14/0x70 >>>>> [ 1.970804] [<ffffffff810fe754>] module_alloc_update_bounds+0x14/0x70 >>>>> [ 1.970811] [<ffffffff810ff2d2>] load_module+0xb02/0x29e0 >>>>> [ 1.970817] [<ffffffff811c0717>] ? vmap_page_range_noflush+0x257/0x350 >>>>> [ 1.970823] [<ffffffff811c083e>] ? map_vm_area+0x2e/0x40 >>>>> [ 1.970829] [<ffffffff811c1590>] ? __vmalloc_node_range+0x170/0x270 >>>>> [ 1.970838] [<ffffffff81101249>] ? SyS_init_module+0x99/0x110 >>>>> [ 1.970846] [<ffffffff81101275>] SyS_init_module+0xc5/0x110 >>>>> [ 1.970856] [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b >>>>> [ 1.970862] ---[ end trace 2117480876ed90d2 ]--- >>>>> [ 1.970869] vmalloc: allocation failure, allocated 24576 of 28672 bytes >>>>> [ 1.970874] systemd-udevd: page allocation failure: order:0, mode:0xd2 >>>>> [ 1.970883] CPU: 2 PID: 225 Comm: systemd-udevd Tainted: G W >>>>> ------------ 3.10.0-693.1.1.el7.x86_64 #1 >>>>> [ 1.970894] 00000000000000d2 000000008cddc75d ffff8803e8587c48 >>>>> ffffffff816a3d91 >>>>> [ 1.970910] ffff8803e8587cd8 ffffffff81188810 ffffffff8190ea38 >>>>> ffff8803e8587c68 >>>>> [ 1.970923] ffffffff00000018 ffff8803e8587ce8 ffff8803e8587c88 >>>>> 000000008cddc75d >>>>> [ 1.970939] Call Trace: >>>>> [ 1.970946] [<ffffffff816a3d91>] dump_stack+0x19/0x1b >>>>> [ 1.970961] [<ffffffff81188810>] warn_alloc_failed+0x110/0x180 >>>>> [ 1.970971] [<ffffffff811c1654>] __vmalloc_node_range+0x234/0x270 >>>>> [ 1.970981] [<ffffffff810fe754>] ? module_alloc_update_bounds+0x14/0x70 >>>>> [ 1.970989] [<ffffffff810fe754>] ? module_alloc_update_bounds+0x14/0x70 >>>>> [ 1.970999] [<ffffffff8105f143>] module_alloc+0x73/0xd0 >>>>> [ 1.971031] [<ffffffff810fe754>] ? module_alloc_update_bounds+0x14/0x70 >>>>> [ 1.971038] [<ffffffff810fe754>] module_alloc_update_bounds+0x14/0x70 >>>>> [ 1.971046] [<ffffffff810ff2d2>] load_module+0xb02/0x29e0 >>>>> [ 1.971052] [<ffffffff811c0717>] ? vmap_page_range_noflush+0x257/0x350 >>>>> [ 1.971061] [<ffffffff811c083e>] ? map_vm_area+0x2e/0x40 >>>>> [ 1.971067] [<ffffffff811c1590>] ? __vmalloc_node_range+0x170/0x270 >>>>> [ 1.971075] [<ffffffff81101249>] ? SyS_init_module+0x99/0x110 >>>>> [ 1.971081] [<ffffffff81101275>] SyS_init_module+0xc5/0x110 >>>>> [ 1.971088] [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b >>>>> [ 1.971094] Mem-Info: >>>>> [ 1.971103] active_anon:875 inactive_anon:2049 isolated_anon:0 >>>>> [ 1.971103] active_file:791 inactive_file:8841 isolated_file:0 >>>>> [ 1.971103] unevictable:0 dirty:0 writeback:0 unstable:0 >>>>> [ 1.971103] slab_reclaimable:1732 slab_unreclaimable:1629 >>>>> [ 1.971103] mapped:1464 shmem:2053 pagetables:480 bounce:0 >>>>> [ 1.971103] free:4065966 free_pcp:763 free_cma:0 >>>>> [ 1.971131] Node 0 DMA free:15912kB min:12kB low:12kB high:16kB >>>>> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB >>>>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15996kB >>>>> managed:15912kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB >>>>> slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB >>>>> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB >>>>> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no >>>>> [ 1.971217] lowmem_reserve[]: 0 4063 16028 16028 >>>>> [ 1.971226] Node 0 DMA32 free:4156584kB min:4104kB low:5128kB >>>>> high:6156kB active_anon:952kB inactive_anon:1924kB active_file:0kB >>>>> inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB >>>>> present:4177920kB managed:4162956kB mlocked:0kB dirty:0kB writeback:0kB >>>>> mapped:4kB shmem:1928kB slab_reclaimable:240kB slab_unreclaimable:504kB >>>>> kernel_stack:32kB pagetables:592kB unstable:0kB bounce:0kB >>>>> free_pcp:1760kB local_pcp:288kB free_cma:0kB writeback_tmp:0kB >>>>> pages_scanned:0 all_unreclaimable? no >>>>> [ 1.971264] lowmem_reserve[]: 0 0 11964 11964 >>>>> [ 1.971273] Node 0 Normal free:12091564kB min:12088kB low:15108kB >>>>> high:18132kB active_anon:2352kB inactive_anon:6272kB active_file:3164kB >>>>> inactive_file:35364kB unevictable:0kB isolated(anon):0kB >>>>> isolated(file):0kB present:12591104kB managed:12251788kB mlocked:0kB >>>>> dirty:0kB writeback:0kB mapped:5852kB shmem:6284kB >>>>> slab_reclaimable:6688kB slab_unreclaimable:6012kB kernel_stack:880kB >>>>> pagetables:1328kB unstable:0kB bounce:0kB free_pcp:1196kB >>>>> local_pcp:152kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 >>>>> all_unreclaimable? no >>>>> [ 1.971309] lowmem_reserve[]: 0 0 0 0 >>>>> [ 1.971316] Node 0 DMA: 0*4kB 1*8kB (U) 0*16kB 1*32kB (U) 2*64kB (U) >>>>> 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = >>>>> 15912kB >>>>> [ 1.971343] Node 0 DMA32: 7*4kB (M) 18*8kB (UM) 7*16kB (EM) 3*32kB >>>>> (EM) 1*64kB (E) 2*128kB (UM) 1*256kB (E) 4*512kB (UM) 4*1024kB (UEM) >>>>> 4*2048kB (EM) 1011*4096kB (M) = 4156348kB >>>>> [ 1.971377] Node 0 Normal: 64*4kB (UEM) 10*8kB (UEM) 6*16kB (EM) >>>>> 3*32kB (EM) 3*64kB (UE) 3*128kB (UEM) 1*256kB (E) 2*512kB (UE) 0*1024kB >>>>> 1*2048kB (M) 2951*4096kB (M) = 12091728kB >>>>> [ 1.971413] Node 0 hugepages_total=0 hugepages_free=0 >>>>> hugepages_surp=0 hugepages_size=2048kB >>>>> [ 1.971425] 11685 total pagecache pages >>>>> [ 1.971430] 0 pages in swap cache >>>>> [ 1.971437] Swap cache stats: add 0, delete 0, find 0/0 >>>>> [ 1.971444] Free swap = 0kB >>>>> [ 1.971451] Total swap = 0kB >>>>> [ 1.971456] 4196255 pages RAM >>>>> [ 1.971462] 0 pages HighMem/MovableOnly >>>>> [ 1.971467] 88591 pages reserved >>>>> >>>> >>>> Do any of you guys have access to RHEL to try the RHEL 7.4 Kernel? >>> >>> I think I may. I haven't tried yet, but I'll see if I can get my hands >>> on one and test it tomorrow when I'm back at the office tomorrow. >>> >>> RH closed my bug as "WONTFIX" so far, saying Red Hat Quality Engineering >>> Management declined the request. I started to look at the Red Hat >>> source browser to see the list of patches from 693 to 514, but getting >>> the full list seems impossible because the change log only goes back to >>> 644 and there doesn't seem to be a way to obtain full builds of >>> unreleased kernels. Unless I'm mistaken. >>> >>> I will also do some digging via RH support if I can. >>> >> I would think that RH would want AWS support for RHEL 7.4 and I thought >> AWS was run on Xen // Note: I could be wrong about that. >> >> In any event, at the very least, we can make a kernel that boots PV for >> 7.4 at some point. > > AWS does run on Xen, but the modifications they make to Xen are not > known to me nor which version of Xen they use. They may also run the > domains as HVM, which seems to mitigate the issue here. > > I just verified this kernel issue exists on a RHEL 7.3 system image > under the same conditions, when it's updated to RHEL 7.4 and kernel > 3.10.0-693.2.1.el7.x86_64. > One other option is to run the DomU's as PVHVM: https://wiki.xen.org/wiki/Xen_Linux_PV_on_HVM_drivers That should be much better performance than HVM and may be a workable solution for people who don't want to modify their VM kernel. Here is more info on PVHVM: https://wiki.xen.org/wiki/PV_on_HVM ================ Also heard from someone to try this Config file change to the base kernel and rebuild: CONFIG_RANDOMIZE_BASE=n
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ CentOS-virt mailing list CentOS-virt@xxxxxxxxxx https://lists.centos.org/mailman/listinfo/centos-virt