On Wed, Mar 4, 2020 at 2:49 PM Dave Anderson <anderson@xxxxxxxxxxxx> wrote: > > > Hello List, > > > > I've a two ELF coredumps from two different HyperV VMs generated by this > > tool (https://github.com/Azure/azure-linux-utils/tree/master/vm2core). > > > > Crash works with one of these coredumps but do not work with other. > > > > I've placed the output generated by crash tool here: > > > > Not ok with crash: > > ./crash/crash /usr/lib/debug/boot/vmlinux-4.15.0-88-generic > > vm1_numa_4gb_5cpu.coredump --kaslr 600000 -m phys_base=4355784704 -d8 > > https://raw.githubusercontent.com/santoshx/temp/master/notok_with_crash.txt > > > > Ok with crash: > > ./crash/crash /usr/lib/debug/boot/vmlinux-4.15.0-88-generic > > vm1_nonuma_4gb_5cpu.coredump --kaslr 3c00000 -m phys_base=2344615936 -d8 > > https://raw.githubusercontent.com/santoshx/temp/master/ok_with_crash.txt > > > > > > The problem I see that in non-working case crash fails to detect correct > > cpu_possible_mask: > > > > Relevant part of $ diff ok_with_crash.txt notok_with_crash.txt: > > > > < cpu_active_mask: cpus: 0 1 2 3 4 > > < FREEBUF(0) > > < <readmem: ffffffff86039f40, KVADDR, "pv_init_ops", 8, (ROE), > > 7ffe01722870> > > < <read_kdump: addr: ffffffff86039f40 paddr: 91c39f40 cnt: 8> > > < read_netdump: addr: ffffffff86039f40 paddr: 91c39f40 cnt: 8 offset: > > 91c3a760 > > --- > >> <readmem: ffffffff826f2b60, KVADDR, "possible", 1024, (ROE), > >> 5638a35a2280> > >> <read_kdump: addr: ffffffff826f2b60 paddr: 1060f2b60 cnt: 1024> > >> read_netdump: addr: ffffffff826f2b60 paddr: 1060f2b60 cnt: 1024 offset: > >> fe0f3380 > >> cpu_possible_mask: cpus: 3 4 5 6 8 13 14 18 20 21 22 26 28 29 30 33 36 > >> 37 38 48 49 52 53 54 56 59 60 61 62 64 65 68 69 70 72 73 74 75 76 78 82 > >> 83 85 86 90 91 93 94 96 99 101 102 104 105 108 109 110 114 116 117 118 > >> 123 124 125 126 128 133 134 138 140 141 142 146 148 149 150 153 156 157 > >> 158 168 169 172 173 174 176 179 180 181 182 184 185 188 189 190 192 193 > >> 194 195 196 198 200 202 205 206 211 212 213 214 216 219 221 222 226 228 > >> 229 230 232 233 234 235 236 238 242 243 245 246 248 251 253 254 256 257 > >> 260 261 262 266 268 269 270 275 276 277 278 280 285 286 290 292 293 294 > >> 298 300 301 302 305 308 309 310 320 321 324 325 326 328 331 332 333 334 > >> 336 337 340 341 342 344 345 346 347 348 350 352 354 357 358 361 362 363 > >> 365 366 370 372 373 374 376 378 381 382 385 388 389 390 392 393 394 395 > >> 396 398 402 403 405 406 408 411 413 414 416 417 420 421 422 426 428 429 > >> 430 435 436 437 438 440 445 446 450 452 453 454 458 460 461 462 465 468 > >> 469 470 480 481 484 485 486 488 491 492 493 494 496 497 500 50 > > 1 502 504 505 506 507 508 510 514 515 517 518 520 523 525 526 528 529 532 > > 533 534 538 540 541 542 547 548 549 > > > > I'm trying to find where the problem is? in the crash too or the tool that > > generated the ELF coredumps? > > I suspect that it's a problem with either the --kaslr offset and/or > the phys_base value that you have used. Is there method to know or print kaslr & phy_base in a running Linux system? > > It appears that the read of the cpu_possible mask is not using the > correct virtual address, or perhaps the wrong physical address, and > as a result it is trying to translate bogus data. In fact, the full > output txt file shows that every thing that it reads is garbage, e.g., > the cpu masks, the utsname data structure, the linux_banner string, etc. > > Dave > > > -- > Crash-utility mailing list > Crash-utility@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/crash-utility > -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/crash-utility