大道 憲一 wrote: > Hi, I've found a problem on crash 4.0-2. > > On x86_64 system, crash causes segmentation fault > by executing "bt -f" for the dumpfile created by NMI. > > My System is as follows. > CPU : AMD Opteron(tm) Processor 252 > arch : x86_64 > memory : 16GB > kernel : 2.6.9-22.EL (RHEL4-U2) > crash : 4.0-2 (RHEL4-U2) > diskdumputils: 1.1.9-4 (RHEL4-U2) > > The reproduction step is as follows. > 1.Boot x86_64 kernel. > 2.Start diskdump service. > 3.Execute diskdump by pushing the NMI button. > 4.Reboot x86_64 kernel. > 5.Get the dumpfile by starting diskdump service. > 6.Activate crash and execute "bt -f" for the dumpfile. > 7.Segmentation fault after printing exception stack. > > After printing the NMI exception frame, x86_64_low_budget_back_trace_cmd > calculates the next bt->frameptr without changing RSP. This will cause > the condition > bt->frameptr > rsp > in line x86_64.c:1097 at x86_64_display_full_frame, > causing the following loop to run continuously until it stops with > a segmentation fault. > The attached patch adds the sanity check (bt->frameptr < rsp) in > x86_64_display_full_frame. > > The following example describes this problem when NMI occurs within > "default_idle". > > @x86_64_low_budget_back_trace_cmd (x86_64.c:1367) > 1.about Exception Stack (x86_64.c:1416) > a. Print Exception Stack. > b. Print Register Info(RIP,RSP) from Exception Stack as function before NMI exception. > The RIP points the text in "default_idle". > But the area pointed by RSP keeps the address of the text in "cpu_idle", > because RSP doesn't change while "default_idle" is running. > c. bt->frameptr = RSP + sizeof(ulong). > > 2.about Process Stack (x86_64.c:1655) > a. Try to print stack of "cpu_idle" in x86_64_display_full_frame. > b. bt->frameptr > RSP because of Section 1.c. > c. Cause segmentation fault. > > Ken'ichi Ohmichi > > Hello Ken'ichi, Excellent -- thanks for fixing this! Dave