在 2019年01月28日 09:55, lijiang 写道: > 在 2019年01月25日 22:32, Lendacky, Thomas 写道: >> On 1/24/19 9:55 PM, dyoung@xxxxxxxxxx wrote: >>> + Tom >>> On 01/25/19 at 11:06am, lijiang wrote: >>>> 在 2019年01月24日 06:16, Kazuhito Hagio 写道: >>>>> On 1/22/2019 3:03 AM, Lianbo Jiang wrote: >>>>>> For AMD machine with SME feature, if SME is enabled in the first >>>>>> kernel, the crashed kernel's page table(pgd/pud/pmd/pte) contains >>>>>> the memory encryption mask, so makedumpfile needs to remove the >>>>>> memory encryption mask to obtain the true physical address. >>>>>> >>>>>> Signed-off-by: Lianbo Jiang <lijiang@xxxxxxxxxx> >>>>>> --- >>>>>> arch/x86_64.c | 3 +++ >>>>>> makedumpfile.c | 1 + >>>>>> 2 files changed, 4 insertions(+) >>>>>> >>>>>> diff --git a/arch/x86_64.c b/arch/x86_64.c >>>>>> index 537fb78..7651d36 100644 >>>>>> --- a/arch/x86_64.c >>>>>> +++ b/arch/x86_64.c >>>>>> @@ -346,6 +346,7 @@ __vtop4_x86_64(unsigned long vaddr, unsigned long pagetable) >>>>>> return NOT_PADDR; >>>>>> } >>>>>> pud_paddr = pgd & ENTRY_MASK; >>>>>> + pud_paddr = pud_paddr & ~(NUMBER(sme_mask)); >>>>>> } >>>>>> >>>>>> /* >>>>>> @@ -371,6 +372,7 @@ __vtop4_x86_64(unsigned long vaddr, unsigned long pagetable) >>>>>> * Get PMD. >>>>>> */ >>>>>> pmd_paddr = pud_pte & ENTRY_MASK; >>>>>> + pmd_paddr = pmd_paddr & ~(NUMBER(sme_mask)); >>>>>> pmd_paddr += pmd_index(vaddr) * sizeof(unsigned long); >>>>>> if (!readmem(PADDR, pmd_paddr, &pmd_pte, sizeof pmd_pte)) { >>>>>> ERRMSG("Can't get pmd_pte (pmd_paddr:%lx).\n", pmd_paddr); >>>>>> @@ -391,6 +393,7 @@ __vtop4_x86_64(unsigned long vaddr, unsigned long pagetable) >>>>>> * Get PTE. >>>>>> */ >>>>>> pte_paddr = pmd_pte & ENTRY_MASK; >>>>>> + pte_paddr = pte_paddr & ~(NUMBER(sme_mask)); >>>>>> pte_paddr += pte_index(vaddr) * sizeof(unsigned long); >>>>>> if (!readmem(PADDR, pte_paddr, &pte, sizeof pte)) { >>>>>> ERRMSG("Can't get pte (pte_paddr:%lx).\n", pte_paddr); >>>>>> diff --git a/makedumpfile.c b/makedumpfile.c >>>>>> index a03aaa1..81c7bb4 100644 >>>>>> --- a/makedumpfile.c >>>>>> +++ b/makedumpfile.c >>>>>> @@ -977,6 +977,7 @@ next_page: >>>>>> read_size = MIN(info->page_size - PAGEOFFSET(paddr), size); >>>>>> >>>>>> pgaddr = PAGEBASE(paddr); >>>>>> + pgaddr = pgaddr & ~(NUMBER(sme_mask)); >>>>> >>>>> Since NUMBER(sme_mask) is initialized with -1 (NOT_FOUND_NUMBER), >>>>> if the sme_mask is not in vmcoreinfo, ~(NUMBER(sme_mask)) will be 0. >>>>> So the four lines added above need >>>>> >>>>> if (NUMBER(sme_mask) != NOT_FOUND_NUMBER) >>>>> ... >>>>> >>>> >>>> Thank you very much for pointing out my mistake. >>>> >>>> I will improve it and post again. >> >> Might be worth creating a local variable that includes ENTRY_MASK and >> NUMBER(sme_mask) so that you make the check just once. Then use that >> variable in place of ENTRY_MASK in the remainder of the function so >> that the correct value is used throughout. >> Ok. >> This would also cover the 5-level path which would make this future >> proof should AMD someday support 5-level paging. >> > > Thank you, Tom. Makedumpfile will cover the 5-level path in next post, > though AMD does not support 5-level paging yet. > I mean that i will improve this patch and cover the 5-level path in patch v2. Thanks. > Thanks. > Lianbo > >>>> >>>>> and, what I'm wondering is whether it doesn't need to take hugepages >>>>> into account such as this >>>>> >>>>> 392 if (pmd_pte & _PAGE_PSE) /* 2MB pages */ >>>>> 393 return (pmd_pte & ENTRY_MASK & PMD_MASK) + >>>>> 394 (vaddr & ~PMD_MASK); >>>>> "arch/x86_64.c" >>>>> >>>> >>>> This is a good question. Theoretically, it should be modified accordingly for >>>> huge pages case. >> >> Yes, this should also have the ~(NUMBER(sme_mask)) applied to it. You >> can probably add some debugging to see if you're hitting this case and >> whether the encryption bit (sme_mask) is set just to help understand what >> is occurring. This also goes for the 1GB page check above. However, if >> you use my suggestion of a local variable then you should be covered. >> Thank you, Tom. I will modify this patch and cover the huge pages case in patch v2. Thanks. Lianbo >> Thanks, >> Tom >> >>>> >>>> But makedumpfile still works well without this change. And i'm sure that the >>>> huge pages are enabled in crashed kernel. This is very strange. >>>> >>>> Thanks. >>>> Lianbo >>>> >>>>> Thanks, >>>>> Kazu >>>>> >>>>> >>>>>> pgbuf = cache_search(pgaddr, read_size); >>>>>> if (!pgbuf) { >>>>>> ++cache_miss; >>>>>> -- >>>>>> 2.17.1 >>>>>> >>>>> >>>>> _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec