On 7/15/21 10:47 AM, Gavin Shan wrote: > Hi Anshuman, > > On 7/14/21 3:26 PM, Anshuman Khandual wrote: >> On 7/13/21 6:50 AM, Gavin Shan wrote: >>> On 7/12/21 2:14 PM, Anshuman Khandual wrote: >>>> Though I have not jumped into the details for all individual >>>> patches here but still have some high level questions below. >>>> >>>> On 7/6/21 11:47 AM, Gavin Shan wrote: >>>>> There are couple of issues with current implementations and this series >>>>> tries to resolve the issues: >>>>> >>>>> (a) All needed information are scattered in variables, passed to various >>>>> test functions. The code is organized in pretty much relaxed fashion. >>>> All these variables are first prepared in debug_vm_pgtable(), before >>>> getting passed into respective individual test functions. Also these >>>> test functions receive only the required number of variables not all. >>>> Adding a structure that captures all test parameters at once before >>>> passing them down will be unnecessary. I am still wondering what will >>>> be the real benefit of this large code churn ? >>>> >>> >>> Thanks for your review. There are couple of reasons to have "struct vm_pgtable_debug". >>> >>> (1) With the struct, the old and new implementation can coexist. In this way, >>> the patches in this series can be stacked up easily. >> >> Makes sense. >> >>> (2) I think passing single struct to individual test functions improves the >>> code readability. Besides, it also makes the empty stubs simplified. >> >> Empty stub simplified - reduced argument set in the empty stubs ? >> > > Yes. > >>> (3) The code can be extended easily if we need in future. >> >> Agreed. >> >>> >>>>> >>>>> (b) The page isn't allocated from buddy during page table entry modifying >>>>> tests. The page can be invalid, conflicting to the implementations >>>>> of set_{pud, pmd, pte}_at() on ARM64. The target page is accessed >>>>> so that the iCache can be flushed when execution permission is given >>>>> on ARM64. Besides, the target page can be unmapped and access to >>>>> it causes kernel crash. >>>> >>>> Using 'start_kernel' based method for struct page usage, enabled this >>>> test to run on platforms which might not have enough memory required >>>> for various individual test functions. This method is not a problem for >>>> tests that just need an aligned pfn (which creates a page table entry) >>>> not a real struct page. >>>> >>>> But not allocating and owning the struct page might be problematic for >>>> tests that expect a real struct page and transform its state via set_ >>>> {pud, pmd, pte}_at() functions as reported here. >>>> >>> >>> Yeah, I totally agree. The series follows what you explained: Except the >>> test cases where set_{pud, pmd, pte}_at() is used, the allocated page >>> is used. For other test cases, 'start_kernel' based PFN is used as before. >>> >>>>> >>>>> "struct vm_pgtable_debug" is introduced to address issue (a). For issue >>>>> (b), the used page is allocated from buddy in page table entry modifying >>>>> tests. The corresponding tets will be skipped if we fail to allocate the >>>>> (huge) page. For other test cases, the original page around to kernel >>>>> symbol (@start_kernel) is still used. >>>> >>>> For all basic pfn requiring tests, existing 'start_kernel' based method >>>> should continue but allocate a struct page for other tests which change >>>> the passed struct page. Skipping the tests when allocation fails is the >>>> right thing to do. >>>> >>> >>> Yes, it's exactly what this series does. Hope you can jump into the details >>> when you get a chance :) >> >> I have already started looking into the series. But still wondering if >> the huge page memory allocation change and the arm64 specific page fix >> should be completed first, before getting into the new structure based >> arguments (in a separate series). Although the end result would still >> remain the same, the transition there would be better I guess. Do you >> see any challenges in achieving that ? >> > > Thanks for your time to review in details. As I can understand, the reason > to have the fix for easy backporting to stable-kernel and I didn't do that > because of couple of facts: (1) The changes included in this series only > affects only one source file, so backporting the whole series isn't hard. > (2) There will be more redundant code if we include the fix before switching > to "struct vm_pgtable_debug". It's unnecessary. Okay. > > So lets keep the patch layout we had if you agree. Actually, the issues are > found during the testing with RHEL downstream kernel. Once it's settled down, > I will backport the whole series to RHEL downstream kernel. Okay, then lets keep this proposed layout and address the issues here. - Anshuman