On 2020-01-10 at 14:22 Michal Hocko wrote: >On Fri 10-01-20 10:52:56, Li Xinhai wrote: >> On 2020-01-10 at 07:00 Mike Kravetz wrote: >> >On 1/9/20 2:48 PM, Li Xinhai wrote: >> >> oops, I didn't write the code correctly. I should wrote it as >> >> >> >> if (pfn >= hpage_pfn && pfn - hpage_pfn < hpage_nr_pages(hpage)) { >> >> VM_BUG_ON_PAGE(PageHuge(hpage) && pfn != hpage_pfn, hpage); >> >> return true; >> >> } >> >> >> >> return false; >> >> >> >> hpage_nr_pages(hpage) give us HPAGE_PMD_NR for THP and hugetlbfs page, >> >> but remapping PTE to a differrnt hugetlbfs page still allowed, so put the BUG code >> >> into this condition is necessary. By this way, if it was not a exact match for PageHuge, >> >> then it is a bug. >> > >> >Thank you. I think we all agree on what the proposed code is doing. >> >However, we would like to know why you believe this code should be added. >> >For example, >> >- Did you actually encounter this situation (PageHuge(hpage) && pfn != >> > hpage_pfn)? >> >- Did you discover some code path where we are likely to encounter this >> > situation? >> >- Some other reason? >> >> I didn't actually encounter this condition. >> >> There are two ways for faulty code, >> 1. one is from the 'hpage', it could be head or tail page of hugetlbfs (I see that >> current code make sure always call with head page as you mentioned). Luckly, >> we catch the tail page case as BUG at begining of this mapped_walk(the >> page_hstate(page) return NULL for tail page). >> 2. The other is from the content stored in the PTE, wihch we used as 'pfn' and >> compare with 'hpage'. >> >> Current code matches 'pfn' and 'hpage' like below: >> - normal 4k page: hpage_pfn <= pfn < hpage_pfn + 1 >> - THP, hugetlbfs page: hpage_pfn <= pfn < hpage_pfn + HPAGE_PMD_NR >> we need do exact match for normal 4K page and hugetlbfs page, and range >> match for THP. > >This still doesn't really explain why to add the BUG_ON, I am afraid. >pfn_in_hpage is called from the vma walk. check_pte is reponsible to >check the page table mapping so the input to pfn_in_hpage should be >already sanitized. If it is not then that should be fixed and {VM_}BUG_ON >is not the best way to do such a sanitization IMHO. First of all this is >all under locks so crashing would likely mean a follow up problems. >On the other hand a failure can be handled gracefully AFAICS. > >That being said I still do not see how this is going to help with >anything. Please note that adding {VM}_BUG_ON as general asserts is >strongly discouraged. Those should be added only when there is a >data corruption detected (and then it should likely be BUG_ON rather >than VM_BUG_ON) that cannot be handled gracefully or when it >considerably improves debugability of very subtle problems. > pfn_in_hpage() has purpose to check all those three types of page, not just THP as implied in its name. Change name as below would be clear static inline bool pfn_is_matched(struct page *page, unsigned long pfn) { unsigned long page_pfn = page_to_pfn(page); /* normal page and hugetlbfs page */ if (!PageCompound(page) || PageHuge(page)) return page_pfn == pfn; /* THP can be referenced by any subpage */ return pfn >= page_pfn && pfn - page_pfn < hpage_nr_pages(page); } check_pte is doing right thing and helps to collect 'pfn' from both normal PTE and migration entry. No chnages to it. >-- >Michal Hocko >SUSE Labs