Paul Mackerras <paulus@xxxxxxxxx> writes: > On Mon, Mar 04, 2013 at 05:11:53PM +0530, Aneesh Kumar K.V wrote: >> Paul Mackerras <paulus@xxxxxxxxx> writes: >> >> +static inline int hpte_actual_psize(struct hash_pte *hptep, int psize) >> >> +{ >> >> + unsigned int mask; >> >> + int i, penc, shift; >> >> + /* Look at the 8 bit LP value */ >> >> + unsigned int lp = (hptep->r >> LP_SHIFT) & ((1 << LP_BITS) - 1); >> >> + >> >> + penc = 0; >> >> + for (i = 0; i < MMU_PAGE_COUNT; i++) { >> >> + /* valid entries have a shift value */ >> >> + if (!mmu_psize_defs[i].shift) >> >> + continue; >> >> + >> >> + /* encoding bits per actual page size */ >> >> + shift = mmu_psize_defs[i].shift - 11; >> >> + if (shift > 9) >> >> + shift = 9; >> >> + mask = (1 << shift) - 1; >> >> + if ((lp & mask) == mmu_psize_defs[psize].penc[i]) >> >> + return i; >> >> + } >> >> + return -1; >> >> +} >> > >> > This doesn't look right to me. First, it's not clear what the 11 and >> > 9 refer to, and I think the 9 should be LP_BITS (i.e. 8). Secondly, >> > the mask for the comparison needs to depend on the actual page size >> > not the base page size. >> >> That 11 should be 12.That depends on the fact that we have below mapping > > And the 12 should be LP_SHIFT, shouldn't it? LP_SHIFT would indicate how many bit poisition need to be shifted to get to the LP field in HPTE. I guess what we want here is shift value for 4K page. How about shift = mmu_psize_defs[i].shift - mmu_psize_defs[MMU_PAGE_4K].shift; > >> rrrr rrrz ≥8KB >> >> Yes, that 9 should be LP_BITs. >> >> We are generating mask based on actual page size above (variable i in >> the for loop). > > OK, yes, you're right. > >> > I don't see where in this function you set the penc[] elements for >> > invalid actual page sizes to -1. >> >> We do the below >> >> --- a/arch/powerpc/mm/hash_utils_64.c >> +++ b/arch/powerpc/mm/hash_utils_64.c >> @@ -125,7 +125,7 @@ static struct mmu_psize_def mmu_psize_defaults_old[] = { >> [MMU_PAGE_4K] = { >> .shift = 12, >> .sllp = 0, >> - .penc = 0, >> + .penc = { [0 ... MMU_PAGE_COUNT - 1] = -1 }, >> .avpnm = 0, > > Yes, which sets them for the entries you initialize, but not for the > others. For example, the entry for MMU_PAGE_64K will initially be all > zeroes. Then we find an entry in the ibm,segment-page-sizes property > for 64k pages, so we set mmu_psize_defs[MMU_PAGE_64K].shift to 16, > making that entry valid, but we never set any of the .penc[] entries > to -1, leading your other code to think that it can do (say) 1M pages > in a 64k segment using an encoding of 0. > Noticed that earlier. This is what i currently have. +static void mmu_psize_set_default_penc(struct mmu_psize_def *mmu_psize) +{ + int bpsize, apsize; + for (bpsize = 0; bpsize < MMU_PAGE_COUNT; bpsize++) + for (apsize = 0; apsize < MMU_PAGE_COUNT; apsize++) + mmu_psize[bpsize].penc[apsize] = -1; +} + static void __init htab_init_page_sizes(void) { int rc; + mmu_psize_set_default_penc(mmu_psize_defaults_old); + /* Default to 4K pages only */ memcpy(mmu_psize_defs, mmu_psize_defaults_old, sizeof(mmu_psize_defaults_old)); @@ -411,6 +443,8 @@ static void __init htab_init_page_sizes(void) if (rc != 0) /* Found */ goto found; + mmu_psize_set_default_penc(mmu_psize_defaults_gp); + /* * Not in the device-tree, let's fallback on known size * list for 16M capable GP & GR Modified arch/powerpc/mm/hugetlbpage-hash64.c > Also, I noticed that the code in the if (base_idx < 0) statement is > wrong. It needs to advance prop (and decrease size) by 2 * lpnum, > not just 2. > Ok. Fixed now. -aneesh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href