On Sun, Sep 11, 2016 at 11:29 PM, Oliver O'Halloran <oohall@xxxxxxxxx> wrote: > On Mon, Sep 12, 2016 at 3:31 AM, Dan Williams <dan.j.williams@xxxxxxxxx> wrote: >> As evidenced by this bug report [1], userspace libraries are interested >> in whether a mapping is DAX mapped, i.e. no intervening page cache. >> Rather than using the ambiguous VM_MIXEDMAP flag in smaps, provide an >> explicit "is dax" indication as a new flag in the page vector populated >> by mincore. >> >> There are also cases, particularly for testing and validating a >> configuration to know the hardware mapping geometry of the pages in a >> given process address range. Consider filesystem-dax where a >> configuration needs to take care to align partitions and block >> allocations before huge page mappings might be used, or >> anonymous-transparent-huge-pages where a process is opportunistically >> assigned large pages. mincore2() allows these configurations to be >> surveyed and validated. >> >> The implementation takes advantage of the unused bits in the per-page >> byte returned for each PAGE_SIZE extent of a given address range. The >> new format of each vector byte is: >> >> (TLB_SHIFT - PAGE_SHIFT) << 2 | vma_is_dax() << 1 | page_present > > What is userspace expected to do with the information in vec? Whether > PMD or THP mappings can be used is going to depend more on the block > allocations done by the filesystem rather than anything the an > application can directly influence. Returning a vector for each page > makes some sense in the mincore() case since the application can touch > each page to fault them in, but I don't see what they can do here. It's not a "can huge pages be used?" question it's interrogating the mapping that got established after the fact. If an application/environment expects huge mappings, but pte mappings are getting established > Why not just get rid of vec entirely and make mincore2() a yes/no > check over the range for whatever is supplied in flags? That would > work for NVML's use case and it should be easier to extend if needed. I think having a way to ask the kernel if an address range satisfies a certain set of input attributes is a useful interface. Perhaps a "MINCORE_CHECK" flag can indicate that the input vector contains a single character that it wants the kernel to validate during the page table walk, and return zero or the offset of the first mismatch. -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html