Solar Designer <solar@xxxxxxxxxxxx> writes: > On Wed, Mar 19, 2025 at 07:06:57PM +0800, Yunsheng Lin wrote: >> On 2025/3/19 4:55, Toke Høiland-Jørgensen wrote: >> > Yunsheng Lin <linyunsheng@xxxxxxxxxx> writes: >> >> On 2025/3/17 23:16, Toke Høiland-Jørgensen wrote: >> >>> Yunsheng Lin <yunshenglin0825@xxxxxxxxx> writes: >> >>>> On 3/14/2025 6:10 PM, Toke Høiland-Jørgensen wrote: >> >>>> >> >>>> ... >> >>>> >> >>>>> To avoid having to walk the entire xarray on unmap to find the page >> >>>>> reference, we stash the ID assigned by xa_alloc() into the page >> >>>>> structure itself, using the upper bits of the pp_magic field. This >> >>>>> requires a couple of defines to avoid conflicting with the >> >>>>> POINTER_POISON_DELTA define, but this is all evaluated at compile-time, >> >>>>> so does not affect run-time performance. The bitmap calculations in this >> >>>>> patch gives the following number of bits for different architectures: >> >>>>> >> >>>>> - 24 bits on 32-bit architectures >> >>>>> - 21 bits on PPC64 (because of the definition of ILLEGAL_POINTER_VALUE) >> >>>>> - 32 bits on other 64-bit architectures >> >>>> >> >>>> From commit c07aea3ef4d4 ("mm: add a signature in struct page"): >> >>>> "The page->signature field is aliased to page->lru.next and >> >>>> page->compound_head, but it can't be set by mistake because the >> >>>> signature value is a bad pointer, and can't trigger a false positive >> >>>> in PageTail() because the last bit is 0." >> >>>> >> >>>> And commit 8a5e5e02fc83 ("include/linux/poison.h: fix LIST_POISON{1,2} >> >>>> offset"): >> >>>> "Poison pointer values should be small enough to find a room in >> >>>> non-mmap'able/hardly-mmap'able space." >> >>>> >> >>>> So the question seems to be: >> >>>> 1. Is stashing the ID causing page->pp_magic to be in the mmap'able/ >> >>>> easier-mmap'able space? If yes, how can we make sure this will not >> >>>> cause any security problem? >> >>>> 2. Is the masking the page->pp_magic causing a valid pionter for >> >>>> page->lru.next or page->compound_head to be treated as a vaild >> >>>> PP_SIGNATURE? which might cause page_pool to recycle a page not >> >>>> allocated via page_pool. >> >>> >> >>> Right, so my reasoning for why the defines in this patch works for this >> >>> is as follows: in both cases we need to make sure that the ID stashed in >> >>> that field never looks like a valid kernel pointer. For 64-bit arches >> >>> (where CONFIG_ILLEGAL_POINTER_VALUE), we make sure of this by never >> >>> writing to any bits that overlap with the illegal value (so that the >> >>> PP_SIGNATURE written to the field keeps it as an illegal pointer value). >> >>> For 32-bit arches, we make sure of this by making sure the top-most bit >> >>> is always 0 (the -1 in the define for _PP_DMA_INDEX_BITS) in the patch, >> >>> which puts it outside the range used for kernel pointers (AFAICT). >> >> >> >> Is there any season you think only kernel pointer is relevant here? >> > >> > Yes. Any pointer stored in the same space as pp_magic by other users of >> > the page will be kernel pointers (as they come from page->lru.next). The >> > goal of PP_SIGNATURE is to be able to distinguish pages allocated by >> > page_pool, so we don't accidentally recycle a page from somewhere else. >> > That's the goal of the check in page_pool_page_is_pp(): >> > >> > (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE >> > >> > To achieve this, we must ensure that the check above never returns true >> > for any value another page user could have written into the same field >> > (i.e., into page->lru.next). For 64-bit arches, POISON_POINTER_DELTA >> >> POISON_POINTER_DELTA is defined according to CONFIG_ILLEGAL_POINTER_VALUE, >> if CONFIG_ILLEGAL_POINTER_VALUE is not defined yet, POISON_POINTER_DELTA >> is defined to zero. >> >> It seems only the below 64-bit arches define CONFIG_ILLEGAL_POINTER_VALUE >> through grepping: >> a29815a333c6 core, x86: make LIST_POISON less deadly >> 5c178472af24 riscv: define ILLEGAL_POINTER_VALUE for 64bit >> f6853eb561fb powerpc/64: Define ILLEGAL_POINTER_VALUE for 64-bit >> bf0c4e047324 arm64: kconfig: Move LIST_POISON to a safe value >> >> The below 64-bit arches don't seems to define the above config yet: >> MIPS64, SPARC64, System z(S390X),loongarch >> >> Does ID stashing cause problem for the above arches? >> >> > serves this purpose. For 32-bit arches, we can leave the top-most bits >> > out of PP_MAGIC_MASK, to make sure that any valid pointer value will >> > fail the check above. >> >> The above mainly explained how to ensure page_pool_page_is_pp() will >> not return false positive result from the page_pool perspective. >> >> From MM/security perspective, most of the commits quoted above seem >> to suggest that poison pointer should be in the non-mmap'able or >> hardly-mmap'able space, otherwise userspace can arrange for those >> pointers to actually be dereferencable, potentially turning an oops >> to an expolit, more detailed example in the below paper, which explains >> how to exploit a vulnerability which hardened by the 8a5e5e02fc83 commit: >> https://www.usenix.org/system/files/conference/woot15/woot15-paper-xu.pdf >> >> ID stashing seems to cause page->lru.next (aliased to page->pp_magic) to >> be in the mmap'able space for some arches. > > ... > >> To be honest, I am not that familiar with the pointer poison mechanism. >> But through some researching and analyzing, it makes sense to overstate >> it a little as it seems to be security-related. >> Cc'ed some security-related experts and ML to see if there is some >> clarifying from them. > > You're correct that the pointer poison values should be in areas not > mmap'able by userspace (at least with reasonable mmap_min_addr values). > > Looking at the union inside "struct page", I see pp_magic is aliased > against multiple pointers in the union'ed anonymous structs. > > I'm not familiar with the uses of page->pp_magic and how likely or not > we are to have a bug where its aliasing with pointers would be exposed > as an attack vector, but this does look like a serious security concern. > It looks like we would be seriously weakening the poisoning, except on > archs where the new values with ID stashing are still not mmap'able. > > I just discussed the matter with my colleague at CIQ, Sultan Alsawaf, > and he thinks the added risk is not that bad. He wrote: > >> Toke's response here is fair: >> >> > Right, okay, I see what you mean. So the risk is basically the >> > following: >> > >> > If some other part of the kernel ends up dereferencing the >> > page->lru.next pointer of a page that is owned by page_pool, and which >> > has an ID stashed into page->pp_magic, that dereference can end up being >> > to a valid userspace mapping, which can lead to Bad Things(tm), cf the >> > paper above. >> > >> > This is mitigated by the fact that it can only happen on architectures >> > that don't set ILLEGAL_POINTER_VALUE (which includes 32-bit arches, and >> > the ones you listed above). In addition, this has to happen while the >> > page is owned by page_pool, and while it is DMA-mapped - we already >> > clear the pp_magic field when releasing the page from page_pool. >> > >> > I am not sure to what extent the above is a risk we should take pains to >> > avoid, TBH. It seems to me that for this to become a real problem, lots >> > of other things will already have gone wrong. But happy to defer to the >> > mm/security folks here. >> >> For this to be a problem, there already needs to be a use-after-free on >> a page, which arguably creates many other vectors for attack. >> >> The lru field of struct page is already used as a generic list pointer >> in several places in the kernel once ownership of the page is obtained. >> Any risk of dereferencing lru.next in a use-after-free scenario would >> technically apply to a bunch of other places in the kernel (grep for >> page->lru). > > We also tried searching for existing exploitation techniques for "struct > page" use-after-free. We couldn't find any. The closest (non-)match I > found is this fine research (the same project presented differently): > > https://i.blackhat.com/BH-US-24/Presentations/US24-Qian-PageJack-A-Powerful-Exploit-Technique-With-Page-Level-UAF-Thursday.pdf page 33+ > https://arxiv.org/html/2401.17618v2#S4 > https://phrack.org/issues/71/13 > > The arxiv paper includes this sentence: "To create a page-level UAF, the > key is to cause a UAF of the struct page objects." However, we do not > see them actually do that, and this statement is not found in the slides > nor in the Phrack article. Confused. Thank you for weighing in! I will post an updated version of this patch with a reference to this discussion and try to summarise the above. > Thank you for CC'ing me and the kernel-hardening list. However, please > do not CC the oss-security list like that, where it's against content > guidelines. Only properly focused new postings/threads are acceptable > there (not CC'ing from/to other lists where only part of the content is > on-topic, and follow-ups might not be on-topic at all). See: > > https://oss-security.openwall.org/wiki/mailing-lists/oss-security#list-content-guidelines > > As a moderator for oss-security, I'm going to remove these messages from > the queue now. Please drop Cc: oss-security from any further replies. > > If desired, we may bring these topics to oss-security separately, with a > proper Subject line and clear description of what we're talking about. Noted, thanks! -Toke