On 7/19/24 2:35 AM, Barry Song wrote: > On Thu, Jul 18, 2024 at 8:50 PM Michal Hocko <mhocko@xxxxxxxx> wrote: >> >> On Thu 18-07-24 20:43:53, Barry Song wrote: >> > On Thu, Jul 18, 2024 at 8:32 PM Michal Hocko <mhocko@xxxxxxxx> wrote: >> > > >> > > On Thu 18-07-24 20:18:02, Barry Song wrote: >> > > > So the purpose is making sure the semantics - NOFAIL means no failure >> > > > and we don't need to check ret. If we can't really succeed, it should throw >> > > > a BUG to stop any potential exploits. >> > > >> > > This would require to panic consistently on failure in all allocator >> > > path that can bail out. E.g. page allocator on GFP_NOWAIT|GFP_NOFAIL >> > > req. not sure how many more. >> > >> > Right, this GFP_NOFAIL issue seems quite messy. However, at least vmalloc >> > will retry by itself, even if alloc_pages might have failed with >> > GFP_NOWAIT | GFP_NOFAIL. >> > >> > But isn't that the definition of __GFP_NOFAIL? >> > >> > * %__GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller >> > * cannot handle allocation failures. The allocation could block >> > * indefinitely but will never return with failure. Testing for >> > * failure is pointless." >> > >> > So I believe any code that doesn't retry and ends up returning NULL should be >> > fixed. >> >> Yes, those shouldn't really fail. NOWAIT|NOFAIL was something that >> should never happen and I really hope it doesn't. Others should really >> retry but it's been some time since I've checked the last time. > > > I assume allocations directly using alloc_pages() might not respect GFP_NOFAIL > and violate the semantics of GFP_NOFAIL. > > static inline struct page * > __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, > struct alloc_context *ac) { > /* > * Make sure that __GFP_NOFAIL request doesn't leak out and make sure > * we always retry > */ > if (gfp_mask & __GFP_NOFAIL) { > /* > * All existing users of the __GFP_NOFAIL are blockable, so warn > * of any new users that actually require GFP_NOWAIT > */ > if (WARN_ON_ONCE_GFP(!can_direct_reclaim, gfp_mask)) > goto fail; > ... > } > > Additionally, at least drivers/vdpa/vdpa_user/iova_domain.c is > incorrect with GFP_ATOMIC > | __GFP_NOFAIL. > > void vduse_domain_remove_user_bounce_pages(struct vduse_iova_domain *domain) > { > ... > > count = domain->bounce_size >> PAGE_SHIFT; > for (i = 0; i < count; i++) { > ... > > /* Copy user page to kernel page if it's in use */ > if (map->orig_phys != INVALID_PHYS_ADDR) { > page = alloc_page(GFP_ATOMIC | __GFP_NOFAIL); This should be already triggering the warning above? If it doesn't nobody yet reached the particular line in the alloc slowpath. Probalby thanks to the GFP_ATOMIC reserves. Maybe we should tighten the warnigns then. > memcpy_from_page(page_address(page), > map->bounce_page, 0, PAGE_SIZE); > } > put_page(map->bounce_page); > map->bounce_page = page; > } > domain->user_bounce_pages = false; > out: > write_unlock(&domain->bounce_lock); > } > > GFP_NOFAIL things need to be fixed. Let me investigate further. > >> >> These overflow checks were added without any acks by MM people... >> -- >> Michal Hocko >> SUSE Labs > > Thanks > Barry