On Tue, Aug 11, 2020 at 12:26:49PM +0200, Michal Hocko wrote: > On Tue 11-08-20 11:37:13, Uladzislau Rezki wrote: > > On Tue, Aug 11, 2020 at 10:19:17AM +0200, Michal Hocko wrote: > > > On Mon 10-08-20 21:25:26, Michal Hocko wrote: > > > > On Mon 10-08-20 18:07:39, Uladzislau Rezki wrote: > > > [...] > > > > > The problem that i see is we can not use the page allocator from atomic > > > > > contexts, what is our case: > > > > > > > > > > <snip> > > > > > local_irq_save(flags) or preempt_disable() or raw_spinlock(); > > > > > __get_free_page(GFP_ATOMIC); > > > > > <snip> > > > > > > > > > > So if we can convert the page allocator to raw_* lock it will be appreciated, > > > > > at least from our side, IMHO, not from RT one. But as i stated above we need > > > > > to sort raised questions out if converting is done. > > > > > > > > > > What is your view? > > > > > > > > To me it would make more sense to support atomic allocations also for > > > > the RT tree. Having both GFP_NOWAIT and GFP_ATOMIC which do not really > > > > work for atomic context in RT sounds subtle and wrong. > > > > > > I was thinking about this some more. I still think the above would be a > > > reasonable goal we should try to achieve. If for not other then for > > > future maintainability (especially after the RT patchset is merged). > > > I have tried to search for any known problems/attempts to make > > > zone->lock raw but couldn't find anything. Maybe somebody more involved > > > in RT world have something to say about that. > > > > > I tried yesterday to convert zone->lock. See below files i had to modify: > > <snip> > > modified: include/linux/mmzone.h > > modified: mm/compaction.c > > modified: mm/memory_hotplug.c > > modified: mm/page_alloc.c > > modified: mm/page_isolation.c > > modified: mm/page_reporting.c > > modified: mm/shuffle.c > > modified: mm/vmscan.c > > modified: mm/vmstat.c > > <snip> > > > > There is one more lock, that is zone->lru_lock one. Both zone->lock and this > > one intersect between each other. If the lru_lock can be nested under zone->lock > > it should be converted as well. But i need to analyze it farther. There are > > two wrapper functions which are used as common interface to lock/unlock both > > locks. See compact_lock_irqsave()/compact_unlock_should_abort_lru() in the > > mm/compaction.c. > > > > Any thoughts here? > > I am not an expert on compaction. Vlastimil would know better. My > thinking was that zone->lock is a tail lock but compaction/page > isolation might be doing something I am not aware of right now. > > > Anyway i tried to convert only zone->lock and use page allocator passing there > > gfp_mask=0 as argument. So it works. CONFIG_PROVE_RAW_LOCK_NESTING does not > > complain about any "bad" lock nesting. > > > > > Anyway, if the zone->lock is not a good fit for raw_spin_lock then the > > > only way I can see forward is to detect real (RT) atomic contexts and > > > bail out early before taking the lock in the allocator for NOWAIT/ATOMIC > > > requests. > > > > > For RT kernel we can detect it for sure. preemtable() works just fine there, > > i.e. we can identify the context we are currently in. > > In previous email I didn't mention why I prefer full NOWAIT semantic > over rt specific bailouts. There are users making NOWAIT allocation > attempts as an opportunistic allocation request which is OK to fail > as they have a fallback to go through. This would imply they would > prefer to know this ASAP rather then get blocked and sleep. A lack of > reports for PREEMPT_RT would suggest that nobody has noticed as this > though. > I agree here and share your view on it. To me, making *_ATOMIC *_NOWAIT to be fully workable on both kernels sounds like correct way to go. Indeed, there can be no complains as of now. But later on it can be and the question will be raised again, what to do. -- Vlad Rezki