>On Wed, 25 Jul 2018 14:37:58 +0800 "zhaowuyun@xxxxxxxxxxxx" <zhaowuyun@xxxxxxxxxxxx> wrote: > >> From: zhaowuyun <zhaowuyun@xxxxxxxxxxxx> >> >> issue is that there are two processes A and B, A is kworker/u16:8 >> normal priority, B is AudioTrack, RT priority, they are on the >> same CPU 3. >> >> The task A preempted by task B in the moment >> after __delete_from_swap_cache(page) and before swapcache_free(swap). >> >> The task B does __read_swap_cache_async in the do {} while loop, it >> will never find the page from swapper_space because the page is removed >> by the task A, and it will never sucessfully in swapcache_prepare because >> the entry is EEXIST. >> >> The task B then stuck in the loop infinitely because it is a RT task, >> no one can preempt it. >> >> so need to disable preemption until the swapcache_free executed. > >Yes, right, sorry, I must have merged cbab0e4eec299 in my sleep. >cond_resched() is a no-op in the presence of realtime policy threads >and using to attempt to yield to a different thread it in this fashion >is broken. > >Disabling preemption on the other side of the race should fix things, >but it's using a bandaid to plug the leakage from the earlier bandaid. >The proper way to coordinate threads is to use a sleeping lock, such >as a mutex, or some other wait/wakeup mechanism. > >And once that's done, we can hopefully eliminate the do loop from >__read_swap_cache_async(). That also services ENOMEM from >radix_tree_insert(), but __add_to_swap_cache() appears to handle that >OK and we shouldn't just loop around retrying the insert and the >radix_tree_preload() should ensure that radix_tree_insert() never fails >anyway. Unless we're calling __read_swap_cache_async() with screwy >gfp_flags from somewhere. > > Your are right, it is a bandaid ... Could you provide some suggestion more specific about how to use sleeping lock/some other wait/wakeup mechanism to fix this issue? Thanks very much! Our project really needs a fix to this issue ... -------------- zhaowuyun@xxxxxxxxxxxx