On Thu, Feb 18, 2021 at 10:15 AM Steven Price <steven.price@xxxxxxx> wrote: > > On 18/02/2021 15:45, Alyssa Rosenzweig wrote: > >> Yeah plus Cc: stable for backporting and I think an igt or similar for > >> panfrost to check this works correctly would be pretty good too. Since > >> if it took us over 1 year to notice this bug it's pretty clear that > >> normal testing doesn't catch this. So very likely we'll break this > >> again. > > > > Unfortunately there are a lot of kernel bugs which are noticed during actual > > use (but not CI runs), some of which have never been fixed. I do know > > the shrinker impl is buggy for us, if this is the fix I'm very happy. > > I doubt this will actually "fix" anything - if I understand correctly > then the sequence which is broken is: > > * allocate BO, mmap to CPU > * madvise(DONTNEED) > * trigger purge > * try to access the BO memory > > which is an invalid sequence for user space - the attempt to access > memory should cause a SIGSEGV. However because drm_vma_node_unmap() is > unable to find the mappings there may still be page table entries > present which would provide access to memory the kernel has freed. Which > is of course a big security hole and so this fix is needed. > > In what way do you find the shrinker impl buggy? I'm aware there's some > dodgy locking (although I haven't worked out how to fix it) - but AFAICT > it's more deadlock territory rather than lacking in locks. Are there > correctness issues? What's there was largely a result of getting lockdep happy. > >> btw for testing shrinkers recommended way is to have a debugfs file > >> that just force-shrinks everything. That way you avoid all the trouble > >> that tend to happen when you drive a system close to OOM on linux, and > >> it's also much faster. > > > > 2nding this as a good idea. > > > > Sounds like a good idea to me too. But equally I'm wondering whether the > best (short term) solution is to actually disable the shrinker. I'm > somewhat surprised that nobody has got fed up with the "Purging xxx > bytes" message spam - which makes me think that most people are not > hitting memory pressure to trigger the shrinker. If the shrinker is dodgy, then it's probably good to have the messages to know if it ran. > The shrinker on kbase caused a lot of grief - and the only way I managed > to get that under control was by writing a static analysis tool for the > locking, and by upsetting people by enforcing the (rather dumb) rules of > the tool on the code base. I've been meaning to look at whether sparse > can do a similar check of locks. Lockdep doesn't cover it? Rob _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel