Am Mittwoch, den 16.05.2018, 15:10 +0200 schrieb Christian König: > Am 16.05.2018 um 15:00 schrieb Lucas Stach: > > Am Mittwoch, den 16.05.2018, 14:32 +0200 schrieb Christian König: > > > Am 16.05.2018 um 14:28 schrieb Lucas Stach: > > > > Am Mittwoch, den 16.05.2018, 14:08 +0200 schrieb Christian König: > > > > > Yes, exactly. > > > > > > > > > > For normal user space command submission we should have tons of > > > > > locks > > > > > guaranteeing that (e.g. just the VM lock should do). > > > > > > > > > > For kernel moves we have the mutex for the GTT windows which > > > > > protects > > > > > it. > > > > > > > > > > The could be problems with the UVD/VCE queues to cleanup the > > > > > handles > > > > > when an application crashes. > > > > > > > > FWIW, etnaviv is currently completely unlocked in this path, but I > > > > believe that this isn't an issue as the sched fence seq numbers are > > > > per > > > > entity. So to actually end up with reversed seqnos one context has > > > > to > > > > preempt itself to do another submit, while the current one hasn't > > > > returned from kernel space, which I believe is a fairly theoretical > > > > issue. Is my understanding correct? > > > > > > Yes. The problem is with the right timing this can be used to access > > > freed up memory. > > > > > > If you then manage to place a page table in that freed up memory > > > taking > > > over the system is just a typing exercise. > > > > Thanks. I believe we don't have this problem in etnaviv, as memory > > referencing is tied to the job and will only be unreferenced on > > free_job, but I'll re-check this when I've got some time. > > Well that depends on how you use the sequence numbers. > > If you use them for job completion check somewhere then you certainly > have a problem if job A gets the 1 and B the 2, but B completes before A. We don't do this anymore. All the etnaviv internals are driven by the fence signal callbacks. > At bare minimum that's still a bug we need to fix because it confuses > functions like dma_fence_is_later() and dma_fence_later(). Agreed. Also amending the documentation to state that drm_sched_job_init() and drm_sched_entity_push_job() need to be called under a common lock seems like a good idea. Regards, Lucas