Lucas Stach <l.stach@xxxxxxxxxxxxxx> writes: > Am Dienstag, den 03.07.2018, 10:05 -0700 schrieb Eric Anholt: >> GTF-GLES2.gtf.GL.acos.acos_float_vert_xvary submits jobs that take 4 >> seconds at maximum resolution, but we still want to reset quickly if a >> job is really hung. Sample the CL's current address and the return >> address (since we call into tile lists repeatedly) and if either has >> changed then assume we've made progress. > > So this means you are doubling your timeout? AFAICS for the first time > you hit the timeout handler the cached ctca and ctra values will > probably always differ from the current values. Maybe this warrants a > mention in the commit message, as it's changing the behavior of the > scheduler timeout. I supposes that doubles the minimum timeout, but I don't think there's any principled choice behind that value. > Also how easy is it for userspace to construct such an infinite loop in > the CL? Thinking about a rogue client DoSing the GPU while exploiting > this check in the timeout handler to stay under the radar... You'd need to have a big enough CL that you don't sample the same location twice in a row, but otherwise it's trivial and equivalent to a V3D33 igt case I wrote. I don't think we as the kernel particularly cares to protect from that case, though -- it's mainly "does a broken WebGL shader take down your desktop?" which we will still be protecting from. If you wanted to protect from a general userspace attacker, you could have a maximum 1 minute timeout or something, but I'm not sure your life is actually much better when you let an arbitrary number of clients submit many jobs to round-robin through each of which has a long timeout like that.
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel