On Mon, May 06, 2013 at 11:40:06AM +0200, Daniel Vetter wrote: > On Thu, May 02, 2013 at 01:27:32PM -0700, Jesse Barnes wrote: > > On Tue, 23 Apr 2013 23:15:29 -0700 > > Ben Widawsky <ben at bwidawsk.net> wrote: > > > > > Because our context refcounting doesn't grab a ref at lookup time, it is > > > unsafe to do so without the lock. > > > > > > NOTE: We don't have an easy way to put the assertion in the lookup > > > function which is where this really belongs. Context switching is good > > > enough because it actually asserts even more correctness by protecting > > > the default_context. > > > > > > Signed-off-by: Ben Widawsky <ben at bwidawsk.net> > > > --- > > > drivers/gpu/drm/i915/i915_gem_context.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c > > > index a1e8ecb..411ace0 100644 > > > --- a/drivers/gpu/drm/i915/i915_gem_context.c > > > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > > > @@ -444,6 +444,8 @@ int i915_switch_context(struct intel_ring_buffer *ring, > > > if (dev_priv->hw_contexts_disabled) > > > return 0; > > > > > > + BUG_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex)); > > > + > > > if (ring != &dev_priv->ring[RCS]) > > > return 0; > > > > > > > Simple enough. > > > > Reviewed-by: Jesse Barnes <jbarnes at virtuousgeek.org> > > > > We usually do WARN_ONs for this stuff though, in case a user actually > > does hit it, it may not be fatal so why crash the machine? > > > > But that's a minor distinction since we shouldn't hit this except in > > development anyway. > > Well, since this is a patch for upstream the focus should very much be on > supporting bug reporters and not developers. And for bug reporters a BUG > is much more annoying than a WARN and greatly reduces the chances that > we'll get a bug report. Some more details why a WARN massively beats a BUG for us: BUG kills the current process and ensures all locks are stuck. Usually that means X is dead and you can't vt-switch away to the console to take a quick look at dmesg. Now even when all rendering is down the toilet due to the follow-up damage after the WARN and the gpu a zombie, there's a non-zero chance that vt-switch (or sw rendering in X) will work long enough to grab log files and debugfs data. Hence the first rule to only use a BUG on if we have a guaranteed OOPS otherwise (which again will kill the process and make all locks stuck). -Daniel > > There are imo only very few cases where a BUG instead of a WARN is > justified: > - The kernel is _guaranteed_ to oops in the next few lines anyway, so a > BUG_ON will help in readability of the backtrace. Note that checking 3 > different things for non-NULL in the same BUG actually reduces OOPS > readability (with an oops you can at least reconstruct the faulting > address and so probably the pointer). Also, this means the BUG should > have a neat description of what exactly blew up. > > The "a few lines" part is just a guideline with some big exceptions. A > prime example is refcount over/underflows since those will blow up, but > only sometimes later (and usually no one will have a clue why). > > - BUGs are justified if there's a potential security hole awaiting, e.g. > when something in the userspace input validation has gone wrong. > > - I'm wary of special error handling for WARNs, but if an early return > (with an error code if possible) transforms a BUG into a WARN I'm in. > But trying to fix up e.g. modeset state is usually futile, since we'll > end up with a black/fuzzy/corrupted screen most likely anyway. > > But the most important rule is: In case of doubt, just WARN, don't BUG. > > [I know, I violate it sometimes, too.] > > Patch applied with the s/BUG/WARN bikeshed. > -Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch