On Mon, 16 May 2016, Dave Gordon <david.s.gordon@xxxxxxxxx> wrote: > On 13/05/16 16:05, Tvrtko Ursulin wrote: >> >> On 13/05/16 15:47, Jani Nikula wrote: >>> On Fri, 13 May 2016, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote: >>>> On Fri, May 13, 2016 at 03:25:05PM +0100, Tvrtko Ursulin wrote: >>>>> >>>>> On 13/05/16 15:04, Jani Nikula wrote: >>>>>> This is more robust for assignments and comparisons. >>>>>> >>>>>> Signed-off-by: Jani Nikula <jani.nikula@xxxxxxxxx> >>>>>> --- >>>>>> drivers/gpu/drm/i915/i915_drv.h | 2 +- >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h >>>>>> b/drivers/gpu/drm/i915/i915_drv.h >>>>>> index d9d07b70f05c..bb0b6f64000e 100644 >>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h >>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h >>>>>> @@ -752,7 +752,7 @@ struct intel_csr { >>>>>> func(has_ddi) sep \ >>>>>> func(has_fpga_dbg) >>>>>> >>>>>> -#define DEFINE_FLAG(name) u8 name:1 >>>>>> +#define DEFINE_FLAG(name) bool name:1 >>>>>> #define SEP_SEMICOLON ; >>>>>> >>>>>> struct intel_device_info { >>>>>> >>>>> >>>>> The churn virus spreads? :) >>>>> >>>>> I tried that but it was negatively impacting the compiler. For some >>>>> reason it increases .text by 2.5k here. Don't see anything obvious, >>>>> would have to look at the code more closely to spot exactly why. >>>> >>>> Oh, that's not fun. bool:1 holds such promise for a clear explanation of >>>> the most common form of bitfield. >>> >>> Really a bummer, especially since assigning any positive even number to >>> unsigned int foo:1 will result in 0. >> >> That is a pretty strong argument to go for this rather than make sure >> all places which set them are correct. >> >> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> >> >> Regards, >> Tvrtko > > Unfortunately, the compiler doesn't generate such good code when using > 'bool' vs. 'u8' for these bitfields. I've pushed patch 2/2 while the jury seems to be still out for 1/2. Generally we might prefer bools anyway, but this struct is const in dev_priv with the idea that this is immutable data, and we override the const briefly on driver load in intel_device_info_runtime_init(). Checking all usage is not a huge effort, although we might still screw this up later on... BR, Jani. > > Firstly, it looks like the compiler DOESN'T combine bool-bitfield tests > such as "if (IS_A() || IS_B())", whereas it does if they're u8. Example: > > With bitfields as 'u8' > > if (IS_HASWELL(dev) || IS_BROADWELL(dev)) > 48 8b 57 28 mov 0x28(%rdi),%rdx # dev->dev_priv > 0f b6 52 30 movzbl 0x30(%rdx),%edx # dev_priv->flag byte > f6 c2 60 test $0x60,%dl # (haswell|broadwell) > 0f 85 bb 00 00 00 jne 12e4 # branch to if-true > > else if (IS_VALLEYVIEW(dev) || IS_CHERRYVIEW(dev)) > 83 e2 18 and $0x18,%edx # test two more flags > 0f 84 1c 01 00 00 je 134e # skip if neither > ... code for VLV/CHV ... > (code for HSW/BDW is out-of-line, at the end of the function) > > With bitfields as bools: > > 48 8b 57 28 mov 0x28(%rdi),%rdx # dev->dev_priv > 0f b6 52 30 movzbl 0x30(%rdx),%edx # dev_priv->flag byte > f6 c2 20 test $0x20,%dl # test one flag > 75 09 jne 1241 # jump if true > f6 c2 40 test $0x40,%dl # test other flag > 0f 84 b7 00 00 00 je 12f8 # skip if not > ... code for HSW/BDW ... > (code for VLV/CHV/other is later) > > So that would suggest that the compiler generates more efficient code > for 'u8' (at least it didn't reload the flag byte even in the 'bool' > case). Here's another case: > > With bitfields as bools: > > dev_priv->psr_mmio_base = IS_HASWELL(dev_priv) ? > HSW_EDP_PSR_BASE : BDW_EDP_PSR_BASE; > 0f b6 43 30 movzbl 0x30(%rbx),%eax # flags byte > c0 e8 05 shr $0x5,%al # haswell->bit 0 > 83 e0 01 and $0x1,%eax # leaves bit 0 only > 3c 01 cmp $0x1,%al # compare to 1u > 19 c0 sbb %eax,%eax # convert to 0/-1 > 25 00 b0 00 00 and $0xb000,%eax # convert to 0/b000 > 05 00 48 06 00 add $0x64800,%eax # final result > 89 83 e8 20 00 00 mov %eax,0x20e8(%rbx)# store it > > This is ingenious code, avoiding any branching. Now with 'u8': > > 0f b6 43 30 movzbl 0x30(%rbx),%eax > 83 e0 20 and $0x20,%eax # leaves bit 5 only > 3c 01 cmp $0x1,%al # compare to 1u > 19 c0 sbb %eax,%eax # convert to 0/-1 > 25 00 b0 00 00 and $0xb000,%eax # etc > 05 00 48 06 00 add $0x64800,%eax > 89 83 e8 20 00 00 mov %eax,0x20e8(%rbx) > > Again it's shorter, because the compiler has been able to extract a > truth-value without shifting the "haswell" bit down to bit 0 first. > > .Dave. > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Jani Nikula, Intel Open Source Technology Center _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx