Re: [PATCH RESEND 1/2] drm/i915: make device info bitfield flags bools

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 16 May 2016, Dave Gordon <david.s.gordon@xxxxxxxxx> wrote:
> On 13/05/16 16:05, Tvrtko Ursulin wrote:
>>
>> On 13/05/16 15:47, Jani Nikula wrote:
>>> On Fri, 13 May 2016, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote:
>>>> On Fri, May 13, 2016 at 03:25:05PM +0100, Tvrtko Ursulin wrote:
>>>>>
>>>>> On 13/05/16 15:04, Jani Nikula wrote:
>>>>>> This is more robust for assignments and comparisons.
>>>>>>
>>>>>> Signed-off-by: Jani Nikula <jani.nikula@xxxxxxxxx>
>>>>>> ---
>>>>>>   drivers/gpu/drm/i915/i915_drv.h | 2 +-
>>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>>>>>> b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> index d9d07b70f05c..bb0b6f64000e 100644
>>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>>> @@ -752,7 +752,7 @@ struct intel_csr {
>>>>>>       func(has_ddi) sep \
>>>>>>       func(has_fpga_dbg)
>>>>>>
>>>>>> -#define DEFINE_FLAG(name) u8 name:1
>>>>>> +#define DEFINE_FLAG(name) bool name:1
>>>>>>   #define SEP_SEMICOLON ;
>>>>>>
>>>>>>   struct intel_device_info {
>>>>>>
>>>>>
>>>>> The churn virus spreads? :)
>>>>>
>>>>> I tried that but it was negatively impacting the compiler. For some
>>>>> reason it increases .text by 2.5k here. Don't see anything obvious,
>>>>> would have to look at the code more closely to spot exactly why.
>>>>
>>>> Oh, that's not fun. bool:1 holds such promise for a clear explanation of
>>>> the most common form of bitfield.
>>>
>>> Really a bummer, especially since assigning any positive even number to
>>> unsigned int foo:1 will result in 0.
>>
>> That is a pretty strong argument to go for this rather than make sure
>> all places which set them are correct.
>>
>> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
>>
>> Regards,
>> Tvrtko
>
> Unfortunately, the compiler doesn't generate such good code when using 
> 'bool' vs. 'u8' for these bitfields.


I've pushed patch 2/2 while the jury seems to be still out for 1/2.

Generally we might prefer bools anyway, but this struct is const in
dev_priv with the idea that this is immutable data, and we override the
const briefly on driver load in intel_device_info_runtime_init().

Checking all usage is not a huge effort, although we might still screw
this up later on...

BR,
Jani.


>
> Firstly, it looks like the compiler DOESN'T combine bool-bitfield tests 
> such as "if (IS_A() || IS_B())", whereas it does if they're u8. Example:
>
> With bitfields as 'u8'
>
> 	if (IS_HASWELL(dev) || IS_BROADWELL(dev))
> 48 8b 57 28          	mov    0x28(%rdi),%rdx	# dev->dev_priv
> 0f b6 52 30          	movzbl 0x30(%rdx),%edx	# dev_priv->flag byte
> f6 c2 60             	test   $0x60,%dl	# (haswell|broadwell)
> 0f 85 bb 00 00 00    	jne    12e4		# branch to if-true
>
> 	else if (IS_VALLEYVIEW(dev) || IS_CHERRYVIEW(dev))
> 83 e2 18             	and    $0x18,%edx	# test two more flags
> 0f 84 1c 01 00 00    	je     134e		# skip if neither
>                          ... code for VLV/CHV ...
> (code for HSW/BDW is out-of-line, at the end of the function)
>
> With bitfields as bools:
>
> 48 8b 57 28          	mov    0x28(%rdi),%rdx	# dev->dev_priv
> 0f b6 52 30          	movzbl 0x30(%rdx),%edx	# dev_priv->flag byte
> f6 c2 20             	test   $0x20,%dl	# test one flag
> 75 09                	jne    1241		# jump if true
> f6 c2 40             	test   $0x40,%dl	# test other flag
> 0f 84 b7 00 00 00    	je     12f8		# skip if not
>                          ... code for HSW/BDW ...
> (code for VLV/CHV/other is later)
>
> So that would suggest that the compiler generates more efficient code 
> for 'u8' (at least it didn't reload the flag byte even in the 'bool' 
> case). Here's another case:
>
> With bitfields as bools:
>
> 	dev_priv->psr_mmio_base = IS_HASWELL(dev_priv) ?
> 		HSW_EDP_PSR_BASE : BDW_EDP_PSR_BASE;
> 0f b6 43 30          	movzbl 0x30(%rbx),%eax	# flags byte
> c0 e8 05             	shr    $0x5,%al		# haswell->bit 0
> 83 e0 01             	and    $0x1,%eax	# leaves bit 0 only
> 3c 01                	cmp    $0x1,%al		# compare to 1u
> 19 c0                	sbb    %eax,%eax	# convert to 0/-1
> 25 00 b0 00 00       	and    $0xb000,%eax	# convert to 0/b000
> 05 00 48 06 00       	add    $0x64800,%eax	# final result
> 89 83 e8 20 00 00    	mov    %eax,0x20e8(%rbx)# store it
>
> This is ingenious code, avoiding any branching. Now with 'u8':
>
> 0f b6 43 30          	movzbl 0x30(%rbx),%eax
> 83 e0 20             	and    $0x20,%eax	# leaves bit 5 only
> 3c 01                	cmp    $0x1,%al		# compare to 1u
> 19 c0                	sbb    %eax,%eax	# convert to 0/-1
> 25 00 b0 00 00       	and    $0xb000,%eax	# etc
> 05 00 48 06 00       	add    $0x64800,%eax
> 89 83 e8 20 00 00    	mov    %eax,0x20e8(%rbx)
>
> Again it's shorter, because the compiler has been able to extract a 
> truth-value without shifting the "haswell" bit down to bit 0 first.
>
> .Dave.
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/intel-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux