Using a u32 is enough to store the result, but considering that the
result will be casted to u64 soon after. We use a u64 type directly.
So there no need to cast it to signed type and cast back then.
....
- int frame_size = mode->crtc_htotal * mode->crtc_vtotal;
+ u64 frame_size = mode->crtc_htotal * mode->crtc_vtotal;
...
- framedur_ns = div_u64((u64) frame_size * 1000000, dotclock);
+ framedur_ns = div_u64(frame_size * 1000000, dotclock);
The (u64) cast is there to extend the value to 64bits, not
because the original type is signed.
Sorry about my expression, I think my sentence did not mention anything
about 'because the original type is signed'.
In the contrary, my patch eliminated the concerns to the reviewer. It
say that the results of the multiply can't be negative.
My intent is to tell the compiler we want a unsigned return type, but
GCC emit 'imul' instruction for the multiply......
I'm using u64 as the return type, because div_u64() function accept a
u64 type value as its first argument.
The compiler will detect that the old code is a 32x32 multiply
where a 64bit result is needed, that may not be true for the
changed code (it would need to track back as far as the u16s).
I don't believe my code could be wrong.
when you use the word 'may', you are saying that it could be wrong after
apply my patch.
Then you have to find at least one test example to prove you point, in
which case my codes generate wrong results.
Again I don't believe you could find one.
It is not uncommon to force a 64bit result from a multiply
by making the constant 64bit. As in:
div_u64(frame_size * 1000000ULL, dotclock);
In fact, After apply this patch, the ASM code generated is same with before.
This may because the GCC is smart enough to generate optimized code in
either case,
I think It could be different with a different optimization-level.
I have tested this patch on three different architecture, I can not
find error still.
Below is the assembly extract on x86-64: because GCC generate the same
code in either case,
so I pasted only one copy here.
0000000000000530 <drm_calc_timestamping_constants>:
530: f3 0f 1e fa endbr64
534: e8 00 00 00 00 callq 539
<drm_calc_timestamping_constants+0x9>
539: 55 push %rbp
53a: 48 89 e5 mov %rsp,%rbp
53d: 41 57 push %r15
53f: 41 56 push %r14
541: 41 55 push %r13
543: 41 54 push %r12
545: 53 push %rbx
546: 48 83 ec 18 sub $0x18,%rsp
54a: 4c 8b 3f mov (%rdi),%r15
54d: 41 8b 87 6c 01 00 00 mov 0x16c(%r15),%eax
554: 85 c0 test %eax,%eax
556: 0f 84 ec 00 00 00 je 648
<drm_calc_timestamping_constants+0x118>
55c: 44 8b 87 90 00 00 00 mov 0x90(%rdi),%r8d
563: 49 89 fc mov %rdi,%r12
566: 44 39 c0 cmp %r8d,%eax
569: 0f 86 40 01 00 00 jbe 6af
<drm_calc_timestamping_constants+0x17f>
56f: 44 8b 76 1c mov 0x1c(%rsi),%r14d
573: 49 8b 8f 40 01 00 00 mov 0x140(%r15),%rcx
57a: 48 89 f3 mov %rsi,%rbx
57d: 45 85 f6 test %r14d,%r14d
580: 0f 8e d5 00 00 00 jle 65b
<drm_calc_timestamping_constants+0x12b>
586: 0f b7 43 2a movzwl 0x2a(%rbx),%eax
58a: 49 63 f6 movslq %r14d,%rsi
58d: 31 d2 xor %edx,%edx
58f: 48 89 c7 mov %rax,%rdi
592: 48 69 c0 40 42 0f 00 imul $0xf4240,%rax,%rax
599: 48 f7 f6 div %rsi
59c: 31 d2 xor %edx,%edx
59e: 48 89 45 d0 mov %rax,-0x30(%rbp)
5a2: 0f b7 43 38 movzwl 0x38(%rbx),%eax
5a6: 0f af c7 imul %edi,%eax
5a9: 48 98 cltq
5ab: 48 69 c0 40 42 0f 00 imul $0xf4240,%rax,%rax
5b2: 48 f7 f6 div %rsi
5b5: 41 89 c5 mov %eax,%r13d
5b8: f6 43 18 10 testb $0x10,0x18(%rbx)
5bc: 74 0a je 5c8
<drm_calc_timestamping_constants+0x98>
5be: 41 c1 ed 1f shr $0x1f,%r13d
5c2: 41 01 c5 add %eax,%r13d
5c5: 41 d1 fd sar %r13d
5c8: 4b 8d 04 c0 lea (%r8,%r8,8),%rax
5cc: 48 89 de mov %rbx,%rsi
5cf: 49 8d 3c 40 lea (%r8,%rax,2),%rdi
5d3: 8b 45 d0 mov -0x30(%rbp),%eax
5d6: 48 c1 e7 04 shl $0x4,%rdi
5da: 48 01 cf add %rcx,%rdi
5dd: 89 47 78 mov %eax,0x78(%rdi)
5e0: 48 83 ef 80 sub $0xffffffffffffff80,%rdi
5e4: 44 89 6f f4 mov %r13d,-0xc(%rdi)
5e8: e8 00 00 00 00 callq 5ed
<drm_calc_timestamping_constants+0xbd>
5ed: 0f b7 53 2e movzwl 0x2e(%rbx),%edx
5f1: 0f b7 43 38 movzwl 0x38(%rbx),%eax
5f5: 44 0f b7 4b 2a movzwl 0x2a(%rbx),%r9d
5fa: 45 8b 44 24 60 mov 0x60(%r12),%r8d
5ff: 4d 85 ff test %r15,%r15
602: 0f 84 87 00 00 00 je 68f
<drm_calc_timestamping_constants+0x15f>
608: 49 8b 77 08 mov 0x8(%r15),%rsi
60c: 52 push %rdx
60d: 31 ff xor %edi,%edi
60f: 48 c7 c1 00 00 00 00 mov $0x0,%rcx
616: 50 push %rax
617: 31 d2 xor %edx,%edx
619: e8 00 00 00 00 callq 61e
<drm_calc_timestamping_constants+0xee>
61e: 45 8b 44 24 60 mov 0x60(%r12),%r8d
623: 4d 8b 7f 08 mov 0x8(%r15),%r15
627: 5f pop %rdi
628: 41 59 pop %r9
62a: 8b 45 d0 mov -0x30(%rbp),%eax
62d: 48 c7 c1 00 00 00 00 mov $0x0,%rcx
634: 4c 89 fe mov %r15,%rsi
637: 45 89 f1 mov %r14d,%r9d
63a: 31 d2 xor %edx,%edx
63c: 31 ff xor %edi,%edi
63e: 50 push %rax
63f: 41 55 push %r13
641: e8 00 00 00 00 callq 646
<drm_calc_timestamping_constants+0x116>
646: 59 pop %rcx
647: 5e pop %rsi
648: 48 8d 65 d8 lea -0x28(%rbp),%rsp
64c: 5b pop %rbx
64d: 41 5c pop %r12
64f: 41 5d pop %r13
651: 41 5e pop %r14
653: 41 5f pop %r15
655: 5d pop %rbp
656: e9 00 00 00 00 jmpq 65b
<drm_calc_timestamping_constants+0x12b>
65b: 41 8b 54 24 60 mov 0x60(%r12),%edx
660: 49 8b 7f 08 mov 0x8(%r15),%rdi
664: 44 89 45 c4 mov %r8d,-0x3c(%rbp)
668: 45 31 ed xor %r13d,%r13d
66b: 48 c7 c6 00 00 00 00 mov $0x0,%rsi
672: 48 89 4d c8 mov %rcx,-0x38(%rbp)
676: e8 00 00 00 00 callq 67b
<drm_calc_timestamping_constants+0x14b>
67b: c7 45 d0 00 00 00 00 movl $0x0,-0x30(%rbp)
682: 44 8b 45 c4 mov -0x3c(%rbp),%r8d
686: 48 8b 4d c8 mov -0x38(%rbp),%rcx
68a: e9 39 ff ff ff jmpq 5c8
<drm_calc_timestamping_constants+0x98>
68f: 52 push %rdx
690: 48 c7 c1 00 00 00 00 mov $0x0,%rcx
697: 31 d2 xor %edx,%edx
699: 31 f6 xor %esi,%esi
69b: 50 push %rax
69c: 31 ff xor %edi,%edi
69e: e8 00 00 00 00 callq 6a3
<drm_calc_timestamping_constants+0x173>
6a3: 45 8b 44 24 60 mov 0x60(%r12),%r8d
6a8: 58 pop %rax
6a9: 5a pop %rdx
6aa: e9 7b ff ff ff jmpq 62a
<drm_calc_timestamping_constants+0xfa>
6af: 49 8b 7f 08 mov 0x8(%r15),%rdi
6b3: 4c 8b 67 50 mov 0x50(%rdi),%r12
6b7: 4d 85 e4 test %r12,%r12
6ba: 74 25 je 6e1
<drm_calc_timestamping_constants+0x1b1>
6bc: e8 00 00 00 00 callq 6c1
<drm_calc_timestamping_constants+0x191>
6c1: 48 c7 c1 00 00 00 00 mov $0x0,%rcx
6c8: 4c 89 e2 mov %r12,%rdx
6cb: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
6d2: 48 89 c6 mov %rax,%rsi
6d5: e8 00 00 00 00 callq 6da
<drm_calc_timestamping_constants+0x1aa>
6da: 0f 0b ud2
6dc: e9 67 ff ff ff jmpq 648
<drm_calc_timestamping_constants+0x118>
6e1: 4c 8b 27 mov (%rdi),%r12
6e4: eb d6 jmp 6bc
<drm_calc_timestamping_constants+0x18c>
6e6: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
6ed: 00 00 00
6f0: 90 nop
6f1: 90 nop
6f2: 90 nop
6f3: 90 nop
6f4: 90 nop
6f5: 90 nop
6f6: 90 nop
6f7: 90 nop
6f8: 90 nop
6f9: 90 nop
6fa: 90 nop
6fb: 90 nop
6fc: 90 nop
6fd: 90 nop
6fe: 90 nop
6ff: 90 nop
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)