Re: [PATCH] gettime: minimize integer division

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2012-12-20 20:23, Sam Bradshaw wrote:
> 
>>> diff --git a/gettime.c b/gettime.c
>>> index 035d275..89f3e27 100644
>>> --- a/gettime.c
>>> +++ b/gettime.c
>>> @@ -168,17 +168,23 @@ void fio_gettime(struct timeval *tp, void
>>> fio_unused *caller)
>>>  		}
>>>  #ifdef ARCH_HAVE_CPU_CLOCK
>>>  	case CS_CPUCLOCK: {
>>> -		unsigned long long usecs, t;
>>> +		unsigned long long usecs, t, delta = 0;
>>>
>>>  		t = get_cpu_clock();
>>>  		if (tv && t < tv->last_cycles) {
>>>  			dprint(FD_TIME, "CPU clock going back in time\n");
>>>  			t = tv->last_cycles;
>>> -		} else if (tv)
>>> +		} else if (tv) {
>>> +			if (tv->last_tv_valid)
>>> +				delta = t - tv->last_cycles;
>>>  			tv->last_cycles = t;
>>> +		}
>>>
>>>  		usecs = t / cycles_per_usec;
>>> -		tp->tv_sec = usecs / 1000000;
>>> +		if (delta && delta < 1000000)
>>> +			tp->tv_sec = tv->last_tv.tv_sec;
>>> +		else
>>> +			tp->tv_sec = usecs / 1000000;
>>>  		tp->tv_usec = usecs % 1000000;
>>>  		break;
>>>  		}
>>
>> I was thinking about this... Is it actually guarenteed to work. If
>> tv->last_tv.tv_usec is eg 900,000, you'd only need a 100k usec diff to
>> need to wrap, not 1000k. And since this is about avoiding costly divs,
>> since we know the number of cycles last time, it might make more sense
>> to just do the single div to go from cycles to usecs, then add that to
>> the tv->last_tv.
>>
> 
> 
> 
> Something like this might work, though that amount of logic may
> be equivalent in terms of cycles to the divide.

So I took a look at it. The costly bit is the division by
cycles_per_usec, which the compiler has no other option than turn into a
divq. The modulo and divide by 1M can be turned into something more
clever, basically shifts and imull.

So how about the below? It turns the divq into multiplication and
division by 10M, which should be considerably less expensive. Can you
test and see how that works for you?

diff --git a/gettime.c b/gettime.c
index 035d275..56703e1 100644
--- a/gettime.c
+++ b/gettime.c
@@ -15,6 +15,7 @@
 
 #ifdef ARCH_HAVE_CPU_CLOCK
 static unsigned long cycles_per_usec;
+static unsigned long inv_cycles_per_usec;
 int tsc_reliable = 0;
 #endif
 
@@ -177,7 +178,7 @@ void fio_gettime(struct timeval *tp, void fio_unused *caller)
 		} else if (tv)
 			tv->last_cycles = t;
 
-		usecs = t / cycles_per_usec;
+		usecs = (t * inv_cycles_per_usec) / 10000000UL;
 		tp->tv_sec = usecs / 1000000;
 		tp->tv_usec = usecs % 1000000;
 		break;
@@ -277,6 +278,8 @@ static void calibrate_cpu_clock(void)
 	dprint(FD_TIME, "mean=%f, S=%f\n", mean, S);
 
 	cycles_per_usec = avg;
+	inv_cycles_per_usec = 10000000UL / cycles_per_usec;
+	dprint(FD_TIME, "inv_cycles_per_usec=%lu\n", inv_cycles_per_usec);
 }
 #else
 static void calibrate_cpu_clock(void)

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux