VDR-1.3.41: speedup for cVideoRepacker

jburgess at uklinux.net (Jon Burgess) · Wed Feb 1 01:32:40 2006

Reinhard Nissl wrote:
> I don't think that it is worth a try as it tests every byte while the 
> above code tests most of the time only every third byte.

I agree that your algorithm is clever and does greatly cut down the 
number of comparisons as compared to the old code.

The glibc memchr() implementation does the comparisons 4 bytes at a time 
using a clever algorithm. It also has assembler optimised variants for 
some CPU's. I don't think that only doing a comparison of every 3rd byte 
wins you anything over memchr().

I believe the bulk of the time taken by the routine is transferring all 
the data from memory into the CPU. Every byte of the data will have to 
be read into the CPU caches due to cacheline effects. I believe that the 
asm optimisations will take into account the possibilities of 
speculative readahead etc. I've not looked into the assembler to see 
whether it actually exploits this.

I've atached the quickly hacked up test program that I wrote. The output 
  is the time taken for many iterations of the 2 different algorithms. 
For me the difference is within the measurement noise. It certainly 
isn't any slower. I'd be interested to know whether it makes any 
difference on your EPIA, both in the test program and in VDR.

$ ./search /video0/%Click_Online/2005-04-10.04\:28.99.99.rec/001.vdr
Found 10585344 matches in 12.5873 seconds
Found 10585344 matches in 12.6235 seconds

	Jon

-------------- next part --------------
A non-text attachment was scrubbed...
Name: search.c
Type: text/x-csrc
Size: 2507 bytes
Desc: not available
Url : http://www.linuxtv.org/pipermail/vdr/attachments/20060201/22a6a9bb/search.c