[PATCH] UnbufferedFile improvements v7

Klaus.Schmidinger at cadsoft.de (Klaus Schmidinger) · Sat Feb 4 18:06:43 2006

Artur Skawina wrote:
> Klaus Schmidinger wrote:
> 
>> Artur Skawina wrote:
>>
>>> this time with a new approach to read caching. Should make watching 
>>> and editing recordings on a non-idle (and/or slow) machine more 
>>> comfortable.
>>>
>>> The difference to previous versions (and stock fadvise-enabled vdr) 
>>> is that previously, read data was almost immediately forgotten after 
>>> it was used; now a certain amount of recently accessed data (at most 
>>> ~16M) is kept around. This means that short seeks (jumps) when 
>>> replaying do not cause disk accesses. Things like switching play 
>>> mode, FF, setting and moving editing marks shouldn't usually block 
>>> waiting for disk IO. The changes are most noticeable when eg. several 
>>> recordings are happening in the background.
>>>
>>> I did very little testing, treat this as beta quality at best. Seems 
>>> to work ok, but i won't probably have time to test it further for a 
>>> few days; maybe somebody wants to play w/ this, or even better take a 
>>> look at the Read() path...
> 
> 
> after running w/ it for a week, i'd now say it works; haven't seen any 
> problems that could be related to the Read() code.
> 
>> Attached is the patch as I would add it for the next version.
> 
> 
> Why not the inlined Seeks()? The IO is very inefficient anyway (tiny 
> read/write sizes), having an extra syscall (seek) for every read/write 
> call doesn't help...
> (yes, it may not be pretty, but it lets the compiler optimize the calls 
> away, w/o touching any callers. Unless you fixed the callers, then 
> ignore this)

Do you really think that making Seek() inline actually makes things
faster? Can you show some hard numbers?

>> Please take a look at the lines marked CLARIFY and let me know
>> if you have any comment on these.
> 
> 
> i'll have to rediff this and compare w/ my current version; will post 
> merged version later. Comments below.
> 
>> +++ cutter.c    2006/02/04 13:40:20
>> @@ -66,6 +66,8 @@
>>       toFile = toFileName->Open();
>>       if (!fromFile || !toFile)
>>          return;
>> +     fromFile->SetReadAhead(MEGABYTE(20));
>> +     toFile->SetReadAhead(MEGABYTE(20));
>>       int Index = Mark->position;
>>       Mark = fromMarks.Next(Mark);
>>       int FileSize = 0;
>> @@ -122,6 +125,7 @@
>>                      error = "toFile 1";
>>                      break;
>>                      }
>> +                 toFile->SetReadAhead(MEGABYTE(20));
>>                   FileSize = 0;
>>                   }
>>                LastIFrame = 0;
>> @@ -162,6 +166,7 @@
>>                         error = "toFile 2";
>>                         break;
>>                         }
>> +                    toFile->SetReadAhead(MEGABYTE(20));
>>                      FileSize = 0;
>>                      }
>>                   }
> 
> 
> i've dropped the toFile->SetReadAhead() hack -- it doesn't change that 
> much when the datasyncs aren't there anyway.
> 
>> +++ tools.c    2006/02/04 13:55:24
>> +        // Read ahead:
>> +        // no jump? (allow small forward jump still inside readahead 
>> window).
>> +        if (jumped >= 0 && jumped <= (off_t)readahead) {
>> +           // Trigger the readahead IO, but only if we've used at least
>> +           // 1/2 of the previously requested area. This avoids calling
>> +           // fadvise() after every read() call.
>> +           if (ahead - curpos < (off_t)(readahead - readahead / 2)) 
>> {//XXX CLARIFY: isn't this just 'readahead / 2'???
> 
> 
> historical reasons -- it used to be 1/4, not 1/2. I went back to 1/2 
> after switching to the new read caching strategy; previously i was 
> seeing stalls when replaying and eg cutting at the same time. With the 
> new code hopefully 1/2 is enough and reduces the number of fadvise calls.

So we could make this

   if (ahead - curpos < (off_t)(readahead / 2)) {

then?

>> +           // The above fadvise() works when writing slowly 
>> (recording), but could
>> +           // leave cached data around when writing at a high rate 
>> (cutting).
>> +           // Also, it seems in some setups, the above does not 
>> trigger any I/O and
>> +           // the fdatasync() call below has to do all the work 
>> (reiserfs with some
>> +           // kind of write gathering enabled).
>> +           // We add 'readahead' to the threshold in an attempt to 
>> increase cutting
>> +           // speed; it's a tradeoff -- speed vs RAM-used.
>> +           if (totwritten > MEGABYTE(32) + readahead) {
>> +              //fdatasync(fd);//XXX CLARIFY: so what, is this 
>> necessary or not???
> 
> 
> it is not, at least not here, however i left it commented out in case 
> there are setups where it actually helps -- it's easier to uncomment 
> this one line for testing than to find the right place to insert the 
> call. It's not on by default because fsyncs are relatively expensive; 
> i'd rather avoid them unless absolutely required. (i've updated the 
> comments in this area since, they will be in the merged version)

Ok. Please make that a diff against version 1.3.41 with vdr-1.3.41-fadvise.diff
applied.

> [for faster responses please cc me; esp. if it's a reply to an older 
> thread -- i almost missed this]

Don't worry, your response was fast enough ;-)

Klaus