Re: [PATCH 87/88] Add configurable prefetch size for layoutget

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2011-06-10 10:09, tao.peng@xxxxxxx wrote:
> Hi, Benny,
> 
> -----Original Message-----
> From: Benny Halevy [mailto:benny@xxxxxxxxxx] 
> Sent: Friday, June 10, 2011 8:33 PM
> To: Peng, Tao
> Cc: bergwolf@xxxxxxxxx; rees@xxxxxxxxx; linux-nfs@xxxxxxxxxxxxxxx; honey@xxxxxxxxxxxxxx
> Subject: Re: [PATCH 87/88] Add configurable prefetch size for layoutget
> 
> On 2011-06-10 02:00, tao.peng@xxxxxxx wrote:
>> Hi, Benny,
>>
>> Cheers,
>> -Bergwolf
>>
>>
>> -----Original Message-----
>> From: linux-nfs-owner@xxxxxxxxxxxxxxx [mailto:linux-nfs-owner@xxxxxxxxxxxxxxx] On Behalf Of Benny Halevy
>> Sent: Friday, June 10, 2011 5:23 AM
>> To: Peng Tao
>> Cc: Jim Rees; linux-nfs@xxxxxxxxxxxxxxx; peter honeyman
>> Subject: Re: [PATCH 87/88] Add configurable prefetch size for layoutget
>>
>> On 2011-06-09 08:07, Peng Tao wrote:
>>> Hi, Jim and Benny,
>>>
>>> On Thu, Jun 9, 2011 at 9:58 PM, Jim Rees <rees@xxxxxxxxx> wrote:
>>>> Benny Halevy wrote:
>>>>
>>>>  > My understanding is that layoutget specifies a min and max, and the server
>>>>
>>>>  There's a min.  What do you consider the max?
>>>>  Whatever gets into csa_fore_chan_attrs.ca_maxresponsesize?
>>>>
>>>> The spec doesn't say max, it says "desired."  I guess I assumed the server
>>>> wouldn't normally return more than desired.
>>> In fact server is returning "desired" length. The problem is that we
>>> call pnfs_update_layout in nfs_write_begin, and it will end up setting
>>> both minlength and length to page size. There is no space for client
>>> to collapse layoutget range in nfs_write_begin.
>>>
>>
>> That's a different issue.  Waiting with pnfs_update_layout to flush
>> time rather than write_begin if the whole page is written would help
>> sending a more meaningful desired range as well as avoiding needless
>> read-modify-writes in case the application also wrote the whole
>> preallocated block.
>> [PT] It is also the reason why we want to introduce layout prefetching, to get more segment than the page passed in nfs_write_begin.
>>
> 
> Peng, I understand what you want to achieve but the proposed way
> just doesn't fly. The server knows better than the client its allocation policies
> and it knows better the combined workload of different client and possible
> conflicts between them therefore it should be making the ultimate decision
> about the actual segment sizes.
> [PT] Yes, you are right. Server should know combined workload of all clients and make its decision based on that.
> And it always has the right to return more than (or less than) specified in loga_length.
>  
> That said, the client should indeed do its best to ask for the most appropriate
> segments size for its use and we should be making a better job at that.
> It's just that blindly asking for more is not a good strategy and requiring
> manual admin help to tune the clients is not acceptable.
> [PT] yeah, determing the most appropriate is always the hart part. Do you have any suggestions to that?

A simple algorithm I can suggest is:
- on initialization, calculate and save, per layout driver
  - maximum layout size
    - take into account csr_fore_chan_attrs.ca_maxresponsesize and possible other parameters
  - keep a working copy of the maximum value and the calculated copy.
  - alignment value.
- on miss, see if there's an adjacent layout segment in cache
- if found, ask for twice the found segment size, up to the maximum value,
  aligned on the alignment value.
- if the server returns less the layoutget range, keep note of the returned length
  (but not adjust maximum yet, as the server may return a short segment for various
   reasons)
- if the server is consistent about returning less than was asked, adjust the
  - working copy of the maximum length
- if the maximum was adjusted try bumping it up after X (TBD) layoutgets or T seconds
  to see if that was just due to high load or conflicts on the server
- on any error returned for LAYOUTGET reset the algorithm parameters
- on session reestablishment recalculate maximums.

Benny

> 
> Thanks,
> Tao
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux