Re: [RFC] Alternative design for fast register physical memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Bob

Sorry to disturb you. Are you going to revert it to single map as the suggestion from Jason


Thanks
Zhijian


On 27/05/2022 20:42, Jason Gunthorpe wrote:
> On Tue, May 24, 2022 at 05:28:00PM -0500, Bob Pearson wrote:
>
>> We have a work around by fencing all the local operations which more
>> or less works but will have bad performance.  The maps used in FMRs
>> have fairly short lifetimes but definitely longer than we we can
>> support today. I am trying to work out the semantics of everything.
> IBTA specifies the fence requirements, I thought we decided RXE or
> maybe even lustre wasn't following the spec?
>
>> To make this all recoverable in the face of errors let there be more
>> than one map present for an FMR indexed by the key portion of the
>> l/rkeys.
> Real HW doesn't have more than one map, this seems like the wrong
> direction.
>
> As we discussed, there is something wrong with how rxe is processing
> its queues, it isn't following IBTA define behaviors in the
> exceptional cases.
>
>> Alternative view of FMRs:
>>
>> verb: ib_alloc_mr(pd, max_num_sg)			- create an empty MR object with no maps
>> 							  with l/rkey = [index, key] with index
>> 							  fixed and key some initial value.
>>
>> verb: ib_update_fast_reg_key(mr, newkey)		- update key portion of l/rkey
>>
>> verb: ib_map_mr_sg(mr, sg, sg_nents, sg_offset)		- create a new map from allocated memory
>> 							  or by re-using an INVALID map. Maps are
>> 							  all the same size (max_num_sg). The
>> 							  key (index) of this map is the current
>> 							  key from l/rkey. The initial state of
>> 							  the map is FREE. (and thus not usable
>> 							  until a REG_MR work request is used.)
> More than one map is nonsense, real HW has a single map, a MR object is that
> single map.
>
>> This is an improvement over the current state. At the moment we have
>> only two maps one for making new ones and one for doing IO. There is
>> no room to back up but at the moment the retry logic assumes that
>> you can which is false. This can be fixed easily by forcing all
>> local operations to be fenced which is what we are doing at the
>> moment at HPE. This can insert long delays between every new FMR
>> instance.  By allowing three maps and then fencing we can back up
>> one broken IO operation without too much of a delay.
> IMHO you need to go back to one map and fix the queue processing
> logic to be spec compliant.
>
> Jason




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux