RE: [RFC] libibverbs IB Device Memory support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Christoph Lameter [mailto:cl@xxxxxxxxx]
> Sent: Tuesday, June 06, 2017 1:11 AM
> To: Leon Romanovsky <leonro@xxxxxxxxxxxx>
> Cc: Jason Gunthorpe <jgunthorpe@xxxxxxxxxxxxxxxxxxxx>; ahmad omary
> <ahmad151084@xxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; Ahmad Omary
> <ahmad@xxxxxxxxxxxx>; Yishai Hadas <yishaih@xxxxxxxxxxxx>; Tzahi
> Oved <tzahio@xxxxxxxxxxxx>; Alex Rosenbaum <alexr@xxxxxxxxxxxx>;
> Ariel Levkovich <lariel@xxxxxxxxxxxx>; Liran Liss <liranl@xxxxxxxxxxxx>
> Subject: Re: [RFC] libibverbs IB Device Memory support
> 
> On Mon, 5 Jun 2017, Leon Romanovsky wrote:
> 
> > It is rough calculation for 1MB, when I asked Ahmad about this
> > limitation (4K) he explained to me that exposed device memory is less
> > than 1MB.

Some of the devices supports less than 1MB internal device memory (256KB). 
 
> Still doesnt that mean more than 256 MPI instances or so per node?
>

In the above use case device, we can have only 64 processes per node which
Is critical for HPC.
 
> The use case for a semaphore indicates that a 4k page would be shared
> between multiple processes? Therefore there is even less of a need of
> multiple pages.
> 
> You may not be able to avoid the 4k page since page protection works only
> on a 4k level. The kernel futexes rely on 4k page protection tricks.
> 

Vendor driver still allocates and map 4KB pages granularity. But in case the
HW device supports less than 4KB, then the HW must provide the required protection.

> Please come up with a reasonable use case here.... We do not run MPI but
> our use cases work fine with mmapped 4k pages. There are some who
> actually would like 2M pages for that use case since some of the adapters
> have quite a bit of memory available.
> 
> A small object allocator with the need to go through an intermediate layer
> seems to be not very productive.
> 

Note that the device memory does not necessary have to be mapped to the CPU.
i.e. is not necessary accessible by PCI, and can only be accessed by RDMA.
This is why we can't use MMAP for all cases  and a dedicated allocation and
 copy functions are needed.

Ahmad Omary

 




--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux