RE: [PATCH RFC 0/9] A rendezvous module

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> I can only recommende everone to buy from a less f***** up GPU or
> accelerator vendor.  
I would certainly love that.  This is not just a recent problem, it's been going on for at least 3-5 years with no end in sight.  And the nvidia driver itself is closed-source in the kernel :-(  Making tuning and debug even harder and continuing to add costs to NIC vendors other than nVidia themselves to support this.

Back to the topic at hand, yes, there are a few misalignments in the ABI.  Most of the structures are carefully aligned.  Below I summarize the major structures and their alignment characteristics. In a few places we chose readability for the application programmer by ordering fields in a logical order, such as for statistics.   

In one place a superfluous resv field was used (rv_query_params_out)  and when I alluded that might be able to be taken advantage in the future to enable a common ABI for GPUs, we went down this deep rat hole.

In studying all the related fields, in most cases if we shuffled everything for maximum packing, the structures would still end up being about the same size and this would all be for non-performance path ABIs.

Here is a summary:
13 structures all perfectly aligned with no gaps, plus 7 structures below.

rv_query_params_out - has one 4 byte reserved field to guarantee alignment for a u64 which follows.

rv_attach_params_in - organized logically as early fields influence how later fields are interpreted.  Fair number of fields, two 1 byte gaps and one 7 byte gap.  Shuffling this might save about 4-8 bytes tops

rv_cache_stats_params_out - ordered logically by statistics meanings.  Two 4 byte gaps could be solved by having a less logical order.  Of course, applications reporting these statistics will tend to do output in the original order, so packing this results in a harder to use ABI and more difficult code review for application writers wanting to make sure they report all stats but do so in a logical order.

rv_conn_get_stats_params_out - one 2 byte gap (so the 1 bytes field mimicking the input request can be 1st), three 4 byte gaps.  Same explanation as rv_cache_stats_params_out

rv_conn_create_params_in - one 4 byte gap, easy enough to swap

rv_post_write_params_out - one 3 byte gap.  Presented in logical order, shuffling would still yield the same size as compiler will round up size.

rv_event - carefully packed and aligned.  Had to make this work on a wide range of compilers with a 1 byte common field defining which part of union was relevant.  Could put the same field in all unions to get rid of packed attribute if that is preferred.  We found other similar examples like this in an older 4.18 kernel, cited one below.

It should be noted, there are existing examples with small gaps or reserved fields in the existing kernel and RDMA stack.  A few examples in ib_user_verbs.h include:

ib_uverbs_send_wr - 4 byte gap after ex for the rdma field in union.

ib_uverbs_flow_attr - 2 byte reserved field declared

ib_flow_spec - 2 byte gap after size field

rdma_netdev - 7 byte gap after port_num

./hw/mthca/mthca_eq.c - very similar use of packed for mthca_eqe to the one complained about in rv_event

Todd




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux