From: Boqun Feng <boqun.feng@xxxxxxxxx> Sent: Wednesday, July 22, 2020 4:43 PM > > On Wed, Jul 22, 2020 at 11:25:18PM +0000, Michael Kelley wrote: > > From: Boqun Feng <boqun.feng@xxxxxxxxx> Sent: Monday, July 20, 2020 6:41 PM > > > > > > This patch introduces two types of GPADL: HV_GPADL_{BUFFER, RING}. The > > > types of GPADL are purely the concept in the guest, IOW the hypervisor > > > treat them as the same. > > > > > > The reason of introducing the types of GPADL is to support guests whose > > > page size is not 4k (the page size of Hyper-V hypervisor). In these > > > guests, both the headers and the data parts of the ringbuffers need to > > > be aligned to the PAGE_SIZE, because 1) some of the ringbuffers will be > > > mapped into userspace and 2) we use "double mapping" mechanism to > > > support fast wrap-around, and "double mapping" relies on ringbuffers > > > being page-aligned. However, the Hyper-V hypervisor only uses 4k > > > (HV_HYP_PAGE_SIZE) headers. Our solution to this is that we always make > > > the headers of ringbuffers take one guest page and when GPADL is > > > established between the guest and hypervisor, the only first 4k of > > > header is used. To handle this special case, we need the types of GPADL > > > to differ different guest memory usage for GPADL. > > > > > > Type enum is introduced along with several general interfaces to > > > describe the differences between normal buffer GPADL and ringbuffer > > > GPADL. > > > > > > Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx> > > > --- > > > drivers/hv/channel.c | 140 +++++++++++++++++++++++++++++++++++------ > > > include/linux/hyperv.h | 44 ++++++++++++- > > > 2 files changed, 164 insertions(+), 20 deletions(-) > > > > [snip] > > > > > > > > > > > @@ -437,7 +528,17 @@ static int __vmbus_open(struct vmbus_channel *newchannel, > > > open_msg->openid = newchannel->offermsg.child_relid; > > > open_msg->child_relid = newchannel->offermsg.child_relid; > > > open_msg->ringbuffer_gpadlhandle = newchannel->ringbuffer_gpadlhandle; > > > - open_msg->downstream_ringbuffer_pageoffset = newchannel- > > > >ringbuffer_send_offset; > > > + /* > > > + * The unit of ->downstream_ringbuffer_pageoffset is HV_HYP_PAGE and > > > + * the unit of ->ringbuffer_send_offset is PAGE, so here we first > > > + * calculate it into bytes and then convert into HV_HYP_PAGE. Also > > > + * ->ringbuffer_send_offset is the offset in guest, while > > > + * ->downstream_ringbuffer_pageoffset is the offset in gpadl (i.e. in > > > + * hypervisor), so a (PAGE_SIZE - HV_HYP_PAGE_SIZE) gap need to be > > > + * skipped. > > > + */ > > > + open_msg->downstream_ringbuffer_pageoffset = > > > + ((newchannel->ringbuffer_send_offset << PAGE_SHIFT) - (PAGE_SIZE - > > > HV_HYP_PAGE_SIZE)) >> HV_HYP_PAGE_SHIFT; > > > > I couldn't find that the "downstream_ringbuffer_pageoffset" field > > is used anywhere. Can it just be deleted entirely instead of having > > this really messy calculation? > > > > This field is part of struct vmbus_channel_open_channel, which means > guest-hypervisor communication protocal requires us to set the field, > IIUC. So I don't think we can delete it. Indeed, you are right. I mis-read it as a field in struct vmbus_channel, but that's not the case. Thanks. > > To deal with the messy calculation, I do realize there is a similar > calculation in hv_gpadl_hvpfn() too, so in the next version, I will > add a new helper to do this "send offset in guest virtual address to > send offset in GPADL calculation", and use it here and in > hv_gpadl_hvpfn(). Thoughts? Yes, that helps. > > > > open_msg->target_vp = newchannel->target_vp; > > > > > > if (userdatalen) > > > @@ -497,6 +598,7 @@ static int __vmbus_open(struct vmbus_channel *newchannel, > > > return err; > > > } > > > > > > + > > > > Spurious add of a blank line? > > > > Yeah, I will fix this, thanks! > > Regards, > Boqun > > > > /* > > > * vmbus_connect_ring - Open the channel but reuse ring buffer > > > */ > > > diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h > > > index 692c89ccf5df..663f0a016237 100644 > > > --- a/include/linux/hyperv.h > > > +++ b/include/linux/hyperv.h > > > @@ -29,6 +29,48 @@ > > > > > > #pragma pack(push, 1) > > > > > > +/* > > > + * Types for GPADL, decides is how GPADL header is created. > > > + * > > > + * It doesn't make much difference between BUFFER and RING if PAGE_SIZE is the > > > + * same as HV_HYP_PAGE_SIZE. > > > + * > > > + * If PAGE_SIZE is bigger than HV_HYP_PAGE_SIZE, the headers of ring buffers > > > + * will be of PAGE_SIZE, however, only the first HV_HYP_PAGE will be put > > > + * into gpadl, therefore the number for HV_HYP_PAGE and the indexes of each > > > + * HV_HYP_PAGE will be different between different types of GPADL, for example > > > + * if PAGE_SIZE is 64K: > > > + * > > > + * BUFFER: > > > + * > > > + * gva: |-- 64k --|-- 64k --| ... | > > > + * gpa: | 4k | 4k | ... | 4k | 4k | 4k | ... | 4k | > > > + * index: 0 1 2 15 16 17 18 .. 31 32 ... > > > + * | | ... | | | ... | ... > > > + * v V V V V V > > > + * gpadl: | 4k | 4k | ... | 4k | 4k | 4k | ... | 4k | ... | > > > + * index: 0 1 2 ... 15 16 17 18 .. 31 32 ... > > > + * > > > + * RING: > > > + * > > > + * | header | data | header | data | > > > + * gva: |-- 64k --|-- 64k --| ... |-- 64k --|-- 64k --| ... | > > > + * gpa: | 4k | .. | 4k | 4k | ... | 4k | ... | 4k | .. | 4k | .. | ... | > > > + * index: 0 1 16 17 18 31 ... n n+1 n+16 ... 2n > > > + * | / / / | / / > > > + * | / / / | / / > > > + * | / / ... / ... | / ... / > > > + * | / / / | / / > > > + * | / / / | / / > > > + * V V V V V V v > > > + * gpadl: | 4k | 4k | ... | ... | 4k | 4k | ... | > > > + * index: 0 1 2 ... 16 ... n-15 n-14 n-13 ... 2n-30 > > > + */ > > > +enum hv_gpadl_type { > > > + HV_GPADL_BUFFER, > > > + HV_GPADL_RING > > > +}; > > > + > > > /* Single-page buffer */ > > > struct hv_page_buffer { > > > u32 len; > > > @@ -111,7 +153,7 @@ struct hv_ring_buffer { > > > } feature_bits; > > > > > > /* Pad it to PAGE_SIZE so that data starts on page boundary */ > > > - u8 reserved2[4028]; > > > + u8 reserved2[PAGE_SIZE - 68]; > > > > > > /* > > > * Ring data starts here + RingDataStartOffset > > > -- > > > 2.27.0 > >