On Tue, Dec 14, 2021 at 03:06:58AM +0100, Andrea Parri wrote: > > Truncated packet: > > module("hv_vmbus").statement("hv_pkt_iter_first@drivers/hv/ring_buffer.c:457"): > > desc->offset8 = 2, desc->len8 = 514, rbi->pkt_buffer_size = 4096 > > module("hv_vmbus").statement("hv_ringbuffer_read@drivers/hv/ring_buffer.c:382"): > > desc->offset8 = 2, desc->len8 = 512 > > balloon_onchannelcallback: recvlen = 4080, dm_hdr->type = 8 > > > > First garbage packet: > > module("hv_vmbus").statement("hv_pkt_iter_first@drivers/hv/ring_buffer.c:457"): > > desc->offset8 = 21, desc->len8 = 16640, rbi->pkt_buffer_size = 4096 > > module("hv_vmbus").statement("hv_ringbuffer_read@drivers/hv/ring_buffer.c:382"): > > desc->offset8 = 21, desc->len8 = 512 > > balloon_onchannelcallback: recvlen = 3928, dm_hdr->type = 63886 > > > > The trace proved my hypothesis above. > > Thanks for the confirmation. > > (Back to "how to fix this" now.) I think that the patch properly addresses > the "mismatch" between the (maximum) size of the receive buffer (bufferlen > in vmbus_recvpacket()) and max_pkt_size: > > We've discussed hv_ballon in some: > > 1) drivers/hv/hv_balloon.c:balloon_onchannelcallback() > (bufferlen = HV_HYP_PAGE_SIZE, max_pkt_size = VMBUS_DEFAULT_MAX_PKT_SIZE) > > But I understand that the same mismatch is present *and addressed by your > patch in the following cases: > > 2) drivers/hv/hv_util.c:{shutdown,timesync,heartbeat}_onchannelcallback() > (bufferlen = HV_HYP_PAGE_SIZE, max_pkt_size = VMBUS_DEFAULT_MAX_PKT_SIZE) > > 3) drivers/hv/hv_fcopy.c:hv_fcopy_onchannelcallback() > (bufferlen = 2 * HV_HYP_PAGE_SIZE, max_pkt_size = 2 * HV_HYP_PAGE_SIZE) > > 4) drivers/hv/hv_snapshot.c:hv_vss_onchannelcallback() > (bufferlen= 2 * HV_HYP_PAGE_SIZE, max_pkt_size= 2 * HV_HYP_PAGE_SIZE) > > 5) drivers/hv/hv_kvp.c:hv_kvp_onchannelcallback() > (bufferlen= 4 * HV_HYP_PAGE_SIZE, max_pkt_size= 4 * HV_HYP_PAGE_SIZE) > > In fact, IIUC, similar considerations would apply to hv_fb and hv_drm *if > it were not for the fact that those drivers seem to have bogus values for > max_pkt_size: > > 6) drivers/video/fbdev/hyperv_fb.c > (bufferlen = MAX_VMBUS_PKT_SIZE, max_pkt_size=VMBUS_DEFAULT_MAX_PKT_SIZE) > > 7) drivers/gpu/drm/hyperv/hyperv_drm_proto.c > (bufferlen = MAX_VMBUS_PKT_SIZE, max_pkt_size=VMBUS_DEFAULT_MAX_PKT_SIZE) > > So, IIUC, some separate patch is needed in order to "adjust" those values > (say, by appropriately setting max_pkt_size in synthvid_connect_vsp() and > in hyperv_connect_vsp()), but I digress. > > Other comments on your patch: > > a) You mentioned the problem that "pkt_offset may not match the packet > descriptor size". I think this is a real problem. To address this > problem, we can *presumably consider HV_HYP_PAGE_SIZE to be a valid > upper bound for pkt_offset (and sizeof(struct vmpacket_descriptor)) > *and increase the value of max_pkt_size by HV_HYP_PAGE_SIZE (rather > than sizeof(struct vmpacket_descriptor)), like in: > > @@ -256,6 +256,7 @@ int hv_ringbuffer_init(struct hv_ring_buffer_info *ring_info, > > /* Initialize buffer that holds copies of incoming packets */ > if (max_pkt_size) { > + max_pkt_size += HV_HYP_PAGE_SIZE; > ring_info->pkt_buffer = kzalloc(max_pkt_size, GFP_KERNEL); > if (!ring_info->pkt_buffer) > return -ENOMEM; > > b) We may then want to "enforce"/check that that bound on pkt_offset, > > pkt_offset <= HV_HYP_PAGE_SIZE, > > is met by adding a corresponding check to the (previously discussed) > validation of pkt_offset included in hv_pkt_iter_first(), say: > > @@ -498,7 +498,8 @@ struct vmpacket_descriptor *hv_pkt_iter_first(struct vmbus_channel *channel) > * If pkt_offset is invalid, arbitrarily set it to > * the size of vmpacket_descriptor > */ > - if (pkt_offset < sizeof(struct vmpacket_descriptor) || pkt_offset > pkt_len) > + if (pkt_offset < sizeof(struct vmpacket_descriptor) || pkt_offset > pkt_len || > + pkt_offset > HV_HYP_PAGE_SIZE) > pkt_offset = sizeof(struct vmpacket_descriptor); > > /* Copy the Hyper-V packet out of the ring buffer */ > > While there (since I have noticed that such validation as well the > validation on pkt_len in hv_pkt_iter_first() tend to be the object > of a somehow recurring discussion ;/), I wouldn't mind to add some > "explicit" debug, pr_err_ratelimited() say, there. > > c) Last but not least, I'd recommend to pair the above changes or any > other change with some inline explanation/comments; these comments > could be snippets from an (updated) patch description for example. > > Andrea One more thought. The additional HV_HYP_PAGE_SIZE seems unnecessary for drivers such as hv_netvsc and hv_storvsc, which explicitly account for pkt_offset in their setting of max_pkt_size, as well as for drivers such as hv_pci, which uses vmbus_recvpacket_raw(). This suggests an alternative approach: do not increase max_pkt_size in the generic vmbus code, instead set/adjust max_pkt_size (only) for the drivers, in (1-7) above, which require the additional HV_HYP_PAGE_SIZE /pkt_offset. While putting on the driver(s) some additional "burden", this approach has the advantage of saving some memory (and keeping the generic code relatively simpler). Thoughts? Andrea