From: Wei Liu <wei.liu@xxxxxxxxxx> Sent: Friday, January 14, 2022 11:13 AM > > On Mon, Jan 10, 2022 at 01:44:19AM +0100, Andrea Parri wrote: > > (Extending Cc: list,) > > > > On Sun, Jan 09, 2022 at 05:55:16PM +0800, Yanming Liu wrote: > > > Commit adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V > > > out of the ring buffer") introduced a notion of maximum packet size in > > > vmbus channel and used that size to initialize a buffer holding all > > > incoming packet along with their vmbus packet header. Currently, some > > > vmbus drivers set max_pkt_size to the size of their receive buffer > > > passed to vmbus_recvpacket, however vmbus_open expects this size to also > > > include vmbus packet header. This leads to corruption of the ring buffer > > > state when receiving a maximum sized packet. > > > > > > Specifically, in hv_balloon I have observed of a dm_unballoon_request > > > message of 4096 bytes being truncated to 4080 bytes. When the driver > > > tries to read next packet it starts from a wrong read_index, receives > > > garbage and prints a lot of "Unhandled message: type: <garbage>" in > > > dmesg. > > > > > > The same mismatch also happens in hv_fcopy, hv_kvp, hv_snapshot, > > > hv_util, hyperv_drm and hyperv_fb, though bad cases are not observed > > > yet. > > > > > > Allocate the buffer with HV_HYP_PAGE_SIZE more bytes to make room for > > > the descriptor, assuming the vmbus packet header will never be larger > > > than HV_HYP_PAGE_SIZE. This is essentially free compared to just adding > > > 'sizeof(struct vmpacket_descriptor)' because these buffers are all more > > > than HV_HYP_PAGE_SIZE bytes so kmalloc rounds them up anyway. > > > > > > Fixes: adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer") > > > Suggested-by: Andrea Parri (Microsoft) <parri.andrea@xxxxxxxxx> > > > Signed-off-by: Yanming Liu <yanminglr@xxxxxxxxx> > > > > Thanks for sorting this out; the patch looks good to me: > > > > Reviewed-by: Andrea Parri (Microsoft) <parri.andrea@xxxxxxxxx> > > > > Thanks. I will pick this up after 5.17-rc1 is out. > > Wei. I'm NACK'ing this set of changes. I've spent some further time investigating, so let me explain. I'm good with the overall approach of fixing individual drivers to set the max_pkt_size to account for the VMbus packet header, as this is an important aspect that was missed in the original coding. But interestingly, all but one of the miscellaneous VMbus drivers allocate significantly more receive buffer space than is actually needed, and the max_pkt_size matching that receive buffer size is already bigger than needed. In all these cases, there is already plenty of space for the VMbus packet header. These hv-util.c drivers allocate a receive buffer 4 Kbytes in size, and all receive only small fixed-size packets: heartbeat, shutdown, timesync. I don't think any changes are needed for these drivers because the default max_pkt_size value of 4 Kbytes bytes is plenty of space even when accounting for the VMbus packet header. The VSS driver in hv_snapshot.c allocates a receive buffer of 8 Kbytes and sets max_pkt_size to 8 Kbytes. But the received messages are all fixed size and small. I don't know why the driver uses an 8 Kbyte receive buffer instead of 4 Kbytes, but the current settings are more than sufficient. The FCOPY driver in hv_fcopy.c allocates a receive buffer of 8 Kbytes and sets max_pkt_size to 8 Kbytes. The received messages have some header overhead plus up to 6 Kbytes of data, so the 8 Kbyte receive buffer is definitely needed. And while this one is a little closer to filling up the available receive space than the previous ones, there's still plenty of room for the VMbus packet header. I don't think any changes are needed. The KVP driver in hv_kvp.c allocates a receive buffer of 16 Kbytes and sets max_pkt_size to 16 Kbytes. From what I can tell, the received messages max out at close to 4 Kbytes. Key exchange messages have 512 bytes of key name and 2048 bytes of key value, plus some header overhead. ipaddr_value messages are the largest, with 3 IP addresses @ 1024 bytes each, plus a gateway with 512 bytes, and an adapter ID with 128 bytes. But altogether, that is still less than 4096. I don't know why the receive buffer is 16 Kbytes, but it is plenty big and no changes are needed. The two frame buffer drivers also use 16 Kbyte receive buffers and set max_pkt_size to 16 Kbytes. Again, this looks to be overkill as the messages received are mostly fixed size. One message returns a variable size list of supported screen resolutions, but each entry in the list is only 4 bytes, and we're looking at a few tens of resolutions at the very most. Again, no changes are needed. After all this analysis, the balloon driver is the only one that needs changing. It uses a 4 Kbyte receive buffer, and indeed Hyper-V may fill that receive buffer in the case of unballoon messages. And that where the original problem was observed. Two other aspects for completeness. First, all these drivers do protocol version negotiation with the Hyper-V host. The negotiation includes a variable-size list of supported versions. Each version in the list takes 4 bytes, but there would never be enough different versions to come close to filling a 4 Kbyte buffer. So there's no problem here. The other lurking issue is the 'offset8' field in the VMbus packet header, which says where the payload starts relative to the VMbus packet header. In practice, this value is always small, so there's no significant additional space to account for. While it's theoretically possible that Hyper-V could use a much larger value, and cause max_pkt_size to be exceeded, there's no real way to fix this problem. Adding an extra page to max_pkt_size, as it done in this patch, certainly provides some extra room, but doesn't guarantee the problem can't happen. But since there's no indication Hyper-V would ever put a big value into offset8, I don't think we need to worry about the possibility. My bottom-line: Let's fix the balloon driver. But now that we know the other drivers are safe "as is", let's leave them unchanged and not waste the additional memory. Michael