Zvi Effron <zeffron@xxxxxxxxxxxxx> writes: > On Tue, Sep 21, 2021 at 11:23 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: >> >> Zvi Effron <zeffron@xxxxxxxxxxxxx> writes: >> >> > On Tue, Sep 21, 2021 at 9:06 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: >> >> >> >> Hi Lorenz (Cc. the other people who participated in today's discussion) >> >> >> >> Following our discussion at the LPC session today, I dug up my previous >> >> summary of the issue and some possible solutions[0]. Seems no on >> >> actually replied last time, which is why we went with the "do nothing" >> >> approach, I suppose. I'm including the full text of the original email >> >> below; please take a look, and let's see if we can converge on a >> >> consensus here. >> >> >> >> First off, a problem description: If an existing XDP program is exposed >> >> to an xdp_buff that is really a multi-buffer, while it will continue to >> >> run, it may end up with subtle and hard-to-debug bugs: If it's parsing >> >> the packet it'll only see part of the payload and not be aware of that >> >> fact, and if it's calculating the packet length, that will also only be >> >> wrong (only counting the first fragment). >> >> >> >> So what to do about this? First of all, to do anything about it, XDP >> >> programs need to be able to declare themselves "multi-buffer aware" (but >> >> see point 1 below). We could try to auto-detect it in the verifier by >> >> which helpers the program is using, but since existing programs could be >> >> perfectly happy to just keep running, it probably needs to be something >> >> the program communicates explicitly. One option is to use the >> >> expected_attach_type to encode this; programs can then declare it in the >> >> source by section name, or the userspace loader can set the type for >> >> existing programs if needed. >> >> >> >> With this, the kernel will know if a given XDP program is multi-buff >> >> aware and can decide what to do with that information. For this we came >> >> up with basically three options: >> >> >> >> 1. Do nothing. This would make it up to users / sysadmins to avoid >> >> anything breaking by manually making sure to not enable multi-buffer >> >> support while loading any XDP programs that will malfunction if >> >> presented with an mb frame. This will probably break in interesting >> >> ways, but it's nice and simple from an implementation PoV. With this >> >> we don't need the declaration discussed above either. >> >> >> >> 2. Add a check at runtime and drop the frames if they are mb-enabled and >> >> the program doesn't understand it. This is relatively simple to >> >> implement, but it also makes for difficult-to-understand issues (why >> >> are my packets suddenly being dropped?), and it will incur runtime >> >> overhead. >> >> >> >> 3. Reject loading of programs that are not MB-aware when running in an >> >> MB-enabled mode. This would make things break in more obvious ways, >> >> and still allow a userspace loader to declare a program "MB-aware" to >> >> force it to run if necessary. The problem then becomes at what level >> >> to block this? >> >> >> > >> > I think there's another potential problem with this as well: what happens to >> > already loaded programs that are not MB-aware? Are they forcibly unloaded? >> >> I'd say probably the opposite: You can't toggle whatever switch we end >> up with if there are any non-MB-aware programs (you'd have to unload >> them first)... >> > > How would we communicate that issue? dmesg? I'm not very familiar with > how sysctl change failure causes are communicated to users, so this > might be a solved problem, but if I run `sysctl -w net.xdp.multibuffer > 1` (or whatever ends up actually being the toggle) to active > multi-buffer, and it fails because there's a loaded non-aware program, > that seems like a potential for a lot of administrator pain. Hmm, good question. Document that this only fails if there's a non-mb-aware XDP program loaded? Or use some other mechanism with better feedback? >> >> Doing this at the driver level is not enough: while a particular >> >> driver knows if it's running in multi-buff mode, we can't know for >> >> sure if a particular XDP program is multi-buff aware at attach time: >> >> it could be tail-calling other programs, or redirecting packets to >> >> another interface where it will be processed by a non-MB aware >> >> program. >> >> >> >> So another option is to make it a global toggle: e.g., create a new >> >> sysctl to enable multi-buffer. If this is set, reject loading any XDP >> >> program that doesn't support multi-buffer mode, and if it's unset, >> >> disable multi-buffer mode in all drivers. This will make it explicit >> >> when the multi-buffer mode is used, and prevent any accidental subtle >> >> malfunction of existing XDP programs. The drawback is that it's a >> >> mode switch, so more configuration complexity. >> >> >> > >> > Could we combine the last two bits here into a global toggle that doesn't >> > require a sysctl? If any driver is put into multi-buffer mode, then the system >> > switches to requiring all programs be multi-buffer? When the last multi-buffer >> > enabled driver switches out of multi-buffer, remove the system-wide >> > restriction? >> >> Well, the trouble here is that we don't necessarily have an explicit >> "multi-buf mode" for devices. For instance, you could raise the MTU of a >> device without it necessarily involving any XDP multi-buffer stuff (if >> you're not running XDP on that device). So if we did turn "raising the >> MTU" into such a mode switch, we would end up blocking any MTU changes >> if any XDP programs are loaded. Or having an MTU change cause a >> force-unload of all XDP programs. > > Maybe I missed something then, but you had stated that "while a > particular driver knows if it's running in multi-buff mode" so I > assumed that the driver would be able to tell when to toggle the mode > on. Well, a driver knows when it is attaching an XDP program whether it (the driver) is configured in a way such that this XDP program could encounter a multi-buf. > I had been thinking that when a driver turned multi-buffer off, it > could trigger a check of all drivers, but that also seems like it > could just be a global refcount of all the drivers that have requested > multi-buffer mode. When a driver enables multi-buffer for itself, it > increments the refcount, and when it disables, it decrements. A > non-zero count means the system is in multi-buffer mode. I guess we could do a refcount-type thing when an multi-buf XDP program is first attached (as per above). But I think it may be easier to just do it at load-time, then, so it doesn't have to be in the driver, but the BPF core could just enforce it. This would basically amount to a rule saying "you can't mix mb-aware and non-mb-aware programs", and the first type to be loaded determines which mode the system is in. This would be fairly simple to implement and enforce, I suppose. The drawback is that it's potentially racy in the order programs are loaded... -Toke