On 10/03, John Fastabend wrote: > Andrii Nakryiko wrote: > > On Thu, Oct 3, 2019 at 9:01 AM Stanislav Fomichev <sdf@xxxxxxxxxxx> wrote: > > > > > > On 10/02, Andrii Nakryiko wrote: > > > > On Wed, Oct 2, 2019 at 6:43 PM Stanislav Fomichev <sdf@xxxxxxxxxxx> wrote: > > > > > > > > > > On 10/02, Andrii Nakryiko wrote: > > > > > > On Wed, Oct 2, 2019 at 10:35 AM Stanislav Fomichev <sdf@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > Always use init_net flow dissector BPF program if it's attached and fall > > > > > > > back to the per-net namespace one. Also, deny installing new programs if > > > > > > > there is already one attached to the root namespace. > > > > > > > Users can still detach their BPF programs, but can't attach any > > > > > > > new ones (-EPERM). > > > > > > > > I find this quite confusing for users, honestly. If there is no root > > > > namespace dissector we'll successfully attach per-net ones and they > > > > will be working fine. That some process will attach root one and all > > > > the previously successfully working ones will suddenly "break" without > > > > users potentially not realizing why. I bet this will be hair-pulling > > > > investigation for someone. Furthermore, if root net dissector is > > > > already attached, all subsequent attachment will now start failing. > > > The idea is that if sysadmin decides to use system-wide dissector it would > > > be attached from the init scripts/systemd early in the boot process. > > > So the users in your example would always get EPERM/EBUSY/EXIST. > > > I don't really see a realistic use-case where root and non-root > > > namespaces attach/detach flow dissector programs at non-boot > > > time (or why non-root containers could have BPF dissector and root > > > could have C dissector; multi-nic machine?). > > > > > > But I totally see your point about confusion. See below. > > > > > > > I'm not sure what's the better behavior here is, but maybe at least > > > > forcibly detach already attached ones, so when someone goes and tries > > > > to investigate, they will see that their BPF program is not attached > > > > anymore. Printing dmesg warning would be hugely useful here as well. > > > We can do for_each_net and detach non-root ones; that sounds > > > feasible and may avoid the confusion (at least when you query > > > non-root ns to see if the prog is still there, you get a valid > > > indication that it's not). > > > > > > > Alternatively, if there is any per-net dissector attached, we might > > > > disallow root net dissector to be installed. Sort of "too late to the > > > > party" way, but at least not surprising to successfully installed > > > > dissectors. > > > We can do this as well. > > > > > > > Thoughts? > > > Let me try to implement both of your suggestions and see which one makes > > > more sense. I'm leaning towards the later (simple check to see if > > > any non-root ns has the prog attached). > > > > > > I'll follow up with a v2 if all goes well. > > > > Thanks! I don't have strong opinion on either, see what makes most > > sense from an actual user perspective. > > > From my point of view the second option is better. The root namespace flow > dissector attach should always happen first before any other namespaces are > created. If any namespaces have already attached then just fail the root > namespace. > > Otherwise if you detach existing dissectors from a container these were > probably attached by the init container which might not be running anymore > and I have no easy way to learn/find out about this without creating another > container specifically to watch for this. If I'm relying on the dissector > for something now I can seemingly random errors. So its a bit ugly and I'll > probably just tell users to always attach the root namespace first to avoid > this headache. On the other side if the root namespace already has a > flow dissector attached and my init container fails its attach cmd I > can handle the error gracefully or even fail to launch the container with > a nice error message and the administrator can figure something out. > I'm always in favor of hard errors vs trying to guess what the right > choice is for any particular setup. > > Also it seems to me just checking if anything is attached is going to make > the code simpler vs trying to detach things in all namespaces. Agreed, I was also leaning towards this option. Thanks!