On 1/23/23, Paul Moore <paul@xxxxxxxxxxxxxx> wrote: > On Fri, Jan 20, 2023 at 7:50 PM Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: >> On 1/20/23, Paul Moore <paul@xxxxxxxxxxxxxx> wrote: >> > On Mon, Jan 16, 2023 at 4:21 PM Mateusz Guzik <mjguzik@xxxxxxxxx> >> > wrote: >> >> >> >> access(2) remains commonly used, for example on exec: >> >> access("/etc/ld.so.preload", R_OK) >> >> >> >> or when running gcc: strace -c gcc empty.c >> >> % time seconds usecs/call calls errors syscall >> >> ------ ----------- ----------- --------- --------- ---------------- >> >> 0.00 0.000000 0 42 26 access >> >> >> >> It falls down to do_faccessat without the AT_EACCESS flag, which in >> >> turn >> >> results in allocation of new creds in order to modify fsuid/fsgid and >> >> caps. This is a very expensive process single-threaded and most >> >> notably >> >> multi-threaded, with numerous structures getting refed and unrefed on >> >> imminent new cred destruction. >> >> >> >> Turns out for typical consumers the resulting creds would be identical >> >> and this can be checked upfront, avoiding the hard work. >> >> >> >> An access benchmark plugged into will-it-scale running on Cascade Lake >> >> shows: >> >> test proc before after >> >> access1 1 1310582 2908735 (+121%) # distinct files >> >> access1 24 4716491 63822173 (+1353%) # distinct files >> >> access2 24 2378041 5370335 (+125%) # same file >> > >> > Out of curiosity, do you have any measurements of the impact this >> > patch has on the AT_EACCESS case when the creds do need to be >> > modified? >> >> I could not be arsed to bench that. I'm not saying there is literally 0 >> impact, but it should not be high and the massive win in the case I >> patched imho justifies it. > > That's one way to respond to an honest question asking if you've done > any tests on the other side of the change. I agree the impact should > be less than the advantage you've shown, but sometimes it's nice to > see these things. > So reading this now I do think it was worded in quite a poor manner, so apologies for that. Wording aside, I don't know whether this is just a passing remark or are you genuinely concerned about the other case. If you are, I noted there is an immediately achievable speed up by eliminating the get/put ref cycle on creds coming from override_creds + put_cred to backpedal from it. This should be enough to cover it, but there are cosmetic problems around it I don't want to flame over. Say override_creds_noref gets added doing the usual work, except for get_new_cred. Then override_creds would be: validate_creds(new); get_new_cred((struct cred *)new); override_creds_noref(new); But override_creds_noref would retain validate_creds new/old and the above would repeat it which would preferably be avoided. Not a problem if it is deemed ok to get_new_cred without validate_creds. >> These funcs are literally next to each other, I don't think that is easy >> to miss. I concede a comment in access_override_creds to take a look at >> access_need_override_creds would not hurt, but I don't know if a resend >> to add it is justified. > > Perhaps it's because I have to deal with a fair amount of code getting > changed in one place and not another, but I would think that a comment > would be the least one could do here and would justify a respin. > I'm not going to *insist* on not adding that comment. Would this work for you? diff --git a/fs/open.c b/fs/open.c index 3c068a38044c..756177b94b04 100644 --- a/fs/open.c +++ b/fs/open.c @@ -407,6 +407,11 @@ static const struct cred *access_override_creds(void) if (!override_cred) return NULL; + /* + * XXX access_need_override_creds performs checks in hopes of + * skipping this work. Make sure it stays in sync if making any + * changes here. + */ override_cred->fsuid = override_cred->uid; override_cred->fsgid = override_cred->gid; if not, can you phrase it however you see fit for me to copy-paste? > In my opinion a generalized shallow copy approach has more value than > a one-off solution that has the potential to fall out of sync and > cause a problem in the future (I recognize that you disagree on the > likelihood of this happening). > To reiterate my stance, the posted patch is trivial to reason about and it provides a marked improvement for the most commonly seen case. It comes with some extra branches for the less common case, which I don't consider to be a big deal. >From the quick toor I took around kernel/cred.c I think the cred code is messy and it would be best to sort it out before doing anything fancy. I have no interest in doing the clean up. The shallow copy idea I outlined above looks very simple, but I can't help the feeling there are surprises there, so I'm reluctant to roll with it as is. More importantly I can't show any workload which runs into the other case, thus if someone asks me to justify the complexity I will have nothing, which is mostly why I did not go for it. That said, if powers to be declare this is the way forward, I can spend some time getting it done. > Ultimately it's a call for the VFS folks as they are responsible for > the access() code. > Well let's wait and see. :) -- Mateusz Guzik <mjguzik gmail.com>