On 10/16/20 10:22 AM, Sasha Levin wrote:
On Fri, Oct 16, 2020 at 09:55:25AM -0400, Paul Moore wrote:
On Fri, Oct 16, 2020 at 9:05 AM Daniel Burgener
<dburgener@xxxxxxxxxxxxxxxxxxx> wrote:
Yes, thank you. I will fix up the series with the third commit
included, and add commit ids. Thanks.
Greg and I have different opinions on what is classified as a good
candidate for the -stable trees, but in my opinion this patch series
doesn't qualify. There are a lot of dependencies, it is intertwined
with a lot of code, and the issue that this patchset fixes has been
around for a *long* time. I personally feel the risk of backporting
this to -stable does not outweigh the potential wins.
My understanding is that while the issue Daniel is fixing here has been
around for a while, it's also very real - the reports suggest a failure
rate of 1-2% on boot.
As a point of clarity, I think that the issue occurs much less
frequently on boot than it does with a policy load during ordinary
operation, since there are a much higher volume of userspace policy
manager lookups on a policy_load once the system is up. I think 1-2% is
roughly accurate for what we're seeing in the environment I'm working on
for a policy load during normal steady state operation. I don't have
hard numbers on policy load during boot, but I would expect it to be
quite a bit lower. We have seen it, but it's not the common case we're
seeing.
I do understand your concerns around this series, but given it was just
fixed upstream we don't have a better story than "sit tight for the
next LTS" to tell to users affected by this issue.
Is there a scenario where you'd feel safer with the series? I suspect
that if it doesn't go into upstream stable Daniel will end up carrying
it out of tree anyway, so maybe we can ask Daniel to do targetted
testing for the next week or two and report back?
I believe my team will intend to carry this out of tree, yes. If
additional data from that would be helpful, I'd be happy to provide it.
-Daniel