On Thu, May 26, 2022 at 09:59:19PM +0300, Amir Goldstein wrote: > On Thu, May 26, 2022 at 9:44 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: > > > > On Thu, May 26, 2022 at 10:27:41AM -0700, Darrick J. Wong wrote: > > > /me looks and sees a large collection of expunge lists, along with > > > comments about how often failures occur and/or reasons. Neat! > > > > > > Leah mentioned on the ext4 call this morning that she would have found > > > it helpful to know (before she started working on 5.15 backports) which > > > tests were of the flaky variety so that she could better prioritize the > > > time she had to look into fstests failures. (IOWS: saw a test fail a > > > small percentage of the time and then burned a lot of machine time only > > > to figure out that 5.15.0 also failed a percentage of th time). > > > > See my proposal to try to make this easier to parse: > > > > https://lore.kernel.org/all/YoW0ZC+zM27Pi0Us@xxxxxxxxxxxxxxxxxxxxxx/ > > > > > We talked about where it would be most useful for maintainers and QA > > > people to store their historical pass/fail data, before settling on > > > "somewhere public where everyone can review their colleagues' notes" and > > > "somewhere minimizing commit friction". At the time, we were thinking > > > about having people contribute their notes directly to the fstests > > > source code, but I guess Luis has been doing that in the kdevops repo > > > for a few years now. > > > > > > So, maybe there? > > > > For now sure, I'm happy to add others the linux-kdevops org on github > > and they get immediate write access to the repo. This is working well > > so far. Long term we need to decide if we want to spin off the > > expunge list as a separate effort and make it a git subtree (note > > it is different than a git sub module). Another example of a use case > > for a git subtree, to use it as an example, is the way I forked > > kconfig from Linux into a standalone git tree so to allow any project > > to bump kconfig code with just one command. So different projects > > don't need to fork kconfig as they do today. > > > > The value in doing the git subtree for expunges is any runner can use > > it. I did design kdevops though to run on *any* cloud, and support > > local virtualization tech like libvirt and virtualbox. > > > > The linux-kdevops git org also has other projects which both fstest > > and blktests depend on, so for example dbench which I had forked and > > cleaned up a while ago. It may make sense to share keeping oddball > > efforts like thse which are no longer maintained in this repo. > > > > There is other tech I'm evaluating for this sort of collaborative test > > efforts such as ledgers, but that is in its infancy at this point in > > time. I have a sense though it may be a good outlet for collection of > > test artifacts in a decentralized way and also allow *any* entity to > > participate in bringing confidence to stable kernel branches or dev > > branches prior to release. > > > > There are few problems I noticed with the current workflow. > > 1. It will not scale to maintain this in git as more and more testers > start using kdepops and adding more and more fs and configs and distros. You say that but do not explain why you think this is the case. Quite the contrary, I don't think so and I'll expain why. Let's just stic to the expunge list as that is what matters here in this context. The expunge list is already divided by target kernels if using upstream kernels by directory. So this applies to any stable kernel, vanilla kernels, or linux-next. Folks working on these kernels would very likely be collaborating just as you and I have. Distro kernels also have their own directory, and so they'd very likely collaborate. > How many more developers you want to give push access to linux-kdevops? Only those really collaborating, the idea is not to give access to the world here. The challenge I'm thinking about for the future though is how to scale this beyond just those few in a meaningful way in such a way that you don't limit your scope of evaluation only to what resources these folks have. That is a research question and beyond the scope of just using git in a shared linux repo. > I don't know how test labs report to KernelCI, but we need to look at their > model and not invent the wheel. I looked at and to say the least I was not in any way shape or form drawn to it or what it was using. You are free to look at it too. The distributed aspect is what I don't agree with, and it is why I am evaluating alternative decentralized technologies for the future. It relies on a LAVA, Linaro Automated Validation Architecture. The project home page to LAVA [0], mentions "LAVA is an automated validation architecture primarily aimed at testing deployments of systems based around the Linux kernel on ARM devices, specifically ARMv7 and later". The SOC [1] page however now lists x86, but it is not the main focus of the project. You can add a new test lab and add new tests, these tests are intended to be public. If running tests for private consumption you’d have to set up your own backend and front end. All this and the experience with the results page was enough for me to decide this wasn't an immediate good fit for automation for fstests and blktests when I started considering this for enterprise Linux. [0] https://git.lavasoftware.org [1] https://linux.kernelci.org/soc/ It does not mean one cannot use a centralized methodology to share an expunge list / artifacts, etc for fstests or blktests. A shared expunge set on linux-kdevops organization is a *simple* centralized way to do that to start off with, and if you limit access to folks who collaborate on one directory (as you kind of do in Linux development with maintainers) you avoid merge conflicts. We're not at a point yet where we're going to have 100 folks who want access to say the v5.10.y directory for expunges for XFS for example.... it's just you and me right now. Likewise for other filesystems it would be similar. But from a research perspective it does invite one to consider how to scale this in a sensible way beyond those. When I looked at kernelci, I didn't personally think that was an optimal way to scale, but that is beyond the scope of the simple ramp up we're still discussing. > 2. kdevops is very focused on stabilizing the baseline fast, Although it does help with this, I still think there is small efforts to help automate this further in the future. A runner should be able to spin this off without intervention if possible. Today upon failures it requires manual verification, adding a new failure to an expunge list, etc. We can do better, and the goal is to slowly automate each of those menial tasks which today we do manually. Building a completely new baseline without manual intervention I think is possible and we should strive towards it slowly and carefully. > which is > very good, but there is no good process of getting a test out of expunge list. Yes, *part* of this involves a nice atomic task which can be dedicated to a runner. So this goal alone needs to broken up in to parts: a) Is this task still failing? --> easily automated today b) How can we avoid this to fail --> not easily autmated today As for a), a simple dedicated guest could for example evaluate a target kernel on a fileystem configuration and run through each expunge and *verify* it is indeed still failing. If it is not, and there is high confidence that this is the case (say it verified that it is not failing many times), then clearly the issue may have been fixed (say a stable kernel update) and the task can inform us of that. Thas task b) requires years, and years of work. > We have a very strong suspicion that some of the tests that we put in > expunge lists failed due to some setup issue in the host OS that caused > NVME IO errors in the guests. We already know that qemu with qcow2 nvme files does incur some delays when doing full swing drive discards and this can cause some of these nvme IO errors (timeouts). We now also are aware that the odds of this timeout happening twice is also low but *is* possible. We *also* now know that when two consecutive nvme timeoutes happen due to this it can also *somehow* trigger an RCU false positive for blktests in some corner cases when testing ZNS [0] but this was *what* made us realize that this issue was a qemu issue and the qemu nvme maintainer has noted that this needs to be fixed in qemu. [0] https://lkml.kernel.org/r/YliZ9M6QWISXvhAJ@xxxxxxxxxxxxxxxxxxxxxx But these sorts of qemu bugs should should not cause filesystem issues. We also already know that this is a qemu bug and that this will be fixed in the long term. Upon review with the qemu nvme maintainer the way kdevops uses nvme is not incorrect. Yes we can switch to raw format to skip the suboptimal way to do discards, but we *want* to find more bugs, not less. We could simply just make a new Kconfig entry on kdevops to enable users to use raw files for the nvme drives for those that want to opt-out of these timeouts for now. > I tried to put that into comments when > I noticed that, but I am afraid there may have been other tests that are > falsely accused of failing. There are two things we should consider in light of this: c) We do need semantics for common exceptions to failures d) We need an appreciation for why some of these exceptions may be real odd issues and it may take time to either fix them or to acknoledge they are non-issue somehow. As for c) I had proposed a way to annotate failure rate, perhaps we need a way to annotate these odd issues as well. In my talk at LSFMM I mentioned how 25% of time *alone* on the test automation effort consists of dealing with low hanging fruit. Since companies are now trying to dedicate some resources towards stable filesystem efforts it maybe worthy for them to consider this so that they are aware that some of these oddball issues may end up with them lurking in odd corners. I gave one example which took 8 months to root cause on the blktests front alone at LSFMM. > All developers make those mistakes in their > own expunge lists, but if we start propagating those mistakes to the world, > it becomes an issue. Agreed, but note that the conversation is shifting from not sharing expunges to possibly sharing some notion of expunges *somehow*. That is a small step forward. I agree we need to address these semantics issues and they are important, but without the will to share expunges there would have been no point to address some of these common pain points. > For those two reasons I think that the model to aspire to should be > composed of a database where absolutely everyone can post data > point to in the form of facts (i.e. the test failed after N runs on this kernel > and this hardware...) and another process, partly AI, partly human to > digest all those facts into a knowledge base that is valuable and > maintained by experts. Much easier said than done... Anything is possible, sure. A centralized database is one way to go about some of these things. I'm however suspicious that perhaps there is a better way, and am still evaluating a ledger as a way to scale test results. Both paths can be taken, in fact. One does not negate the other. *For now*... I do think a simple repo with those who *are* collaborating on expunges can share a simple repo as we have been doing for a few months. The need for scaling has to be addressed but for the long term of growth of the endeavour. Luis