On Thu, Jan 11, 2024 at 8:11 PM Kent Overstreet <kent.overstreet@xxxxxxxxx> wrote: > > On Thu, Jan 11, 2024 at 09:47:26PM +0000, Mark Brown wrote: > > On Thu, Jan 11, 2024 at 12:38:57PM -0500, Kent Overstreet wrote: > > > On Thu, Jan 11, 2024 at 03:35:40PM +0000, Mark Brown wrote: > > > > > > IME the actually running the tests bit isn't usually *so* much the > > > > issue, someone making a new test runner and/or output format does mean a > > > > bit of work integrating it into infrastructure but that's more usually > > > > annoying than a blocker. > > > > > No, the proliferation of test runners, test output formats, CI systems, > > > etc. really is an issue; it means we can't have one common driver that > > > anyone can run from the command line, and instead there's a bunch of > > > disparate systems with patchwork integration and all the feedback is nag > > > emails - after you've finished whan you were working on instead of > > > moving on to the next thing - with no way to get immediate feedback. > > > > It's certainly an issue and it's much better if people do manage to fit > > their tests into some existing thing but I'm not convinced that's the > > big reason why you have a bunch of different systems running separately > > and doing different things. For example the enterprise vendors will > > naturally tend to have a bunch of server systems in their labs and focus > > on their testing needs while I know the Intel audio CI setup has a bunch > > of laptops, laptop like dev boards and things in there with loopback > > audio cables and I think test equipment plugged in and focuses rather > > more on audio. My own lab is built around on systems I can be in the > > same room as without getting too annoyed and does things I find useful, > > plus using spare bandwidth for KernelCI because they can take donated > > lab time. > > No, you're overthinking. > > The vast majority of kernel testing requires no special hardware, just a > virtual machine. > > There is _no fucking reason_ we shouldn't be able to run tests on our > own local machines - _local_ machines, not waiting for the Intel CI > setup and asking for a git branch to be tested, not waiting for who > knows how long for the CI farm to get to it - just run the damn tests > immediately and get immediate feedback. > > You guys are overthinking and overengineering and ignoring the basics, > the way enterprise people always do. > As one of those former enterprise people that actually did do this stuff, I can say that even when I was "in the enterprise", I tried to avoid overthinking and overengineering stuff like this. :) Nobody can maintain anything that's so complicated nobody can run the tests on their machine. That is the root of all sadness. > > > And it's because building something shiny and new is the fun part, no > > > one wants to do the grungy integration work. > > > > I think you may be overestimating people's enthusiasm for writing test > > stuff there! There is NIH stuff going on for sure but lot of the time > > when you look at something where people have gone off and done their own > > thing it's either much older than you initially thought and predates > > anything they might've integrated with or there's some reason why none > > of the existing systems fit well. Anecdotally it seems much more common > > to see people looking for things to reuse in order to save time than it > > is to see people going off and reinventing the world. > > It's a basic lack of leadership. Yes, the younger engineers are always > going to be doing the new and shiny, and always going to want to build > something new instead of finishing off the tests or integrating with > something existing. Which is why we're supposed to have managers saying > "ok, what do I need to prioritize for my team be able to develop > effectively". > > > > > > > > example tests, example output: > > > > > https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest > > > > > https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing > > > > > > For example looking at the sample test there it looks like it needs > > > > among other things mkfs.btrfs, bcachefs, stress-ng, xfs_io, fio, mdadm, > > > > rsync > > > > > Getting all that set up by the end user is one command: > > > ktest/root_image create > > > and running a test is one morecommand: > > > build-test-kernel run ~/ktest/tests/bcachefs/single_device.ktest > > > > That does assume that you're building and running everything directly on > > the system under test and are happy to have the test in a VM which isn't > > an assumption that holds universally, and also that whoever's doing the > > testing doesn't want to do something like use their own distro or > > something - like I say none of it looks too unreasonable for > > filesystems. > > No, I'm doing it that way because technically that's the simplest way to > do it. > > All you guys building crazy contraptions for running tests on Google > Cloud or Amazon or whatever - you're building technical workarounds for > broken procurement. > > Just requisition the damn machines. > Running in the cloud does not mean it has to be complicated. It can be a simple Buildbot or whatever that knows how to spawn spot instances for tests and destroy them when they're done *if the test passed*. If a test failed on an instance, it could hold onto them for a day or two for someone to debug if needed. (I mention Buildbot because in a previous life, I used that to run tests for the dattobd out-of-tree kernel module before. That was the strategy I used for it.) > > Some will be, some will have more demanding requirements especially when > > you want to test on actual hardware rather than in a VM. For example > > with my own test setup which is more focused on hardware the operating > > costs aren't such a big deal but I've got boards that are for various > > reasons irreplaceable, often single instances of boards (which makes > > scheduling a thing) and for some of the tests I'd like to get around to > > setting up I need special physical setup. Some of the hardware I'd like > > to cover is only available in machines which are in various respects > > annoying to automate, I've got a couple of unused systems waiting for me > > to have sufficient bandwidth to work out how to automate them. Either > > way I don't think the costs are trival enough to be completely handwaved > > away. > > That does complicate things. > > I'd also really like to get automated performance testing going too, > which would have similar requirements in that jobs would need to be > scheduled on specific dedicated machines. I think what you're doing > could still build off of some common infrastructure. > > > I'd also note that the 9 hour turnaround time for that test set you're > > pointing at isn't exactly what I'd associate with immediate feedback. > > My CI shards at the subtest level, and like I mentioned I run 10 VMs per > physical machine, so with just 2 of the 80 core Ampere boxes I get full > test runs done in ~20 minutes. > This design, ironically, is way more cloud-friendly than a lot of testing system designs I've seen in the past. :) -- 真実はいつも一つ!/ Always, there's only one truth!