On Fri, May 26, 2023 at 10:42:55AM -0700, Dave Hansen wrote: > > > If Intel feels that it's useful to run their own instance, maybe > > there's some way you can work with Google syzkaller team so you don't > > have to do that? > > I actually don't know why or when Intel started doing this. 0day in > general runs on a pretty diverse set of systems and I suspect this was > an attempt to leverage that. Philip, do you know the history here? Yeah, I think that's at least part of the issue. Looking at some of the reports that, the reported architecture was Tiger Lake and Adler Lake. According to Pengfei, part of this was to test features that require newer cpu features, such as CET / Shadow Stack. Now, I could be wrong, because Intel's CPU naming scheme is too complex for my tiny brain and makes my head spin. It's really hard to map the names used for mobile processors to those used by Xeon server class platforms, but I *think*, if Intel's Product Managers haven't confused me hopelessly, Google Cloud's C3 VM's, which use Sapphire Rapids, should have those hardware features which are in Tiger Lake and Adler Lake, while the Google Cloud's N2 VM's, which use Ice Lake processors, are too old. Can someone confirm if I got that right? So this might be an issue of Intel submitting the relevant syzkaller commits that add support for testing Shadow Stack, CET, IOMMUFD, etc., where needed to the upstream syzkaller git repo --- and then convincing the Google Syzkaller team to turn up run some of test VM's on the much more expensive (per CPU/hour) C3 VM's. The former is probably something that is just a matter of standard open source upstreaming. The latter might be more complicated, and might require some private negotiations between companies to address the cost differential and availability of C3 VM's. The other thing that's probably worth considering here is that hopefully many of these reports are one that aren't *actually* architecture dependent, but for some reason, are just results that one syzkaller's instance has found, but another syzkaller instance has not yet found. So perhaps there can be some kind of syzkaller state export/import scheme so that a report that be transferred from one syzkaller instance to another. That way, upstream developers would have a single syzkaller dashboard to pay attention to, get regular information about how often a particular report is getting triggered, and if the information behind the report can get fed into receiving syzkaller's instance's fuzzing seed library, it might improve the test coverage for other kernels that Intel doesn't have the business case to test (e.g., Android kernels, kernels compiled for arm64 and RISC-V, etc.) After all, looking at the report which kicked off this thread ("soft lockup in __cleanup_mnt"), I don't think this is something that should be hardware specific; and yet, this report appears not to exist in Google's syzkaller instance. If we could import the fuzzing seed for this and similar reports into Google's syzkaller instance, it seems to me that this would be a Good Thing. Cheers, - Ted