On Sunday, 13 November 2022 08:58:17 PST Borislav Petkov wrote: > * if f/m/s matches, you execute > > if still within the timeout, you return -EAGAIN from > current_batch_store() to tell userspace, take a nap and try again. > > * if the f/m/s doesn't match you return -EINVAL to say, wrong filename, > try the next one. But if it matches but is corrupt or the HW fails to accept it, you also get an error. So you now need to differentiate a failure to load a candidate file and an attempt to load a non-candidate. I'm assuming that the kernel would provide different error conditions for those. But handling those in shell scripting is very difficult: you'd need to start a subshell and parse stderr. It's MUCH easier in C, of course, but incrementing a number is magnitudes easier in C than performing a globbing. Either way, incrementing a number in shell is pretty easy too. The simplest script with just the numbers would be: i=0 while echo $i > /sys/devices/virtual/misc/intel_ifs_0/current_batch; do test_all_cpus done It's four lines and does not need to know about where the scan files are located, how they're named and if some files it may find are not to be used. But I've hidden a lot of complexity in the test_all_cpus shell function, which would be common to either solution of how we specify the batch to be loaded. And this is part of my argument of why it's unlikely people will use their shells to do this. That shell function is easily another 10 lines of scripting, if it's meant to do its job properly. To make that easier, we've developed two tools, one of them the OpenDCDiag tool I linked to, but both just happen to be written in C and C++ instead of shell. > For all Intel employees here on the thread, there's a world outside > Intel and people do not talk (family model stepping) tuples like we do. > All they wanna do is run their damn tests. Indeed they do. I have personally been in contact with the few that will represent over 90% of the deployment of this feature for the next few years. They want this functionality to integrate with their existing health-check scanning methodology. This is where OpenDCDiag comes in, because it does integrate with their workflows, like logging. For another example, it obeys the cpuset that the parent process may have set with sched_setaffinity() or alike tools (taskset(1) or schedtool(8)). I have zero clue how to do that with shell scripting. Which actually means I am the maintainer of the tool that is going to be driving 99% or more of all scans (that's why I was cc'ed in the submission). I am your user. I'm not saying I am the only user. I definitely want to see the best interface so that others could write tools too if they want to. And I don't want there to be a kludge that we need to keep compatibility with for a decade, or to set a bad precedent. But I am giving you the constraints that I need to work under and the kernel interface to support me. I am telling you this is very good right now and your proposal makes it worse for me, not better, for little apparent gain. -- Thiago Macieira - thiago.macieira (AT) intel.com Cloud Software Architect - Intel DCAI Cloud Engineering