On Thu, Mar 30, 2023, at 20:30, Evan Green wrote: > On Thu, Feb 23, 2023 at 2:06 AM Arnd Bergmann <arnd@xxxxxxxx> wrote: >> > + long sys_riscv_hwprobe(struct riscv_hwprobe *pairs, size_t >> > pair_count, >> > + size_t cpu_count, cpu_set_t *cpus, >> > + unsigned long flags); >> >> The cpu set argument worries me more: there should never be a >> need to optimize for broken hardware that has an asymmetric set >> of features. Just let the kernel figure out the minimum set >> of features that works across all CPUs and report that like we >> do with HWCAP. If there is a SoC that is so broken that it has >> important features on a subset of cores that some user might >> actually want to rely on, then have them go through the slow >> sysfs interface for probing the CPUs indidually, but don't make >> the broken case easier at the expense of normal users that >> run on working hardware. > > I'm not so sure. While I agree with you for major classes of features > (eg one CPU has floating point support but another does not), I expect > these bits to contain more subtle details as well, which might vary > across asymmetric implementations without breaking ABI compatibility > per-se. Maybe some vendor has implemented exotic video decoding > acceleration instructions that only work on the big core. Or maybe the > big cores support v3.1 of some extension (where certain things run > faster), but the little cores only have v3.0, where it's a little > slower. Certain apps would likely want to know these things so they > can allocate their work optimally across cores. Do you have a specific feature in mind where hardware would be intentionally designed this way? I still can't come up with a scenario where this would actually work in practice, as having asymmetric features is incompatible with so many other things we normally do. - In a virtual machine, the VCPU tents to get scheduled arbitrarily to physical CPUs, so setting affinity in a guest won't actually guarantee that the feature is still there. - Using a CPU feature from library code is practically impossible if it requires special CPU affinity, as the application may already be started on specific CPUs for another reason, and having a library call sched_setaffinity will conflict with those. - Even in the simplest case of having a standalone application without any shared libraries try to pick a sensible CPU to run on is hard to do in a generic way, as it would need to weigh availabilty of features on certain cores against the number of cores with or without the feature and their current and expected system load. As long as there isn't a specific requirement, I think it's better to not actually encourage hardware vendors to implement designs like that, or at least not designing an interface to make getting this information a few microseconds faster that what already exists. Arnd