On 11/02/2022 11:55, Jani Nikula wrote:
On Thu, 10 Feb 2022, Casey Bowman <casey.g.bowman@xxxxxxxxx> wrote:
In this RFC I would like to ask the community their thoughts
on how we can best handle splitting architecture-specific
calls.
I would like to address the following:
1. How do we want to split architecture calls? Different object files
per platform? Separate function calls within the same object file?
2. How do we address dummy functions? If we have a function call that is
used for one or more platforms, but is not used in another, what should
we do for this case?
I've given an example of splitting an architecture call
in my patch with run_as_guest() being split into different
implementations for x86 and arm64 in separate object files, sharing
a single header.
Another suggestion from Michael (michael.cheng@xxxxxxxxx) involved
using a single object file, a single header, and splitting various
functions calls via ifdefs in the header file.
I would appreciate any input on how we can avoid scaling issues when
including multiple architectures and multiple functions (as the number
of function calls will inevitably increase with more architectures).
v2: Revised to use kernel's platform-splitting scheme.
I think this is overengineering.
Just add different implementations of the functions per architecture
next to where they are now, like I suggested before.
If we need to split them better later, it'll be a trivial undertaking,
and we'll be in a better position to do it because we'll know how many
functions there'll be and where they are and what they do.
Adding a bunch of overhead from the start seems like the wrong thing to
do.
I don't see it adds real complexity, which would normally be associated
with over-engineering. As a benefit I see it helping with driving the
clean re-design (during the porting effort) in a way that it will be
easy to spot is something is overly hacky, split on the wrong level, or
incorrectly placed.
And it moves run_as_guest outside of intel_vtd.[hc] which IMO shows
immediate benefit, since it has nothing to do with intel_vtd.
I suggested to add clflush as well, since I think going for
drm_flush_virt_range everywhere is a bit lazy given how it is a clear
regression for older platforms.
But after that I indeed don't have a crystal ball to show me how many
more appropriate low-level primitives would be to use the pattern.
So my vote would be to go with it, although the main thing is probably
to solve the conflicting asks and let guys focus on the port. Put it to
voting then? :)
Regards,
Tvrtko