On 2/7/22 07:36, Tvrtko Ursulin wrote:
On 20/01/2022 22:16, Casey Bowman wrote:
In this RFC I would like to ask the community their thoughts
on how we can best handle splitting architecture-specific
calls.
I would like to address the following:
1. How do we want to split architecture calls? Different object files
per platform? Separate function calls within the same object file?
If we are talking about per-platform divergence of significant
functions (not necessarily in size but their height position in the
i915 stack) I agree with Jani that top-level per platform organisation
is not the best choice.
On the other hand I doubt that there should be many, if any, such
functions. In practice I think it should be only low level stuff which
diverges.
I agree, as said with my reply to Jani, I think maybe going forward for
arch-specific code, #if IS_ENABLED calls should be used sparingly, only
in the cases where we do have that arch-specific implementation (like
low level calls), instead of just a 'return null', as in my case.
On a concrete example..
2. How do we address dummy functions? If we have a function call that is
used for one or more platforms, but is not used in another, what should
we do for this case?
... depends on the situation. Sometimes a flavour of "warn on once"
can be okay I guess, but also why not build bug on? Because..
True, with Jani's and your comments, I'm thinking that in the case of
when we look at a potential arch-specific function where we're just
returning null or something similar, we should be re-evaluating the
function and seeing if we should make a different change there.
I've given an example of splitting an architecture call
in my patch with run_as_guest() being split into different
implementations for x86 and arm64 in separate object files, sharing
a single header.
... run_as_guest may be a very tricky example, given it is called from
intel_vtd_active which has a number of callers.
What is correct behaviour on Arm in this example? None of these call
sites will run on Arm? Or that you expect the warn on added in this
patch to trigger as a demonstration? If so, what is the plan going
forward? We can take one example and talk about it hypothetically:
./i915_driver.c: drm_printf(p, "iommu: %s\n",
enableddisabled(intel_vtd_active(i915)));
What is the "fix" (refactor) for Arm here? Looks like a new top-level
function is needed which does not carry the intel_vtd_ prefix but
something more generic. That one could then legitimately "warn on
once", while for intel_vtd_active it would be wrong to do so.
Good point, run_as_guest might be a bit more challenging of an example.
In my mind, I was thinking of just simply returning null for arm64 here
as I don't believe arm64 would be making use of iommu for our cases (at
least, in the short term).
I think this example function specifically needs to get reworked, as you
mentioned, in some way, possibly refactoring intel_vtd_active or
something along those lines.
And when I say it is needed.. well perhaps it is not strictly needed
in this case, but in some other cases I think we go back to the
problem I stated some months ago and that is that I suspect use of
intel_vtd_active is overloaded. I think it is currently used to answer
all these questions: 1. Is the IOMMU active, just for information.; 2.
Is the IOMMU active and we want to counteract the performance hit by
say using huge pages, adjusting the display bandwidth calculations or
whatever. (In which case we also may want to distinguish between
pass-through and translation modes.); 3. Is a potentially buggy IOMMU
active and we need to work around it. All these under one kind of
worked with one iommu implementation but does it with a different IOMMU?
Which I mean leads to end conclusion that this particular function is
a tricky example to answer the questions asked. :)
Another suggestion from Michael (michael.cheng@xxxxxxxxx) involved
using a single object file, a single header, and splitting various
functions calls via ifdefs in the header file.
In principle, mostly what you have outlined sounds acceptable to me,
with the difference that I would not use i915_platform, but for this
particular example something like i915_hypervisor prefix.
Then I would prepare i915 with the same scheme kernel uses, not just
for source file divergence, but header file as well. That is:
some_source.c:
#include "i915_hypervisor.h"
i915_hypervisor.h:
#include "platform/i915_hypervisor.h"
Then in i915 root you could have:
platforms/x86/include/platform/i915_hypervisor.h
platforms/arm/include/platform/i915_hypervisor.h
And some kbuild stuff to make that work. Is this doable and does it
make sense? Per-platform source files could live in there as well.
Same scheme for i915_clflush would work as well.
I like the idea of getting more specific for the calls here, but I'm
somewhat afraid of obfuscating these functions to their own files in
addition to scaling issues if we do have more and more arch-specific
calls, along with more architectures in the future.
This just seems like it could blow up the driver in what could
ultimately be unnecessary organization for a fewer number of calls if we
just add a few platforms and different calls.
What do you think?
Regards,
Casey
Regards,
Tvrtko
I would appreciate any input on how we can avoid scaling issues when
including multiple architectures and multiple functions (as the number
of function calls will inevitably increase with more architectures).
Casey Bowman (1):
i915/drm: Split out x86 and arm64 functionality
drivers/gpu/drm/i915/Makefile | 4 +++
drivers/gpu/drm/i915/i915_drv.h | 6 +---
drivers/gpu/drm/i915/i915_platform.h | 16 +++++++++++
drivers/gpu/drm/i915/i915_platform_arm64.c | 33 ++++++++++++++++++++++
drivers/gpu/drm/i915/i915_platform_x86.c | 33 ++++++++++++++++++++++
5 files changed, 87 insertions(+), 5 deletions(-)
create mode 100644 drivers/gpu/drm/i915/i915_platform.h
create mode 100644 drivers/gpu/drm/i915/i915_platform_arm64.c
create mode 100644 drivers/gpu/drm/i915/i915_platform_x86.c