On 11/19/2021 9:36 AM, Sai Prakash Ranjan wrote:
Hi Arnd,
On 11/18/2021 8:54 PM, Arnd Bergmann wrote:
On Mon, Nov 15, 2021 at 12:33 PM Sai Prakash Ranjan
<quic_saipraka@xxxxxxxxxxx> wrote:
/*
* Generic IO read/write. These perform native-endian accesses.
*/
-#define __raw_writeb __raw_writeb
-static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
+static inline void arch_raw_writeb(u8 val, volatile void __iomem
*addr)
{
asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr));
}
Woundn't removing the #define here will break the logic in
include/asm-generic/io.h,
making it fall back to the pointer-dereference version for the actual
access?
#defines for these are added in mmio-instrumented.h header which is
included in
arm64/asm/io.h, so it won't break the logic by falling back to
pointer-dereference.
+#if IS_ENABLED(CONFIG_TRACE_MMIO_ACCESS) &&
!(defined(__DISABLE_TRACE_MMIO__))
+DECLARE_TRACEPOINT(rwmmio_write);
+DECLARE_TRACEPOINT(rwmmio_read);
+
+void log_write_mmio(const char *width, volatile void __iomem *addr);
+void log_read_mmio(const char *width, const volatile void __iomem
*addr);
+
+#define __raw_write(v, a, _l) ({ \
+ volatile void __iomem *_a = (a); \
+ if (tracepoint_enabled(rwmmio_write)) \
+ log_write_mmio(__stringify(write##_l), _a); \
+ arch_raw_write##_l((v), _a); \
+ })
This feels like it's getting too big to be inlined. Have you considered
integrating this with the lib/logic_iomem.c infrastructure instead?
That already provides a way to override MMIO areas, and it lets you do
the logging from a single place rather than having it duplicated in
every
single caller. It also provides a way of filtering it based on the
ioremap()
call.
Thanks for the suggestion, will look at the logic_iomem.c and see if
it fits our
usecase.
So I looked at logic_iomem.c which seems to be useful for emulated IO
for virtio drivers
but our usecase just needs to log the mmio operations and no additional
stuff, similar to
the logging access of x86 msr registers via tracepoint
(arch/x86/include/asm/msr-trace.h).
Also raw read/write macros in logic_iomem.c have the callbacks which
seems to be pretty costly
than inlining or direct function call given it has to be called for
every register read and write
which are going to be thousands in our case. In their usecase, read and
write callbacks are just
pci cfgspace reads and writes which may not be that frequently called
and the latency might not
be visible but in our case, I think it would be visible if we have a
callback as such. I know this is a
debug feature and perf isn't expected much but that wouldn't mean we
should not have a debug
feature which performs better right.
On the second point, filtering by ioremap isn't much useful for our
usecase since ioremapped
region can have 100s of registers and we are interested in the exact
register read/write which
would cause any of the issues mentioned in the description of this patchset.
So I feel like the current way where we consolidate the instrumentation
in mmio-instrumented.h
seems like the better way than adding tracing to an emulated iomem library.
Thanks,
Sai