On 8/25/2023 2:06 AM, Greg KH wrote: > Better! But please, look into using the tracing infrastructure and > functionality first. Only if that is somehow not workable at all should > we regress into using stuff like debugfs for this. > > thanks, > > greg k-h I took some timing measurements to compare the overhead using different tracing techniques. The measurements were taken by executing the rdtsc instruction at the entry and exit of the serial_in() and serial_out() functions in 'struct uart_port'. My test environment is a Celeron M clocked at 1GHz so each clock cycle is 1 nanosecond. Also note that the rdtsc instruction takes 43 clock cycles so that should be subtracted from each measurement to get the actual time. For each of the numbers below 100 measurements were taken and averaged. 1183 cycles using io_serial_in (no tracing) 1192 cycles using io_serial_out (no tracing) 1382 cycles using serial_in_wrapper and rdtsc (this patch) 1564 cycles using serial_out_wrapper and rdtsc (this patch) 1484 cycles using serial_in_wrapper and ktime_get_boottime 1980 cycles using serial_out_wrapper and ktime_get_boottime 2484 cycles using serial_in_wrapper and trace_portio_read 4411 cycles using serial_out_wrapper and trace_portio_write The last two measurements used the existing kernel tracing infrastructure. TRACE_EVENT() macros were created and tracepoints were used to generate trace events that could be collected from /sys/kernel/tracing/trace. My first observation is that the I/O instructions themselves (inb/outb) are fairly slow on this architecture so even with no tracing each operation takes around 1200 nanoseconds. The tracing technique used in this patch adds between 200 and 350 nanoseconds of overhead to that. Swapping ktime_get_boottime() for rdtsc() adds another 100 to 200 nanoseconds. Using the existing kernel trace infrastructure incurs 1300 to 3200 nanoseconds of overhead. Memory usage was difficult to determine exactly. The technique in this patch uses fixed length records that are 8 bytes each (6 bytes for the timestamp, 1 byte for the register offset, and 1 byte for the register data). But the memory used by the kernel trace infrastructure varied depending on how many records were collected. When I set the buffer size to 64 kilobytes (by writing 64 to /sys/kernel/tracing/buffer_size_kb) I was able to collect 4062 records. That averages to a little over 16 bytes per record.