Re: [PATCH v13 0/8] Coresight for Kernel panic and watchdog reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 04/02/2025 12:02 pm, Linu Cherian wrote:
Hi James,


On 2025-01-24 at 17:38:58, James Clark (james.clark@xxxxxxxxxx) wrote:


On 16/12/2024 5:30 am, Linu Cherian wrote:
This patch series is rebased on coresight-next-v6.12.rc4

* Patches 1 & 2 adds support for allocation of trace buffer pages from
    reserved RAM
* Patches 3 & 4 adds support for saving metadata at the time of kernel panic
* Patch 5 adds support for reading trace data captured at the time of panic
* Patches 6 & 7 adds support for disabling coresight blocks at the time of panic
* Patch 8: Gives the full description about this feature as part of documentation

v12 is posted here,
https://lore.kernel.org/linux-arm-kernel/20241129084714.3057080-1-lcherian@xxxxxxxxxxx/

Changelog from v12:
* Fixed wrong buffer pointer passed to coresigh_insert_barrier_packet
* tmc_read_prepare/unprepare_crashdata need to be called only once and
    hence removed from read path and added to tmc_probe
* tmc_read_prepare_crashdata renamed to tmc_prepare_crashdata and
    avoid taking locks  as its moved to probe function.
* Introduced read status flag, "reading" specific to reserved buffer to keep the
    reserved buffer reading independent of the regular buffer.
* open/release ops for reserved buffer has to take care only about the
    set/unset the "reading" status flag as the reserved buffer is prepared
    during the probe time itself.
* Few other trivial changes


Hi Linu,

I tested that decoding a crash dump of ETM1 (trace ID 17) from panic kernel
works:

   $ ./ptm2human -i cstrace.bin

   ...
   There is no valid data in the stream of ID 16
   Decode trace stream of ID 17
   Syncing the trace stream...
   Decoding the trace stream...
   instruction addr at 0x140c9afc, ARM state, secure state,
   ...


Thanks for trying this out.


I noticed that once in the panic kernel Coresight becomes unusable, and the
Perf Coresight tests fail, with no obvious way to reset it other than a cold
boot:

  $ perf record -e cs_etm//u -- true
  $ perf report -D | grep AUX
  ...
  AUX data lost 27 times out of 27!
  ...

I didn't debug it yet. I thought it might be something to do with the RESRV
buffer mode, but it doesn't look like that should be the case from the code.
Perhaps its the claim tags and coresight_is_claimed_any() lingering, so it's
not really an issue that's introduced by this change?


Is that problem reproducible without this series applied ?

Thanks.
Linu Cherian.




Yes looks like it's unrelated. I sent patches to fix the claim tag issue, and there is some other state that needs to be cleared too. But we can do it later.







[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux