On 2021-06-30 19:19, Souradeep Chowdhury wrote:
DCC(Data Capture and Compare) is a DMA engine designed for debugging
purposes.In case of a system
crash or manual software triggers by the user the DCC hardware stores
the value at the register
addresses which can be used for debugging purposes.The DCC driver
provides the user with sysfs
interface to configure the register addresses.The options that the DCC
hardware provides include
reading from registers,writing to registers,first reading and then
writing to registers and looping
through the values of the same register.
In certain cases a register write needs to be executed for accessing
the rest of the registers,
also the user might want to record the changing values of a register
with time for which he has the
option to use the loop feature.
The options mentioned above are exposed to the user by sysfs files
once the driver is probed.The
details and usage of this sysfs files are documented in
Documentation/ABI/testing/sysfs-driver-dcc.
As an example let us consider a couple of debug scenarios where DCC
has been proved to be effective
for debugging purposes:-
i)TimeStamp Related Issue
On SC7180, there was a coresight timestamp issue where it would
occasionally be all 0 instead of proper
timestamp values.
Proper timestamp:
Idx:3373; ID:10; I_TIMESTAMP : Timestamp.; Updated val =
0x13004d8f5b7aa; CC=0x9e
Zero timestamp:
Idx:3387; ID:10; I_TIMESTAMP : Timestamp.; Updated val = 0x0; CC=0xa2
Now this is a non-fatal issue and doesn't need a system reset, but
still needs
to be rootcaused and fixed for those who do care about coresight etm
traces.
Since this is a timestamp issue, we would be looking for any timestamp
related
clocks and such.
o we get all the clk register details from IP documentation and
configure it
via DCC config syfs node. Before that we set the current linked list.
/* Set the current linked list */
echo 3 > /sys/bus/platform/devices/10a2000.dcc/curr_list
/* Program the linked list with the addresses */
echo 0x10c004 > /sys/bus/platform/devices/10a2000.dcc/config
echo 0x10c008 > /sys/bus/platform/devices/10a2000.dcc/config
echo 0x10c00c > /sys/bus/platform/devices/10a2000.dcc/config
echo 0x10c010 > /sys/bus/platform/devices/10a2000.dcc/config
..... and so on for other timestamp related clk registers
/* Other way of specifying is in "addr len" pair, in below case it
specifies to capture 4 words starting 0x10C004 */
echo 0x10C004 4 > /sys/bus/platform/devices/10a2000.dcc/config
/* Enable DCC */
echo 1 > /sys/bus/platform/devices/10a2000.dcc/enable
/* Run the timestamp test for working case */
/* Send SW trigger */
echo 1 > /sys/bus/platform/devices/10a2000.dcc/trigger
/* Read SRAM */
cat /dev/dcc_sram > dcc_sram1.bin
/* Run the timestamp test for non-working case */
/* Send SW trigger */
echo 1 > /sys/bus/platform/devices/10a2000.dcc/trigger
/* Read SRAM */
cat /dev/dcc_sram > dcc_sram2.bin
Get the parser from [1] and checkout the latest branch.
/* Parse the SRAM bin */
python dcc_parser.py -s dcc_sram1.bin --v2 -o output/
python dcc_parser.py -s dcc_sram2.bin --v2 -o output/
Sample parsed output of dcc_sram1.bin:
<hwioDump version="1">
<timestamp>03/14/21</timestamp>
<generator>Linux DCC Parser</generator>
<chip name="None" version="None">
<register address="0x0010c004" value="0x80000000" />
<register address="0x0010c008" value="0x00000008" />
<register address="0x0010c00c" value="0x80004220" />
<register address="0x0010c010" value="0x80000000" />
</chip>
<next_ll_offset>next_ll_offset : 0x1c </next_ll_offset>
</hwioDump>
ii)NOC register errors
A particular class of registers called NOC which are functional
registers was reporting
errors while logging the values.To trace these errors the DCC has been
used effectively.
The steps followed were similar to the ones mentioned above.
In addition to NOC registers a few other dependent registers were
configured in DCC to
monitor it's values during a crash. A look at the dependent register
values revealed that
the crash was happening due to a secured access to one of these
dependent registers.
All these debugging activity and finding the root cause was achieved
using DCC.
DCC parser is available at the following open source location
https://source.codeaurora.org/quic/la/platform/vendor/qcom-opensource/tools/tree/dcc_parser
Changes in v5:
*Fixed the issue with timeout faced while polling dcc_status register
in case
of software triggers.Increased the timeout from 100 us to 5000 us to
enable
dcc to process larger register sets in case of software triggers.
Souradeep Chowdhury (4):
dt-bindings: Added the yaml bindings for DCC
soc: qcom: dcc:Add driver support for Data Capture and Compare
unit(DCC)
MAINTAINERS: Add the entry for DCC(Data Capture and Compare) driver
support
arm64: dts: qcom: sm8150: Add Data Capture and Compare(DCC) support
node
Documentation/ABI/testing/sysfs-driver-dcc | 114 ++
.../devicetree/bindings/arm/msm/qcom,dcc.yaml | 40 +
MAINTAINERS | 8 +
arch/arm64/boot/dts/qcom/sm8150.dtsi | 6 +
drivers/soc/qcom/Kconfig | 8 +
drivers/soc/qcom/Makefile | 1 +
drivers/soc/qcom/dcc.c | 1534
++++++++++++++++++++
7 files changed, 1711 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-driver-dcc
create mode 100644
Documentation/devicetree/bindings/arm/msm/qcom,dcc.yaml
create mode 100644 drivers/soc/qcom/dcc.c
Gentle Ping