Hi James, > -----Original Message----- > From: Linu Cherian <lcherian@xxxxxxxxxxx> > Sent: Tuesday, October 10, 2023 6:53 PM > To: James Clark <james.clark@xxxxxxx>; suzuki.poulose@xxxxxxx; > mike.leach@xxxxxxxxxx; leo.yan@xxxxxxxxxx > Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; coresight@xxxxxxxxxxxxxxxx; linux- > kernel@xxxxxxxxxxxxxxx; robh+dt@xxxxxxxxxx; > krzysztof.kozlowski+dt@xxxxxxxxxx; conor+dt@xxxxxxxxxx; > devicetree@xxxxxxxxxxxxxxx; Sunil Kovvuri Goutham > <sgoutham@xxxxxxxxxxx>; George Cherian <gcherian@xxxxxxxxxxx>; Anil > Kumar Reddy H <areddy3@xxxxxxxxxxx> > Subject: RE: [EXT] Re: [PATCH 5/7] coresight: tmc: Add support for reading > tracedata from previous boot > > Hi James, > > > -----Original Message----- > > From: James Clark <james.clark@xxxxxxx> > > Sent: Wednesday, October 4, 2023 7:18 PM > > To: Linu Cherian <lcherian@xxxxxxxxxxx>; suzuki.poulose@xxxxxxx; > > mike.leach@xxxxxxxxxx; leo.yan@xxxxxxxxxx > > Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; coresight@xxxxxxxxxxxxxxxx; > > linux- kernel@xxxxxxxxxxxxxxx; robh+dt@xxxxxxxxxx; > > krzysztof.kozlowski+dt@xxxxxxxxxx; conor+dt@xxxxxxxxxx; > > devicetree@xxxxxxxxxxxxxxx; Sunil Kovvuri Goutham > > <sgoutham@xxxxxxxxxxx>; George Cherian <gcherian@xxxxxxxxxxx>; Anil > > Kumar Reddy H <areddy3@xxxxxxxxxxx>; Tanmay Jagdale > > <tanmay@xxxxxxxxxxx> > > Subject: [EXT] Re: [PATCH 5/7] coresight: tmc: Add support for reading > > tracedata from previous boot > > > > External Email > > > > ---------------------------------------------------------------------- > > > > > > On 03/10/2023 17:43, James Clark wrote: > > > > > > > > > On 29/09/2023 14:37, Linu Cherian wrote: > > >> * Introduce a new mode CS_MODE_READ_PREVBOOT for reading > > tracedata > > >> captured in previous boot. > > >> > > >> * Add special handlers for preparing ETR/ETF for this special mode > > >> > > >> * User can read the trace data as below > > >> > > >> For example, for reading trace data from tmc_etf sink > > >> > > >> 1. cd /sys/bus/coresight/devices/tmc_etfXX/ > > >> > > >> 2. Change mode to READ_PREVBOOT > > >> > > >> #echo 1 > read_prevboot > > >> > > >> 3. Dump trace buffer data to a file, > > >> > > >> #dd if=/dev/tmc_etrXX of=~/cstrace.bin > > >> > > >> 4. Reset back to normal mode > > >> > > >> #echo 0 > read_prevboot > > >> > > >> Signed-off-by: Anil Kumar Reddy <areddy3@xxxxxxxxxxx> > > >> Signed-off-by: Tanmay Jagdale <tanmay@xxxxxxxxxxx> > > >> Signed-off-by: Linu Cherian <lcherian@xxxxxxxxxxx> > > >> --- > > >> .../coresight/coresight-etm4x-core.c | 1 + > > >> .../hwtracing/coresight/coresight-tmc-core.c | 81 +++++++++- > > >> .../hwtracing/coresight/coresight-tmc-etf.c | 62 ++++++++ > > >> .../hwtracing/coresight/coresight-tmc-etr.c | 145 > +++++++++++++++++- > > >> drivers/hwtracing/coresight/coresight-tmc.h | 6 + > > >> include/linux/coresight.h | 13 ++ > > >> 6 files changed, 306 insertions(+), 2 deletions(-) > > >> > > >> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c > > >> b/drivers/hwtracing/coresight/coresight-etm4x-core.c > > >> index 77b0271ce6eb..513baf681280 100644 > > >> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c > > >> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c > > >> @@ -1010,6 +1010,7 @@ static void etm4_disable(struct > > >> coresight_device *csdev, > > >> > > >> switch (mode) { > > >> case CS_MODE_DISABLED: > > >> + case CS_MODE_READ_PREVBOOT: > > >> break; > > >> case CS_MODE_SYSFS: > > >> etm4_disable_sysfs(csdev); > > >> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c > > >> b/drivers/hwtracing/coresight/coresight-tmc-core.c > > >> index 6658ce76777b..65c15c9f821b 100644 > > >> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c > > >> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c > > >> @@ -103,6 +103,45 @@ u32 tmc_get_memwidth_mask(struct > > tmc_drvdata *drvdata) > > >> return mask; > > >> } > > >> > > >> +int tmc_read_prepare_prevboot(struct tmc_drvdata *drvdata) { > > >> + int ret = 0; > > >> + struct tmc_register_snapshot *reg_ptr; > > >> + struct coresight_device *csdev = drvdata->csdev; > > >> + > > >> + if (!drvdata->metadata.vaddr) { > > >> + ret = -ENOMEM; > > >> + goto out; > > >> + } > > >> + > > >> + reg_ptr = drvdata->metadata.vaddr; > > >> + if (!reg_ptr->valid) { > > >> + dev_err(&drvdata->csdev->dev, > > >> + "Invalid metadata captured from previous boot\n"); > > >> + ret = -EINVAL; > > >> + goto out; > > >> + } > > > > > > I'm wondering if a more robust check is needed than the valid flag, > > > like a checksum or something. I didn't debug it yet but I ended up > > > with an invalid set of metadata after a panic reboot, see below. I'm > > > not sure if it's just a logic bug or something got lost during the > > > reboot, I didn't debug it yet. But I suppose unless you assume the > > > panic didn't affect writing the metadata, then it could be partially > > > written and shouldn't be trusted? > > > > > > [...] > > >> + > > >> +static int tmc_etr_sync_prevboot_buf(struct tmc_drvdata *drvdata) { > > >> + u32 status; > > >> + u64 rrp, rwp, dba; > > >> + struct tmc_register_snapshot *reg_ptr; > > >> + struct etr_buf *etr_buf = drvdata->prevboot_buf; > > >> + > > >> + reg_ptr = drvdata->metadata.vaddr; > > >> + > > >> + rrp = reg_ptr->rrp; > > >> + rwp = reg_ptr->rwp; > > >> + dba = reg_ptr->dba; > > >> + status = reg_ptr->sts; > > >> + > > >> + etr_buf->full = !!(status & TMC_STS_FULL); > > >> + > > >> + /* Sync the buffer pointers */ > > >> + etr_buf->offset = rrp - dba; > > >> + if (etr_buf->full) > > >> + etr_buf->len = etr_buf->size; > > >> + else > > >> + etr_buf->len = rwp - rrp; > > >> + > > >> + /* Sanity checks for validating metadata */ > > >> + if ((etr_buf->offset > etr_buf->size) || > > >> + (etr_buf->len > etr_buf->size)) > > >> + return -EINVAL; > > > > > > The values I got here are 0x781b67182aa346f9 0x8000000 0x8000000 for > > > offset, size and len respectively. This fails the first check. It > > > would also be nice to have a dev_dbg here as well, it's basically > > > the same as the valid check above which does have one. > > > > > > > So I debugged it and the issue is that after the panic I was doing a > > cold boot rather than a warm boot and the memory was being randomised. > > > > The reason that 0x8000000 seemed to be initialised is because they are > > based on the reserved region size, rather than anything from the > > metadata. When I examined the metadata it was all randomised. > > > > That leads me to think that the single bit for 'valid' is insufficient. > > There is a simple hashing function in include/linux/stringhash.h that > > we could use on the whole metadata struct, but that specifically says: > > > > * These hash functions are NOT GUARANTEED STABLE between kernel > > * versions, architectures, or even repeated boots of the same kernel. > > * (E.g. they may depend on boot-time hardware detection or be > > * deliberately randomized.) > > > > Although I'm not sure how true the repeated boots of the same kernel > > part is. > > > > Maybe something in include/crypto/hash.h could be used instead, or > > make our own simple hash. > > Thanks for the pointers. Will take a look at it. Since the purpose is to identify any data corruption, crc32(using crc32_le API) looks okay to me. Any thoughts on this ? May be we could add crc32 checks for trace data as well ? Thanks. > > > > > > > > _______________________________________________ > CoreSight mailing list -- coresight@xxxxxxxxxxxxxxxx To unsubscribe send an > email to coresight-leave@xxxxxxxxxxxxxxxx