Re: [EXTERNAL] Re: [PATCH v9 5/7] coresight: tmc: Add support for reading crash data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/06/2024 07:15, Linu Cherian wrote:
> Hi James,
> 
>> -----Original Message-----
>> From: James Clark <james.clark@xxxxxxx>
>> Sent: Friday, June 7, 2024 7:48 PM
>> To: Linu Cherian <lcherian@xxxxxxxxxxx>
>> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; coresight@xxxxxxxxxxxxxxxx; linux-
>> kernel@xxxxxxxxxxxxxxx; robh+dt@xxxxxxxxxx;
>> krzysztof.kozlowski+dt@xxxxxxxxxx; conor+dt@xxxxxxxxxx;
>> devicetree@xxxxxxxxxxxxxxx; Sunil Kovvuri Goutham
>> <sgoutham@xxxxxxxxxxx>; George Cherian <gcherian@xxxxxxxxxxx>; Anil
>> Kumar Reddy H <areddy3@xxxxxxxxxxx>; Tanmay Jagdale
>> <tanmay@xxxxxxxxxxx>; suzuki.poulose@xxxxxxx; mike.leach@xxxxxxxxxx
>> Subject: [EXTERNAL] Re: [PATCH v9 5/7] coresight: tmc: Add support for
>> reading crash data
>>
>> Prioritize security for external emails: Confirm sender and content safety
>> before clicking links or opening attachments
>>
>> ----------------------------------------------------------------------
>>
>>
>> On 05/06/2024 17:09, James Clark wrote:
>>>
>>>
>>> On 05/06/2024 09:17, Linu Cherian wrote:
>>>> * Introduce a new mode CS_MODE_READ_CRASHDATA for reading trace
>>>>   captured in previous crash/watchdog reset.
>>>>
>>>> * Add special device files for reading ETR/ETF crash data.
>>>>
>>>> * User can read the crash data as below
>>>>
>>>>   For example, for reading crash data from tmc_etf sink
>>>>
>>>>   #dd if=/dev/crash_tmc_etfXX of=~/cstrace.bin
>>>>
>>>> Signed-off-by: Anil Kumar Reddy <areddy3@xxxxxxxxxxx>
>>>> Signed-off-by: Tanmay Jagdale <tanmay@xxxxxxxxxxx>
>>>> Signed-off-by: Linu Cherian <lcherian@xxxxxxxxxxx>
>>>> ---
>>>> Changelog from v8:
>>>> * Added missing exit path in __tmc_probe
>>>> * Few whitespace fixes and a checkpatch fix.
>>>>
>>>>  .../coresight/coresight-etm4x-core.c          |   1 +
>>>>  .../hwtracing/coresight/coresight-tmc-core.c  | 150 ++++++++++++++++-
>>>>  .../hwtracing/coresight/coresight-tmc-etf.c   |  72 +++++++++
>>>>  .../hwtracing/coresight/coresight-tmc-etr.c   | 151 +++++++++++++++++-
>>>>  drivers/hwtracing/coresight/coresight-tmc.h   |  11 +-
>>>>  include/linux/coresight.h                     |  13 ++
>>>>  6 files changed, 390 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>> b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>> index a0bdfabddbc6..7924883476c6 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
>>>> @@ -1011,6 +1011,7 @@ static void etm4_disable(struct
>>>> coresight_device *csdev,
>>>>
>>>>  	switch (mode) {
>>>>  	case CS_MODE_DISABLED:
>>>> +	case CS_MODE_READ_CRASHDATA:
>>>>  		break;
>>>>  	case CS_MODE_SYSFS:
>>>>  		etm4_disable_sysfs(csdev);
>>>> diff --git a/drivers/hwtracing/coresight/coresight-tmc-core.c
>>>> b/drivers/hwtracing/coresight/coresight-tmc-core.c
>>>> index daad08bc693d..0c145477ba66 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-tmc-core.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-tmc-core.c
>>>> @@ -106,6 +106,60 @@ u32 tmc_get_memwidth_mask(struct
>> tmc_drvdata *drvdata)
>>>>  	return mask;
>>>>  }
>>>>
>>>> +int tmc_read_prepare_crashdata(struct tmc_drvdata *drvdata) {
>>>> +	int ret = 0;
>>>> +	struct tmc_crash_metadata *mdata;
>>>> +	struct coresight_device *csdev = drvdata->csdev;
>>>> +
>>>> +	if (!drvdata->crash_mdata.vaddr) {
>>>> +		ret = -ENOMEM;
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	mdata = drvdata->crash_mdata.vaddr;
>>>> +	/* Check data integrity of metadata */
>>>> +	if (mdata->crc32_mdata != find_crash_metadata_crc(mdata)) {
>>>> +		dev_dbg(&drvdata->csdev->dev,
>>>> +			"CRC mismatch in tmc crash metadata\n");
>>>> +		ret = -EINVAL;
>>>> +		goto out;
>>>> +	}
>>>> +	/* Check data integrity of tracedata */
>>>> +	if (mdata->crc32_tdata != find_crash_tracedata_crc(drvdata, mdata))
>> {
>>>> +		dev_dbg(&drvdata->csdev->dev,
>>>> +			"CRC mismatch in tmc crash tracedata\n");
>>>> +		ret = -EINVAL;
>>>> +		goto out;
>>>> +	}
>>>> +	/* Check for valid metadata */
>>>> +	if (!mdata->valid) {
>>>> +		dev_dbg(&drvdata->csdev->dev,
>>>> +			"Data invalid in tmc crash metadata\n");
>>>> +		ret = -EINVAL;
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	/* Sink specific crashdata mode preparation */
>>>> +	ret = crashdata_ops(csdev)->prepare(csdev);
>>>> +	if (ret)
>>>> +		goto out;
>>>> +
>>>> +	if (mdata->sts & 0x1)
>>>> +		coresight_insert_barrier_packet(drvdata->buf);
>>>> +
>>>> +out:
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +int tmc_read_unprepare_crashdata(struct tmc_drvdata *drvdata) {
>>>> +	struct coresight_device *csdev = drvdata->csdev;
>>>> +
>>>> +	/* Sink specific crashdata mode preparation */
>>>> +	return crashdata_ops(csdev)->unprepare(csdev);
>>>> +}
>>>> +
>>>>  static int tmc_read_prepare(struct tmc_drvdata *drvdata)  {
>>>>  	int ret = 0;
>>>> @@ -156,6 +210,9 @@ static int tmc_open(struct inode *inode, struct file
>> *file)
>>>>  	struct tmc_drvdata *drvdata = container_of(file->private_data,
>>>>  						   struct tmc_drvdata,
>> miscdev);
>>>>
>>>> +	if (coresight_get_mode(drvdata->csdev) ==
>> CS_MODE_READ_CRASHDATA)
>>>> +		return -EBUSY;
>>>> +
>>>>  	ret = tmc_read_prepare(drvdata);
>>>>  	if (ret)
>>>>  		return ret;
>>>> @@ -180,13 +237,12 @@ static inline ssize_t tmc_get_sysfs_trace(struct
>> tmc_drvdata *drvdata,
>>>>  	return -EINVAL;
>>>>  }
>>>>
>>>> -static ssize_t tmc_read(struct file *file, char __user *data, size_t len,
>>>> -			loff_t *ppos)
>>>> +static ssize_t tmc_read_common(struct tmc_drvdata *drvdata, char
>> __user *data,
>>>> +			       size_t len, loff_t *ppos)
>>>>  {
>>>>  	char *bufp;
>>>>  	ssize_t actual;
>>>> -	struct tmc_drvdata *drvdata = container_of(file->private_data,
>>>> -						   struct tmc_drvdata,
>> miscdev);
>>>> +
>>>>  	actual = tmc_get_sysfs_trace(drvdata, *ppos, len, &bufp);
>>>>  	if (actual <= 0)
>>>>  		return 0;
>>>> @@ -203,6 +259,15 @@ static ssize_t tmc_read(struct file *file, char
>> __user *data, size_t len,
>>>>  	return actual;
>>>>  }
>>>>
>>>> +static ssize_t tmc_read(struct file *file, char __user *data, size_t len,
>>>> +			loff_t *ppos)
>>>> +{
>>>> +	struct tmc_drvdata *drvdata = container_of(file->private_data,
>>>> +						   struct tmc_drvdata,
>> miscdev);
>>>> +
>>>> +	return tmc_read_common(drvdata, data, len, ppos); }
>>>> +
>>>>  static int tmc_release(struct inode *inode, struct file *file)  {
>>>>  	int ret;
>>>> @@ -225,6 +290,61 @@ static const struct file_operations tmc_fops = {
>>>>  	.llseek		= no_llseek,
>>>>  };
>>>>
>>>> +static int tmc_crashdata_open(struct inode *inode, struct file
>>>> +*file) {
>>>> +	int ret;
>>>> +	struct tmc_drvdata *drvdata = container_of(file->private_data,
>>>> +						   struct tmc_drvdata,
>>>> +						   crashdev);
>>>> +
>>>> +	if (!coresight_take_mode(drvdata->csdev,
>> CS_MODE_READ_CRASHDATA))
>>>> +		return -EBUSY;
>>>> +
>>>> +	ret = tmc_read_prepare(drvdata);
>>>> +	if (ret) {
>>>> +		coresight_set_mode(drvdata->csdev, CS_MODE_DISABLED);
>>>> +		return ret;
>>>> +	}
>>>> +
>>>> +	nonseekable_open(inode, file);
>>>> +
>>>> +	dev_dbg(&drvdata->csdev->dev, "%s: successfully opened\n",
>> __func__);
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static ssize_t tmc_crashdata_read(struct file *file, char __user *data,
>>>> +				  size_t len, loff_t *ppos)
>>>> +{
>>>> +	struct tmc_drvdata *drvdata = container_of(file->private_data,
>>>> +						   struct tmc_drvdata,
>>>> +						   crashdev);
>>>> +
>>>> +	return tmc_read_common(drvdata, data, len, ppos); }
>>>> +
>>>> +static int tmc_crashdata_release(struct inode *inode, struct file
>>>> +*file) {
>>>> +	int ret = 0;
>>>> +	struct tmc_drvdata *drvdata = container_of(file->private_data,
>>>> +						   struct tmc_drvdata,
>>>> +						   crashdev);
>>>> +
>>>> +	ret = tmc_read_unprepare(drvdata);
>>>> +
>>>> +	coresight_set_mode(drvdata->csdev, CS_MODE_DISABLED);
>>>> +
>>>> +	dev_dbg(&drvdata->csdev->dev, "%s: released\n", __func__);
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static const struct file_operations tmc_crashdata_fops = {
>>>> +	.owner		= THIS_MODULE,
>>>> +	.open		= tmc_crashdata_open,
>>>> +	.read		= tmc_crashdata_read,
>>>> +	.release	= tmc_crashdata_release,
>>>> +	.llseek		= no_llseek,
>>>> +};
>>>> +
>>>>  static enum tmc_mem_intf_width tmc_get_memwidth(u32 devid)  {
>>>>  	enum tmc_mem_intf_width memwidth;
>>>> @@ -542,6 +662,18 @@ static u32 tmc_etr_get_max_burst_size(struct
>> device *dev)
>>>>  	return burst_size;
>>>>  }
>>>>
>>>> +static void register_crash_dev_interface(struct tmc_drvdata *drvdata,
>>>> +					 const char *name)
>>>> +{
>>>> +	drvdata->crashdev.name =
>>>> +		devm_kasprintf(&drvdata->csdev->dev, GFP_KERNEL,
>> "%s_%s", "crash", name);
>>>> +	drvdata->crashdev.minor = MISC_DYNAMIC_MINOR;
>>>> +	drvdata->crashdev.fops = &tmc_crashdata_fops;
>>>> +	if (misc_register(&drvdata->crashdev))
>>>> +		dev_dbg(&drvdata->csdev->dev,
>>>> +			"Failed to setup user interface for crashdata\n"); }
>>>> +
>>>>  static int __tmc_probe(struct device *dev, struct resource *res)  {
>>>>  	int ret = 0;
>>>> @@ -642,8 +774,13 @@ static int __tmc_probe(struct device *dev, struct
>> resource *res)
>>>>  	drvdata->miscdev.minor = MISC_DYNAMIC_MINOR;
>>>>  	drvdata->miscdev.fops = &tmc_fops;
>>>>  	ret = misc_register(&drvdata->miscdev);
>>>> -	if (ret)
>>>> +	if (ret) {
>>>>  		coresight_unregister(drvdata->csdev);
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	if (is_tmc_reserved_region_valid(dev))
>>>> +		register_crash_dev_interface(drvdata, desc.name);
>>>
>>> I think this would be better if it checked the CRC of the metadata in
>>> the same way it does before reading the file.
>>>
>>> Now we have two forms of "region valid", one that's any non-zero
>>> value, and the other "really valid" one. And because we don't check
>>> the CRC here we register a device that can't be used.
>>>
>>> I found it a bit confusing because without enabling debug prints I
>>> didn't know why the file couldn't be read. So I wasn't sure if it was
>>> because it wasn't valid or some other reason.
>>>
>>> I also wasn't able to get a valid region after booting the crash kernel.
>>> But maybe the memory isn't preserved across the reboot on my Juno, so
>>> I don't think that's necessarily an issue?
>>
>> Ok so I double checked by writing 0x123456 into the reserved region and
>> confirmed that it _is_ preserved when booting the panic kernel on my Juno.
>> So I'm not sure why I wasn't able to read out the crash dump.
>>
>> I did see the "success" message from tmc_panic_sync_etr() at least some of
>> the times, although I do remember it not printing out every time. I don't
>> know if this is just an issue with outputting to serial after a panic or
>> something else was going on?
>>
>> Did you ever see the success message not print out? Or not able to read back
>> the data when you were testing it?
> 
> During my testing, success messages were not print out only in two cases, 
> a. resrv buf mode is not enabled for ETR
> #echo "resrv" > /sys/bus/coresight/devices/tmc_etr0/buf_mode_preferred
> b. ETR is not in enabled state ie. Neither perf session nor syfs trace session is not active at the time of kernel panic. 
> 
> If success message is not printed, then definitely a valid snapshot is not taken at the time of panic. 
> 
> On iterations where success is printed, do you see any debug messages from tmc_read_prepare_crashdata at the time of reading /dev/crash_tmc_etrXX indicating invalid snapshot ?
> 
> Thanks.
> 

Hi Linu,

It was because I was missing the crash_kexec_post_notifiers option. With
that added I was able to successfully read back the crash dump.

I did notice two things while testing it, but I think they're probably
unrelated:

 * A hang after the panic when the coresight_cpu_debug module is
   loaded. I think this might even be by design, but I didn't look into
   it, I just unloaded it.
 * One hard lockup when booting the crash kernel, but nothing related to
   this in the stack and only happened once.

Other than the other minor review comments it all looks ok to me now.

James




[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux