Re: [PATCH v4 2/2] platform/x86/amd/pmc: Add dump_custom_stb module parameter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Hans,

Apologies for the long delay.

On 9/18/2023 5:57 PM, Hans de Goede wrote:
> Hi Shyam,
> 
> On 9/10/23 16:20, Shyam Sundar S K wrote:
>> There have been instances when the default size (1M) of the STB is not
>> sufficient to get the complete traces of the failure. In such scenarios
>> we can use a module_param to enable full trace that shall contain more
>> debugging data. This is not a regular case and hence not enabling this
>> capability by default.
>>
>> With this change, there will be two cases on how the driver fetches the
>> stb data:
>> 1) A special case (proposed now) - which is required only for certain
>> platforms. Here, a new module param will be supplied to the driver that
>> will have a special PMFW supporting enhanced dram sizes for getting
>> the stb data. Without the special PMFW support, just setting the module
>> param will not help to get the enhanced stb data.
>>
>> 2) Current code branch which fetches the stb data based on the parameters
>> like the num_samples, fsize and the r/w pointer.
>>
>> Co-developed-by: Harsh Jain <Harsh.Jain@xxxxxxx>
>> Signed-off-by: Harsh Jain <Harsh.Jain@xxxxxxx>
>> Signed-off-by: Sanket Goswami <Sanket.Goswami@xxxxxxx>
>> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@xxxxxxx>
>> ---
>> v3->v4:
>>  - Update code branches and commit-msg as per Ilpo's remark.
>>
>> v2->v3:
>>  - no change
>>
>> v1->v2:
>>  - rebase to 'review-hans' branch
>>  - drop 2/4 of v1
>>    (https://patchwork.kernel.org/project/platform-driver-x86/list/?series=775324&state=%2A&archive=both)
>>
>>  drivers/platform/x86/amd/pmc/pmc.c | 43 +++++++++++++++++++++---------
>>  drivers/platform/x86/amd/pmc/pmc.h |  1 +
>>  2 files changed, 32 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/platform/x86/amd/pmc/pmc.c b/drivers/platform/x86/amd/pmc/pmc.c
>> index 443bb78ea5f4..7e907cb50787 100644
>> --- a/drivers/platform/x86/amd/pmc/pmc.c
>> +++ b/drivers/platform/x86/amd/pmc/pmc.c
>> @@ -53,6 +53,7 @@
>>  
>>  /* STB Spill to DRAM Parameters */
>>  #define S2D_TELEMETRY_BYTES_MAX		0x100000
>> +#define S2D_TELEMETRY_FSIZE_MAX		0x200000
>>  #define S2D_TELEMETRY_DRAMBYTES_MAX	0x1000000
>>  
>>  /* STB Spill to DRAM Message Definition */
>> @@ -160,6 +161,10 @@ static bool disable_workarounds;
>>  module_param(disable_workarounds, bool, 0644);
>>  MODULE_PARM_DESC(disable_workarounds, "Disable workarounds for platform bugs");
>>  
>> +static bool dump_custom_stb;
>> +module_param(dump_custom_stb, bool, 0644);
>> +MODULE_PARM_DESC(dump_custom_stb, "Enable to dump full STB buffer");
>> +
>>  static struct amd_pmc_dev pmc;
>>  static int amd_pmc_send_cmd(struct amd_pmc_dev *dev, u32 arg, u32 *data, u8 msg, bool ret);
>>  static int amd_pmc_read_stb(struct amd_pmc_dev *dev, u32 *buf);
>> @@ -239,7 +244,7 @@ static const struct file_operations amd_pmc_stb_debugfs_fops = {
>>  static int amd_pmc_stb_debugfs_open_v2(struct inode *inode, struct file *filp)
>>  {
>>  	struct amd_pmc_dev *dev = filp->f_inode->i_private;
>> -	u32 *buf, fsize, num_samples, val, stb_rdptr_offset = 0;
>> +	u32 *buf, num_samples, val, stb_rdptr_offset = 0;
>>  	int ret;
>>  
>>  	/* Write dummy postcode while reading the STB buffer */
>> @@ -247,10 +252,6 @@ static int amd_pmc_stb_debugfs_open_v2(struct inode *inode, struct file *filp)
>>  	if (ret)
>>  		dev_err(dev->dev, "error writing to STB: %d\n", ret);
>>  
>> -	buf = kzalloc(S2D_TELEMETRY_BYTES_MAX, GFP_KERNEL);
>> -	if (!buf)
>> -		return -ENOMEM;
>> -
>>  	/* Spill to DRAM num_samples uses separate SMU message port */
>>  	dev->msg_port = 1;
>>  
>> @@ -264,20 +265,36 @@ static int amd_pmc_stb_debugfs_open_v2(struct inode *inode, struct file *filp)
>>  	dev->msg_port = 0;
>>  	if (ret) {
>>  		dev_err(dev->dev, "error: S2D_NUM_SAMPLES not supported : %d\n", ret);
>> -		kfree(buf);
>>  		return ret;
>>  	}
>>  
>>  	/* Start capturing data from the last push location */
>> -	if (num_samples > S2D_TELEMETRY_BYTES_MAX) {
>> -		fsize  = S2D_TELEMETRY_BYTES_MAX;
>> -		stb_rdptr_offset = num_samples - fsize;
>> +	if (dump_custom_stb &&
>> +	    (dev->dram_size - S2D_TELEMETRY_BYTES_MAX <= S2D_TELEMETRY_FSIZE_MAX)) {
>> +		/*
>> +		 * we have a custom stb size and the PMFW is supposed to give
>> +		 * the enhanced dram size. Note that we land here only for the
>> +		 * platforms that support enhanced dram size reporting.
>> +		 */
>> +		dev->fsize = dev->dram_size - S2D_TELEMETRY_BYTES_MAX;
>> +		stb_rdptr_offset = 0;
> 
> I don't understand this part. Why is num_samples not taken into account
> anymore ?  and why substract S2D_TELEMETRY_BYTES_MAX from dram_size ?

Like I tried to capture this information in the commit-msg; this is a
special case where the standard STB size reported by the SMU FW will
not be useful to debug the failures. In those identified platforms,
there be a custom SMU FW running and it will have the enhanced FW
reporting capability.

So, in those "identified" platforms, we don't need to look at the
num_samples. That's an agreed protocol between the driver and FW for
this case.

not the entire DRAM size reported by the FW is usable, it has a
reserved space of 1M. Hence we have to subtract that while accouting
the dev->fsize here.

> 
> 
> 
>> +	} else if (num_samples > S2D_TELEMETRY_BYTES_MAX) {
>> +		/*
>> +		 * This is for general cases, where the stb limits are meant for
>> +		 * standard usage
>> +		 */
>> +		dev->fsize  = S2D_TELEMETRY_BYTES_MAX;
>> +		stb_rdptr_offset = num_samples - dev->fsize;
> 
> This assumes that num_samples is in the S2D_TELEMETRY_BYTES_MAX+1 .. 2*S2D_TELEMETRY_BYTES_MAX
> rang, what if it is more ?
> 
> I think that what you want here is:
> 
> 		stb_rdptr_offset = num_samples % S2D_TELEMETRY_BYTES_MAX;

No, this does not work.

Let's take an example:

dev->fsize = S2D_TELEMETRY_BYTES_MAX (i.e. 0x100000)
num_samples = 0x8218b8

existing:
stb_rdptr_offset = num_samples - dev->fsize;
we will get => 0x7218b8

but if we do:
stb_rdptr_offset = num_samples % S2D_TELEMETRY_BYTES_MAX;
we will get => 0x21ea8

You can see that r/w pointer would get corrupted if we do %

This is only one example and the same holds true for any other
examples too.

I spoke to our FW team too to confirm if the driver interpretation is
as per expectation and their answer was YES.

So I feel the current code is actually doing the right thing to
calculate the r/w offset.

Thanks,
Shyam

> 
> Note this is a pre-existing problem but I just noticed this now.
> 
> Regards,
> 
> Hans
> 
>>  	} else {
>> -		fsize = num_samples;
>> +		dev->fsize = num_samples;
>>  		stb_rdptr_offset = 0;
>>  	}
>>  
>> -	memcpy_fromio(buf, dev->stb_virt_addr + stb_rdptr_offset, fsize);
>> +	buf = kzalloc(dev->fsize, GFP_KERNEL);
>> +	if (!buf)
>> +		return -ENOMEM;
>> +
>> +	memcpy_fromio(buf, dev->stb_virt_addr + stb_rdptr_offset, dev->fsize);
>>  	filp->private_data = buf;
>>  
>>  	return 0;
>> @@ -286,11 +303,13 @@ static int amd_pmc_stb_debugfs_open_v2(struct inode *inode, struct file *filp)
>>  static ssize_t amd_pmc_stb_debugfs_read_v2(struct file *filp, char __user *buf, size_t size,
>>  					   loff_t *pos)
>>  {
>> +	struct amd_pmc_dev *dev = filp->f_inode->i_private;
>> +
>>  	if (!filp->private_data)
>>  		return -EINVAL;
>>  
>>  	return simple_read_from_buffer(buf, size, pos, filp->private_data,
>> -					S2D_TELEMETRY_BYTES_MAX);
>> +					dev->fsize);
>>  }
>>  
>>  static int amd_pmc_stb_debugfs_release_v2(struct inode *inode, struct file *filp)
>> diff --git a/drivers/platform/x86/amd/pmc/pmc.h b/drivers/platform/x86/amd/pmc/pmc.h
>> index c27bd6a5642f..f73d265430b8 100644
>> --- a/drivers/platform/x86/amd/pmc/pmc.h
>> +++ b/drivers/platform/x86/amd/pmc/pmc.h
>> @@ -26,6 +26,7 @@ struct amd_pmc_dev {
>>  	u32 dram_size;
>>  	u32 num_ips;
>>  	u32 s2d_msg_id;
>> +	u32 fsize;
>>  /* SMU version information */
>>  	u8 smu_program;
>>  	u8 major;
> 



[Index of Archives]     [Linux Kernel Development]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux