On Tue, 29 Aug 2023, Shyam Sundar S K wrote: > On 8/29/2023 4:49 PM, Ilpo Järvinen wrote: > > On Tue, 29 Aug 2023, Shyam Sundar S K wrote: > >> On 8/29/2023 3:09 PM, Ilpo Järvinen wrote: > >>> On Tue, 29 Aug 2023, Shyam Sundar S K wrote: > >>> > >>>> There have been instances when the default size (1M) of the STB is not > >>>> sufficient to get the complete traces of the failure. In such scenarios > >>>> we can use a module_param to enable full trace that shall contain more > >>>> debugging data. This is not a regular case and hence not enabling this > >>>> capability by default. > >>>> > >>>> Co-developed-by: Harsh Jain <Harsh.Jain@xxxxxxx> > >>>> Signed-off-by: Harsh Jain <Harsh.Jain@xxxxxxx> > >>>> Signed-off-by: Sanket Goswami <Sanket.Goswami@xxxxxxx> > >>>> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@xxxxxxx> > >>>> --- > >>>> v2->v3: > >>>> - no change > >>>> > >>>> v1->v2: > >>>> - rebase to 'review-hans' branch > >>>> - drop 2/4 of v1 > >>>> (https://patchwork.kernel.org/project/platform-driver-x86/list/?series=775324&state=%2A&archive=both) > >>>> > >>>> drivers/platform/x86/amd/pmc/pmc.c | 34 +++++++++++++++++++----------- > >>>> drivers/platform/x86/amd/pmc/pmc.h | 1 + > >>>> 2 files changed, 23 insertions(+), 12 deletions(-) > >>>> > >>>> diff --git a/drivers/platform/x86/amd/pmc/pmc.c b/drivers/platform/x86/amd/pmc/pmc.c > >>>> index 443bb78ea5f4..0e61ae74f1a9 100644 > >>>> --- a/drivers/platform/x86/amd/pmc/pmc.c > >>>> +++ b/drivers/platform/x86/amd/pmc/pmc.c > >>>> @@ -53,6 +53,7 @@ > >>>> > >>>> /* STB Spill to DRAM Parameters */ > >>>> #define S2D_TELEMETRY_BYTES_MAX 0x100000 > >>>> +#define S2D_TELEMETRY_FSIZE_MAX 0x200000 > >>>> #define S2D_TELEMETRY_DRAMBYTES_MAX 0x1000000 > >>>> > >>>> /* STB Spill to DRAM Message Definition */ > >>>> @@ -160,6 +161,10 @@ static bool disable_workarounds; > >>>> module_param(disable_workarounds, bool, 0644); > >>>> MODULE_PARM_DESC(disable_workarounds, "Disable workarounds for platform bugs"); > >>>> > >>>> +static bool dump_custom_stb; > >>>> +module_param(dump_custom_stb, bool, 0644); > >>>> +MODULE_PARM_DESC(dump_custom_stb, "Enable to dump full STB buffer"); > >>>> + > >>>> static struct amd_pmc_dev pmc; > >>>> static int amd_pmc_send_cmd(struct amd_pmc_dev *dev, u32 arg, u32 *data, u8 msg, bool ret); > >>>> static int amd_pmc_read_stb(struct amd_pmc_dev *dev, u32 *buf); > >>>> @@ -239,7 +244,7 @@ static const struct file_operations amd_pmc_stb_debugfs_fops = { > >>>> static int amd_pmc_stb_debugfs_open_v2(struct inode *inode, struct file *filp) > >>>> { > >>>> struct amd_pmc_dev *dev = filp->f_inode->i_private; > >>>> - u32 *buf, fsize, num_samples, val, stb_rdptr_offset = 0; > >>>> + u32 *buf, num_samples, val, stb_rdptr_offset = 0; > >>>> int ret; > >>>> > >>>> /* Write dummy postcode while reading the STB buffer */ > >>>> @@ -247,10 +252,6 @@ static int amd_pmc_stb_debugfs_open_v2(struct inode *inode, struct file *filp) > >>>> if (ret) > >>>> dev_err(dev->dev, "error writing to STB: %d\n", ret); > >>>> > >>>> - buf = kzalloc(S2D_TELEMETRY_BYTES_MAX, GFP_KERNEL); > >>>> - if (!buf) > >>>> - return -ENOMEM; > >>>> - > >>>> /* Spill to DRAM num_samples uses separate SMU message port */ > >>>> dev->msg_port = 1; > >>>> > >>>> @@ -264,20 +265,27 @@ static int amd_pmc_stb_debugfs_open_v2(struct inode *inode, struct file *filp) > >>>> dev->msg_port = 0; > >>>> if (ret) { > >>>> dev_err(dev->dev, "error: S2D_NUM_SAMPLES not supported : %d\n", ret); > >>>> - kfree(buf); > >>>> return ret; > >>>> } > >>>> > >>>> /* Start capturing data from the last push location */ > >>>> - if (num_samples > S2D_TELEMETRY_BYTES_MAX) { > >>>> - fsize = S2D_TELEMETRY_BYTES_MAX; > >>>> - stb_rdptr_offset = num_samples - fsize; > >>>> + if (dump_custom_stb && > >>>> + (dev->dram_size - S2D_TELEMETRY_BYTES_MAX <= S2D_TELEMETRY_FSIZE_MAX)) { > >>>> + dev->fsize = dev->dram_size - S2D_TELEMETRY_BYTES_MAX; > >>>> + stb_rdptr_offset = 0; > >>>> + } else if (num_samples > S2D_TELEMETRY_BYTES_MAX) { > >>> > >>> I find the logic here somewhat suspicious in the first if check (I fail > >>> to follow the reasoning here despite spending considerable amount of time > >>> staring at it). > >>> > >>> Lets assume dump_custom_stb is set. If dev->dram_size is large, the first > >>> if condition will be false because LHS of <= is large. Thus, we go to the > >>> second branch and dump exactly as many samples as before this patch. > >>> ...And that is opposite of what this patch claims to do? > >>> > >> > >> Let me explain: > >> > >> We have cases where the S2D (Spill to DRAM) STB (Smart Trace Buffer - > >> which is meant to have debug data that gives information about the > >> system conditions, be it success or failure.) may not fit in the current > >> stb buffer size of 1M (i.e. S2D_TELEMETRY_BYTES_MAX). The stb buffer > >> size of 1M is sufficient for most of the platforms, but there could be > >> exceptions. > >> > >> In those cases, the dram_size would be updated by the PMFW (Power > >> Management Firmware) coupled with "dump_custom_stb" to fit into a custom > >> requirement. > >> > >> As an example (with this proposed change): > >> > >> when dump_custom_stb is SET and dev->dram_size is 3M (received from > >> PMFW) the first if() will be successful and it takes the corresponding > >> code path. > > > > Heh, I can see the numbers work in that case, however, the above still > > doesn't really explain why only S2D_TELEMETRY_BYTES_MAX is copied if > > dram_size - S2D_TELEMETRY_BYTES_MAX > S2D_TELEMETRY_FSIZE_MAX and not > > S2D_TELEMETRY_FSIZE_MAX as I would expect? > > > > I guess if your question is on: > > -->the if() case: > > why: > > dev->fsize = dev->dram_size - S2D_TELEMETRY_BYTES_MAX; > > and not > > dev->fsize = S2D_TELEMETRY_FSIZE_MAX; > > if yes, I think the intention of the code change is correct. I agree with you here, the statement itself inside the first if block is not wrong. > The entire > buffer has multiple segments out of which the driver can only access the > STB buffer region. In general cases its S2D_TELEMETRY_BYTES_MAX(1M) and > with the proposed patch a custom STB buffer range that cannot exceed > S2D_TELEMETRY_FSIZE_MAX. > > taking the above example, assume > > case 1: > dev->dram_size is 3M, S2D_TELEMETRY_BYTES_MAX is 1M, so the dev->fsize > is 2M. > > case 2: > dev->dram_size is 2M, S2D_TELEMETRY_BYTES_MAX is 1M, so the dev->fsize > is 1M. > > so, it depends on the incoming dev->dram_size from the PMFW based on > which the dev->fsize is calculated and cannot be directly > S2D_TELEMETRY_FSIZE_MAX (always). Again your examples leave the most interesting case out... So what if dev->dram_size above 3M, do you want dev->fsize to be S2D_TELEMETRY_FSIZE_MAX or up to S2D_TELEMETRY_BYTES_MAX (the latter depends on the num_samples)? Or are saying that configuration never happens? If that's the case, then the entire <= condition looks unnecessary (which is why it looked suspicious) and you could simply do: if (dump_custom_stb) { dev->fsize = dev->dram_size - S2D_TELEMETRY_BYTES_MAX; ... ...But then, dev->dram_size is assigned S2D_TELEMETRY_DRAMBYTES_MAX if DRAM size wasn't provided so it doesn't look convincing to me dev->dram_size would be limited to 3M. > -->else if() case: > > why: > dev->fsize = S2D_TELEMETRY_BYTES_MAX; > > and not: > dev->fsize = S2D_TELEMETRY_FSIZE_MAX; > > the already exising code is put under the new else if() block. In this > case the dev->fsize is fixed to S2D_TELEMETRY_BYTES_MAX which is 1M as > the internal alignment with the PMFW team. Like I mentioned above, the > telemetry buffer is huge and we are only interested with the actual STB > buffer region here. Is it intentional to end up into this branch if dump_custom_stb is set or not? Because that's what happens when dev->dram_size > 3M, no? > Hope that's clear. Unfortunately my question is still not answered adequately. -- i.