On 11/09/2023 17:39, Mukesh Ojha wrote: > > > On 9/11/2023 2:22 PM, Bagas Sanjaya wrote: >> On Sun, Sep 10, 2023 at 01:46:01AM +0530, Mukesh Ojha wrote: >>> Hi All, >>> >>> This is to continuation from the conversation happened at v4 >>> >>> https://lore.kernel.org/lkml/632c5b97-4a91-c3e8-1e6c-33d6c4f6454f@xxxxxxxxxxx/ >>> >>> https://lore.kernel.org/lkml/695133e6-105f-de2a-5559-555cea0a0462@xxxxxxxxxxx/ >>> >>> We have put abstract on LPC on this topic as well as initiated a mail thread >>> with other SoC vendors but did not get much traction on it. >>> >>> https://lore.kernel.org/lkml/0199db00-1b1d-0c63-58ff-03efae02cb21@xxxxxxxxxxx/ >>> >>> We explored most of possiblity present in kernel to address this issue[1] but >>> solution like kdump/fadump does not seems safe/secure/performant from our >>> perspective. >>> >>> Hence, with this series we tried to make the minidump kernel driver, simple >>> and tied with pstore frontends, so that it collects the present available >>> frontends data like dmesg, ftrace, pmsg, ftrace., Also, we will be working >>> towards enhancing generic pstore to capture more debug data which will be >>> helpful for first hand of debugging that can benefit both other pstore users >>> as well as us as minidump users. >>> >>> One of the proposal made here, >>> https://lore.kernel.org/lkml/1683561060-2197-1-git-send-email-quic_mojha@xxxxxxxxxxx/ >>> >>> Looking forward for your comments. >>> >>> Thanks, >>> Mukesh >>> >>> [1] >>> Minidump is a best effort mechanism to collect useful and predefined data >>> for first level of debugging on end user devices running on Qualcomm SoCs. >>> It is built on the premise that System on Chip (SoC) or subsystem part of >>> SoC crashes, due to a range of hardware and software bugs. Hence, the >>> ability to collect accurate data is only a best-effort. The data collected >>> could be invalid or corrupted, data collection itself could fail, and so on. >>> >>> Qualcomm devices in engineering mode provides a mechanism for generating >>> full system ramdumps for post mortem debugging. But in some cases it's >>> however not feasible to capture the entire content of RAM. The minidump >>> mechanism provides the means for selecting which snippets should be >>> included in the ramdump. >>> >>> The core of SMEM based minidump feature is part of Qualcomm's boot >>> firmware code. It initializes shared memory (SMEM), which is a part of >>> DDR and allocates a small section of SMEM to minidump table i.e also >>> called global table of content (G-ToC). Each subsystem (APSS, ADSP, ...) >>> has their own table of segments to be included in the minidump and all >>> get their reference from G-ToC. Each segment/region has some details >>> like name, physical address and it's size etc. and it could be anywhere >>> scattered in the DDR. >>> >>> Existing upstream Qualcomm remoteproc driver[1] already supports SMEM >>> based minidump feature for remoteproc instances like ADSP, MODEM, ... >>> where predefined selective segments of subsystem region can be dumped >>> as part of coredump collection which generates smaller size artifacts >>> compared to complete coredump of subsystem on crash. >>> >>> [1] >>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/remoteproc/qcom_common.c#n142 >>> >>> In addition to managing and querying the APSS minidump description, >>> the Linux driver maintains a ELF header in a segment. This segment >>> gets updated with section/program header whenever a new entry gets >>> registered. >>> >>> Changes in v5: >>> - On suggestion from Pavan.k, to have single function call for minidump collection >>> from remoteproc driver, separated the logic to have separate minidump file called >>> qcom_rproc_minidump.c and also renamed the function from qcom_minidump() to >>> qcom_rproc_minidump(); however, dropped his suggestion about rework on lazy deletion >>> during region unregister in this series, will pursue it in next series. >>> >>> - To simplify the minidump driver, removed the complication for frontend and different >>> backend from Greg suggestion, will pursue this once main driver gets mainlined. >>> >>> - Move the dynamic ramoops region allocation from Device tree approach to command line >>> approch with the introduction command line parsing and memblock reservation during >>> early boot up; Not added documentation about it yet, will add if it gets positive >>> response. >>> >>> - Exporting linux banner from kernel to make minidump build also as module, however, >>> minidump is a debug module and should be kernel built to get most debug information >>> from kernel. >>> >>> - Tried to address comments given on dload patch series. >>> >>> Changes in v4: https://lore.kernel.org/lkml/1687955688-20809-1-git-send-email-quic_mojha@xxxxxxxxxxx/ >>> - Redesigned the driver and divided the driver into front end and backend (smem) so >>> that any new backend can be attached easily to avoid code duplication. >>> - Patch reordering as per the driver and subsystem to easier review of the code. >>> - Removed minidump specific code from remoteproc to minidump smem based driver. >>> - Enabled the all the driver as modules. >>> - Address comments made on documentation and yaml and Device tree file [Krzysztof/Konrad] >>> - Address comments made qcom_pstore_minidump driver and given its Device tree >>> same set of properties as ramoops. [Luca/Kees] >>> - Added patch for MAINTAINER file. >>> - Include defconfig change as one patch as per [Krzysztof] suggestion. >>> - Tried to remove the redundant file scope variables from the module as per [Krzysztof] suggestion. >>> - Addressed comments made on dload mode patch v6 version >>> https://lore.kernel.org/lkml/1680076012-10785-1-git-send-email-quic_mojha@xxxxxxxxxxx/ >>> >>> Changes in v3: https://lore.kernel.org/lkml/1683133352-10046-1-git-send-email-quic_mojha@xxxxxxxxxxx/ >>> - Addressed most of the comments by Srini on v2 and refactored the minidump driver. >>> - Added platform device support >>> - Unregister region support. >>> - Added update region for clients. >>> - Added pending region support. >>> - Modified the documentation guide accordingly. >>> - Added qcom_pstore_ramdump client driver which happen to add ramoops platform >>> device and also registers ramoops region with minidump. >>> - Added download mode patch series with this minidump series. >>> https://lore.kernel.org/lkml/1680076012-10785-1-git-send-email-quic_mojha@xxxxxxxxxxx/ >>> >>> Changes in v2: https://lore.kernel.org/lkml/1679491817-2498-1-git-send-email-quic_mojha@xxxxxxxxxxx/ >>> - Addressed review comment made by [quic_tsoni/bmasney] to add documentation. >>> - Addressed comments made by [srinivas.kandagatla] >>> - Dropped pstore 6/6 from the last series, till i get conclusion to get pstore >>> region in minidump. >>> - Fixed issue reported by kernel test robot. >>> >>> Changes in v1: https://lore.kernel.org/lkml/1676978713-7394-1-git-send-email-quic_mojha@xxxxxxxxxxx/ >>> >>> Testing of the patches has been done on sm8450 target after enabling config like >>> CONFIG_PSTORE_RAM and CONFIG_PSTORE_CONSOLE and once the device boots up. >>> >>> echo mini > /sys/module/qcom_scm/parameters/download_mode >>> >>> Try crashing it via devmem2 0xf11c000(this is known to create xpu violation and >>> and put the device in download mode) on command prompt. >>> >>> Default storage type is set to via USB, so minidump would be downloaded with the >>> help of x86_64 machine (running PCAT tool) attached to Qualcomm device which has >>> backed minidump boot firmware support. >>> >>> This will make the device go to download mode and collect the minidump on to the >>> attached x86 machine running the Qualcomm PCAT tool(This comes as part Qualcomm >>> package manager kit). >>> >>> After that we will see a bunch of predefined registered region as binary blobs files >>> starts with md_* downloaded on the x86 machine on given location in PCAT tool from >>> the target device, more about this can be found in qualcomm minidump guide patch. >>> >> >> I tried to apply this series on top of 535a265d7f0dd50 (as suggested by >> `b4 am -l -g`), but it conflicts on patch [04/17]. Please specify the >> exact base commit or another series for which this series is based on. > > Apologies ! > I just realized, it was 6.5-rc7, but let me rebase version of the series; > > Sorry, for all the reviewed done so far, i will definitely take care of them or reply. > OK, see you in v6! -- An old man doll... just what I always wanted! - Clara