Re: [PATCH net-next v4 5/5] ice: add documentation for FW logging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/6/2023 4:46 PM, Jakub Kicinski wrote:
On Thu,  5 Oct 2023 10:01:10 -0700 Tony Nguyen wrote:
From: Paul M Stillwell Jr <paul.m.stillwell.jr@xxxxxxxxx>

Add documentation for FW logging in
Documentation/networking/device-drivers/ethernet/intel/ice.rst

Wrong spelling, I think, because no such file.


Sorry, hyphen vs underscore issue, will fix.

Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@xxxxxxxxx>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@xxxxxxxxx>

+Firmware (FW) logging
+---------------------

I think you need empty lines after the headers.
Did you try to build this documentation and checked the warnings?


I believe this to be correct. It is the same as the section above it for GNSS and it looks correct when complete. I did run 'make htmldocs' on this and I don't get any errors or warnings and the page looks correct.

+The driver supports FW logging via the debugfs interface on PF 0 only. In order
+for FW logging to work, the NVM must support it. The 'fwlog' file will only get
+created in the ice debugfs directory if the NVM supports FW logging.

Odd phrasing - "in order to work it needs to be supported"

also NVM == non-volatile memory, you mean the logging goes into NVM
or NVM as in FW in the NVM needs to support it?


Yeah, I can see it as oddly phrased. What I'm trying to say is that the NVM image on the NIC has to support FW logging and if it doesn't then the 'fwlog' directory will not be created. I'll take another run at it to try to make it less confusing.

+Module configuration
+~~~~~~~~~~~~~~~~~~~~
+To see the status of FW logging, read the 'fwlog/modules' file like this::
+
+  # cat /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/modules
+
+To configure FW logging, write to the 'fwlog/modules' file like this::
+
+  # echo <fwlog_event> <fwlog_level> > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/modules
+
+where
+
+* fwlog_level is a name as described below. Each level includes the
+  messages from the previous/lower level
+
+      *	NONE
+      *	ERROR
+      *	WARNING
+      *	NORMAL
+      *	VERBOSE

Is this going to give us a nice list when we render the docs?
White space looks odd.


Yes, it does give a nice list

+* fwlog_event is a name that represents the module to receive events for. The
+  module names are
+
+      *	GENERAL
+      *	CTRL
+      *	LINK
+      *	LINK_TOPO
+      *	DNL
+      *	I2C
+      *	SDP
+      *	MDIO
+      *	ADMINQ
+      *	HDMA
+      *	LLDP
+      *	DCBX
+      *	DCB
+      *	XLR
+      *	NVM
+      *	AUTH
+      *	VPD
+      *	IOSF
+      *	PARSER
+      *	SW
+      *	SCHEDULER
+      *	TXQ
+      *	RSVD
+      *	POST
+      *	WATCHDOG
+      *	TASK_DISPATCH
+      *	MNG
+      *	SYNCE
+      *	HEALTH
+      *	TSDRV
+      *	PFREG
+      *	MDLVER
+      *	ALL
+
+The name ALL is special and specifies setting all of the modules to the
+specified fwlog_level.
+
+Example usage to configure the modules::
+
+  # echo LINK VERBOSE > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/modules
+
+Enabling FW log
+~~~~~~~~~~~~~~~
+Once the desired modules are configured the user enables logging. To do
+this the user can write a 1 (enable) or 0 (disable) to 'fwlog/enable'. An
+example is::
+
+  # echo 1 > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/enable

Hm, so we "select" the module and then enable / disable?

It'd feel more natural to steal the +/- thing from dynamic printing.
To enable:

  # echo '+LINK VERBOSE' > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/active

To disable:

  # echo '-LINK VERBOSE' > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/active

No?


I like this idea, but not sure if it will work or not for us. What I'm trying to do is reduce the number of AQ commands we send to the FW when configuring/enabling logging.

What normally happens is the user sets multiple different modules up with different log values so my initial thought is to allow the user to do all the configuration first and then 'enable' that configuration. This way there is only 1 AQ write to the FW instead of a bunch of them and we know that once the logging is 'enabled' then the data we get from the FW is the data that we expect to see.

If we enable each module individually then we are going to get data coming from the FW as each module gets enabled. That can get confusing to the FW team as they look at the log data because they may not see all the events they expect to see in any given time because the event wasn't enabled.

+Retrieving FW log data
+~~~~~~~~~~~~~~~~~~~~~~
+The FW log data can be retrieved by reading from 'fwlog/data'. The user can
+write to 'fwlog/data' to clear the data. The data can only be cleared when FW
+logging is disabled.

Oh, now it sounds like only one thing can be enabled at a time.
Can you clarify?


What I'm trying to describe here is a mechanism to read all the data (whatever modules have been enabled) as it's coming in and to also be able to clear the data in case the user wants to start fresh (by writing 0 to the file). Does that make sense? I probably wasn't clear in the previous section that the user can enable many modules at the same time.

The FW log data is a binary file that is sent to Intel and
+used to help debug user issues.
+
+An example to read the data is::
+
+  # cat /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/data > fwlog.bin
+
+An example to clear the data is::
+
+  # echo 0 > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/data
+
+Changing how often the log events are sent to the driver
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The driver receives FW log data from the Admin Receive Queue (ARQ). The
+frequency that the FW sends the ARQ events can be configured by writing to
+'fwlog/resolution'. The range is 1-128 (1 means push every log message, 128
+means push only when the max AQ command buffer is full). The suggested value is
+10. The user can see what the value is configured to by reading
+'fwlog/resolution'. An example to set the value is::
+
+  # echo 50 > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/resolution

Resolution doesn't sound quite right, batch_size maybe?


I agree, resolution is what the FW team uses, but I'll change this to some other name

+Configuring the number of buffers used to store FW log data
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The driver stores FW log data in a ring within the driver. The default size of
+the ring is 256 4K buffers. Some use cases may require more or less data so
+the user can change the number of buffers that are allocated for FW log data.
+To change the number of buffers write to 'fwlog/nr_buffs'. The value must be one
+of: 64, 128, 256, or 512. FW logging must be disabled to change the value. An
+example of changing the value is::
+
+  # echo 128 > /sys/kernel/debug/ice/0000\:18\:00.0/fwlog/nr_buffs

Why 4K? The number of buffers is irrelevant to the user, why not let
the user configure the size in bytes (which his how much DRAM the
driver will hold hostage)?

I'm trying to keep the numbers small for the user :). I could say 1048576 bytes (256 x 4096), but those kinds of numbers get unwieldy to a user (IMO).

The FW logs generate a LOT of data depending on what modules are enabled so we typically need a lot of buffers to handle them.

In the past we have tried to use the syslog mechanism, but we generate SO much data that we overwhelm that and lose data. That's why the idea of using static buffers is appealing to us. We could still overrun the buffers, but at least we will have contiguous data. The problem then becomes one of allocating enough space for what the user is trying to catch instead of trying to start/stop logging and hoping you get all the events in the log.

I can drop the mention of 4K buffers in the documentation. Or we could use terms like 1M, 2M, 512K, et al. That would require string parsing in the driver though and I'm trying to avoid that if possible. What do you think?




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux