Hi,
On 9/14/20 8:07 PM, Alexander Duyck wrote:
On Mon, Sep 14, 2020 at 6:42 AM Hans de Goede <hdegoede@xxxxxxxxxx> wrote:
Hi,
On 9/11/20 9:45 PM, David E. Box wrote:
From: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
Add support for the Intel Platform Monitoring Technology crashlog
interface. This interface provides a few sysfs values to allow for
controlling the crashlog telemetry interface as well as a character driver
to allow for mapping the crashlog memory region so that it can be accessed
after a crashlog has been recorded.
This driver is meant to only support the server version of the crashlog
which is identified as crash_type 1 with a version of zero. Currently no
other types are supported.
Signed-off-by: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
Signed-off-by: David E. Box <david.e.box@xxxxxxxxxxxxxxx>
---
.../ABI/testing/sysfs-class-pmt_crashlog | 66 ++
drivers/platform/x86/Kconfig | 10 +
drivers/platform/x86/Makefile | 1 +
drivers/platform/x86/intel_pmt_crashlog.c | 588 ++++++++++++++++++
4 files changed, 665 insertions(+)
create mode 100644 Documentation/ABI/testing/sysfs-class-pmt_crashlog
create mode 100644 drivers/platform/x86/intel_pmt_crashlog.c
diff --git a/Documentation/ABI/testing/sysfs-class-pmt_crashlog b/Documentation/ABI/testing/sysfs-class-pmt_crashlog
new file mode 100644
index 000000000000..40fb4ff437a6
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-pmt_crashlog
@@ -0,0 +1,66 @@
+What: /sys/class/pmt_crashlog/
+Date: September 2020
+KernelVersion: 5.10
+Contact: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
+Description:
+ The pmt_crashlog/ class directory contains information
+ for devices that expose crashlog capabilities using the Intel
+ Platform Monitoring Technology (PTM).
+
+What: /sys/class/pmt_crashlog/crashlogX
+Date: September 2020
+KernelVersion: 5.10
+Contact: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
+Description:
+ The crashlogX directory contains files for configuring an
+ instance of a PMT crashlog device that can perform crash data
+ recoring. Each crashlogX device has an associated
+ /dev/crashlogX device node. This node can be opened and mapped
+ to access the resulting crashlog data. The register layout for
+ the log can be determined from an XML file of specified guid
+ for the parent device.
+
+What: /sys/class/pmt_crashlog/crashlogX/guid
+Date: September 2020
+KernelVersion: 5.10
+Contact: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
+Description:
+ (RO) The guid for this crashlog device. The guid identifies the
+ version of the XML file for the parent device that should be
+ used to determine the register layout.
+
+What: /sys/class/pmt_crashlog/crashlogX/size
+Date: September 2020
+KernelVersion: 5.10
+Contact: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
+Description:
+ (RO) The length of the result buffer in bytes that corresponds
+ to the mapping size for the /dev/crashlogX device node.
+
+What: /sys/class/pmt_crashlog/crashlogX/offset
+Date: September 2020
+KernelVersion: 5.10
+Contact: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
+Description:
+ (RO) The offset of the buffer in bytes that corresponds
+ to the mapping for the /dev/crashlogX device node.
+
+What: /sys/class/pmt_crashlog/crashlogX/enable
+Date: September 2020
+KernelVersion: 5.10
+Contact: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
+Description:
+ (RW) Boolean value controlling if the crashlog functionality
+ is enabled for the /dev/crashlogX device node.
+
+What: /sys/class/pmt_crashlog/crashlogX/trigger
+Date: September 2020
+KernelVersion: 5.10
+Contact: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx>
+Description:
+ (RW) Boolean value controlling the triggering of the
+ /dev/crashlogX device node. When read it provides data on if
+ the crashlog has been triggered. When written to it can be
+ used to either clear the current trigger by writing false, or
+ to trigger a new event if the trigger is not currently set.
+
Both the pmt_crashlog and the attributes suggest that this is highly
Intel PMT specific. /sys/class/foo interfaces are generally speaking
meant to be generic interfaces.
If this was defining a generic, vendor and implementation agnostic interface for
configuring / accessing crashlogs, then using a class would be fine, but that
is not the case, so I believe that this should not implement / register a class.
Since the devices are instantiated through MFD there already is a
static sysfs-path which can be used to find the device in sysfs:
/sys/bus/platform/device/pmt_crashlog
So you can register the sysfs attributes directly under the platform_device
and then userspace can easily find them, so there really is no need to
use a class here.
I see. So we change the root directory from "/sys/class/pmt_crashlog/"
to "/sys/bus/platform/device/pmt_crashlog" while retaining the same
functionality. That should be workable.
Ack.
<snip>
+static const struct file_operations pmt_crashlog_fops = {
+ .owner = THIS_MODULE,
+ .open = pmt_crashlog_open,
+ .mmap = pmt_crashlog_mmap,
mmap but no read, I guess read may be emulated through mmap,
is that the case ?
I can see sysadmins wanting to be able to do a simple cat
on this file to get the logs (including headers), so if
the kernel-core does not emulate read in this case, you
should really add read support I guess.
So first the contents of the crashlog are not really human readable,
so it is not likely that they would "cat" the contents.
Sorry, I was not really clear there, what I meant is a sysadmin doing
something like this:
cat /sys/.../crashlog-file > /mnt/external-usb-disk/server-foo-bar-crashlog20200917
So that they can easily save the crashlog for later reference without
needing to install special tools.
Also I don't
believe it is a very common thing to provide read access if we don't
know the memory layout of the region. If you take a look at the
handling for resourceN in
pci_create_attr(https://elixir.bootlin.com/linux/v5.9-rc4/source/drivers/pci/pci-sysfs.c#L1127)
it looks like it does something similar where it only provides mmap
for MMIO access.
That was meant as a way to allow doing hardware-drivers in userspace
(think old userspace modesetting Xorg/xfree86) without needing to
call iopl and on non-x86 platforms which don't have iopl.
Also how big are these files ? sysfs also supports binary
files, so unless these files are huge / this is really
performance critical it may make more sense to just add
a binary sysfs attr for this and get rid of the whole chardev
all together.
So for the file we are looking at the minimum of a page up to multiple
pages of data. It largely depends on how much information is collected
by the crashlog agent. I can take a look and see if we can do it. Odds
are it shouldn't be too different from how resourceN is done for the
PCI devices.
Ok, a few pages of data should not be an issue for a binary sysfs
file at all.
<snip>
+ entry->devid = ida_simple_get(entry->ida, 0, 0, GFP_KERNEL);
+ if (entry->devid < 0)
+ return entry->devid;
+
+ ret = pmt_crashlog_make_dev(priv, entry);
+ if (ret) {
+ ida_simple_remove(entry->ida, entry->devid);
+ return ret;
+ }
Hmm wait, you are making one chardev per log entry ? Then just using
binary sysfs attributes seems to make even more sense to me.
Yes we are required to create one per log entry as each one can be
accessed independently.
That is fine, but then at least to me, using sysfs binary files, seems to make
a lot more sense then creating multiple char devices for this.
Regards,
Hans