Dan Williams wrote: > Ira Weiny wrote: > > BIOS can configure memory devices as firmware first. This will send CXL > > events to the firmware instead of the OS. The firmware can then send > > these events to the OS via UEFI. Currently a configuration such as this > > will trace a non standard event in the log. Using the specific CXL > > trace points with the additional information CXL can provide is much ^^^^^^^^^^^^^^^^^^^^^^^^ See below. > > more useful to users. > > Specifically, future support can be added to CXL > > provide the DPA to HPA mapping configured at the time of the event. > > One could argue that support should have happened first and taken the > event all the way to EDAC, so this needs to merged on faith that that > those patches are in flight. Sure but then you also don't get the additional information already provided by the CXL trace points. Reading this back I see that my use of 'Specifically' masked the fact that the CXL code already has better support to decode these records and we want to leverage that. > > > UEFI v2.10 section N.2.14 defines a Common Platform Error Record (CPER) > > format for CXL Component Events. The format is mostly the same as the > > CXL Common Event Record Format. The difference is the use of a GUID in > > the Section Type rather than a UUID as part of the event itself. > > > > Add GHES support to detect CXL CPER records and call into the CXL code > > to process the event. > > > > Multiple methods were considered for the call into the CXL code. A > > notifier chain was considered for the callback but the complexity did > > not justify the use case. > > Not sure what this adds. If you want to talk about alternatives and > tradeoffs, great, but that should be a comparative analysis in support > of the chosen direction. I'll remove it. This was carried from the previous versions. > > > Furthermore, the CXL code is required to be called from process > > context as it needs to take a device lock so a simple callback > > register proved difficult. Dan Williams suggested using 2 work items > > as an atomic way of switching between a callback being registered and > > not. This allows the callback to run without any locking.[1] > > > > Note that a local work item is required to dump any messages seen during > > a race between any check done in cxl_cper_post_event() and the > > scheduling of work. That said, no attempt is made to stop the addition > > of messages into the kfifo because this local work item provides a hook > > to add a local CXL CPER trace point in a future patch. > > > > This new combined patch addresses the report by Dan Carpenter[2]. Thus > > the reported by tag. > > > > [1] https://lore.kernel.org/all/65d111eb87115_6c745294ac@xxxxxxxxxxxxxxxxxxxxxxxxx.notmuch/ > > [2] https://lore.kernel.org/all/b963c490-2c13-4b79-bbe7-34c6568423c7@moroto.mountain/ > > > > Cc: Ard Biesheuvel <ardb@xxxxxxxxxx> > > Cc: "Rafael J. Wysocki" <rafael@xxxxxxxxxx> > > Cc: Tony Luck <tony.luck@xxxxxxxxx> > > Cc: Borislav Petkov <bp@xxxxxxxxx> > > Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx> > > checkpatch will whine about a missing Closes: tag. I know. How about Suggested-by? I want to give Dan credit because your revert already closed the actual bug. [snip] > > + > > +DEFINE_KFIFO(cxl_cper_fifo, struct cxl_cper_work_data, 32); > > +static DEFINE_SPINLOCK(cxl_cper_read_lock); > > +static DEFINE_SPINLOCK(cxl_cper_write_lock); > > + > > +static cxl_cper_callback cper_callback; > > +/* cb function dumps the records */ > > +static void cxl_cper_cb_fn(struct work_struct *work) > > +{ > > + struct cxl_cper_work_data wd; > > + > > + while (kfifo_out_spinlocked(&cxl_cper_fifo, &wd, 1, > > + &cxl_cper_read_lock)) { > > Why is this taking the lock on retrieval? The work item is > single-threaded. The only potential contention is between > cxl_cper_local_fn() and cxl_cper_cb_fn() collision, but that can be ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Because of this collision, Yes. > handled by a cancel_work_sync(&cxl_local_work) on registration to pair > with the cancel_work_sync(&cxl_cb_work) on unregistration. Ok yea I could use cancel_work_sync(). > > ...but I am not sure 2 work items are needed unless some default > processing is going to happen in the "local" case. I thought about merging patch 3 in this one and that would make the use of this work function more clear. > > > + cper_callback(wd.event_type, &wd.rec); > > + } > > +} > > +static DECLARE_WORK(cxl_cb_work, cxl_cper_cb_fn); > > + > > +static void cxl_cper_local_fn(struct work_struct *work) > > +{ > > + struct cxl_cper_work_data wd; > > + > > + while (kfifo_out_spinlocked(&cxl_cper_fifo, &wd, 1, > > + &cxl_cper_read_lock)) { > > This just looks like open coded / less efficient kfifo_reset_out(). > > > + /* drop msg */ > > If the proposal is to do nothing when no callback is registered then no > need to have have cxl_local_work at all. Patch 4 makes use of this. > > > + } > > +} > > +static DECLARE_WORK(cxl_local_work, cxl_cper_local_fn); > > + > > +/* Pointer for atomic switch of record processing */ > > +struct work_struct *cxl_cper_work = &cxl_local_work; > > + > > +static void cxl_cper_post_event(enum cxl_event_type event_type, > > + struct cxl_cper_event_rec *rec) > > +{ > > + struct cxl_cper_work_data wd; > > + > > + if (rec->hdr.length <= sizeof(rec->hdr) || > > + rec->hdr.length > sizeof(*rec)) { > > + pr_err(FW_WARN "CXL CPER Invalid section length (%u)\n", > > + rec->hdr.length); > > + return; > > + } > > + > > + if (!(rec->hdr.validation_bits & CPER_CXL_COMP_EVENT_LOG_VALID)) { > > + pr_err(FW_WARN "CXL CPER invalid event\n"); > > + return; > > + } > > + > > + wd.event_type = event_type; > > + memcpy(&wd.rec, rec, sizeof(wd.rec)); > > Unfortunate to have a double copy of the record into the stack variable > and then again into the kfifo, but I can not immediately think of a way > around that. Yep. I could not either. > > > + > > + kfifo_in_spinlocked(&cxl_cper_fifo, &wd, 1, &cxl_cper_write_lock); > > + schedule_work(cxl_cper_work); > > I think you don't need 2 work items if you write it this way: > > spin_lock_irqsave(&cxl_cper_write_lock, flags); > if (cxl_cper_work) { > if (kfifo_put(&cxl_cper_fifo, &wd)) > schedule_work(cxl_cper_work); > else > pr_err_ratelimited( > "buffer overflow when queuing CXL CPER record\n"); > } > spin_lock_irqrestore(&cxl_cper_write_lock, flags); > > ...and then take the write lock when modifying that cxl_cper_work > pointer between NULL and non-NULL. I'll merge patch 4 here. Then this will be used. Ira