Re: [RFC PATCH] cxl/pci: Set default timeout for background operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Trying to revive this discussion about background commands. I'm sorry if more on the topic has been discussed elsewhere, 
but this is a patch that I could find which related to the same topic.

The requirement for default timeouts (or atleast a mechanism to provide timeouts to the command structure) seems to be needed
since the CXL spec doesn't seem to explicitly deny userspace access to background commands. This seems to constrain a lot of
functionality with respect to Vendor Defined commands as well, if a command needs to be implemented as background.

>> sthanneeru.opensrc@ wrote:
>> > From: Srinivasulu Thanneeru <sthanneeru.opensrc@xxxxxxxxxx>
>> > 
>> > The CXL 3.0 specification outlines background operations,
>> > and support for handling these operations was added in following patch.
>> > 
>> > Link: https://lore.kernel.org/all/20230523170927.20685-5-dave@xxxxxxxxxxxx/
>> > 
>> > Mailbox commands like ‘Log Populate’ use background operations
>> > to complete the execution of the command.
>> > This can lead to a timeout, since there is currently no option
>> > in the ioctl cxl_send_command structure to specify
>> > a timeout value. The default values being zero can lead
>> > to the driver reporting false negatives to the application.
>> > 
>> > This patch aims to establish default values, enabling mailbox commands
>> > that operate in the background to continue functioning even
>> > if a timeout is not set in the userspace application.  
>> 
>> The reason there are no defaults is because userspace is not allowed to
>> issue background commands. The CXL background command definition is
>> awkward in that it allows a single command to monopolize the mailbox for
>> an indefinite amount of time.

I understand this is a design adopted by the kernel, and not actually by the CXL spec (at least I don't see it
specifying explicitly that background commands are not allowed in band).

>> 
>> Instead, the approach taken with Firmware Update and Sanitize is that a
>> kernel sysfs ABI mediates access to the mailbox and facilitates bounded 
>> timeslices between command submissions. It effectively allows the kernel
>> to manage fairness and more importantly preempt userspace if it needs to
>> issue its own commands.
>> 
>> I assume you are only seeing this lack of a default due to building with
>> CONFIG_CXL_MEM_RAW_COMMANDS=y? If yes, "raw" means "raw" and the kernel
>> is mostly taken out of the loop of saving userspace from itself.

This seems to imply that no vendor defined command can be background commands. Please correct me if
I'm wrong, since Vendor defined commands have to be sent as raw commands, and the absence of default timeouts
indicate that the command will be immediately timed out.

Also, I understand the point that background commands (since there can only be one) monopolizes the
mailbox, and hence cannot be exposed to userspace and as I see the background command handling goes synchronously
with the usual command handling in the kernel (ie; foreground command has to wait for the background command to complete or timeout)
but doesn't this defeat the purpose of the spec defined ability to run a foreground command while a background command is
underway?

>> 
>> All that said, ugh, "Log Populate" has no facility to time bound the
>> population of the log. I do not think it is tenable for Linux to
>> surrender mailbox access for an indefinite uninterruptible amount of
>> time... unless you want to handle "Log Populate" like Sanitize where the
>> unbounded background operation is tolerated because the device is taken
>> offline?
>
>It may be pointless to do a component state dump only on an offline device.
>My assumption is this one is hardware debug only.  Patches out of
>tree or behind a really scary sounding config variable perhaps?
>Other than vendor log I don't think populate log applies to the other
>log types yet (they don't mention it anyway!)
>
>Jonathan

Am I wrong to understand that this design decision is still in the open right now?

Another concern was that right now, a WARN_ONCE is issued when a raw command is ensued. Maybe this belongs
on a different thread, but are raw commands to be depreciated (if then how can Vendor Defined commands be sent?)






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux