Re: [PATCH v9 6/7] iommu/riscv: Command and fault queue support

Will Deacon <will@xxxxxxxxxx> · Tue, 15 Oct 2024 09:58:14 +0100

On Thu, Oct 10, 2024 at 12:48:09PM -0700, Tomasz Jeznach wrote:
> Introduce device command submission and fault reporting queues,
> as described in Chapter 3.1 and 3.2 of the RISC-V IOMMU Architecture
> Specification.
> 
> Command and fault queues are instantiated in contiguous system memory
> local to IOMMU device domain, or mapped from fixed I/O space provided
> by the hardware implementation. Detection of the location and maximum
> allowed size of the queue utilize WARL properties of queue base control
> register. Driver implementation will try to allocate up to 128KB of
> system memory, while respecting hardware supported maximum queue size.
> 
> Interrupts allocation is based on interrupt vectors availability and
> distributed to all queues in simple round-robin fashion. For hardware
> Implementation with fixed event type to interrupt vector assignment
> IVEC WARL property is used to discover such mappings.
> 
> Address translation, command and queue fault handling in this change
> is limited to simple fault reporting without taking any action.
> 
> Reviewed-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
> Reviewed-by: Zong Li <zong.li@xxxxxxxxxx>
> Signed-off-by: Tomasz Jeznach <tjeznach@xxxxxxxxxxxx>
> ---
>  drivers/iommu/riscv/iommu-bits.h |  75 +++++
>  drivers/iommu/riscv/iommu.c      | 507 ++++++++++++++++++++++++++++++-
>  drivers/iommu/riscv/iommu.h      |  21 ++
>  3 files changed, 601 insertions(+), 2 deletions(-)

[...]

> +/* Enqueue an entry and wait to be processed if timeout_us > 0
> + *
> + * Error handling for IOMMU hardware not responding in reasonable time
> + * will be added as separate patch series along with other RAS features.
> + * For now, only report hardware failure and continue.
> + */
> +static unsigned int riscv_iommu_queue_send(struct riscv_iommu_queue *queue,
> +					   void *entry, size_t entry_size)
> +{
> +	unsigned int prod;
> +	unsigned int head;
> +	unsigned int tail;
> +	unsigned long flags;
> +
> +	/* Do not preempt submission flow. */
> +	local_irq_save(flags);
> +
> +	/* 1. Allocate some space in the queue */
> +	prod = atomic_inc_return(&queue->prod) - 1;
> +	head = atomic_read(&queue->head);
> +
> +	/* 2. Wait for space availability. */
> +	if ((prod - head) > queue->mask) {
> +		if (readx_poll_timeout(atomic_read, &queue->head,
> +				       head, (prod - head) < queue->mask,
> +				       0, RISCV_IOMMU_QUEUE_TIMEOUT))
> +			goto err_busy;
> +	} else if ((prod - head) == queue->mask) {
> +		const unsigned int last = Q_ITEM(queue, head);
> +
> +		if (riscv_iommu_readl_timeout(queue->iommu, Q_HEAD(queue), head,
> +					      !(head & ~queue->mask) && head != last,
> +					      0, RISCV_IOMMU_QUEUE_TIMEOUT))
> +			goto err_busy;
> +		atomic_add((head - last) & queue->mask, &queue->head);
> +	}
> +
> +	/* 3. Store entry in the ring buffer. */
> +	memcpy(queue->base + Q_ITEM(queue, prod) * entry_size, entry, entry_size);
> +
> +	/* 4. Wait for all previous entries to be ready */
> +	if (readx_poll_timeout(atomic_read, &queue->tail, tail, prod == tail,
> +			       0, RISCV_IOMMU_QUEUE_TIMEOUT))
> +		goto err_busy;
> +
> +	/* 5. Complete submission and restore local interrupts */
> +	dma_wmb();
> +	riscv_iommu_writel(queue->iommu, Q_TAIL(queue), Q_ITEM(queue, prod + 1));

Please explain why a dma_wmb() is sufficient to order the memcpy() stores
before the tail update.

> +	atomic_inc(&queue->tail);

I think this can be reordered before the relaxed MMIO write to tail,
causing other CPUs to exit their polling early.

Will