On Fri, 2023-08-18 at 00:42 +0200, Jann Horn wrote: > *** CAUTION: This email originates from a source not known to Imagination Technologies. Think before you click a link or open an attachment *** > > Hi! > > Thanks, I think it's great that Imagination is writing an upstream > driver for PowerVR. :) > > On Wed, Aug 16, 2023 at 10:25 AM Sarah Walker <sarah.walker@xxxxxxxxxx> wrote: > > +#define PVR_BO_CPU_CACHED BIT_ULL(63) > > + > > +#define PVR_BO_FW_NO_CLEAR_ON_RESET BIT_ULL(62) > > + > > +/* Bits 62..3 are undefined. */ > > +/* Bits 2..0 are defined in the UAPI. */ > > + > > +/* Other utilities. */ > > +#define PVR_BO_UNDEFINED_MASK GENMASK_ULL(61, 3) > > +#define PVR_BO_RESERVED_MASK (PVR_BO_UNDEFINED_MASK | GENMASK_ULL(63, 63)) > > In commit 1a9c568fb559 ("drm/imagination: Rework firmware object > initialisation") in powervr-next, PVR_BO_FW_NO_CLEAR_ON_RESET (bit 62) > was added in the kernel-only flags group, but the mask > PVR_BO_RESERVED_MASK (which is used in pvr_ioctl_create_bo to detect > kernel-only and reserved flags) looks like it wasn't changed to > include bit 62. I think that means it becomes possible for userspace > to pass this bit in via pvr_ioctl_create_bo()? Yes, this is a bug. Will fix (and refactor a bit). > > +/** > > + * pvr_page_table_l2_entry_raw_set() - Write a valid entry into a raw level 2 > > + * page table. > > + * @entry: Target raw level 2 page table entry. > > + * @child_table_dma_addr: DMA address of the level 1 page table to be > > + * associated with @entry. > > + * > > + * When calling this function, @child_table_dma_addr must be a valid DMA > > + * address and a multiple of %ROGUE_MMUCTRL_PC_DATA_PD_BASE_ALIGNSIZE. > > + */ > > +static void > > +pvr_page_table_l2_entry_raw_set(struct pvr_page_table_l2_entry_raw *entry, > > + dma_addr_t child_table_dma_addr) > > +{ > > + child_table_dma_addr >>= ROGUE_MMUCTRL_PC_DATA_PD_BASE_ALIGNSHIFT; > > + > > + entry->val = > > + PVR_PAGE_TABLE_FIELD_PREP(2, PC, VALID, true) | > > + PVR_PAGE_TABLE_FIELD_PREP(2, PC, ENTRY_PENDING, false) | > > + PVR_PAGE_TABLE_FIELD_PREP(2, PC, PD_BASE, child_table_dma_addr); > > +} > > For this function and others that manipulate page table entries, > please use some kernel helper that ensures that the store can't tear > (at least WRITE_ONCE() - that can still tear on 32-bit, but I see the > driver depends on ARM64, so that's not a problem). Will do. > > +/** > > + * pvr_page_table_l2_insert() - Insert an entry referring to a level 1 page > > + * table into a level 2 page table. > > + * @op_ctx: Target MMU op context pointing at the entry to insert the L1 page > > + * table into. > > + * @child_table: Target level 1 page table to be referenced by the new entry. > > + * > > + * It is the caller's responsibility to ensure @op_ctx.curr_page points to a > > + * valid L2 entry. > > + */ > > +static void > > +pvr_page_table_l2_insert(struct pvr_mmu_op_context *op_ctx, > > + struct pvr_page_table_l1 *child_table) > > +{ > > + struct pvr_page_table_l2 *l2_table = > > + &op_ctx->mmu_ctx->page_table_l2; > > + struct pvr_page_table_l2_entry_raw *entry_raw = > > + pvr_page_table_l2_get_entry_raw(l2_table, > > + op_ctx->curr_page.l2_idx); > > + > > + pvr_page_table_l2_entry_raw_set(entry_raw, > > + child_table->backing_page.dma_addr); > > Can you maybe add comments in functions that set page table entries to > document who is responsible for using a memory barrier (like wmb()) to > ensure that the creation of a page table entry is ordered after the > thing it points to is fully initialized, so that the GPU can't end up > concurrently walking into a page table and observe its old contents > from before it was zero-initialized? Will do. > > +static int > > +pvr_page_table_l1_get_or_insert(struct pvr_mmu_op_context *op_ctx, > > + bool should_insert) > > +{ > > + struct pvr_page_table_l2 *l2_table = > > + &op_ctx->mmu_ctx->page_table_l2; > > + struct pvr_page_table_l1 *table; > > + int err; > > + > > + if (pvr_page_table_l2_entry_is_valid(l2_table, > > + op_ctx->curr_page.l2_idx)) { > > + op_ctx->curr_page.l1_table = > > + l2_table->entries[op_ctx->curr_page.l2_idx]; > > + return 0; > > + } > > + > > + if (!should_insert) > > + return -ENXIO; > > + > > + /* Take a prealloced table. */ > > + table = op_ctx->l1_free_tables; > > + if (!table) > > + return -ENOMEM; > > + > > + err = pvr_page_table_l1_init(table, op_ctx->mmu_ctx->pvr_dev); > > I think when we have a preallocated table here, it was allocated in > pvr_page_table_l1_alloc(), which already called > pvr_page_table_l1_init()? So it looks to me like this second > pvr_page_table_l1_init() call will allocate another page and leak the > old allocation. Yes, this is also a bug. Will address. > +/** > > + * pvr_mmu_op_context_create() - Create an MMU op context. > > + * @ctx: MMU context associated with owning VM context. > > + * @sgt: Scatter gather table containing pages pinned for use by this context. > > + * @sgt_offset: Start offset of the requested device-virtual memory mapping. > > + * @size: Size in bytes of the requested device-virtual memory mapping. For an > > + * unmapping, this should be zero so that no page tables are allocated. > > + * > > + * Returns: > > + * * Newly created MMU op context object on success, or > > + * * -%ENOMEM if no memory is available, > > + * * Any error code returned by pvr_page_table_l2_init(). > > + */ > > +struct pvr_mmu_op_context * > > +pvr_mmu_op_context_create(struct pvr_mmu_context *ctx, struct sg_table *sgt, > > + u64 sgt_offset, u64 size) > > +{ > > + int err; > > + > > + struct pvr_mmu_op_context *op_ctx = > > + kzalloc(sizeof(*op_ctx), GFP_KERNEL); > > + > > + if (!op_ctx) > > + return ERR_PTR(-ENOMEM); > > + > > + op_ctx->mmu_ctx = ctx; > > + op_ctx->map.sgt = sgt; > > + op_ctx->map.sgt_offset = sgt_offset; > > + op_ctx->sync_level_required = PVR_MMU_SYNC_LEVEL_NONE; > > + > > + if (size) { > > + const u32 l1_start_idx = pvr_page_table_l2_idx(sgt_offset); > > + const u32 l1_end_idx = pvr_page_table_l2_idx(sgt_offset + size); > > + const u32 l1_count = l1_end_idx - l1_start_idx + 1; > > + const u32 l0_start_idx = pvr_page_table_l1_idx(sgt_offset); > > + const u32 l0_end_idx = pvr_page_table_l1_idx(sgt_offset + size); > > + const u32 l0_count = l0_end_idx - l0_start_idx + 1; > > Shouldn't the page table indices be calculated from the device_addr > (which is not currently passed in by pvr_vm_map())? As far as I can > tell, sgt_offset doesn't have anything to do with the device address > at which this mapping will be inserted in the page tables? This code is correct, but badly documented; this function only cares about the number of l0/l1 pages required, not the address. Will improve it to make less confusing. Thanks, Sarah