On 9.02.2024 19:06, Elliot Berman wrote: > On Fri, Feb 09, 2024 at 04:55:36PM +0530, Dibakar Singh wrote: >> Expose an API qcom_scm_assign_table to allow client drivers to batch >> multiple memory regions in a single hyp assign call. >> >> In the existing situation, if our goal is to process an sg_table and >> transfer its ownership from the current VM to a different VM, we have a >> couple of strategies. The first strategy involves processing the entire >> sg_table at once and then transferring the ownership. However, this >> method may have an adverse impact on the system because during an SMC >> call, the NS interrupts are disabled, and this delay could be >> significant when dealing with large sg_tables. To address this issue, we >> can adopt a second strategy, which involves processing each sg_list in >> the sg_table individually and reassigning memory ownership. Although >> this method is slower and potentially impacts performance, it will not >> keep the NS interrupts disabled for an extended period. >> >> A more efficient strategy is to process the sg_table in batches. This >> approach addresses both scenarios by involving memory processing in >> batches, thus avoiding prolonged NS interrupt disablement for longer >> duration when dealing with large sg_tables. Moreover, since we process >> in batches, this method is faster compared to processing each item >> individually. The observations on testing both the approaches for >> performance is as follows: >> >> Allocation Size/ 256MB 512MB 1024MB >> Algorithm Used =========== =========== ============ >> >> Processing each sg_list 73708(us) 149289(us) 266964(us) >> in sg_table one by one >> >> Processing sg_table in 46925(us) 92691(us) 176893(us) >> batches >> >> This implementation serves as a wrapper around the helper function >> __qcom_scm_assign_mem, which takes an sg_list and processes it in >> batches. We’ve set the limit to a minimum of 32 sg_list in a batch or a >> total batch size of 512 pages. The selection of these numbers is >> heuristic, based on the test runs conducted. Opting for a smaller number >> would compromise performance, while a larger number would result in >> non-secure interrupts being disabled for an extended duration. >> >> Co-developed-by: Prakruthi Deepak Heragu <quic_pheragu@xxxxxxxxxxx> >> Signed-off-by: Prakruthi Deepak Heragu <quic_pheragu@xxxxxxxxxxx> >> Signed-off-by: Dibakar Singh <quic_dibasing@xxxxxxxxxxx> >> --- >> drivers/firmware/qcom/qcom_scm.c | 211 +++++++++++++++++++++++++ >> include/linux/firmware/qcom/qcom_scm.h | 7 + >> 2 files changed, 218 insertions(+) >> >> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c >> index 520de9b5633a..038b96503d65 100644 >> --- a/drivers/firmware/qcom/qcom_scm.c >> +++ b/drivers/firmware/qcom/qcom_scm.c >> @@ -21,6 +21,8 @@ >> #include <linux/platform_device.h> >> #include <linux/reset-controller.h> >> #include <linux/types.h> >> +#include <linux/scatterlist.h> >> +#include <linux/slab.h> >> >> #include "qcom_scm.h" >> >> @@ -1048,6 +1050,215 @@ int qcom_scm_assign_mem(phys_addr_t mem_addr, size_t mem_sz, >> } >> EXPORT_SYMBOL_GPL(qcom_scm_assign_mem); >> >> +/** >> + * qcom_scm_assign_mem_batch() - Make a secure call to reassign memory >> + * ownership of several memory regions >> + * @mem_regions: A buffer describing the set of memory regions that need to >> + * be reassigned >> + * @nr_mem_regions: The number of memory regions that need to be reassigned >> + * @srcvms: A buffer populated with he vmid(s) for the current set of >> + * owners >> + * @src_sz: The size of the srcvms buffer (in bytes) >> + * @destvms: A buffer populated with the new owners and corresponding >> + * permission flags. >> + * @dest_sz: The size of the destvms buffer (in bytes) >> + * >> + * Return negative errno on failure, 0 on success. >> + */ >> +static int qcom_scm_assign_mem_batch(struct qcom_scm_mem_map_info *mem_regions, >> + size_t nr_mem_regions, u32 *srcvms, >> + size_t src_sz, >> + struct qcom_scm_current_perm_info *destvms, >> + size_t dest_sz) >> +{ >> + dma_addr_t mem_dma_addr; >> + size_t mem_regions_sz; >> + int ret = 0, i; >> + >> + for (i = 0; i < nr_mem_regions; i++) { >> + mem_regions[i].mem_addr = cpu_to_le64(mem_regions[i].mem_addr); >> + mem_regions[i].mem_size = cpu_to_le64(mem_regions[i].mem_size); >> + } >> + >> + mem_regions_sz = nr_mem_regions * sizeof(*mem_regions); >> + mem_dma_addr = dma_map_single(__scm->dev, mem_regions, mem_regions_sz, >> + DMA_TO_DEVICE); >> + if (dma_mapping_error(__scm->dev, mem_dma_addr)) { >> + dev_err(__scm->dev, "mem_dma_addr mapping failed\n"); >> + return -ENOMEM; >> + } >> + >> + ret = __qcom_scm_assign_mem(__scm->dev, virt_to_phys(mem_regions), >> + mem_regions_sz, virt_to_phys(srcvms), src_sz, >> + virt_to_phys(destvms), dest_sz); >> + >> + dma_unmap_single(__scm->dev, mem_dma_addr, mem_regions_sz, DMA_TO_DEVICE); >> + return ret; >> +} >> + >> +/** >> + * qcom_scm_prepare_mem_batch() - Prepare batches of memory regions >> + * @sg_table: A scatter list whose memory needs to be reassigned >> + * @srcvms: A buffer populated with he vmid(s) for the current set of >> + * owners >> + * @nr_src: The number of the src_vms buffer >> + * @destvms: A buffer populated with he vmid(s) for the new owners >> + * @destvms_perms: A buffer populated with the permission flags of new owners >> + * @nr_dest: The number of the destvms >> + * @last_sgl: Denotes to the last scatter list element. Used in case of rollback >> + * @roll_back: Identifies whether we are executing rollback in case of failure >> + * >> + * Return negative errno on failure, 0 on success. >> + */ >> +static int qcom_scm_prepare_mem_batch(struct sg_table *table, >> + u32 *srcvms, int nr_src, >> + int *destvms, int *destvms_perms, >> + int nr_dest, >> + struct scatterlist *last_sgl, bool roll_back) >> +{ >> + struct qcom_scm_current_perm_info *destvms_cp; >> + struct qcom_scm_mem_map_info *mem_regions_buf; >> + struct scatterlist *curr_sgl = table->sgl; >> + dma_addr_t source_dma_addr, dest_dma_addr; >> + size_t batch_iterator; >> + size_t batch_start = 0; >> + size_t destvms_cp_sz; >> + size_t srcvms_cp_sz; >> + size_t batch_size; >> + u32 *srcvms_cp; >> + int ret = 0; >> + int i; >> + >> + if (!table || !table->sgl || !srcvms || !nr_src || >> + !destvms || !destvms_perms || !nr_dest || !table->nents) >> + return -EINVAL; >> + >> + srcvms_cp_sz = sizeof(*srcvms_cp) * nr_src; >> + srcvms_cp = kmemdup(srcvms, srcvms_cp_sz, GFP_KERNEL); >> + if (!srcvms_cp) >> + return -ENOMEM; >> + >> + for (i = 0; i < nr_src; i++) >> + srcvms_cp[i] = cpu_to_le32(srcvms_cp[i]); >> + >> + source_dma_addr = dma_map_single(__scm->dev, srcvms_cp, >> + srcvms_cp_sz, DMA_TO_DEVICE); > > Please use the new tzmem allocator: > > https://lore.kernel.org/all/20240205182810.58382-1-brgl@xxxxxxxx/ And the new __free annotations, please Konrad