On Tue, Jan 14, 2025 at 7:58 PM Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> wrote: > > * Suren Baghdasaryan <surenb@xxxxxxxxxx> [250114 22:15]: > > On Tue, Jan 14, 2025 at 6:27 PM Wei Yang <richard.weiyang@xxxxxxxxx> wrote: > > > > > > On Fri, Jan 10, 2025 at 08:26:03PM -0800, Suren Baghdasaryan wrote: > > > > > > >diff --git a/kernel/fork.c b/kernel/fork.c > > > >index 9d9275783cf8..151b40627c14 100644 > > > >--- a/kernel/fork.c > > > >+++ b/kernel/fork.c > > > >@@ -449,6 +449,42 @@ struct vm_area_struct *vm_area_alloc(struct mm_struct *mm) > > > > return vma; > > > > } > > > > > > > >+static void vm_area_init_from(const struct vm_area_struct *src, > > > >+ struct vm_area_struct *dest) > > > >+{ > > > >+ dest->vm_mm = src->vm_mm; > > > >+ dest->vm_ops = src->vm_ops; > > > >+ dest->vm_start = src->vm_start; > > > >+ dest->vm_end = src->vm_end; > > > >+ dest->anon_vma = src->anon_vma; > > > >+ dest->vm_pgoff = src->vm_pgoff; > > > >+ dest->vm_file = src->vm_file; > > > >+ dest->vm_private_data = src->vm_private_data; > > > >+ vm_flags_init(dest, src->vm_flags); > > > >+ memcpy(&dest->vm_page_prot, &src->vm_page_prot, > > > >+ sizeof(dest->vm_page_prot)); > > > >+ /* > > > >+ * src->shared.rb may be modified concurrently when called from > > > >+ * dup_mmap(), but the clone will reinitialize it. > > > >+ */ > > > >+ data_race(memcpy(&dest->shared, &src->shared, sizeof(dest->shared))); > > > >+ memcpy(&dest->vm_userfaultfd_ctx, &src->vm_userfaultfd_ctx, > > > >+ sizeof(dest->vm_userfaultfd_ctx)); > > > >+#ifdef CONFIG_ANON_VMA_NAME > > > >+ dest->anon_name = src->anon_name; > > > >+#endif > > > >+#ifdef CONFIG_SWAP > > > >+ memcpy(&dest->swap_readahead_info, &src->swap_readahead_info, > > > >+ sizeof(dest->swap_readahead_info)); > > > >+#endif > > > >+#ifndef CONFIG_MMU > > > >+ dest->vm_region = src->vm_region; > > > >+#endif > > > >+#ifdef CONFIG_NUMA > > > >+ dest->vm_policy = src->vm_policy; > > > >+#endif > > > >+} > > > > > > Would this be difficult to maintain? We should make sure not miss or overwrite > > > anything. > > > > Yeah, it is less maintainable than a simple memcpy() but I did not > > find a better alternative. I added a warning above the struct > > vm_area_struct definition to update this function every time we change > > that structure. Not sure if there is anything else I can do to help > > with this. > > Here's a horrible idea.. if we put the ref count at the end or start of > the struct, we could set the ref count to zero and copy the other area > in one mmecpy(). > > Even worse idea, we could use a pointer_of like macro to get the position > of the ref count in the vma struct, set the ref count to zero and > carefully copy the other two parts in two memcpy() operations. I implemented this approach in v3 of this patchset here: https://lore.kernel.org/all/20241117080931.600731-5-surenb@xxxxxxxxxx/ like this: #define VMA_BEFORE_LOCK offsetof(struct vm_area_struct, vm_lock) #define VMA_LOCK_END(vma) \ (((void *)(vma)) + offsetofend(struct vm_area_struct, vm_lock)) #define VMA_AFTER_LOCK \ (sizeof(struct vm_area_struct) - offsetofend(struct vm_area_struct, vm_lock)) static inline void vma_copy(struct vm_area_struct *new, struct vm_area_struct *orig) { /* Preserve vma->vm_lock */ data_race(memcpy(new, orig, VMA_BEFORE_LOCK)); data_race(memcpy(VMA_LOCK_END(new), VMA_LOCK_END(orig), VMA_AFTER_LOCK)); } If this looks more maintainable I can revive it. Maybe introduce a more generic function to copy any structure excluding a specific field and use it like this: copy_struct_except(new, orig, struct vm_area_struct, vm_refcnt); Would that be better? > > Feel free to disregard these ideas as it is late here and I'm having > fun thinking up bad ways to make this "more" maintainable. > > Either of these would make updating the struct easier, but very painful > to debug when it goes wrong (or reading the function). > > Thanks, > Liam