Re: [PATCH v5 2/3] mm/mempolicy: add set_mempolicy_home_node syscall

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Michal Hocko <mhocko@xxxxxxxx> writes:

> On Mon 29-11-21 16:16:05, Aneesh Kumar K.V wrote:
>> Michal Hocko <mhocko@xxxxxxxx> writes:
>> 
>> > On Tue 16-11-21 12:12:37, Aneesh Kumar K.V wrote:
> [...]
>> >> +SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, len,
>> >> +		unsigned long, home_node, unsigned long, flags)
>> >> +{
>> >> +	struct mm_struct *mm = current->mm;
>> >> +	struct vm_area_struct *vma;
>> >> +	struct mempolicy *new;
>> >> +	unsigned long vmstart;
>> >> +	unsigned long vmend;
>> >> +	unsigned long end;
>> >> +	int err = -ENOENT;
>> >> +
>> >> +	if (start & ~PAGE_MASK)
>> >> +		return -EINVAL;
>> >> +	/*
>> >> +	 * flags is used for future extension if any.
>> >> +	 */
>> >> +	if (flags != 0)
>> >> +		return -EINVAL;
>> >> +
>> >> +	if (!node_online(home_node))
>> >> +		return -EINVAL;
>> >
>> > You really want to check the home_node before dereferencing the mask.
>> >
>> 
>> Any reason why we want to check for home node first?
>
> Because the given node is an index to node_states[N_ONLINE] bitmap. I do
> not think we do range checking there.

Will add this

	if (home_node >= MAX_NUMNODES || !node_online(home_node))
		return -EINVAL;



>
>> >> +	len = (len + PAGE_SIZE - 1) & PAGE_MASK;
>> >> +	end = start + len;
>> >> +
>> >> +	if (end < start)
>> >> +		return -EINVAL;
>> >> +	if (end == start)
>> >> +		return 0;
>> >> +	mmap_write_lock(mm);
>> >> +	vma = find_vma(mm, start);
>> >> +	for (; vma && vma->vm_start < end;  vma = vma->vm_next) {
>> >> +
>> >> +		vmstart = max(start, vma->vm_start);
>> >> +		vmend   = min(end, vma->vm_end);
>> >> +		new = mpol_dup(vma_policy(vma));
>> >> +		if (IS_ERR(new)) {
>> >> +			err = PTR_ERR(new);
>> >> +			break;
>> >> +		}
>> >> +		/*
>> >> +		 * Only update home node if there is an existing vma policy
>> >> +		 */
>> >> +		if (!new)
>> >> +			continue;
>> >
>> > Your changelog only mentions MPOL_BIND and MPOL_PREFERRED_MANY as
>> > supported but you seem to be applying the home node to all existing
>> > policieso
>> 
>> 
>> The restriction is done in policy_node. 
>> 
>> @@ -1801,6 +1856,11 @@ static int policy_node(gfp_t gfp, struct mempolicy *policy, int nd)
>> 		WARN_ON_ONCE(policy->mode == MPOL_BIND && (gfp & __GFP_THISNODE));
>> 	}
>> 
>> +	if ((policy->mode == MPOL_BIND ||
>> +	     policy->mode == MPOL_PREFERRED_MANY) &&
>> +	    policy->home_node != NUMA_NO_NODE)
>> +		return policy->home_node;
>> +
>> 	return nd;
>>  }
>
> But you do allow to set the home node also for other policies and that
> means that a default MPOL_INTERLEAVE would be different from the one
> with home_node set up even though they behave exactly the same.

I agree that there is no error returned if we try to set the home_node
for other memory policies. But there should not be any behaviour
differences. We ignore home_node for policies other than BIND and
PREFERRED_MANY.

The reason I avoided erroring out for other policies was to simplify the
error handling. For example, for a range of addr with a mix of memory
policy MPOLD_BIND and MPOL_INTERLEAVE what should be the state after the
above system call? We could say, we ignore setting home_node for vma
with policy MPOL_INTERLEAVE and leave the home node set for vma with
policy MPOL_BIND. Or should we make the system call return error also
undoing the changes done for vmas for which we have set the home_node?

-aneesh



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux