Re: node-hotplug: is memset 0 safe in try_offline_node()?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2015/3/4 11:53, Gu Zheng wrote:

> Hi Xishi,
> 
> On 03/04/2015 10:22 AM, Xishi Qiu wrote:
> 
>> On 2015/3/3 18:20, Gu Zheng wrote:
>>
>>> Hi Xishi,
>>> On 03/03/2015 11:30 AM, Xishi Qiu wrote:
>>>
>>>> When hot-remove a numa node, we will clear pgdat,
>>>> but is memset 0 safe in try_offline_node()?
>>>
>>> It is not safe here. In fact, this is a temporary solution here.
>>> As you know, pgdat is accessed lock-less now, so protection
>>> mechanism (RCU?) is needed to make it completely safe here,
>>> but it seems a bit over-kill.
>>>
>>>>
>>>> process A:			offline node XX:
>>>> for_each_populated_zone()
>>>> find online node XX
>>>> cond_resched()
>>>> 				offline cpu and memory, then try_offline_node()
>>>> 				node_set_offline(nid), and memset(pgdat, 0, sizeof(*pgdat))
>>>> access node XX's pgdat
>>>> NULL pointer access error
>>>
>>> It's possible, but I did not meet this condition, did you?
>>>
>>
>> Yes, we test hot-add/hot-remove node with stress, and meet the following
>> call trace several times.
> 
> Thanks.
> 
>>
>> 	next_online_pgdat()
>> 		int nid = next_online_node(pgdat->node_id);  // it's here, pgdat is NULL
> 
> 	memset(pgdat, 0, sizeof(*pgdat));
> This memset just sets the context of pgdat to 0, but it will not free pgdat, so the *pgdat is
> NULL* is strange here.
> But anyway, the bug is real, we must fix it.

next_zone()
	pg_data_t *pgdat = zone->zone_pgdat;  // I think this pgdat is NULL, and NODE_DATA() is not NULL.
	...
	pgdat = next_online_pgdat(pgdat);
		int nid = next_online_node(pgdat->node_id);  // so here is the null pointer access

Thanks for your new patch, I'll test it.

Thanks,
Xishi Qiu

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]