On Thu 20-04-17 10:25:27, Vlastimil Babka wrote: > On 04/10/2017 06:25 PM, Michal Hocko wrote: [...] > > Let's simulate memory hot online manually > > Normal Movable > > > > /sys/devices/system/memory/memory32/valid_zones:Normal > > /sys/devices/system/memory/memory33/valid_zones:Normal Movable > > > > /sys/devices/system/memory/memory32/valid_zones:Normal > > /sys/devices/system/memory/memory33/valid_zones:Normal > > /sys/devices/system/memory/memory34/valid_zones:Normal Movable > > > > /sys/devices/system/memory/memory32/valid_zones:Normal > > /sys/devices/system/memory/memory33/valid_zones:Normal Movable > > /sys/devices/system/memory/memory34/valid_zones:Movable Normal > > Commands seem to be missing above? Yes. git commit just dropped everything starting with # which happened to be the bash prompt for my commands. I have changed that to $ and it looks as follows $ echo 0x100000000 > /sys/devices/system/memory/probe $ grep . /sys/devices/system/memory/memory32/valid_zones Normal Movable $ echo $((0x100000000+(128<<20))) > /sys/devices/system/memory/probe $ grep . /sys/devices/system/memory/memory3?/valid_zones /sys/devices/system/memory/memory32/valid_zones:Normal /sys/devices/system/memory/memory33/valid_zones:Normal Movable $ echo $((0x100000000+2*(128<<20))) > /sys/devices/system/memory/probe $ grep . /sys/devices/system/memory/memory3?/valid_zones /sys/devices/system/memory/memory32/valid_zones:Normal /sys/devices/system/memory/memory33/valid_zones:Normal /sys/devices/system/memory/memory34/valid_zones:Normal Movable $ echo online_movable > /sys/devices/system/memory/memory34/state $ grep . /sys/devices/system/memory/memory3?/valid_zones /sys/devices/system/memory/memory32/valid_zones:Normal /sys/devices/system/memory/memory33/valid_zones:Normal Movable /sys/devices/system/memory/memory34/valid_zones:Movable Normal [...] > > This means that the same physical online steps as above will lead to the > > following state: > > Normal Movable > > > > /sys/devices/system/memory/memory32/valid_zones:Normal Movable > > /sys/devices/system/memory/memory33/valid_zones:Normal Movable > > > > /sys/devices/system/memory/memory32/valid_zones:Normal Movable > > /sys/devices/system/memory/memory33/valid_zones:Normal Movable > > /sys/devices/system/memory/memory34/valid_zones:Normal Movable > > > > /sys/devices/system/memory/memory32/valid_zones:Normal Movable > > /sys/devices/system/memory/memory33/valid_zones:Normal Movable > > /sys/devices/system/memory/memory34/valid_zones:Movable > > Ditto. This just copies the above so I didn't add those commands. I can if that is preferable. > > --- a/include/linux/mmzone.h > > +++ b/include/linux/mmzone.h > > @@ -533,6 +533,20 @@ static inline bool zone_is_empty(struct zone *zone) > > } > > > > /* > > + * Return true if [start_pfn, start_pfn + nr_pages) range has a non-mpty > > > non-empty fixed > > + * intersection with the given zone > > + */ > > +static inline bool zone_intersects(struct zone *zone, > > + unsigned long start_pfn, unsigned long nr_pages) > > +{ > > I'm looking at your current mmotm tree branch, which looks like this: > > + * Return true if [start_pfn, start_pfn + nr_pages) range has a non-mpty > + * intersection with the given zone > + */ > +static inline bool zone_intersects(struct zone *zone, > + unsigned long start_pfn, unsigned long nr_pages) > +{ > + if (zone_is_empty(zone)) > + return false; > + if (zone->zone_start_pfn <= start_pfn && start_pfn < zone_end_pfn(zone)) > + return true; > + if (start_pfn + nr_pages > zone->zone_start_pfn) > + return true; > > A false positive is possible here, when start_pfn >= zone_end_pfn(zone)? Ohh, right. Looks better? diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index eae6da28646e..611ff869fa4d 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -541,10 +541,14 @@ static inline bool zone_intersects(struct zone *zone, { if (zone_is_empty(zone)) return false; - if (zone->zone_start_pfn <= start_pfn && start_pfn < zone_end_pfn(zone)) + if (start_pfn >= zone_end_pfn(zone)) + return false; + + if (zone->zone_start_pfn <= start_pfn) return true; if (start_pfn + nr_pages > zone->zone_start_pfn) return true; + return false; } > > @@ -1029,39 +1018,114 @@ static void node_states_set_node(int node, struct memory_notify *arg) > > node_set_state(node, N_MEMORY); > > } > > > > -bool zone_can_shift(unsigned long pfn, unsigned long nr_pages, > > - enum zone_type target, int *zone_shift) > > +bool allow_online_pfn_range(int nid, unsigned long pfn, unsigned long nr_pages, int online_type) > > { > > - struct zone *zone = page_zone(pfn_to_page(pfn)); > > - enum zone_type idx = zone_idx(zone); > > - int i; > > + struct pglist_data *pgdat = NODE_DATA(nid); > > + struct zone *movable_zone = &pgdat->node_zones[ZONE_MOVABLE]; > > + struct zone *normal_zone = &pgdat->node_zones[ZONE_NORMAL]; > > > > - *zone_shift = 0; > > + /* > > + * TODO there shouldn't be any inherent reason to have ZONE_NORMAL > > + * physically before ZONE_MOVABLE. All we need is they do not > > + * overlap. Historically we didn't allow ZONE_NORMAL after ZONE_MOVABLE > > + * though so let's stick with it for simplicity for now. > > + * TODO make sure we do not overlap with ZONE_DEVICE > > Is this last TODO a blocker, unlike the others? I think it is not but my knowledge of the zone device is very limited. I was hoping for Dan's feedback here. From what I understand Zone device occupies the high end of the address space so we shouldn't overlap here. Is this correct Dan? [...] > > + if (online_type == MMOP_ONLINE_MOVABLE && !can_online_high_movable(nid)) > > + return -EINVAL; > > > > - zone = move_pfn_range(zone_shift, pfn, pfn + nr_pages); > > + /* associate pfn range with the zone */ > > + zone = move_pfn_range(online_type, nid, pfn, nr_pages); > > if (!zone) > > return -EINVAL; > > Nit: This !zone currently cannot happen. fixed Thanks! -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>