On Thu, Aug 22, 2019 at 9:19 AM Nitesh Narayan Lal <nitesh@xxxxxxxxxx> wrote: > > > On 8/21/19 10:59 AM, Alexander Duyck wrote: > > From: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx> > > > > In order to pave the way for free page reporting in virtualized > > environments we will need a way to get pages out of the free lists and > > identify those pages after they have been returned. To accomplish this, > > this patch adds the concept of a Reported Buddy, which is essentially > > meant to just be the Uptodate flag used in conjunction with the Buddy > > page type. > > > > It adds a set of pointers we shall call "boundary" which represents the > > upper boundary between the unreported and reported pages. The general idea > > is that in order for a page to cross from one side of the boundary to the > > other it will need to go through the reporting process. Ultimately a > > free_list has been fully processed when the boundary has been moved from > > the tail all they way up to occupying the first entry in the list. > > > > Doing this we should be able to make certain that we keep the reported > > pages as one contiguous block in each free list. This will allow us to > > efficiently manipulate the free lists whenever we need to go in and start > > sending reports to the hypervisor that there are new pages that have been > > freed and are no longer in use. > > > > An added advantage to this approach is that we should be reducing the > > overall memory footprint of the guest as it will be more likely to recycle > > warm pages versus trying to allocate the reported pages that were likely > > evicted from the guest memory. > > > > Since we will only be reporting one zone at a time we keep the boundary > > limited to being defined for just the zone we are currently reporting pages > > from. Doing this we can keep the number of additional pointers needed quite > > small. To flag that the boundaries are in place we use a single bit > > in the zone to indicate that reporting and the boundaries are active. > > > > The determination of when to start reporting is based on the tracking of > > the number of free pages in a given area versus the number of reported > > pages in that area. We keep track of the number of reported pages per > > free_area in a separate zone specific area. We do this to avoid modifying > > the free_area structure as this can lead to false sharing for the highest > > order with the zone lock which leads to a noticeable performance > > degradation. > [...] > > + > > +/* request page reporting on this zone */ > > +void __page_reporting_request(struct zone *zone) > > +{ > > + struct page_reporting_dev_info *phdev; > > + > > + rcu_read_lock(); > > + > > + /* > > + * We use RCU to protect the ph_dev_info pointer. In almost all > > + * cases this should be present, however in the unlikely case of > > + * a shutdown this will be NULL and we should exit. > > + */ > > + phdev = rcu_dereference(ph_dev_info); > > + if (unlikely(!phdev)) > > + return; > > + > > Just a minor comment here. > Although this is unlikely to trigger still I think you should release the > rcu_read_lock before returning. Thanks for catching that. I will have that fixed for next version. - Alex