On Mon, 2019-12-16 at 06:44 -0500, Nitesh Narayan Lal wrote: > On 12/5/19 11:22 AM, Alexander Duyck wrote: > > From: Alexander Duyck <alexander.h.duyck@xxxxxxxxxxxxxxx> > > > > In order to pave the way for free page reporting in virtualized > > environments we will need a way to get pages out of the free lists and > > identify those pages after they have been returned. To accomplish this, > > this patch adds the concept of a Reported Buddy, which is essentially > > meant to just be the Uptodate flag used in conjunction with the Buddy > > page type. > > > > To prevent the reported pages from leaking outside of the buddy lists I > > added a check to clear the PageReported bit in the del_page_from_free_list > > function. As a result any reported page that is split, merged, or > > allocated will have the flag cleared prior to the PageBuddy value being > > cleared. > > > > The process for reporting pages is fairly simple. Once we free a page that > > meets the minimum order for page reporting we will schedule a worker thread > > to start 2s or more in the future. That worker thread will begin working > > from the lowest supported page reporting order up to MAX_ORDER - 1 pulling > > unreported pages from the free list and storing them in the scatterlist. > > > > When processing each individual free list it is necessary for the worker > > thread to release the zone lock when it needs to stop and report the full > > scatterlist of pages. To reduce the work of the next iteration the worker > > thread will rotate the free list so that the first unreported page in the > > free list becomes the first entry in the list. > > [...] > > > k); > > + > > + return err; > > +} > > + > > +static int > > +page_reporting_process_zone(struct page_reporting_dev_info *prdev, > > + struct scatterlist *sgl, struct zone *zone) > > +{ > > + unsigned int order, mt, leftover, offset = PAGE_REPORTING_CAPACITY; > > + unsigned long watermark; > > + int err = 0; > > + > > + /* Generate minimum watermark to be able to guarantee progress */ > > + watermark = low_wmark_pages(zone) + > > + (PAGE_REPORTING_CAPACITY << PAGE_REPORTING_MIN_ORDER); > > + > > + /* > > + * Cancel request if insufficient free memory or if we failed > > + * to allocate page reporting statistics for the zone. > > + */ > > + if (!zone_watermark_ok(zone, 0, watermark, 0, ALLOC_CMA)) > > + return err; > > + > > Will it not make more sense to check the low watermark condition before every > reporting request generated for a bunch of 32 isolated pages? > or will that be too costly? My thought is to wait until we are actually processing the request. That way we are only performing this check once every 2 seconds instead of every time we are thinking about requesting page reporting. Keep in mind I removed the reported_pages tracking statistics so we now are requesting as soon as we free any page. So if we moved the check tot he request itself it would mean that a low memory condition would result in us repeatedly checking the low water mark and failing the test.