On Fri, Dec 15, 2023 at 1:04 AM Ronald Monthero <debug.penguin32@xxxxxxxxx> wrote: > > On Thu, Dec 14, 2023 at 10:28 AM Nhat Pham <nphamcs@xxxxxxxxx> wrote: > > > .. > < snipped > > > > I should have been clearer. I'm not against the change per-se - I > > agree that we should replace create_workqueue() with > > alloc_workqueue(). What I meant was, IIUC, there are two behavioral > > changes with this new workqueue creation: > > > > a) We're replacing a bounded workqueue (which as you noted, is fixed > > by create_workqueue()) with an unbounded one (WQ_UNBOUND). This seems > > fine to me - I doubt locality buys us much here. > > Yes the workqueue attribute change per se but the existing > functionality remains seamless and the change is more a forward > looking change. imo under a memory pressure scenario an unbound > workqueue might workaround the scenario better as the number of > backing pools is dynamic. And with the WQ_UNBOUND attribute the > scheduler is more open to exercise some improvisations in any > demanding scenarios for offloading cpu time slicing for workers, ie if > any other worker of the same primary cpu had to be served due to > workers with WQ_HIGHPRI and WQ_CPU_INTENSIVE.Although normal and > highpri worker-pools don't interact with each other, the target cpu > atleast need not be the same if our worker for zswap is WQ_UNBOUND. I don't object to the change per-se. I just meant that these changes/discussion should be spelled out in the patch's changelog :) IMHO, we should document behavior changes if there are any. For instance, when we switched to kmap_local_page() for zswap, there is a discussion in the changelog regarding how it differs from the old (and deprecated) kmap_atomic(): https://lore.kernel.org/linux-mm/20231127160058.586446-1-fabio.maria.de.francesco@xxxxxxxxxxxxxxx/ and how that difference doesn't matter in the case of zswap. > > Also noting that the existing wq of zwap worker has the WQ_MEM_RECLAIM > attribute, so is there a rescue worker for zswap during a memory > pressure scenario ? > Quoting: "All work items which might be used on code paths that handle > memory reclaim are required to be queued on wq's that have a > rescue-worker reserved for execution under memory pressure. Else it is > possible that the worker-pool deadlocks waiting for execution contexts > to free up" Seems like it, but this behavior is not changed by your patch :) So i'm not too concerned by it one way or another. > > Also additional thought if adding WQ_FREEZABLE attribute while > creating the zswap worker make sense in scenarios to handle freeze and > unfreeze of specific cgroups or file system wide freeze and unfreeze > scenarios ? Does zswap worker participate in freeze/unfreeze code path > scenarios ? I don't think so, no? This zswap worker is scheduled upon the zswap pool limit hit (which happens on the zswap store/swapping/memory reclaim) path. > > > b) create_workqueue() limits the number of concurrent per-cpu > > execution contexts at 1 (i.e only one single global reclaimer), > > whereas after this patch this is set to the default value. This seems > > fine to me too - I don't remember us taking advantage of the previous > > concurrency limitation. Also, in practice, the task_struct is > > one-to-one with the zswap_pool's anyway, and most of the time, there > > is just a single pool being used. (But it begs the question - what's > > the point of using 0 instead of 1 here?) > > Nothing in particular but I left it at default 0, which can go upto > 256 ( @maxactive per cpu). > But if zswap worker is always intended to only have 1 active worker per cpu, > then that's fine with 1, otherwise a default setting might be flexible > for scaling. > just a thought, does having a higher value help for larger memory systems ? I don't think having higher value helps here tbh. We only have one work_struct per pool, so it shouldn't make a difference either way :) > > > Both seem fine (to me anyway - other reviewers feel free to take a > > look and fact-check everything). I just feel like this should be > > explicitly noted in the changelog, IMHO, in case we are mistaken and > > need to revisit this :) Either way, not a NACK from me. > > Thanks Nhat, for checking. Above are my thoughts, I could be missing > some info or incorrect > on certain fronts so I would seek clarifications. > Also thanks in advance to other experts/maintainers, please share > feedback and suggestions. > > BR, > ronald Also +Chris Li, who is also working on improving zswap :)