> From: Dave Hansen [mailto:dave@xxxxxxxxxxxxxxxxxx] > Sent: Monday, October 03, 2011 12:23 PM > To: Nitin Gupta > Cc: Dan Magenheimer; Seth Jennings; Greg KH; gregkh@xxxxxxx; devel@xxxxxxxxxxxxxxxxxxxx; > cascardo@xxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; brking@xxxxxxxxxxxxxxxxxx; > rcj@xxxxxxxxxxxxxxxxxx > Subject: Re: [PATCH v2 0/3] staging: zcache: xcfmalloc support > > On Mon, 2011-10-03 at 13:54 -0400, Nitin Gupta wrote: > > I think disabling preemption on the local CPU is the cheapest we can get > > to protect PCPU buffers. We may experiment with, say, multiple buffers > > per CPU, so we end up disabling preemption only in highly improbable > > case of getting preempted just too many times exactly within critical > > section. > > I guess the problem is two-fold: preempt_disable() and > local_irq_save(). > > > static int zcache_put_page(int cli_id, int pool_id, struct tmem_oid *oidp, > > uint32_t index, struct page *page) > > { > > struct tmem_pool *pool; > > int ret = -1; > > > > BUG_ON(!irqs_disabled()); > > That tells me "zcache" doesn't work with interrupts on. It seems like > awfully high-level code to have interrupts disabled. The core page > allocator has some irq-disabling spinlock calls, but that's only really > because it has to be able to service page allocations from interrupts. > What's the high-level reason for zcache? > > I'll save the discussion about preempt for when Seth posts his patch. I completely agree that the irq/softirq/preempt states should be re-examined and, where possible, improved before zcache moves out of staging. Actually, I think cleancache_put is called from a point in the kernel where irqs are disabled. I believe it is unsafe to call a routine sometimes with irqs disabled and sometimes with irqs enabled? I think some points of call to cleancache_flush may also have irqs disabled. IIRC, much of the zcache code has preemption disabled because it is unsafe for a page fault to occur when zcache is running, since the page fault may cause a (recursive) call into zcache and possibly recursively take a lock. Anyway, some of the atomicity constraints in the code are definitely required, but there are very likely some constraints that are overzealous and can be removed. For now, I'd rather have the longer interrupt latency with code that works than have developers experimenting with zcache and see lockups. :-} Dan _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/devel