Re: [PATCH 1 of 8] Introduce a place holder page for the pagecache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Placeholders can span a range bigger than one page. The placeholder is inserted into the radix slot for the end of the range, and the flags field in
the page struct is used to record the start of the range.

Not surprisingly, I like how this makes the interaction between buffered and O_DIRECT a lot less magical. Nice work.

Some comments:

+void wait_on_placeholder_pages_range(struct address_space *mapping,
+			       pgoff_t start, pgoff_t end)

...

+	while(start <= end) {

...

+			start = pages[i]->index + 1;
+			if (pages[i]->index > end)
+				goto done;

It looks like this might be confused by a end and final page->index of ~0? Maybe it'd be safer to explicitly deal with this here instead of relying on safe arguments.

+		}
+		if (need_resched()) {
+			read_unlock_irq(&mapping->tree_lock);
+			cond_resched();
+			read_lock_irq(&mapping->tree_lock);
+		}

We should introduce a cond_sched_lock_irq() to match cond_resched_lock ().

+void remove_placeholder_pages(struct address_space *mapping,
+			     unsigned long start,
+			     unsigned long end)
+{
+	struct page *page;
+	int ret;
+	int i;
+	unsigned long num;
+	struct page *pages[8];
+
+	write_lock_irq(&mapping->tree_lock);
+	while (start < end) {

...

+	write_unlock_irq(&mapping->tree_lock);

And call that cond_resched_lock_irq() in here too? Since we grabbing the lock in here, presumably the caller won't mind if we grab and release it a few times?

+static int insert_placeholder(struct address_space *mapping,
+					 struct page *insert)
+{

spin_lock_assert(&mapping->tree_lock)?

+int find_or_insert_placeholders(struct address_space *mapping,
+				  unsigned long start, unsigned long end,
+				  gfp_t gfp_mask, int iowait)
+{

...

+	/*
+	 * this gets complicated.

It sure does! Sprinkling a few comments above the main blocks in the loop might be nice.

+			err = filemap_write_and_wait_range(mapping,
+						 cur << PAGE_CACHE_SHIFT,
+						 end << PAGE_CACHE_SHIFT);

This end is exclusive (>= end breaks) but filemap_write_and_wait_range () -> wait_on_page_writeback_range() wants an argument that is inclusive (> end breaks)?

+		if (PagePlaceHolder(page)) {
+			DEFINE_WAIT(wait);
+			wait_queue_head_t *wqh = page_waitqueue(page);
+			prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE);
+			read_unlock_irq(&mapping->tree_lock);
+			io_schedule();
+			finish_wait(wqh, &wait);
+			read_lock_irq(&mapping->tree_lock);

This pattern seems to be repeated quite a few times. Maybe a helper would.. help? Maybe it's not worth it if it has to do silly little pre-schedule tasks differently at each site.

-	for (i = 0; i < ret; i++)
-		page_cache_get(pages[i]);
+	while(i < ret) {
+		if (PagePlaceHolder(pages[i])) {
+			/* we can't return a place holder, shift it away */
+			if (i + 1 < ret) {
+				memcpy(&pages[i], &pages[i+1],
+		                       (ret - i - 1) * sizeof(struct page *));
+			}
+			ret--;
+			continue;
+		} else
+			page_cache_get(pages[i]);
+		i++;

It looks like this loop is implemented twice? How about a factoring it out? Especially with the risk of off-by-ones in there. (Though I don't see problems, it looks correct.)

More on the rest of the series after lunch :).

- z
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux