Re: [RFC][PATCH] ext4: Convert uninitialized extent to initialized extent in case of file system full

"Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxxxxxxx> · Fri, 22 Feb 2008 20:01:28 +0530

On Thu, Feb 21, 2008 at 01:07:17PM -0800, Mingming Cao wrote:
> Hi Aneesh,
> 
> It's a good start, a few comments below..
> 

.....

> > +			page = __grab_cache_page(mapping, index);
> > +			if (!page)
> > +				return -ENOMEM;
> > +		}
> > +
> 
> I the page is already locked before calling get_block() via writepage(),
> isn't it? and the journal transaction already started...
> 

It would be via write_begin or writepage. But both the callbacks lock
the page before their call getblock for all the blocks corresponding to
the page.

> 
> > +		if (!page_has_buffers(page))
> > +			create_empty_buffers(page, blocksize, 0);
> > +
> > +		head = page_buffers(page);
> > +		/* Look for the buffer_head which map the block */
> > +		bh = head;
> > +		while (offset > 0) {
> > +			bh = bh->b_this_page;
> > +			offset -= blocksize;
> > +		}
> > +		offset = (pos & (PAGE_CACHE_SIZE - 1));
> > +
> > +		/* Now write all the buffer_heads in the page */
> > +		do {
> > +			set_buffer_uptodate(bh);
> > +			if (ext4_should_journal_data(inode)) {
> > +				err = ext4_journal_get_write_access(handle, bh);
> > +				/* do we have that many credits ??*/
> > +				if (err)
> > +					goto err_out;
> > +			}
> > +			zero_user(page, offset, blocksize);
> 
> Ah oh, you are trying to zero out the pages in the page cache, that's
> seems wrong to me. By the time get_block() is called from writepages(),
> the pages should have meaningful content that needs to flush to disk,
> zero the pages out will lost the data.
> 

It is writebegin.  In case of writebegin the pages doesn't have the content. By the
time we reach write begin the page is supposed to have buffer heads that
are alreayd mapped. So we won't end up calling get_blk. Even in case of
mmap with page_mkwrite change we would have called writebegin equivalent
before the writepage.

> > +			offset += blocksize;
> > +			if (ext4_should_journal_data(inode)) {
> > +				err = ext4_journal_dirty_metadata(handle, bh);
> > +				if (err)
> > +					goto err_out;
> > +			} else {
> > +				if (ext4_should_order_data(inode)) {
> > +					err = ext4_journal_dirty_data(handle,
> > +									bh);
> > +					if (err)
> > +						goto err_out;
> > +				}
> > +				mark_buffer_dirty(bh);
> > +			}
> > +
> > +			bh = bh->b_this_page;
> > +			blkcount--;
> > +		} while ((bh != head) && (blkcount > 0));
> > +		/* only unlock if we have locked */
> > +		if (index != skip_index)
> > +			unlock_page(page);
> > +		page_cache_release(page);
> > +	}
> > +
> > +	return 0;
> > +err_out:
> > +	unlock_page(page);
> > +	page_cache_release(page);
> > +	return err;
> > +}
> > +
> 
> I was thinking just simply create a new bh, zero out the bh, then map
> the bh with the block number to zero out, lastly submit a IO via
> ll_rw_block. It maybe more efficient to do this via bio(perhaps cooking
> a bio with zeroed out pages and submit_bio) but I have not look very
> closely to it. Just throw out my thoughts.
> 

But we would still need pages. buffer head need to have a mapped page 
b_page. Also if we don't take the page from page cache then we would
have to wait for the I/O to complete using wait_on_buffer before we can
update the extent information. Using page cache also plug it nicely with
different journalling mode.

-aneesh

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html