On 9/4/06, Phillip Lougher <phillip@xxxxxxxxxxxxxxxxxxx> wrote:
Pete Zaitcev wrote: > > At some situations, this can get called twice on the same page. > When this happens first, the page is filled normally, and SetPageUptodate > clears dirty bit. Notice though, SetPageUptodate is different on s390 > from all other architectures, because the dirty bit belongs to a page, > and not to a page table entry there. So, on second pass, SetPageUptodate > sees the page already up to date and does not clear the dirty bit. > This creates a dirty page which eventually oopses us. Makes sense. I hit the "same page read more than once at the same time" race condition a couple of years ago, and solved it using grab_cache_page_nowait. Unfortunately I didn't know about the s390, however, I probably should always have had a PageUptodate check anyway. > > Here's a patch which I made out of Heiko's patch by transforming it: >
Hi Heiko, Pete, I've had a second look at the patch. The patch is fixing the one branch of the if, but it isn't fixing the other branch. That is when process A enters read_page to read page n, and process B simultaneously enters read_page to read page n +1, the patch fixes the case where process A pushes page n +1 a second time (second branch of the if, i != page->index). The other case can happen, process A pushes pages n and n+1 first, which means process B pushes page n + 1 a second time (first branch of the if, i == page->index). This case isn't fixed. In fact when fragment pages are not being read, this is the case that will always occur because process A grabs the read_page_mutex lock. This is the case I experienced a couple of years ago. Process A enters read_page, and grabs the lock. Process B enters and waits on the lock. Process A then reads page n, and pushes page n +1 (and others) into the page cache. Process B only proceeds once the lock is released, by which time the page which it has entered read_page to read (n + 1), has already been read. This is a updated patch which deals with both cases. I've also attached it because I don't know if Gmail mangles inline patches. BTW Zisofs uses grab_cache_page_nowait and will suffer from this problem. NTFS which also uses grab_cache_page_nowait does have a PageUptodate check. Regards Phillip diff --new-file -urp linux-2.6.17/fs/squashfs/inode.c linux-2.6.17-patched/fs/squashfs/inode.c --- linux-2.6.17/fs/squashfs/inode.c 2006-09-07 00:54:33.000000000 +0100 +++ linux-2.6.17-patched/fs/squashfs/inode.c 2006-09-07 01:08:49.000000000 +0100 @@ -1547,36 +1547,31 @@ static int squashfs_readpage(struct file for (i = start_index; i <= end_index && byte_offset < bytes; i++, byte_offset += PAGE_CACHE_SIZE) { struct page *push_page; - int available_bytes = (bytes - byte_offset) > PAGE_CACHE_SIZE ? + int avail = (bytes - byte_offset) > PAGE_CACHE_SIZE ? PAGE_CACHE_SIZE : bytes - byte_offset; TRACE("bytes %d, i %d, byte_offset %d, available_bytes %d\n", - bytes, i, byte_offset, available_bytes); + bytes, i, byte_offset, avail); - if (i == page->index) { - pageaddr = kmap_atomic(page, KM_USER0); - memcpy(pageaddr, data_ptr + byte_offset, - available_bytes); - memset(pageaddr + available_bytes, 0, - PAGE_CACHE_SIZE - available_bytes); - kunmap_atomic(pageaddr, KM_USER0); - flush_dcache_page(page); - SetPageUptodate(page); - unlock_page(page); - } else if ((push_page = - grab_cache_page_nowait(page->mapping, i))) { - pageaddr = kmap_atomic(push_page, KM_USER0); - - memcpy(pageaddr, data_ptr + byte_offset, - available_bytes); - memset(pageaddr + available_bytes, 0, - PAGE_CACHE_SIZE - available_bytes); - kunmap_atomic(pageaddr, KM_USER0); - flush_dcache_page(push_page); - SetPageUptodate(push_page); - unlock_page(push_page); + push_page = (i == page->index) ? page : + grab_cache_page_nowait(page->mapping, i); + + if (!push_page) + continue; + + if (PageUptodate(push_page)) + goto skip_page; + + pageaddr = kmap_atomic(push_page, KM_USER0); + memcpy(pageaddr, data_ptr + byte_offset, avail); + memset(pageaddr + avail, 0, PAGE_CACHE_SIZE - avail); + kunmap_atomic(pageaddr, KM_USER0); + flush_dcache_page(push_page); + SetPageUptodate(push_page); +skip_page: + unlock_page(push_page); + if(i != page->index) page_cache_release(push_page); - } } if (SQUASHFS_I(inode)->u.s1.fragment_start_block == SQUASHFS_INVALID_BLK
Attachment:
s390_patch
Description: Binary data