On Tue, Dec 22, 2015 at 08:37:08AM +1100, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > now that we try to write dirty buffers before we release them, we > can get buildup of unwritable dirty buffers on the LRU lists, This > results in the cache shaker repeatedly trying to write out these > buffers every time the cache fills up. This results in more > corruption warnings, and takes up a lot of time doing reclaiming > nothing. This can effectively livelock the processing parts of phase > 4. > > Fix this by not trying to write buffers with corruption errors on > them. These errors will get cleared when the buffer is re-read and > fixed and them marked dirty again. At which point, we'll be able to > write them and so the cache can reclaim them successfully. > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > --- Reviewed-by: Brian Foster <bfoster@xxxxxxxxxx> > libxfs/rdwr.c | 27 ++++++++++++++++----------- > 1 file changed, 16 insertions(+), 11 deletions(-) > > diff --git a/libxfs/rdwr.c b/libxfs/rdwr.c > index 0337a21..a1f0029 100644 > --- a/libxfs/rdwr.c > +++ b/libxfs/rdwr.c > @@ -1103,7 +1103,6 @@ int > libxfs_writebufr(xfs_buf_t *bp) > { > int fd = libxfs_device_to_fd(bp->b_target->dev); > - int error = 0; > > /* > * we never write buffers that are marked stale. This indicates they > @@ -1134,7 +1133,7 @@ libxfs_writebufr(xfs_buf_t *bp) > } > > if (!(bp->b_flags & LIBXFS_B_DISCONTIG)) { > - error = __write_buf(fd, bp->b_addr, bp->b_bcount, > + bp->b_error = __write_buf(fd, bp->b_addr, bp->b_bcount, > LIBXFS_BBTOOFF64(bp->b_bn), bp->b_flags); > } else { > int i; > @@ -1144,11 +1143,10 @@ libxfs_writebufr(xfs_buf_t *bp) > off64_t offset = LIBXFS_BBTOOFF64(bp->b_map[i].bm_bn); > int len = BBTOB(bp->b_map[i].bm_len); > > - error = __write_buf(fd, buf, len, offset, bp->b_flags); > - if (error) { > - bp->b_error = error; > + bp->b_error = __write_buf(fd, buf, len, offset, > + bp->b_flags); > + if (bp->b_error) > break; > - } > buf += len; > } > } > @@ -1157,14 +1155,14 @@ libxfs_writebufr(xfs_buf_t *bp) > printf("%lx: %s: wrote %u bytes, blkno=%llu(%llu), %p, error %d\n", > pthread_self(), __FUNCTION__, bp->b_bcount, > (long long)LIBXFS_BBTOOFF64(bp->b_bn), > - (long long)bp->b_bn, bp, error); > + (long long)bp->b_bn, bp, bp->b_error); > #endif > - if (!error) { > + if (!bp->b_error) { > bp->b_flags |= LIBXFS_B_UPTODATE; > bp->b_flags &= ~(LIBXFS_B_DIRTY | LIBXFS_B_EXIT | > LIBXFS_B_UNCHECKED); > } > - return error; > + return bp->b_error; > } > > int > @@ -1266,15 +1264,22 @@ libxfs_bulkrelse( > return count; > } > > +/* > + * When a buffer is marked dirty, the error is cleared. Hence if we are trying > + * to flush a buffer prior to cache reclaim that has an error on it it means > + * we've already tried to flush it and it failed. Prevent repeated corruption > + * errors from being reported by skipping such buffers - when the corruption is > + * fixed the buffer will be marked dirty again and we can write it again. > + */ > static int > libxfs_bflush( > struct cache_node *node) > { > struct xfs_buf *bp = (struct xfs_buf *)node; > > - if (bp->b_flags & LIBXFS_B_DIRTY) > + if (!bp->b_error && bp->b_flags & LIBXFS_B_DIRTY) > return libxfs_writebufr(bp); > - return 0; > + return bp->b_error; > } > > void > -- > 2.5.0 > > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs