xfssyncd and disk spin down

Petre Rodan <petre.rodan@xxxxxxxxxx> · Thu, 23 Dec 2010 18:55:32 +0200

Hello,

I have a problem with a hard drive that never managed to spin down. this drive is a storage space, not a system disk, the only thing that generated writes is the nfs server that exports its contents. it has only one large xfs partition on it.

upon closer inspection it turns out that after the first Write action to that partition, an xfssyncd process continues to write to that partition each 36 seconds and it doesn't stop doing that, even if there are no more Writes from the exterior. this keeps the drive busy with varying consequences. more about that later.

I found that the only easy way to stop the xfssyncd process poking the drive is to run a `mount -o remount /mnt/space`. this will silence any internal xfs process to acessing the drive, thus allowing it to spin down and only be woken up by a NFS access.

here are some simple steps to replicate the problem:

# echo 3 > /proc/sys/vm/drop_caches # free cached fs entities 
# ( blktrace -d /dev/sdb -o - | blkparse -i - ) &
# mount -o remount /mnt/space
# find /mnt/space/ -type f > /dev/null  # generate some non-cached Read requests
# # absolutely no writes have been performed to the drive, 
# # it could spin down now if enough time would pass
# touch /mnt/space/foo
# # process 1352 will start writing to the drive at a 35-36s interval,
# # even if there has been no other write request.

  8,16   1    36591  6306.873151576  1352  A WBS 976985862 + 2 <- (8,17) 976985799
  8,16   1    36592  6306.873152998  1352  Q WBS 976985862 + 2 [xfssyncd/sdb1]
[..]
  8,16   1    36600  6342.875151286  1352  A WBS 976985864 + 2 <- (8,17) 976985801
  8,16   1    36601  6342.875152938  1352  Q WBS 976985864 + 2 [xfssyncd/sdb1]
[..]
  8,16   1    36609  6378.877225211  1352  A WBS 976985866 + 2 <- (8,17) 976985803
  8,16   1    36610  6378.877226935  1352  Q WBS 976985866 + 2 [xfssyncd/sdb1]

there was no file at or near the 976985799 inode (I presume that's an inode?)

I found that the only way to stop it is to remount the partition. I also tried sync(1), but to no avail. 

so is there an XFS option somewhere that would make the filesystem be more forgiving with the hardware beneath it? without loosing the journal of course.

I'm using a vanilla 2.6.36.2 kernel patched with grsecurity, default mkfs.xfs options, rw,nosuid,nodev,noexec,noatime,attr2,noquota mount options, and xfs_info looks like this:

meta-data=/dev/sdb1              isize=256    agcount=4, agsize=61047500 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=244190000, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=32768, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0

a probably related issue is with modern Western Digital/Hitachi hard drives that use a Ramp load/unload technology that automatically parks the heads at stupidly small inactivity intervals (some small as 8 seconds), so look what happens when using such a drive and xfs:

# smartctl -a /dev/sda
[..]
Device Model:     WDC WD20EARS-00MVWB0
Serial Number:    WD-WCAZA0101731
Firmware Version: 50.0AB50
[..]
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       17
[..]
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       3351
[..]
193 Load_Cycle_Count        0x0032   066   066   000    Old_age   Always       -       403405

this hard drive has exceeded it's 300k load/unload maximum from the specs in only 140 days, which means it was woken up every 30s or so. not willingly.

cheers,
peter

Attachment:
pgp6BKCAkzuFr.pgp

Description: PGP signature
_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs