Re: Is partition alignment needed for RAID partitions ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Stan,
Size is incorrect in what way?  If your RAID0 chunk is 512KiB, then
3407028224 sectors is 3327176 chunks, evenly divisible, so this
partition is fully aligned.  Whether the capacity is correct is
something only you can determine.  Partition 2 is 1.587 TiB.
Would you mind showing me the calc you did to get there, 3407028224/3327176=1024, I don't understand how the 512kiB came into play ?
I'm not intending to be jerk, but this is a technical mailing list.
Understood - here is the complete layout:

/dev/sda - 250 gig disk
/dev/sdb - 2TB disk
/dev/sdc - 2TB disk
/dev/sdd - 256gig iSCSI target on QNAP NAS (block allocated, not thin prov'ed)
/dev/sde - 2TB iSCSI target on QNAP NAS (block allocated, not thin prov'ed)
Show your partition table for sdc.  Even if the partitions on it are not
aligned, reads shouldn't be adversely affected by it.  Show

$ mdadm --detail
# parted /dev/sdb unit s print
Model: ATA WDC WD20EARX-008 (scsi)
Disk /dev/sdb: 3907029168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  Start       End          Size         File system  Name Flags
 1      2048s       500000767s   499998720s raid
 2      500000768s  3907028991s  3407028224s raid

# parted /dev/sdc unit s print
Model: ATA WDC WD20EARX-008 (scsi)
Disk /dev/sdc: 3907029168s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  Start       End          Size         File system  Name Flags
 1      2048s       500000767s   499998720s raid
 2      500000768s  3907028991s  3407028224s raid


# mdadm --detail /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Mon Dec 30 12:33:43 2013
     Raid Level : raid1
     Array Size : 249868096 (238.29 GiB 255.86 GB)
  Used Dev Size : 249868096 (238.29 GiB 255.86 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Tue Dec 31 01:01:42 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : srv01:0  (local to host srv01)
           UUID : 45d71ef8:9a1115cb:8ed0c4d9:95d56df4
         Events : 25

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1

# mdadm --detail /dev/md1
/dev/md1:
        Version : 1.2
  Creation Time : Mon Dec 30 12:33:56 2013
     Raid Level : raid0
     Array Size : 3407027200 (3249.19 GiB 3488.80 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Mon Dec 30 12:33:56 2013
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 512K

           Name : srv01:1  (local to host srv01)
           UUID : abfdcb5e:804fa119:9c4a8d88:fa2f08a7
         Events : 0

    Number   Major   Minor   RaidDevice State
       0       8       18        0      active sync   /dev/sdb2
       1       8       34        1      active sync   /dev/sdc2


for the RAID0 array.  md itself, especially in RAID0 personality, is
simply not going to be the -cause- of low performance.  The problem lay
somewhere else.  Given the track record of Western Digital's Green
series of drives I'm leaning toward that cause.  Post output from

$ smartctl -A /dev/sdb
$ smartctl -A /dev/sdc
# smartctl -A /dev/sdb
smartctl 6.2 2013-04-20 r3812 [i686-linux-3.11.0-14-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 217 186 021 Pre-fail Always - 4141 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 102 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 089 089 000 Old_age Always - 8263 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 102 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 88 193 Load_Cycle_Count 0x0032 155 155 000 Old_age Always - 135985 194 Temperature_Celsius 0x0022 121 108 000 Old_age Always - 29 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

# smartctl -A /dev/sdc
smartctl 6.2 2013-04-20 r3812 [i686-linux-3.11.0-14-generic] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 217 186 021 Pre-fail Always - 4141 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 100 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 089 089 000 Old_age Always - 8263 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 100 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 86 193 Load_Cycle_Count 0x0032 156 156 000 Old_age Always - 134976 194 Temperature_Celsius 0x0022 122 109 000 Old_age Always - 28 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

I would have expected the RAID0 device to easily get
up to the 60meg/sec mark ?
As the source disk of a bulk file copy over NFS/CIFS?  As a point of
reference, I have a workstation that maxes 50MB/s FTP and only 24MB/s
CIFS to/from a server.  Both hosts have far in excess of 100MB/s disk
throughput.  The 50MB/s limitation is due to the cheap Realtek mobo NIC,
and the 24MB/s is a Samba limit.  I've spent dozens of hours attempting
to tweak Samba to greater throughput but it simply isn't capable on that
machine.

Your throughput issues are with your network, not your RAID.  Learn and
use FIO to see what your RAID/disks can do.  For now a really simple
test is to time cat of a large file and pipe to /dev/null.  Divide the
file size by the elapsed time.  Or simply do a large read with dd.  This
will be much more informative than "moving data to a NAS", where your
throughput is network limited, not disk.

The system is using a server grade NIC, I will run a dd/network test
shortly after the copy is done. (I am shifting all the data back to the
NAS, incase I mucked up the partitions :) ), I do recall that this
system was able to fill a gig pipe...
Now that you've made it clear the first scenario was over iSCSI same as
the 2nd scenario, and not NFS/CIFS, I doubt the TCP stack is the
problem.  Assume the network is fine for now and concentrate on the disk
drives in the host.  That's seems the most likely cause of the problem
at this point.

BTW, you didn't state the throughput of the RAID1 device on sdb/sdc.
The RAID0 device is on the same disks, yes?  RAID0 was 15 MB/s.  What
was the RAID1?

ATM, the data is still moving back to the NAS (from the RAID1 device). According to iostat, this is reading at +30000 kB/s (all of my numbers are from iostat -x)

Also, there is no other disk usage in the system. All the data is currently on the NAS (except system "stuff" for a quite firewall)

I just spotted another thing, the two drives are on the same SATA controller, from rescan-scsi-bus:

Scanning for device 3 0 0 0 ...
OLD: Host: scsi3 Channel: 00 Id: 00 Lun: 00
      Vendor: ATA      Model: WDC WD20EARX-008 Rev: 51.0
      Type:   Direct-Access                    ANSI SCSI revision: 05
Scanning for device 3 0 1 0 ...
OLD: Host: scsi3 Channel: 00 Id: 01 Lun: 00
      Vendor: ATA      Model: WDC WD20EARX-008 Rev: 51.0
      Type:   Direct-Access                    ANSI SCSI revision: 05

Would it be better to move these apart ? I remember IDE used to have this issue, but I also recall SATA "fixed" that.

Thanks again,

Pieter
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux