What is the best way to combine 4 HDD's and 2 SSD's into a single filesystem?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



­I have 4 slow, loud, big, power hungry and old hard drives, and 2 SSD's. I'm trying to come up with a way to combine them into a system that has the following characteristics:

A) The hard drives stop spinning 5 minutes after they have been used.
B) The SSD's are used for read and write caching. Writes to the system are absorbed by the SSD's. Only when the ssd's are full of dirty data, then the hard drives are woken up. (This means the SSD's contain dirty data for potentially a long time.)
C) When data is requested that's not present on the SSD's (a read cache miss), then the hard drive which has that data is woken up.
D) When a hard drive is woken up as a result of a read cache miss, then the SSD's write out the dirty data to that drive.
E) If one drive fails, or starts to produce random data, the system must return the correct data to the user.

First idea is to use this stack of bcache and btrfs:
+--------------------------------------------+--------------+
|               btrfs raid 1 (2 copies) /mnt                |
+--------------+--------------+--------------+--------------+
| /dev/bcache0 | /dev/bcache1 | /dev/bcache2 |/dev/bcache3  |
+--------------+--------------+--------------+--------------+
|                       Cache (SSD)                         |
|                       /dev/sda4                           |
+--------------+--------------+--------------+--------------+
| Data HDD     | Data HDD     | Data HDD     |Data HDD      |
| /dev/sda8    | /dev/sda9    | /dev/sda10   |/dev/sda11    |
+--------------+--------------+--------------+--------------+
The good:
Btrfs in raid 1 is able to handle a failing hard drive, both when it failed completely, and when it corrupts data.
Bcache is capable of using an ssd to cache the read and the write requests from btrfs. 
The not-so-good:
Bcache can only use one SSD, so using bcache is only possible as read cache in order to achieve characteristic E, but this prevents characteristic B to be achieved.
I can't get bcache to read-ahead the data that is adjacent to the data that has just been accessed.

Second idea is to use a SSD in front of each hard drive:
+-----------------------------------------------------------+
|                btrfs raid 1 (2 copies) /mnt               |
+--------------+--------------+--------------+--------------+
| /dev/bcache0 | /dev/bcache1 | /dev/bcache2 | /dev/bcache3 |
+--------------+--------------+--------------+--------------+
| Cache SSD    |  Cache SSD   |  Cache SSD   |   Cache SSD  | 
| /dev/sda5    | /dev/sda6    | /dev/sda7    | /dev/sda8    |
+--------------+--------------+--------------+--------------+
| Data         | Data         | Data         | Data         |
| /dev/sda9    | /dev/sda10   | /dev/sda11   |/dev/sda12    |
+--------------+--------------+--------------+--------------+
The good:
This setup achieves all characteristics I'm after 
The not-so-good:
This requires more SSD's and more (SATA) ports than I have.
I can't get bcache to read-ahead the data that is adjacent to the data that has just been accessed.

Third idea is to use mdadm to create a raid 0 array out of the 2 SSD's to create a fault tolerant write cache:
+-----------------------------------------------------------+
|                 btrfs raid 1 (2 copies) /mnt              |
+--------------+--------------+--------------+--------------+
| /dev/bcache0 | /dev/bcache1 | /dev/bcache2 |/dev/bcache3  |
+--------------+--------------+--------------+--------------+
|                      bcache Cache                         |
|                         /dev/md0                          |
+-----------------------------------------------------------+
|               mdadm raid 0 array /dev/md0                 |
|             SSD /dev/sda4 and SSD /dev/sda5               |    
+--------------+--------------+--------------+--------------+
| Data         | Data         | Data         | Data         |
| /dev/sda9    | /dev/sda10   | /dev/sda11   |/dev/sda12    |
+--------------+--------------+--------------+--------------+
The good:
This setup is capable of achieving all characteristics I'm after. It can handle abrupt failure of a single drive.
The not-so-good:
When one of the SSD's start to produce random data, mdadm is not able to know what SSD produces correct data, and data is lost. (both copies of the data btrfs is trying to write to underlying storage are on the 2 SSD's.

Fourth idea is to use dm-cache. Dm-cache can only cache one backing device, and it has no way to use 2 cache devices. 
+-----------------------------------------------------------+
|                btrfs raid 1 (2 copies) /mnt               |
+--------------+--------------+--------------+--------------+
| /dev/bcache0 | /dev/bcache1 | /dev/bcache2 | /dev/bcache3 |
+--------------+--------------+--------------+--------------+
| Cache SSD    |  Cache SSD   |  Cache SSD   |   Cache SSD  |  
| /dev/sda5    | /dev/sda6    | /dev/sda7    | /dev/sda8    |
+--------------+--------------+--------------+--------------+
| Data         | Data         | Data         | Data         |
| /dev/sda9    | /dev/sda10   | /dev/sda11   |/dev/sda12    |
+--------------+--------------+--------------+--------------+
The good:
This setup is capable of achieving all characteristics I'm after.
The not-so-good:
This requires more SSD's and more (SATA) ports than I have.

What options do I have to create the desired setup?
Is it feasible to add a checksum to mdadm, much like btrfs has, so it can tell what drive (if any) has returned the correct data?

Is this the correct mailing list to ask these questions?

---

Take your mailboxes with you. Free, fast and secure Mail & Cloud: https://www.eclipso.eu - Time to change!





[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux