On 19 July 2014 02:11, Ian Pilcher <arequipeno@xxxxxxxxx> wrote: > I just finished moving my existing Fedora 20 root filesystem onto a > bcache device (actually LVM on top of a bcache physical volume). > > The bcache cache device is /dev/sda2, a partition on my SSD; the backing > device is /dev/md126p5, a partition on an Intel RAID (imsm) volume. > > This configuration only boots successfully about 50% of the time. The > other 50% of the time, the bcache device is not created, and dracut > times out and dumps me into an emergency shell. > > After changing the bcache-register script to use /sys/fs/bcache/register > (instead of register_quiet), I see a "device busy" error when udev > attempts to register the backing device: > > [ 2.105581] bcache: register_bcache() error opening /dev/md126p5: > device busy > > This is kernel 3.5.15, so this doesn't mean that the device is already > registered; something else has it (temporarily) opened. I say that it's > opened temporarily, because I am able to register the backing device > manually from the dracut shell -- which starts the the bcache device. > > Looking at /usr/lib/udev/bcache-register and the bcache_register source > in drivers/md/bcache/super.c, I notice 2 things. > > (1) bcache-register gives up immediately when an error occurs because of > a (possibly temporary) conflict. > > (2) Although the driver logs a different message in the already > registered case ("device already registered" instead of "device > busy"), it doesn't provide userspace with any way to distinguish the > two cases; it always returns -EINVAL. > > Suggested fix: > > (1) Change bcache_register to return -EBUSY in the device busy case > (while still returning -EINVAL in the already registered case). > > (2) Change bcache-register to check the exit code of the registration > attempt and retry in the EBUSY case. > > Does this make sense? > > -- > ======================================================================== > Ian Pilcher arequipeno@xxxxxxxxx > -------- "I grew up before Mark Zuckerberg invented friendship" -------- > ======================================================================== > > -- > To unsubscribe from this list: send the line "unsubscribe linux-bcache" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html Hello, I am using bcache for a boot device and have found a "rare" behaviour, too. In my case, I think bcache device is tried to be registered twice, first incorrectly (perhaps first readonly system mount?), or tries to register whole devices (in my case bcache0 is made of sda1 and sdb7) instead of partitions. It could be related to initramfs scripts. My lsblk: .# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 238,5G 0 disk ├─sda1 8:1 0 201,8G 0 part │ └─bcache0 254:0 0 898G 0 disk / ├─sda2 8:2 0 226M 0 part └─sda3 8:3 0 36,5G 0 part sdb 8:16 0 931,5G 0 disk ├─sdb1 8:17 0 1,9G 0 part ├─sdb2 8:18 0 1M 0 part ├─sdb5 8:21 0 953M 0 part /boot ├─sdb6 8:22 0 30,8G 0 part [SWAP] └─sdb7 8:23 0 898G 0 part └─bcache0 254:0 0 898G 0 disk / I tracked down the code by adding some variables log and found that flash_dev_run is sometimes called with values higher than 0x9000000000000000 and makes bcache lock system boot: Feb 26 16:40:02 minijep kernel: [ 3.062304] sd 1:0:0:0: [sda] 500118192 512-byte logical blocks: (256 GB/238 GiB) Feb 26 16:40:02 minijep kernel: [ 3.062342] sd 1:0:0:0: [sda] Write Protect is off Feb 26 16:40:02 minijep kernel: [ 3.062357] sd 1:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Feb 26 16:40:02 minijep kernel: [ 3.063053] sda: sda1 sda2 sda3 Feb 26 16:40:02 minijep kernel: [ 3.063305] sd 1:0:0:0: [sda] Attached SCSI disk Feb 26 16:40:02 minijep kernel: [ 3.065135] sr0: scsi3-mmc drive: 24x/24x writer dvd-ram cd/rw xa/form2 cdda tray Feb 26 16:40:02 minijep kernel: [ 3.065136] cdrom: Uniform CD-ROM driver Revision: 3.20 Feb 26 16:40:02 minijep kernel: [ 3.065351] sd 4:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB) Feb 26 16:40:02 minijep kernel: [ 3.065355] sd 4:0:0:0: [sdb] 4096-byte physical blocks Feb 26 16:40:02 minijep kernel: [ 3.065394] sd 4:0:0:0: [sdb] Write Protect is off Feb 26 16:40:02 minijep kernel: [ 3.065409] sd 4:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Feb 26 16:40:02 minijep kernel: [ 3.066631] sd 1:0:0:0: Attached scsi generic sg0 type 0 Feb 26 16:40:02 minijep kernel: [ 3.066669] sr 2:0:0:0: Attached scsi generic sg1 type 5 Feb 26 16:40:02 minijep kernel: [ 3.066710] sd 4:0:0:0: Attached scsi generic sg2 type 0 Feb 26 16:40:02 minijep kernel: [ 3.144996] bcache: bch_cache_set_alloc() P7000: bch_cache_set_alloc: c->nr_uuids 4096 Feb 26 16:40:02 minijep kernel: [ 3.145035] bio: create slab <bio-1> at 1 Feb 26 16:40:02 minijep kernel: [ 3.357056] sdb: sdb1 sdb2 sdb5 sdb6 sdb7 Feb 26 16:40:02 minijep kernel: [ 3.357972] sd 4:0:0:0: [sdb] Attached SCSI disk Feb 26 16:40:02 minijep kernel: [ 3.386779] bcache: uuid_io() read UUIDs at 0:0 len 1024 -> [0:60275712 gen 0] Feb 26 16:40:02 minijep kernel: [ 3.499528] Switched to clocksource tsc Feb 26 16:40:02 minijep kernel: [ 3.928562] bcache: bch_journal_replay() journal replay done, 354 keys in 31 entries, seq 1076068 Feb 26 16:40:02 minijep kernel: [ 3.928717] bcache: _debug_show_uuids_of_cache_set() probes17-302: flash_dev_run: u ffff880801300000 c->nr_uuids 1000 u + c->nr_uuids ffff880801380000, sizeof(struct uuid_entry) 128, c->nr_uuids*sizeof(struct uuid_entry) 524288 Feb 26 16:40:02 minijep kernel: [ 3.928720] bcache: _debug_show_uuids_of_cache_set() probes17-9999: flash_dev_run: u ffff880801300000 c->nr_uuids 1000 u + c->nr_uuids ffff880801380000, sizeof(struct uuid_entry) 128, c->nr_uuids*sizeof(struct uuid_entry) 524288 Feb 26 16:40:02 minijep kernel: [ 3.928723] bcache: flash_devs_run() P00: flash_dev_run: uuids[0] u: ffff880801300000, u->sectors: 1504f6b8 Feb 26 16:40:02 minijep kernel: [ 3.928724] bcache: flash_devs_run() P00: flash_dev_run: uuids[1] u: ffff880801300080, u->sectors: aed46001 Feb 26 16:40:02 minijep kernel: [ 3.928726] bcache: flash_devs_run() P00: flash_dev_run: uuids[2] u: ffff880801300100, u->sectors: 9000000000800000 Feb 26 16:40:02 minijep kernel: [ 3.928727] bcache: flash_devs_run() P00: flash_dev_run: uuids[3] u: ffff880801300180, u->sectors: 15050838 Feb 26 16:40:02 minijep kernel: [ 3.928729] bcache: flash_devs_run() P00: flash_dev_run: uuids[4] u: ffff880801300200, u->sectors: 13e9fe800 Feb 26 16:40:02 minijep kernel: [ 3.928730] bcache: flash_devs_run() P00: flash_dev_run: uuids[5] u: ffff880801300280, u->sectors: 9000000040000000 Feb 26 16:40:02 minijep kernel: [ 3.928732] bcache: flash_devs_run() P00: flash_dev_run: uuids[6] u: ffff880801300300, u->sectors: 15051ff0 Feb 26 16:40:02 minijep kernel: [ 3.928733] bcache: flash_devs_run() P00: flash_dev_run: uuids[7] u: ffff880801300380, u->sectors: 13ec38800 Feb 26 16:40:02 minijep kernel: [ 3.928734] bcache: flash_devs_run() P00: flash_dev_run: uuids[8] u: ffff880801300400, u->sectors: 9000000037000000 Feb 26 16:40:02 minijep kernel: [ 3.928736] bcache: flash_devs_run() P00: flash_dev_run: uuids[9] u: ffff880801300480, u->sectors: 1505a648 Feb 26 16:40:02 minijep kernel: [ 3.928738] bcache: flash_devs_run() P00: flash_dev_run: uuids[10] u: ffff880801300500, u->sectors: 948d8800 Feb 26 16:40:02 minijep kernel: [ 3.928740] bcache: flash_devs_run() P00: flash_dev_run: uuids[11] u: ffff880801300580, u->sectors: 9000000002000000 Feb 26 16:40:02 minijep kernel: [ 3.928741] bcache: flash_devs_run() P00: flash_dev_run: uuids[12] u: ffff880801300600, u->sectors: 15400798 Feb 26 16:40:02 minijep kernel: [ 3.928743] bcache: flash_devs_run() P00: flash_dev_run: uuids[13] u: ffff880801300680, u->sectors: 2da2c000 Feb 26 16:40:02 minijep kernel: [ 3.928744] bcache: flash_devs_run() P00: flash_dev_run: uuids[14] u: ffff880801300700, u->sectors: 9000000000800000 Those enormous values (incorrectly read from device?) make no sense, compared to normal values. That is why, in order to be able to boot, I add an if filter in flash_devs_run() in super.c: -------in super.c--------------------------------------------- static int flash_devs_run(struct cache_set *c) { int ret = 0; struct uuid_entry *u; for (u = c->uuids; u < c->uuids + c->nr_uuids && !ret; u++) if (u->sectors < 0x9000000000000000) //line added to be able to root boot! if (UUID_FLASH_ONLY(u)) ret = flash_dev_run(c, u); return ret; } ------------------------------------------------------------- Even with this "patch", system boots but it has a 30 second wait during boot process. See wait from time 5.7 to 35.66 log lines: Feb 27 21:05:33 minijep kernel: [ 5.758655] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off Feb 27 21:05:33 minijep kernel: [ 35.663535] Adding 32225276k swap on /dev/sdb6. Priority:-1 extents:1 across:32225276k Feb 27 21:05:33 minijep kernel: [ 35.686366] EXT4-fs (bcache0): re-mounted. Opts: (null) Feb 27 21:05:33 minijep kernel: [ 35.796078] EXT4-fs (bcache0): re-mounted. Opts: errors=remount-ro Feb 27 21:05:33 minijep kernel: [ 36.026857] fuse init (API version 7.22) Feb 27 21:05:33 minijep kernel: [ 36.035753] loop: module loaded -- -- Salutacions...Josep -- -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html