That's what you'd expect in writethrough mode when you aren't getting any cache hits - try flipping on writeback and see what happens. On Fri, Dec 16, 2011 at 10:49 AM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote: > Actually I think this IS user error. I ran a benchmark with FIO, and > the results were practically identical with and without bcache. I > applied the 3.1.4 kernel patch on top of your 3.1 tree, even though it > applied cleanly I'm guessing that wiped something out. Here are my > stats after running the benchmark on bcache, and also included is the > fio config. > > bypassed 32.1G > cache_bypass_hits 5482 > cache_bypass_misses 194862 > cache_hit_ratio 3 > cache_hits 786 > cache_miss_collisions 206 > cache_misses 19447 > cache_readaheads 0 > > [global] > ioengine=libaio > iodepth=4 > invalidate=1 #make sure we're not cached locally > direct=1 #don't use buffers during test (test without local caches) > thread > ramp_time=20 > time_based > runtime=180 > > [8RandomReadWriters] > rw=randrw > numjobs=8 > blocksize=4k > size=1G > > [2SequentialReadWriters] > rw=rw > numjobs=2 > size=4G > blocksize_range=64k-1M > > > On Thu, Dec 15, 2011 at 9:28 PM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote: >> Thanks! I'll put it through some more tests. I kind of figured that >> something more real-world would help. >> >> On Thu, Dec 15, 2011 at 7:17 PM, Kent Overstreet <koverstreet@xxxxxxxxxx> wrote: >>> Sorry, I was thinking about that issue for awhile and then I got distracted... >>> >>> It's not user error, it's an irritating corner case. Basically, it's >>> the result of a workaround for a particularly obscure data corruption >>> bug. >>> >>> If a write bypasses the cache, it has to invalidate that region of the >>> cache; the null key it leaves in the cache will block cache misses >>> from adding that data to the cache until the btree node fills up (and >>> possibly splits). >>> >>> It hasn't been an issue for us in normal operation, but when you're >>> just testing - i.e. you don't have much load - that node split may not >>> happen for a long time, and so if for some reason a bunch of data >>> bypassed the cache... well, you see what happens. >>> >>> Unfortunately a better solution to the original race is not going to >>> be simple, so it's probably not going to be done in the very near >>> future. It's a _very_ difficult race to hit, but in the meantime I'd >>> rather lose performance than corrupt data. >>> >>> But the good news is if you put normal server-ish load on it the issue >>> should go away in steady state operation. >>> >>> On Thu, Dec 15, 2011 at 3:40 PM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote: >>>> Any ideas on this? Do you think it's a bug, or am I just holding it wrong? :-) >>>> >>>> On Sat, Dec 10, 2011 at 8:02 AM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote: >>>>> That keeps the 'bypassed' value from increasing, but it doesn't change >>>>> write performance. >>>>> >>>>> BEFORE: >>>>> [root@sansrv2-10 stats_day]# cat * >>>>> 27.6M >>>>> 83 >>>>> 3500 >>>>> 0 >>>>> 166 >>>>> 24380 >>>>> 40660 >>>>> 0 >>>>> >>>>> ...benchmarking... >>>>> >>>>> AFTER: >>>>> >>>>> [root@sansrv2-10 stats_day]# for i in `ls`; do echo -n "$i "; cat $i; >>>>>> done 2>/dev/null >>>>> bypassed 27.6M >>>>> cache_bypass_hits 83 >>>>> cache_bypass_misses 3500 >>>>> cache_hit_ratio 0 >>>>> cache_hits 410 >>>>> cache_miss_collisions 48879 >>>>> cache_misses 80545 >>>>> cache_readaheads 0 >>>>> >>>>> /sys/fs/bcache/60da061c-d646-4ebe-931a-d8580add411d >>>>> >>>>> average_key_size 0 >>>>> block_size 2.0k >>>>> btree_cache_size 3.2M >>>>> bucket_size 1.0M >>>>> cache_available_percent 100 >>>>> clear_stats congested 0 >>>>> congested_threshold_us 0 >>>>> dirty_data 0 >>>>> io_error_halflife 0 >>>>> io_error_limit 8 >>>>> root_usage_percent 0 >>>>> synchronous 1 >>>>> tree_depth 1 >>>>> >>>>> >>>>> On Fri, Dec 9, 2011 at 11:33 PM, Kent Overstreet >>>>> <kent.overstreet@xxxxxxxxx> wrote: >>>>>> On Fri, Dec 09, 2011 at 10:09:55AM -0700, Marcus Sorensen wrote: >>>>>>> Here's some more info. I'm running kernel 3.1.4. When I do random >>>>>>> writes, the 'bypassed' number increases in stats. Now I'm random >>>>>>> writing direct to /dev/bcache0 and get the same result. >>>>>> >>>>>> Weird. From what you're describing it sounds like throttling is screwed >>>>>> up (and it was recently), but I can't reproduce it now. >>>>>> >>>>>> Can you try echoing 0 to congested_threshold_us in the cache set dir, >>>>>> and seeing if that fixes it? >>>>>> >>>>>>> There also seems to be some work needed with clean-up, since I'm >>>>>>> unfamiliar with how bcache works I attempted to make-bcache twice, >>>>>>> thinking I'd start over. That worked, but because my cache device was >>>>>>> already registered I was unable to re-register my newly formatted >>>>>>> cache dev, got "kobject_add_internal failed for bcache with -EEXIST, >>>>>>> don't try to register things with the same name in the same >>>>>>> directory." I was still able to use my cache device via the old uuid, >>>>>>> but this will probably cause problems on reboot. Perhaps an unregister >>>>>>> file in /sys/fs/bcache would help, I also tried rmmod'ing bcache to >>>>>>> see if I could clear /sys/fs/bcache, but no luck. make-bcache should >>>>>>> perhaps check for an existing superblock, ask for confirmation, and >>>>>>> give some sort instruction on how to unregister, or do it for you if >>>>>>> you reformat. >>>>>> >>>>>> Yeah, I think for some reason bcache isn't opening the devices >>>>>> exclusively on 3.1. I'll have a look... >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html