Yeah, I echoed 1 into writeback before doing the test. And why wouldn't I get any cache hits? On Fri, Dec 16, 2011 at 11:52 AM, Kent Overstreet <koverstreet@xxxxxxxxxx> wrote: > That's what you'd expect in writethrough mode when you aren't getting > any cache hits - try flipping on writeback and see what happens. > > On Fri, Dec 16, 2011 at 10:49 AM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote: >> Actually I think this IS user error. I ran a benchmark with FIO, and >> the results were practically identical with and without bcache. I >> applied the 3.1.4 kernel patch on top of your 3.1 tree, even though it >> applied cleanly I'm guessing that wiped something out. Here are my >> stats after running the benchmark on bcache, and also included is the >> fio config. >> >> bypassed 32.1G >> cache_bypass_hits 5482 >> cache_bypass_misses 194862 >> cache_hit_ratio 3 >> cache_hits 786 >> cache_miss_collisions 206 >> cache_misses 19447 >> cache_readaheads 0 >> >> [global] >> ioengine=libaio >> iodepth=4 >> invalidate=1 #make sure we're not cached locally >> direct=1 #don't use buffers during test (test without local caches) >> thread >> ramp_time=20 >> time_based >> runtime=180 >> >> [8RandomReadWriters] >> rw=randrw >> numjobs=8 >> blocksize=4k >> size=1G >> >> [2SequentialReadWriters] >> rw=rw >> numjobs=2 >> size=4G >> blocksize_range=64k-1M >> >> >> On Thu, Dec 15, 2011 at 9:28 PM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote: >>> Thanks! I'll put it through some more tests. I kind of figured that >>> something more real-world would help. >>> >>> On Thu, Dec 15, 2011 at 7:17 PM, Kent Overstreet <koverstreet@xxxxxxxxxx> wrote: >>>> Sorry, I was thinking about that issue for awhile and then I got distracted... >>>> >>>> It's not user error, it's an irritating corner case. Basically, it's >>>> the result of a workaround for a particularly obscure data corruption >>>> bug. >>>> >>>> If a write bypasses the cache, it has to invalidate that region of the >>>> cache; the null key it leaves in the cache will block cache misses >>>> from adding that data to the cache until the btree node fills up (and >>>> possibly splits). >>>> >>>> It hasn't been an issue for us in normal operation, but when you're >>>> just testing - i.e. you don't have much load - that node split may not >>>> happen for a long time, and so if for some reason a bunch of data >>>> bypassed the cache... well, you see what happens. >>>> >>>> Unfortunately a better solution to the original race is not going to >>>> be simple, so it's probably not going to be done in the very near >>>> future. It's a _very_ difficult race to hit, but in the meantime I'd >>>> rather lose performance than corrupt data. >>>> >>>> But the good news is if you put normal server-ish load on it the issue >>>> should go away in steady state operation. >>>> >>>> On Thu, Dec 15, 2011 at 3:40 PM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote: >>>>> Any ideas on this? Do you think it's a bug, or am I just holding it wrong? :-) >>>>> >>>>> On Sat, Dec 10, 2011 at 8:02 AM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote: >>>>>> That keeps the 'bypassed' value from increasing, but it doesn't change >>>>>> write performance. >>>>>> >>>>>> BEFORE: >>>>>> [root@sansrv2-10 stats_day]# cat * >>>>>> 27.6M >>>>>> 83 >>>>>> 3500 >>>>>> 0 >>>>>> 166 >>>>>> 24380 >>>>>> 40660 >>>>>> 0 >>>>>> >>>>>> ...benchmarking... >>>>>> >>>>>> AFTER: >>>>>> >>>>>> [root@sansrv2-10 stats_day]# for i in `ls`; do echo -n "$i "; cat $i; >>>>>>> done 2>/dev/null >>>>>> bypassed 27.6M >>>>>> cache_bypass_hits 83 >>>>>> cache_bypass_misses 3500 >>>>>> cache_hit_ratio 0 >>>>>> cache_hits 410 >>>>>> cache_miss_collisions 48879 >>>>>> cache_misses 80545 >>>>>> cache_readaheads 0 >>>>>> >>>>>> /sys/fs/bcache/60da061c-d646-4ebe-931a-d8580add411d >>>>>> >>>>>> average_key_size 0 >>>>>> block_size 2.0k >>>>>> btree_cache_size 3.2M >>>>>> bucket_size 1.0M >>>>>> cache_available_percent 100 >>>>>> clear_stats congested 0 >>>>>> congested_threshold_us 0 >>>>>> dirty_data 0 >>>>>> io_error_halflife 0 >>>>>> io_error_limit 8 >>>>>> root_usage_percent 0 >>>>>> synchronous 1 >>>>>> tree_depth 1 >>>>>> >>>>>> >>>>>> On Fri, Dec 9, 2011 at 11:33 PM, Kent Overstreet >>>>>> <kent.overstreet@xxxxxxxxx> wrote: >>>>>>> On Fri, Dec 09, 2011 at 10:09:55AM -0700, Marcus Sorensen wrote: >>>>>>>> Here's some more info. I'm running kernel 3.1.4. When I do random >>>>>>>> writes, the 'bypassed' number increases in stats. Now I'm random >>>>>>>> writing direct to /dev/bcache0 and get the same result. >>>>>>> >>>>>>> Weird. From what you're describing it sounds like throttling is screwed >>>>>>> up (and it was recently), but I can't reproduce it now. >>>>>>> >>>>>>> Can you try echoing 0 to congested_threshold_us in the cache set dir, >>>>>>> and seeing if that fixes it? >>>>>>> >>>>>>>> There also seems to be some work needed with clean-up, since I'm >>>>>>>> unfamiliar with how bcache works I attempted to make-bcache twice, >>>>>>>> thinking I'd start over. That worked, but because my cache device was >>>>>>>> already registered I was unable to re-register my newly formatted >>>>>>>> cache dev, got "kobject_add_internal failed for bcache with -EEXIST, >>>>>>>> don't try to register things with the same name in the same >>>>>>>> directory." I was still able to use my cache device via the old uuid, >>>>>>>> but this will probably cause problems on reboot. Perhaps an unregister >>>>>>>> file in /sys/fs/bcache would help, I also tried rmmod'ing bcache to >>>>>>>> see if I could clear /sys/fs/bcache, but no luck. make-bcache should >>>>>>>> perhaps check for an existing superblock, ask for confirmation, and >>>>>>>> give some sort instruction on how to unregister, or do it for you if >>>>>>>> you reformat. >>>>>>> >>>>>>> Yeah, I think for some reason bcache isn't opening the devices >>>>>>> exclusively on 3.1. I'll have a look... >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in >>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html