Re: Quick bcache benchmark

Marcus Sorensen <shadowsor@xxxxxxxxx> · Fri, 16 Dec 2011 15:45:33 -0700

Yeah, I echoed 1 into writeback before doing the test. And why
wouldn't I get any cache hits?

On Fri, Dec 16, 2011 at 11:52 AM, Kent Overstreet
<koverstreet@xxxxxxxxxx> wrote:
> That's what you'd expect in writethrough mode when you aren't getting
> any cache hits - try flipping on writeback and see what happens.
>
> On Fri, Dec 16, 2011 at 10:49 AM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote:
>> Actually I think this IS user error. I ran a benchmark with FIO, and
>> the results were practically identical with and without bcache.  I
>> applied the 3.1.4 kernel patch on top of your 3.1 tree, even though it
>> applied cleanly I'm guessing that wiped something out. Here are my
>> stats after running the benchmark on bcache, and also included is the
>> fio config.
>>
>> bypassed 32.1G
>> cache_bypass_hits 5482
>> cache_bypass_misses 194862
>> cache_hit_ratio 3
>> cache_hits 786
>> cache_miss_collisions 206
>> cache_misses 19447
>> cache_readaheads 0
>>
>> [global]
>> ioengine=libaio
>> iodepth=4
>> invalidate=1 #make sure we're not cached locally
>> direct=1 #don't use buffers during test (test without local caches)
>> thread
>> ramp_time=20
>> time_based
>> runtime=180
>>
>> [8RandomReadWriters]
>> rw=randrw
>> numjobs=8
>> blocksize=4k
>> size=1G
>>
>> [2SequentialReadWriters]
>> rw=rw
>> numjobs=2
>> size=4G
>> blocksize_range=64k-1M
>>
>>
>> On Thu, Dec 15, 2011 at 9:28 PM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote:
>>> Thanks! I'll put it through some more tests. I kind of figured that
>>> something more real-world would help.
>>>
>>> On Thu, Dec 15, 2011 at 7:17 PM, Kent Overstreet <koverstreet@xxxxxxxxxx> wrote:
>>>> Sorry, I was thinking about that issue for awhile and then I got distracted...
>>>>
>>>> It's not user error, it's an irritating corner case. Basically, it's
>>>> the result of a workaround for a particularly obscure data corruption
>>>> bug.
>>>>
>>>> If a write bypasses the cache, it has to invalidate that region of the
>>>> cache; the null key it leaves in the cache will block cache misses
>>>> from adding that data to the cache until the btree node fills up (and
>>>> possibly splits).
>>>>
>>>> It hasn't been an issue for us in normal operation, but when you're
>>>> just testing - i.e. you don't have much load - that node split may not
>>>> happen for a long time, and so if for some reason a bunch of data
>>>> bypassed the cache... well, you see what happens.
>>>>
>>>> Unfortunately a better solution to the original race is not going to
>>>> be simple, so it's probably not going to be done in the very near
>>>> future. It's a _very_ difficult race to hit, but in the meantime I'd
>>>> rather lose performance than corrupt data.
>>>>
>>>> But the good news is if you put normal server-ish load on it the issue
>>>> should go away in steady state operation.
>>>>
>>>> On Thu, Dec 15, 2011 at 3:40 PM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote:
>>>>> Any ideas on this? Do you think it's a bug, or am I just holding it wrong? :-)
>>>>>
>>>>> On Sat, Dec 10, 2011 at 8:02 AM, Marcus Sorensen <shadowsor@xxxxxxxxx> wrote:
>>>>>> That keeps the 'bypassed' value from increasing, but it doesn't change
>>>>>> write performance.
>>>>>>
>>>>>> BEFORE:
>>>>>> [root@sansrv2-10 stats_day]# cat *
>>>>>> 27.6M
>>>>>> 83
>>>>>> 3500
>>>>>> 0
>>>>>> 166
>>>>>> 24380
>>>>>> 40660
>>>>>> 0
>>>>>>
>>>>>> ...benchmarking...
>>>>>>
>>>>>> AFTER:
>>>>>>
>>>>>> [root@sansrv2-10 stats_day]#  for i in `ls`; do echo -n "$i "; cat $i;
>>>>>>> done 2>/dev/null
>>>>>> bypassed 27.6M
>>>>>> cache_bypass_hits 83
>>>>>> cache_bypass_misses 3500
>>>>>> cache_hit_ratio 0
>>>>>> cache_hits 410
>>>>>> cache_miss_collisions 48879
>>>>>> cache_misses 80545
>>>>>> cache_readaheads 0
>>>>>>
>>>>>> /sys/fs/bcache/60da061c-d646-4ebe-931a-d8580add411d
>>>>>>
>>>>>> average_key_size 0
>>>>>> block_size 2.0k
>>>>>> btree_cache_size 3.2M
>>>>>> bucket_size 1.0M
>>>>>> cache_available_percent 100
>>>>>> clear_stats congested 0
>>>>>> congested_threshold_us 0
>>>>>> dirty_data 0
>>>>>> io_error_halflife 0
>>>>>> io_error_limit 8
>>>>>> root_usage_percent 0
>>>>>> synchronous 1
>>>>>> tree_depth 1
>>>>>>
>>>>>>
>>>>>> On Fri, Dec 9, 2011 at 11:33 PM, Kent Overstreet
>>>>>> <kent.overstreet@xxxxxxxxx> wrote:
>>>>>>> On Fri, Dec 09, 2011 at 10:09:55AM -0700, Marcus Sorensen wrote:
>>>>>>>> Here's some more info. I'm running kernel 3.1.4. When I do random
>>>>>>>> writes, the 'bypassed' number increases in stats. Now I'm random
>>>>>>>> writing direct to /dev/bcache0 and get the same result.
>>>>>>>
>>>>>>> Weird. From what you're describing it sounds like throttling is screwed
>>>>>>> up (and it was recently), but I can't reproduce it now.
>>>>>>>
>>>>>>> Can you try echoing 0 to congested_threshold_us in the cache set dir,
>>>>>>> and seeing if that fixes it?
>>>>>>>
>>>>>>>> There also seems to be some work needed with clean-up, since I'm
>>>>>>>> unfamiliar with how bcache works I attempted to make-bcache twice,
>>>>>>>> thinking I'd start over. That worked, but because my cache device was
>>>>>>>> already registered I was unable to re-register my newly formatted
>>>>>>>> cache dev, got "kobject_add_internal failed for bcache with -EEXIST,
>>>>>>>> don't try to register things with the same name in the same
>>>>>>>> directory." I was still able to use my cache device via the old uuid,
>>>>>>>> but this will probably cause problems on reboot. Perhaps an unregister
>>>>>>>> file in /sys/fs/bcache would help, I also tried rmmod'ing bcache to
>>>>>>>> see if I could clear /sys/fs/bcache, but no luck. make-bcache should
>>>>>>>> perhaps check for an existing superblock, ask for confirmation, and
>>>>>>>> give some sort instruction on how to unregister, or do it for you if
>>>>>>>> you reformat.
>>>>>>>
>>>>>>> Yeah, I think for some reason bcache isn't opening the devices
>>>>>>> exclusively on 3.1. I'll have a look...
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html