Re: Caching policy in machine learning context

Zdenek Kabelac <zkabelac@redhat.com> · Thu, 16 Feb 2017 11:29:53 +0100

Dne 15.2.2017 v 14:30 Jonas Degrave napsal(a):
Thanks, I tried your suggestions, and tried going back to the mq policy and
play with those parameters. In the end, I tried:

    lvchange --cachesettings 'migration_threshold=20000000
    sequential_threshold=10000000 read_promote_adjustment=1
    write_promote_adjustment=4' VG

With little success. This is probably due to the mq-policy looking only at the
hit-count, rather than the hit-rate. Or at least, that is what I make up from
line 595 in the code:
http://lxr.free-electrons.com/source/drivers/md/dm-cache-policy-mq.c?v=3.19#L595

I wrote a small script, so my users could empty the cache manually, if they
want to:

    #!/bin/bash
    if [ "$(id -u)" != "0" ]; then
       echo "This script must be run as root" 1>&2
       exit 1
    fi
    lvremove -y VG/lv_cache
    lvcreate -L 445G -n lv_cache VG /dev/sda
    lvcreate -L 1G -n lv_cache_meta VG /dev/sda
    lvconvert -y --type cache-pool --poolmetadata VG/lv_cache_meta VG/lv_cache
    lvchange --cachepolicy smq VG
    lvconvert --type cache --cachepool VG/lv_cache VG/lv

So, the only remaining option for me, would to write my own policy. This
should be quite simple, as you basically need to act as if the cache is not
full yet.

Can someone point me in the right direction as to how to do this? I have tried
to find the last version of the code, but the best I could find was a redhat
CVS-server which times out when connecting.

    cvs -d :pserver:cvs@sources.redhat.com:/cvs/dm login cvs
    CVS password:
    cvs [login aborted]: connect to sources.redhat.com
    <http://sources.redhat.com>(209.132.183.64):2401 failed: Connection timed out

 Can someone direct me to the latest source of the smq-policy?

Hi

Yep - it does look like you have some special use-case where you know 'ahead 
of time' what's the usage pattern going to be.

'smq' policy is targeted to rather 'slowly' fill over the time with 'more time 
permanent data' which are known to be kept used over and over - so i.e. after 
reboot there is large chance you will need them again.

But in your case it seems you need a policy which fills very quickly with 
current set of date - i.e. some sore of  page-cache extension.

So to get to the source:

https://github.com/torvalds/linux/blob/master/drivers/md/dm-cache-policy-smq.c

relatively 'small' piece of code - by may take a while to get to it as you 
need to fit within policy rules - there is certain limited amount of data you 
may keep with cached data block and some others...

Once you get new dm caching policy loaded - lvm2 should be able to use it,
as  cache_policy & cache_settings  are 'free-from' strings.

For 4.12 kernel (likely) there is going to be new 'cache2-like' which should 
be match faster with startup...    but likely it may or may not solve your 
special 100GB workload.

Regards

Zdenek

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/