On Wed, May 29, 2019 at 5:31 AM Michal Soltys <soltys@xxxxxxxx> wrote: > > On 5/28/19 6:31 PM, Song Liu wrote: > > On Mon, May 27, 2019 at 2:46 AM Michal Soltys <soltys@xxxxxxxx> wrote: > >>>> > >>>> Question though - other than trying to add journal to existing live raid > >>>> - is this feature overall safe to use (or are there any other know > >>>> issues one should be aware of beforehand) ? > >>>> > >>> We (Facebook) have done some tests with it. However, we didn't put > >>> it into production. The reason behind this decision was not reliability, but > >>> performance concerns and high level directions. I think Redhat is > >>> evaluating it. > >>> > >> > >> Well I will give it a shot probably. My case scenario is that a bunch of > >> sync-happy VMs on top of lvm+raid seem to be crushing performance > >> (unless there are other reasons), even with very small disk usage. > >> > >> Out of curiosity - is the journal in writeback mode controllable in some > >> way (e.g. frequency of how often it flushes to raid disks, whether it's > >> space or time (or both) based ?). > > > > It is combination of both time and space: > > > > /* > > * log->max_free_space is min(1/4 disk size, 10G reclaimable space). > > * > > * In write through mode, the reclaim runs every log->max_free_space. > > * This can prevent the recovery scans for too long > > */ > > #define RECLAIM_MAX_FREE_SPACE (10 * 1024 * 1024 * 2) /* sector */ > > #define RECLAIM_MAX_FREE_SPACE_SHIFT (2) > > > > /* wake up reclaim thread periodically */ > > #define R5C_RECLAIM_WAKEUP_INTERVAL (30 * HZ) > > /* start flush with these full stripes */ > > #define R5C_FULL_STRIPE_FLUSH_BATCH(conf) (conf->max_nr_stripes / 4) > > /* reclaim stripes in groups */ > > #define R5C_RECLAIM_STRIPE_GROUP (NR_STRIPE_HASH_LOCKS * 2) > > > > However, we didn't expose knobs to tune these on a live system. > > > > Would (probably) be awesome one day to have those exposed somehow. I think Shaohua spent quite some time to make them work well without tuning. But we sure can add knobs if we see clear benefits. > > Few extra questions: > > 1) if I have journal in w-b mode, will echoing write-through to > journal_mode block until all the data is safe on the actual raid devices ? Yes, it will first flush all data in cache. > > 2) if (for any reason) I need to remove the journal device live - > assuming (1) is sufficient - is --fail & --remove the correct way to do so ? > I think the best way is to make it write-through first, then do --fail and --remove. After that, the array will be read-only. We need to reassemble it and force it to run without journal. Thanks, Song