Re: 4x lower IOPS: Linux MD vs indiv. devices - why?

Tobias Oberstein <tobias.oberstein@xxxxxxxxx> · Tue, 24 Jan 2017 10:28:05 +0100

My current preliminary conclusions on this box / workload:

- running psync is much better than sync

So you likely have a convincing case for Postgres guys to switch over to
pread/pwrite.

I will approach them, but I want to make sure I did all my homework first.

One question that bugs me:

the difference in performance between sync and psync engines only 
surface with MD, _not_ when running over individual devices.

---

I ran Linux perf with these results:

https://github.com/oberstet/scratchbox/blob/master/cruncher/sync-engines-perf/individual-nvmes-sync.md

https://github.com/oberstet/scratchbox/blob/master/cruncher/sync-engines-perf/individual-nvmes-psync.md

https://github.com/oberstet/scratchbox/blob/master/cruncher/sync-engines-perf/md-nvmes-sync.md

https://github.com/oberstet/scratchbox/blob/master/cruncher/sync-engines-perf/md-nvmes-psync.md

---

md-nvmes-sync shows the "issue":

Overhead  Command  Shared Object       Symbol
  73.48%  fio      [kernel.kallsyms]   [k] osq_lock

So while I think it would be good in general if PostgreSQL used 
pread/pwrite instead of lseek/read/write when available, I am afraid 
there might be a bottleneck in MD.

What do you think?

And if so, where should I raise this rgd MD? I have no clue where the 
hackers of MD hang out ..

Cheers,
/Tobias
--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html