Dave Chinner wrote:
As most users never have things go wrong, all they think is "CRCs
are unnecessary overhead". It's just like backups - how many people
don't make backups because they cost money right now and there's no
tangible benefit until something goes wrong which almost never
happens?
----
But it's not like backups. You can't run a util
program upon discovering bad CRC's that will fix the file system
because the file system is no longer usable. That means you
have to restore from backup. Thus, for those keeping
backups, there is no benefit, as they'll have to restore
from backups in either case.
Exactly my point. Humans are terrible at risk assessment and
mitigation because most people are unaware of the unconcious
cognitive biases that affect this sort of decision making.
---
My risk is near 0 since my file systems are monitored
by a raid controller with read patrols made over the data on
a period basis. I'll assert that the chance of data randomly
going corrupt is much higher because there is ALOT more data than
metadata. On top of that, because I keep backups, my risk, is
at worst, the same without crc's as with them.
It actually slows down
allocation on an empty filesystem and trades off that increase in
"empty filesystem" overhead for significantly better aged filesystem
inode allocation performance.
----
Ok, let's see, ages of my file systems:
4 from 2009 (7 years)
1 from 2013 (3 years)
9 from 2014 (2 years)
---
I don't think I have any empty or new filesystems
(FWIW, I store the creation time in the UUID).
i.e. the finobt provides more
deterministic inode allocation overhead, not "faster" allocation.
Let me demonstrate with some numbers on empty filesystem create
rate:
create rate sys CPU time write rate
(files/s) (seconds) (MB/s)
crc = 0, finobt = 0: 238943 2629 ~200
crc = 1, finobt = 0: 231582 2711 ~40
crc = 1, finobt = 1: 232563 2766 ~40
*hacked* crc disable: 231435 2789 ~40
We can see that the system CPU time increased by 3.1% with the
"addition of CRCs". The CPU usage increases by a further 2% with
the addition of the free inode btree,
---
On an empty file system or older ones that are >50%
used? It's *nice* to be able to benchmarks, but not allowing
crc to be disabled, disables that possibility -- and that's
sorta the point. In order to prove you point, you created a
benchmark with crc's disabled. But the thing about benchmarks
is making so others can reproduce your results. That's
the problem. If I could do the same benchmarks, and get
similar results, I'd give up as finobt not being worth it.
But I'm not able to run such tests on my workload
and/or filesystems. The common advice about performance numbers
and how they are affected by options is to do benchmarks
on your own systems with your own workload and see if the option
helps. That's what I want to do. Why deny that?
which should give you an idea
of how much CPU time even a small btree consumes.
---
In a non-real-world case on empty file systems. How
does it work in the real world on file systems like mine?
I know the MB/s isn't close, w/my max sustained I/O rates
being about 1GB/s (all using direct i/o -- rate drops
significantly if I use kernel buffering). Even not
pre-allocating and defragmenting the test file will noticeable
affect I/O rates.
Showing the result on an empty file
system is when finobt would have the *least* affect, since
it is when the kernel has to search for space that things
slow down, but if the free space is pre-allocated in a
dedicated b-tree, then the kernel doesn't have to search --
which would be a much bigger difference than on an empty
file system.
The allocated
inode btree is huge in comparison to the finobt in this workload,
which is why even a small change in header size (when CRCs are
enabled) makes a large difference in CPU usage.
To verify that CRC has no significant impact on inode allocation,
let's look at the actual CPU being used by the CRC calculations in
this workload are:
0.28% [kernel] [k] crc32c_pcl_intel_update
---
And how much is spent searching for free space?
On multi-gig files it can reduces I/O rates by 30% or more.
Only a small proportion of the entire increase in CPU consumption
that comes from "turning on CRCS". Indeed, the "*hacked* CRC
disable" results are from skipping CRC calculations in the code
altogether and returning "verify ok" without calculating them. The
create rate is identical to the crc=1,finobt=1 numbers and the CPU
usage is /slightly higher/ than when CRCs are enabled.
IOWs, for most workloads CRCs have no impact on filesystem
performance.
---
Too bad no one can test such the effect on their
own workloads, though if not doing crc's takes more CPU, then
it sounds like an algorithm problem: crc calculations don't
take "negative time", and a benchmark showing they do indicates
something else is causing the slowdown.
Cheers,
Dave.
----
Sigh... and Cheers to you too! ;-)
Linda
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html