Junio C Hamano <gitster@xxxxxxxxx> writes: > I am not quite sure if this is way too complex (e.g. what does it > mean that we do not care much about loose-object safety while we do > care about commit-graph files?) and at the same time it is too > limited (e.g. if it makes sense to say a class of items deserve more > protection than another class of items, don't we want to be able to > say "class X is ultra-precious so use method A on them, while class > Y is mildly precious and use method B on them, everything else are > not that important and doing the default thing is just fine"). > > If we wanted to allow the "matrix" kind of flexibility,... To continue with the thinking aloud... Sometimes configuration flexibility is truly needed, but often it is just a sign of designer being lazy and not thinking it through as an end-user facing problem. In other words, "I am giving enough knobs to you, so it is up to you to express your policy in whatever way you want with the knobs provided" is a very irresponsible thing to tell end-users. And this one smells like the case of a lazy design. It may be that it makes sense in some workflows to protect commit-graph files less than object files and pack.idx files can be corrupted as long as pack.pack files are adequately protected because the former can be recomputed from the latter, but in no workflows, the reverse would be true. Yet the design gives such needless flexibility, which makes it hard for lay end-users to choose the best combination and allows them to protect .idx files more than .pack files by mistake, for example. I am wondering if the classification itself introduced by this step actually can form a natural and linear progression of safe-ness. By default, we'd want _all_ classes of things to be equally safe, but at one level down, there is "protect things that are not recomputable, but recomputable things can be left to the system" level, and there would be even riskier "protect packs as it would hurt a _lot_ to lose them, but losing loose ones will typically lose only the most recent work, and they are less valuable" level. If we, as the Git experts, spend extra brain cycles to come up with an easy to understand spectrum of performance vs durability trade-off, end-users won't have to learn the full flexibility and easily take the advice from experts. They just need to say what level of durability they want (or how much durability they can risk in exchange for an additional throughput), and leave the rest to us. On the core.fsyncMethod side, the same suggestion applies. Once we know the desired level of performance vs durability trade-off from the user, we, as the impolementors, should know the best method, for each class of items, to achieve that durability on each platform when writing it to the storage, without exposing the low level details of the implementation that only the Git folks need to be aware of. So, from the end-user UI perspective, I'd very much prefer if we can just come up with a single scalar variable, (say "fsync.durability" that ranges from "conservative" to "performance") that lets our users express the level of durability desired. The combination of core.fsyncMethod and core.fsync are one variable too many, and the latter being a variable that takes a list of things as its value makes it even worse to sell to the end users.