On 29/01/2020 17:14, Luboš Luňák wrote:
On Wednesday 29 of January 2020, Stephan Bergmann wrote:
On 29/01/2020 15:20, Luboš Luňák wrote:
Which is the assumption. Using o3tl::make_signed would not require such
an assumption, or it would be even less likely false.
But a precondition-free o3tl::make_signed would need to map an unsigned
type to a wider signed types, which need not exist.
That wider type would need to be larger than 63bits, which is a value so
large that it's extremely unlikely we'd need it in practice any time soon. I
realize I'm now in the territory of "640K ought to be enough", but seriously,
in which realistic scenario is a precise representation of
9223372036854775807 insufficient but 18446744073709551615 will still do?
This part is about the meaning of the highest bit, and what I'm saying here
is that in signed/unsigned comparisons you're still more likely in practice
to encounter the highest bit set in a signed type than in unsigned. People
are still more likely to mess up the >=0 assumption than compare with an
unsigned value that has the highest bit set.
The "if large enough" is the hard part.
Only theoretically.
Why resort to code that likely works, when we can write code that is
guaranteed to work?
Exactly my point. It's just that you seem to find it guaranteed that people
won't mess up range checks and only likely there won't be titanically huge
files/allocations/containers, and I see it the other way around. So far I've
definitely seen more often somebody get >=0 wrong than I've seen 8 exabytes
of anything.
My point is that, for e1 of signed type S1 (where U1 is the unsigned
counterpart) and e2 of unsigned type U2 (where S2 is the signed
counterpart),
e1 < 0 || U1(e1) < e2 // (*)
is guaranteed to work for all types S1 and U2 and all values of e1 and
e2, while
e1 < S2(e2)
is not. My point has nothing to do with people writing broken code, or
how to prevent them from doing so.
It is just that for the task "compare a signed e1 against an unsigned
e2", (*) is the tool I at least reach for (naturally; without much of a
second thought, actually). And it has in fact been used all over the LO
code base, and the newly introduced o3tl::make_unsigned merely helps
write it in a better way (by not having to spell out U1). This is
orthogonal to the observation that signed APIs may be better than
unsigned ones.
For reference, both Java's and C#'s List classes use int for size and
index types, and use long for file size and position types. Apparently it
does the job.
Sure. But what we are faced with here are C/C++ APIs that use unsigned
types, and we have to interoperate with those.
Sure. 'o3tl::make_signed(l.size())' wherever needed. Done. It'll generally
need to go next to the place where we'd need to write the make_unsigned()
variant anyway.
My point was just that the highest bit in size_t is practically irrelevant.
C++20 will have ssize for containers (see
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1227r2.html>
"P1227: Signed ssize() functions, unsigned size() functions (Revision
2)"). Using it would probably help remove a large chunk of
signed/unsigned mixture in existing LO code.
(I'm fine with using it, as, sure, for containers it is clear that
restricting maximum size to no larger than size_t/2 is feasible. What I
dislike is a helper function mapping from an arbitrary unsigned type to
its signed counterpart pretending to be a total function.)
You mean, an o3tl::make_signed that maps from an unsigned type to the
signed type of the same rank? What would its precondition be, require
that the given value is sufficiently small? Typically not being able to
guarantee that statically, code would then need to first check for '<=
std::numeric_limits<T>::max()' before being able to call o3tl::make_signed?
It's the same whether it's make_signed() or make_unsigned().
Also, it seems to me that you make the mistake of assuming that using an
unsigned type actually guarantees you anything (the "semantically makes
sense" mistake I mentioned before). You can as easily "underflow" unsigned as
you can overflow signed.
I still fail to see how converting from unsigned to signed is generally
possible, leave alone safe.
I can write the same about the other direction. The difference is that
signed->unsigned cuts off values that are realistic and unsigned->signed cuts
off values that are pretty much unrealistic.
What exactly is the base for your claim that signed->unsigned is better than
unsigned->signed?
If you only use unsigned->signed, the equivalent of (*) would be
something like
e2 > std::numeric_limits<S1>::max() || e1 < S1(e2)
which is why I think signed->unsigned is a better building block.
_______________________________________________
LibreOffice mailing list
LibreOffice@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/libreoffice