On Sat, Dec 26, 2020 at 02:16:03PM -0800, Tim Bray wrote: > See https://twitter.com/dave_universetf/status/1342685822286360576 to which > I heartily concur. IPV6 addresses are neither easy for humans to read, nor > easy for software to parse. The specification in https://tools.ietf.org/html/rfc4291#section-2.2 is not nearly as diverse as the exotic variations in the tweet barrage. There are just 3 or 4 valid input forms (depending on whether you count the compressed and uncomressed variants of case 3 as two separate forms). > *If* someone has a better idea, there’s no good reason not to standardize > it, the old approach would still work. Does anyone have a better idea? It is far from clear that a change at this point would do more good than harm. In practice, the IPv4 forms are only occur as: ::1.2.3.4 -- https://tools.ietf.org/html/rfc4291#section-2.5.5.1 ::ffff:1.2.3.4 -- https://tools.ietf.org/html/rfc4291#section-2.5.5.2 otherwise, one only sees compressed and uncomressed hex forms, with at most 4 hex digits per group. So if one were willing to parse just the expected forms, one could parse only: - 8 colon-separated groups - n1 colon-separated groups + "::" + n2 colon-separated groups, n1 >= 0, n2 >= 0, n1 + n2 <= 7 - "::" + dotted quad IPv4 address - "::ffff:" + dotted quad IPv4 address If you see anything else, someone is straying off the beaten path. But parsing a more general 96-bit hex prefix before the final IPv4 dotted quads is both reasonable and not onerous. Same parser as the first two cases above, with 6 groups instead of 8. -- Viktor.