Search Postgresql Archives

Re: PostgreSQL Top 10 Wishlist

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



jnasby@xxxxxxxxxxxxx ("Jim C. Nasby") writes:
> On Mon, Jan 16, 2006 at 12:13:15PM -0500, Chris Browne wrote:
>> What you seem to be after, here, would confine your telno formatting
>> to telephone numbers for Canada and the United States, and would break
>> any time people have a need to express telephone numbers outside those
>> two countries.
>> 
>> It would be quite interesting to add an EB164 type, as it could
>> represent phone numbers considerably more compactly than is the case
>> for plain strings.  The 20 digits permissible across 1. and 2. could
>> be encoded in... 68 bits.
>
> And it would be trivial to provide functions to map that into the
> customary format for various countries. In fact, since there's 4 bits
> left over, it might be possible to encode the formatting used for the
> number in the storage itself.
>
> BTW, you sure about 68 bits? That doesn't seem to allow for a full 20
> digit number; or are there restrictions on the max value for one of the
> fields?

Hmm.

 [3]> (let ((ttl 1)) (loop
           for i from 1 to 68
           do (setf ttl (* ttl 2))
           do (format t "n=~D  2^n=~D~%" i ttl)))
n=1  2^n=2
n=2  2^n=4
n=3  2^n=8
n=4  2^n=16
n=5  2^n=32
n=6  2^n=64
n=7  2^n=128
n=8  2^n=256
n=9  2^n=512
... boring bits elided :-) ...
n=60  2^n=1152921504606846976
n=61  2^n=2305843009213693952
n=62  2^n=4611686018427387904
n=63  2^n=9223372036854775808
n=64  2^n=18446744073709551616
n=65  2^n=36893488147419103232
n=66  2^n=73786976294838206464
n=67  2^n=147573952589676412928
n=68  2^n=295147905179352825856
NIL

Actually, we pass 10^20 at 2^67, so I went 1 bit too far.  We could,
in principle, get 20 digits in 67 bits.  Not that this is necessarily
the wisest move.

That's a bit over 8 bytes, which is certainly more space-efficient
than 20 bytes.  It's certainly an open question whether that
efficiency is actually worth anything.

My sense is that it would be better to use something equivalent to BCD
(which is what we do with NUMERIC), packing two digits per byte, and
have two "segments" to the telno, a country code segment, and a "local
number" segment.  Using BCD, this would occupy 10 bytes, which, by
magical happenstance, is the same size as a US/Canada phone number.

The real questions are how to deal with:

a) Parsing incoming data, since formats people use vary so inanely

   Note: I spent most of yesterday dealing with this very issue,
   writing up a total of 31 eye-destroying regular expressions to
   generate a pl/tcl function to parse cases that I had handy...  I
   daresay that even with so many regexps, I was _still_ left with a
   reasonably material number that would not be parsed...

b) What sorts of extra operators would be interesting to add in.

   Extracting country code is an obvious thing.  Applying formatting
   rules based on country code is another.
-- 
let name="cbbrowne" and tld="ntlug.org" in name ^ "@" ^ tld;;
http://www.ntlug.org/~cbbrowne/linux.html
Rules of the Evil Overlord #35. "I  will not grow a goatee. In the old
days they made  you look diabolic. Now they just make  you look like a
disaffected member of Generation X." <http://www.eviloverlord.com/>


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux