Re: Re: using UID in DB

Tommy Pham <tommyhp2@xxxxxxxxx> · Wed, 31 Mar 2010 11:24:46 -0700

On Wed, Mar 31, 2010 at 11:01 AM, Nathan Rixham <nrixham@xxxxxxxxx> wrote:
> Tommy Pham wrote:
>> On Wed, Mar 31, 2010 at 4:06 AM, Nathan Rixham <nrixham@xxxxxxxxx> wrote:
>>> Tommy Pham wrote:
>>>> As for spatial data types, I've never find much use for non scientific
>>>> related.  (example) If using point as a PK, if MySQL stores it the
>>>> same way as PostgreSQL which is 16 bytes, how is that any different -
>>>> performance wise - than using UUID and storing it as binary(16) for
>>>> MySQL or uniqueidentifier (16 bytes) for PostgreSQL?
>>>>
>>> it's all about the indexing (R-Tree)
>>>
>>> http://en.wikipedia.org/wiki/R-tree
>>>
>>
>> I can see where the performance would be between B-Tree vs R-Tree for
>> the same field size but I've yet to see real life application of it.
>> Case in point, if using point for GPS data coordinates, then wouldn't
>> it still be a lot better to use Lon (float), Lat (float) which is 2
>> fields of 4 bytes each vs 1 field of 16 bytes?  The index would still
>> be faster (8 bytes less) in B-Tree right?  Not to mention smaller row
>> & DB size.
>
> wish I still had an application to point you at; and I can't assert in
> any other way just how much faster it is; especially over large datasets
> - regardless of the amount of rows the style of index doesn't slow
> (spatial) queries down.

If I see the PoC w/o having to see the actual setup, then I would
probably understand your point better.

>
> if index byte size is of importance then negate spatial indexes (which
> are rather a lot bigger).
>

Since the spatial indexes are bigger, wouldn't that work against it
due disk access having to read more?

> regardless though, it was just an idea I was throwing at you, not
> suggesting it's the best approach, but worth consideration depending on
> your priorities.
>

Thanks for your idea.  I'd appreciate it very much.  I'm told that I
over analyze things at times ... lol ... so please don't take it the
wrong way :)

>  As for calculating the distance between 2 'points', it
>> would be simpler for querying purposes to use 'point'.  ATM, I don't
>> have need to do any such calculations.  If I do use 'point' for PK,
>> I'd run into problems with scaling and migration later.
>
> the points don't matter tbh; or should i say specifying an x and a y
> doesn't matter, because you can simply get by a single id, or get a
> range of ids very quickly; the other value comes in to play when you
> want temporal queries.
>
> again though, i reiterate only useful if you want speed and don't care
> about index bytesize - and certainly not forcing it down your neck.
>
> re scaling.. unsure theoretically you have 15 digit precision which is a
> mysql DOUBLE/REAL, and that's for each value in the pair, so an almost
> incomprehensible number of rows are possible before a collision.
>
> all in all: there are many good reasons why I've only used this on
> occassion (specifically when dealing with 1 million+ rows in an RDBMS
> table); and many more good reasons why i stick to non-rdbms databases
> and use http uri's + eav model data only nowadays.
>
> The latter i would advocate, the former not so unless you are stuck w/
> mysql postgres - in which case you can't get faster. [1]
>
> [1] get some numbers to prove when i have time.
>
> Nathan
>

Using HTTP URI's + EAV model is something I'll definitely look into.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php