On Wed, Mar 31, 2010 at 11:01 AM, Nathan Rixham <nrixham@xxxxxxxxx> wrote: > Tommy Pham wrote: >> On Wed, Mar 31, 2010 at 4:06 AM, Nathan Rixham <nrixham@xxxxxxxxx> wrote: >>> Tommy Pham wrote: >>>> As for spatial data types, I've never find much use for non scientific >>>> related. (example) If using point as a PK, if MySQL stores it the >>>> same way as PostgreSQL which is 16 bytes, how is that any different - >>>> performance wise - than using UUID and storing it as binary(16) for >>>> MySQL or uniqueidentifier (16 bytes) for PostgreSQL? >>>> >>> it's all about the indexing (R-Tree) >>> >>> http://en.wikipedia.org/wiki/R-tree >>> >> >> I can see where the performance would be between B-Tree vs R-Tree for >> the same field size but I've yet to see real life application of it. >> Case in point, if using point for GPS data coordinates, then wouldn't >> it still be a lot better to use Lon (float), Lat (float) which is 2 >> fields of 4 bytes each vs 1 field of 16 bytes? The index would still >> be faster (8 bytes less) in B-Tree right? Not to mention smaller row >> & DB size. > > wish I still had an application to point you at; and I can't assert in > any other way just how much faster it is; especially over large datasets > - regardless of the amount of rows the style of index doesn't slow > (spatial) queries down. If I see the PoC w/o having to see the actual setup, then I would probably understand your point better. > > if index byte size is of importance then negate spatial indexes (which > are rather a lot bigger). > Since the spatial indexes are bigger, wouldn't that work against it due disk access having to read more? > regardless though, it was just an idea I was throwing at you, not > suggesting it's the best approach, but worth consideration depending on > your priorities. > Thanks for your idea. I'd appreciate it very much. I'm told that I over analyze things at times ... lol ... so please don't take it the wrong way :) > As for calculating the distance between 2 'points', it >> would be simpler for querying purposes to use 'point'. ATM, I don't >> have need to do any such calculations. If I do use 'point' for PK, >> I'd run into problems with scaling and migration later. > > the points don't matter tbh; or should i say specifying an x and a y > doesn't matter, because you can simply get by a single id, or get a > range of ids very quickly; the other value comes in to play when you > want temporal queries. > > again though, i reiterate only useful if you want speed and don't care > about index bytesize - and certainly not forcing it down your neck. > > re scaling.. unsure theoretically you have 15 digit precision which is a > mysql DOUBLE/REAL, and that's for each value in the pair, so an almost > incomprehensible number of rows are possible before a collision. > > all in all: there are many good reasons why I've only used this on > occassion (specifically when dealing with 1 million+ rows in an RDBMS > table); and many more good reasons why i stick to non-rdbms databases > and use http uri's + eav model data only nowadays. > > The latter i would advocate, the former not so unless you are stuck w/ > mysql postgres - in which case you can't get faster. [1] > > [1] get some numbers to prove when i have time. > > Nathan > Using HTTP URI's + EAV model is something I'll definitely look into. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php