Re: big data base

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



just a few thoughts myself

bastien

From: "Kevin" <aliaghan@xxxxxxxxxxx>
To: php-db@xxxxxxxxxxxxx
Subject: Re:  big data base
Date: Wed, 16 Mar 2005 04:32:49 +0100

"Martin Norland" <martin.norland@xxxxxxxxxx> wrote in message
news:422F6517.1010506@xxxxxxxxxxxxx
> luis medrano wrote:
> > Hi All,
> >
> > I need you help. I need to design a very big data base, around 900GB
> > or more of information but I never had design or manage this magnitude
> > of information before. I really appreciate if you can point me to
> > documentation to how can do this properly  or you can help me with
> > some experiance you have doing this.
>
> I wasn't going to chime in, but after reading some replies I feel I have
to.
>
> Bottom line - you need to give more details.
>
> The design of the database will not depend greatly on the amount of data
> inside it, unless there are operations/calculations to be done on/with
> it.  Most people talk number of rows / tables / read|write operations /
> joins / etc. when talking database design, instead of size - as with
> proper indexes, the size can affect things very little.

Ok... I don't completely agree with the "Design of the database will not
depend greatly on the amount of data", unless of course I am
misinterpreting. If the size of the data and the usage rather simple one
mind choose to ignore some of the rules when developing databases. I have
seen databases pass my desk where parts of it aren't even in 3NF (3rd degree
Normal Form). Then again, if the database is complex one might need to get
into 4NF or BCNF. Also the design when it comes down to how to organise the
database, storage wise, one might choose to seperate the database into
several pieces - store one piece here and another there. It would suggest
that the design is rather important if you do not wish to see the
performance drop like a brick in water.

One thing to consider here is that it IS acceptable to not have fully normalized tables in a web based environment. Under high load situations, avoiding joins can mean significant performance gains. For example I split up a four table join query into parts and moved some of the logic into the ASP application from the db. With this I was able to take a query that ran in 4-5minutes for 100 records into a query that can produce 1000+ (depending on system configuration) records in about 10 seconds.


I agree that design is important, but Martin's point is that the amount of data is less relevant than the design. Whether you have 10,000 rows ro 10,000,000 rows won't affect the db much if its well designed and that includes indexing.







Also if the data in the database is small you can easily get away with design mistakes, like bad index and the sort. While when the data is extensive one needs fast ways to get through the data. A table of several tens of thousands of records or more can take a long time to process when not having the right indexes and design schema's.

I am wondernig though that if the size of the db is anyway related to the
complexity and size of the project if PHP is the right choice of language. I
would most likely move towards something like Java. Well just a thought...

Nope, language is dictated by the environment and what the goal of the project is and how comfortable you are with that environment. Java and other complied environments may give you some speed increases, it is likely a pig to maintain (if there are frequent changes to be made or new features to be added). For us, a .com, basically, involved in retail security, our systems (ASP and PHP based) run 33 queries a second and that number is only going to grow as our customer base increases. Our biggest issue is not the language or the db, but the crappy windows servers that are fraught with issues.





> > For example - if you're "designing" a database to store 20GB raw movie > files - then you might want to rethink storing the gigs of data inside > the database, then suddenly you aren't working with a "large" database > at all. > > So, if you can't give more details - you're going to have to search on > your own. Otherwise, you might want to speak up with some more info so > you can get better recommendations. > > My only suggestion with the information you've given - is basically what > Bastien said - multiple big iron and big fast hard disks / SAN. Other > than that, I can only suggest that you design it well > > P.S. > - I wanted to toss in a troll reply about trying SQLLite, since it > "comes with PHP!" - but that's mean > P.P.S. > - I was also tempted to link to a "great tutorial on this" - an > arbitrary resume writing tutorial, as a mean spiteful joke about being > in over your head - but that's just too mean. It all depends on the > system requirements, your budget, and your time.

Just because someone may be in over his head one shouldn't try? Didn't we
all have our first big project at some point?

>
> cheers,
> --
> - Martin Norland, Sys Admin / Database / Web Developer, International
> Outreach x3257
> The opinion(s) contained within this email do not necessarily represent
> those of St. Jude Children's Research Hospital.

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


-- PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [PHP Users]     [Postgresql Discussion]     [Kernel Newbies]     [Postgresql]     [Yosemite News]

  Powered by Linux