Re: Waste of storage space?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 28, 2008 at 4:24 PM, Frank Arensmeier <frank@xxxxxxxxxxxx>wrote:

> Hi all.
>
> In short, I am working on a system that allows me to keep track of changes
> to a large amount of short texts (a couple of thousand text snippets, two or
> three sentences per text). All text is stored in a database. As soon as a
> user changes some text (insert, delete, update), this action is recorded.
> Look at an article on e.g. Wikipedia and click "History". This is more or
> less what I am trying to accomplish.
>
> Right now, my "history" class that takes care of all changes, is working
> pretty much as I want. The thing is that both the original text and the
> altered text is stored in the database every time the text is changed. My
> concern is that this will eventually evolve into a serious problem regarding
> amount of storage and performance. So, I am looking for a more efficient way
> to store all changes.
>
> Ideas I have come up with so far are:
>
> 1) Store the "delta" (=the actual change) of a text change. This could be
> done by utilizing the Pear package TextDiff. My idea was to compare the old
> with the new text with help of the TextDiff class. I would then grab the
> array containing the changes from TextDiff, serialize it and store this data
> into the db. The problem is that this is every thing else but efficient when
> it comes to smaller text (the serialized array holding the changes was
> actually larger than the two texts combined).
>
> 2) Do some kind of compression on the text to be stored. However, it seems
> that the build-in compression functions from PHP5 are more efficient when it
> comes to large texts.
>
> Any other ideas?
>
> thank you.
> //frank
>
> ps. I notice that Mediawiki also stores complete articles in the db (every
> time an article is updated, the hole article is stored in the database). ds.
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>
Save just the new version each time

table like
record_id  //PK
relates_to //FK
item_text
author_id
timestamp


much easier to work with

-- 

Bastien

Cat, the other other white meat

[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux