29 okt 2008 kl. 01.08 skrev Bastien Koert:
On Tue, Oct 28, 2008 at 4:24 PM, Frank Arensmeier
<frank@xxxxxxxxxxxx> wrote:
Hi all.
In short, I am working on a system that allows me to keep track of
changes to a large amount of short texts (a couple of thousand text
snippets, two or three sentences per text). All text is stored in a
database. As soon as a user changes some text (insert, delete,
update), this action is recorded. Look at an article on e.g.
Wikipedia and click "History". This is more or less what I am trying
to accomplish.
Right now, my "history" class that takes care of all changes, is
working pretty much as I want. The thing is that both the original
text and the altered text is stored in the database every time the
text is changed. My concern is that this will eventually evolve into
a serious problem regarding amount of storage and performance. So, I
am looking for a more efficient way to store all changes.
Ideas I have come up with so far are:
1) Store the "delta" (=the actual change) of a text change. This
could be done by utilizing the Pear package TextDiff. My idea was to
compare the old with the new text with help of the TextDiff class. I
would then grab the array containing the changes from TextDiff,
serialize it and store this data into the db. The problem is that
this is every thing else but efficient when it comes to smaller text
(the serialized array holding the changes was actually larger than
the two texts combined).
2) Do some kind of compression on the text to be stored. However, it
seems that the build-in compression functions from PHP5 are more
efficient when it comes to large texts.
Any other ideas?
thank you.
//frank
ps. I notice that Mediawiki also stores complete articles in the db
(every time an article is updated, the hole article is stored in the
database). ds.
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php
Save just the new version each time
table like
record_id //PK
relates_to //FK
item_text
author_id
timestamp
much easier to work with
Yes, maybe it's just as simple as that. Thanks.
//frank
--
Bastien
Cat, the other other white meat