On Tue, Feb 24, 2009 at 3:24 AM, leledumbo <leledumbo_cool@xxxxxxxxxxx> wrote: > > Just tried serializing array of 256 booleans and printing the length, it > really shocked me: 2458. This project will be used by about 500 students, so > in the worst case (all students enroll all courses) it will eat 500 * 2458 > (assuming one character eats one byte) = 1229000 Bytes ~= 1.2 MB. Not a big > deal, eh? > That is a very un-normalized way of storing associations between entities in a database. You describe a "worst case scenario" where every student is registered for every course. If you serialize an array, every bit takes the same amount of space so it doesn't matter how many courses each student has. If you convert this to a bitmask as you are describing, your worst case would happen very frequently - at least as often as students are assigned to the course whose place is the most significant bit value. What's more, if you later need to add a 257th course, you could have lots of work to do to adjust your code to deal with the extra value. Generally relationships like the one you describe are stored in three separate and related tables: Students, Courses, and Enrollment. The latter is a n:m association between the first two. The advantage this approach has with regard to storage is that it is a sparse matrix. Most students will only enroll in a handful of courses. With this approach, you only store a row in the Enrollment table if a student is enrolled in a course. Otherwise, you don't need to store anything. Granted, if you use 4-byte integers for the primary keys in both the Students and Courses tables, your data storage would be about 2048 bytes for the data "worst case scenario" that you described, and another 2048 bytes for a unique index that would insure that each student could be enrolled in any course no more than one time. However, since I highly doubt most students are going to be enrolled in every course, it is not likely that you'll approach the 1.2MB storage space that you're worried about. Each row would take 16 bytes to store both the data and the index; if each student is enrolled in an average of 4 courses, it should only take an average of 64 bytes per student. For 500 students, that works out to around 2000 rows at about 32KB overall. For comparison, I have a fairly small table in a system we recently put into production that maintains similar associations. It stores more information in each row than just a two-column association (about 17 columns) and currently has over 7400 rows. The data is 0.734MB and the total size of three indexes is 0.820MB. And no, 1.2MB is not that big of a deal. A single floppy high-density drive could hold that much data 18 years ago! Andrew -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php