bottom post
On 19/04/2014 12:46, R. Pasch wrote:
On
19-4-2014 9:38, Robin wrote:
Well, given that there are known limited attributes, this is the
type of application that really really suits a column oriented
database, such as Sybase IQ (now sold by SAP). Its a neat
product that scales. Great performance with drag'n'drop
analytics.
Unless you can charm IQ out of SAP (it has been known to
happen), you might have to look at some other techniques
So consider some binary data representation
Red - 1 (0000 0001)
Orange - 2 (0000 0010)
Yellow - 4 (0000 0100)
Green - 8 (0000 1000)
Blue - 16 (0001 0000)
Indigo - 32 (0010 0000)
Violet - 64 (0100 0000)
This way, you can encode several colours in 1 value
Red or Green or Indigo = 1 + 8 + 32 = 41 = 0010 1001
Robin
I stopped reading when I heard the word "sold by SAP" ;-) This
project is solely build with open-source and freely available
software.
I've been thinking about using a binary data representation but
didn't come to a solution to this specific problem quite yet. Per
property of a product, only one bit would be 1 and the rest would
be 0. What would a query look like to match all products that have
a bit in the correct position?
Say for instance these are a couple records (and yes, property
values can be null as well)
title, property1, property2, property3
================================
product1, 0000 0001, 0000 0010, NULL
product2, 0000 0100, 0100 0000, 0010 0000
product3, 0010 0000, 0010 0000, 0100 0000
Say that I would like to retrieve the products that either have
property1 as 0010 0000, 1000 000 or 0000 0001. Combined that would
be 0010 1001 and would have to match product1 and product3 as they
both have their individual bit matching one of the bits being
asked for. What would a where statement look like using this type
of binary representation?
If that would be fairly simple to do and fast (most important
factor) then I could do an OR construction on all property columns
and have something count the amount of properties that actually
matched. Is that something you can do with a binary operator of
some sort as well? Count the amount of overlapping bits?
Say for instance I have a binary value of 0110 0101 and another
binary value of 1100 0100, how could I found out how many bits
matched? (in this case the number of matching bits would be 2)
I understand the reluctance to pay SAP-style rates, as a longtime DB
user, I have learned some 'charm' techniques.
However, I poked around a bit for alternatives, as I do like the
column-oriented approach, and found something called - MonetDB -
it apparently has a column-store db kernel, and is open source - I
suggest you have a look, if it does what it says on the label, then
it looks like a find.
There is a discussion of bitmask-trickiness here
also dealing with colours
Robin
|