I would also like to note that each
node can store multiple elements. Current implementation creates a
node for each byte in the key. In my implementation I only create
a node if there is a prefix coincidence between 2 or more keys.
This reduces the number of nodes and the number of indirections.
Each node stores the key from the beginning to allow very fast
lookup by a single memcmp().
In normal situations, this implementation is very fast and very
efficient in space.
Greetings,
Xavi
Al 04/09/13 10:10, En/na Xavier Hernandez ha escrit:
Al 04/09/13 02:55, En/na Anand Avati
ha escrit:
Yes, but it is used only inside dict.c itself. It should not be
published in the dict.h. This is only a cosmetic change but I
think it would be better. Any other use of dict_add() by a
translator will be incorrect unless it is made with much care,
which is not desirable for future maintenance.
That would be good. I would allow that dict_replace() sets *oldval
to NULL to indicate that the key did not exist, and maintain the
0/-1 return code to indicate error (this will maintain homogeneity
with other APIs, though I still think that an error code would be
more useful in general, but this would be another topic).
I missed that. Sorry.
For the case of 4 childs I split each byte in 4 elements of 2
bits. This makes it very easy to access the next child. It only
needs some basic logic operations. This is the current
implementation for the lookup function. It shows how it is
accessed:
/* TRIE_DIMENSION values:
*
* 1: 1 bit per element, 8 elements per byte
* 2: 2 bits per element, 4 elements per byte
* 3: 4 bits per element, 2 elements per byte
* 4: 8 bits per element, 1 element per byte
*/
#define TRIE_DIMENSION 2
#define TRIE_INDEX_BITS (4 - TRIE_DIMENSION)
#define TRIE_ELEM_BITS (1 << (TRIE_DIMENSION - 1))
#define TRIE_CHILDS (1 << TRIE_ELEM_BITS)
#define TRIE_ELEMS_PER_BYTE (1 << TRIE_INDEX_BITS)
#define TRIE_INDEX_MASK (TRIE_ELEMS_PER_BYTE - 1)
#define TRIE_ELEM_MASK (TRIE_CHILDS - 1)
#define sys_trie_value(_data, _offs) \
(((_data) >> (((_offs) & TRIE_INDEX_MASK) * TRIE_ELEM_BITS)) & \
TRIE_ELEM_MASK)
#define sys_trie_value_idx(_data, _offs) \
({ \
off_t __tmp_offs = (_offs); \
sys_trie_value(((_data)[(__tmp_offs) >> TRIE_INDEX_BITS]), \
__tmp_offs); \
})
typedef struct _sys_trie_node
{
struct _sys_trie_node * childs[TRIE_CHILDS];
struct _sys_trie_node * parent;
void * data;
uint32_t count;
uint32_t length;
uint8_t key[0];
} sys_trie_node_t;
typedef struct _sys_trie
{
sys_trie_node_t root;
} sys_trie_t;
sys_trie_node_t * sys_trie_lookup(sys_trie_t * trie,
const uint8_t * key,
size_t length)
{
sys_trie_node_t * node;
size_t len;
len = length << TRIE_INDEX_BITS;
for (node = trie->root.childs[sys_trie_value(*key, 0)];
(node != NULL) && (node->length < len);
node = node->childs[sys_trie_value_idx(key, node->length)]);
if ((node != NULL) && (node->length == len) && (node->data != NULL) &&
(memcmp(node->key, key, length) == 0))
{
return node;
}
return NULL;
}
It's true that the collapsing feature is not really used for
dictionaries, however I think it would be interesting to have it
to use tries for other purposes that may require a long duration
data structure that eventually adds and removes items.
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
https://lists.nongnu.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
https://lists.nongnu.org/mailman/listinfo/gluster-devel