On 09/10/2013 04:35 PM, Ludwig Krispenz
wrote:
On 09/10/2013 04:29 PM, Rich Megginson wrote:
On 09/10/2013 01:47 AM, Ludwig Krispenz
wrote:
On 09/09/2013 07:19 PM, Rich Megginson wrote:
On 09/09/2013 02:27 AM, Ludwig
Krispenz wrote:
On 09/07/2013 05:02 AM, David Boreham wrote:
On 9/6/2013 8:49 PM, Nathan Kinder
wrote:
This is a good idea, and it is
something that we discussed briefly off-list. The
only downside is that we need to change the index
format to keep a count of ids for each key.
Implementing this isn't a big problem, but it does
mean that the existing indexes need to be updated to
populate the count based off of the contents (as you
mention above).
I don't think you need to do this (I certainly wasn't
advocating doing so). The "statistics" state is much the
same as that proposed in Rich's design. In fact you
could probably just use that same information. My idea
is more about where and how you use the information. All
you need is something associated with each index that
says "not much point looking here if you're after
something specific, move along, look somewhere else
instead". This is much the same information as "don't
use a high scan limit here".
In the short term, we are looking for a way to be able
to improve performance for specific search filters
that are not possible to modify on the client side
(for whatever reason) while leaving the index file
format exactly as it is. I still feel that there is
potentially great value in keeping a count of ids per
key so we can optimize things on the server side
automatically without the need for complex index
configuration on the administrator's part. I think we
should consider this for an additional future
enhancement.
I'm saying the same thing. Keeping a cardinality count
per key is way more than I'm proposing, and I'm not sure
how useful that would be anyway, unless you want to do
OLAP in the DS ;)
we have the cardinality of the key in old-idl and this
makes some searches where parts of the filter are allids
fast.
I'm late in the discussion, but I think Rich's proposal is
very promising to address all the problems related to
allids in new-idl.
We could then eventually rework filter ordering based on
these configurations. Right now we only have a filter
ordering based on index type and try to postpone "<="
or similar filter as they are known to be costly, but this
could be more elaborate.
An alternative would be to have some kind of index lookup
caching. In the example in ticket 47474 the filter is
(&(|(objectClass=organizationalPerson)(objectClass=inetOrgPerson)(objectClass=organization)(objectClass=organizationalUnit)(objectClass=groupOf
Names)(objectClass=groupOfUniqueNames)(objectClass=group))(c3sUserID=EndUser0000078458))"
and probably only the "c3sUserID=xxxxx" part will change,
if we cache the result for the (&(|(objectClass=...
part, even if it is expensive, it would be done only once.
Thanks everyone for the comments. I have added Noriko's
suggestion:
http://port389.org/wiki/Design/Fine_Grained_ID_List_Size
David, Ludwig: Does the current design address your
concerns, and/or provide the necessary first step for
further refinements?
yes, the topic of filter reordering or caching could be looked
at independently.
Just one concern abou the syntax:
nsIndexIDListScanLimit:
maxsize[:indextype][:flag[,flag...]][:value[,value...]]
since everything is optional, how do you decide if in
nsIndexIDListScanLimit: 6:eq:AND "AND" is a value or a flag ?
and as it defines limits for specific keys, could the
attributname reflect this, eg nsIndexKeyIDListScanLimit or
nsIndexKeyScanLimit or ... ?
Thanks, yes, it is ambiguous.
I think it may have to use keyword=value, so something like
this:
nsIndexIDListScanLimit: limit=NNN [type=eq[,sub]]
[flags=ADD[,OR]] [values=val[,val...]]
That should be easy to parse for both humans and machines.
For values, will have to figure out a way to have escapes (e.g.
if a value contains a comma or an escape character). Was
thinking of using LDAP escapes (e.g. \, or \032)
they should be treated as in filters and normalized, in the config
it should be the string representation according to the
attributetype
Hi,
I was wondering if this configuration attribute at the
index level, could not also be implemented at the bind-base level.
If an application use to bind with a given entry, it could use its
own limitations put for example into operational attribute in the
bound entry itself.
So that two applications, using the same filter component could
have their specific idlist size.
Anyway if it makes sense it could be added later.
best regards
thierry
--
389-devel mailing list
389-devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-devel
--
389-devel mailing list
389-devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-devel
--
389-devel mailing list
389-devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-devel
|