Re: [389-devel] RFC: New Design: Fine Grained ID List Size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/06/2013 05:30 PM, David Boreham wrote:
On 9/6/2013 3:05 PM, Rich Megginson wrote:
Please review and comment:

http://port389.org/wiki/Design/Fine_Grained_ID_List_Size

This looks interesting. I suppose this is similar to a SQL database's concept of index statistics, and also query hints supplied by the client. Perhaps more of a "server index hint".

This may already been discussed, but in reading through the design doc, I was wondering about having the query planner (such as there is one in the DS) take note of the index hints prior to executing any lookups. This is similar to a SQL database's behavior executing such a query, when it sees index statistics that indicate low cardinality. To expand:

I seem to recall that there is already code to avoid looking up a low cardinality index, if and only if the intersection predicates are ordered suitably, by checking the id list size between index lookups. Thus, if there is a filter for uid=foo & objectclass=bar (apologies for not using the wacky LDAP string filter syntax...), then if only one hit is seen from the uid=foo lookup, the objectclass=bar lookup is skipped. If that's still the case, then the example "bad" search would become good if the client were to re-order the predicates. Of course often the client can not be modified, so:

My thought is to add that functionality to the server -- the client can then submit filters without regard to the internal workings of the server. The server checks the predicates against Rich's new index hints, and can therefore make the correct ordering itself. The benefit would be that no additional index lookup would be done, vs one that meets the id limit pertaining to the index, and the administrator only has to know that the index has low cardinality. A further "refinement" would be to make a tool that populates the hint data based on analysis of the index content, a la SQL "UPDATE STATISTICS".
Thanks for your feedback David!

This is a good idea, and it is something that we discussed briefly off-list. The only downside is that we need to change the index format to keep a count of ids for each key. Implementing this isn't a big problem, but it does mean that the existing indexes need to be updated to populate the count based off of the contents (as you mention above).

In the short term, we are looking for a way to be able to improve performance for specific search filters that are not possible to modify on the client side (for whatever reason) while leaving the index file format exactly as it is. I still feel that there is potentially great value in keeping a count of ids per key so we can optimize things on the server side automatically without the need for complex index configuration on the administrator's part. I think we should consider this for an additional future enhancement.

Thanks,
-NGK

Hopefully this makes sense. Apologies if it has already been considered.


--
389-devel mailing list
389-devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-devel



--
389-devel mailing list
389-devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/389-devel





[Index of Archives]     [Fedora Directory Announce]     [Fedora Users]     [Older Fedora Users Mail]     [Fedora Advisory Board]     [Fedora Security]     [Fedora Devel Java]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Mentors]     [Fedora Package Review]     [Fedora Art]     [Fedora Music]     [Fedora Packaging]     [CentOS]     [Fedora SELinux]     [Big List of Linux Books]     [KDE Users]     [Fedora Art]     [Fedora Docs]

  Powered by Linux