Re: Please have a look at rewriters design

thierry bordaz <tbordaz@xxxxxxxxxx> · Wed, 1 Apr 2020 10:36:44 +0200

Hello,

I agree that the term generic is not appropriate. I should change it 
(design/PR) if it still exist somewhere.

https://pagure.io/389-ds-base/pull-request/50981 is said to extend the 
usability of existing interfaces and I think it is what it does.
People needing to map/rewrite/transform (whatever the word) an 
attribute/value know what they want to obtain but usually do not care 
about the burden of writing/deploying a new plugin.

In my mind only the rewriter is complex and knows when/how it applies 
(attribute, scope, crafting values, authentication...) so I wanted to 
keep the interface very simple: just load your rewriter and let core 
server call it. William raised that it could contain helper function, 
for example going through a filter and call rewriter function for each 
filter components. I am looking at that at the moment.
I think that a rewriter may also appreciate some configuration area, for 
example if a rewriter is generic and apply some transformation rules 
specific to a rewriter instance.

I agree that it needs to be documented and plugin guide is a good place. 
I would like to use the design to describe the interfaces.

best regards
thierry

On 4/1/20 9:24 AM, Ludwig Krispenz wrote:
Ok, so Thierry's solution is useful to make using rewriters simpler, 
but I really want to have its use and interface  documented somewhere 
outside the code, PR, or design doc on the 389ds wiki - it needs to go 
to the official doc eg plugin guide.

Regards,
Ludwig

On 04/01/2020 01:02 AM, William Brown wrote:

On 1 Apr 2020, at 01:04, Ludwig Krispenz <lkrispen@xxxxxxxxxx> wrote:

Hi,

I was away and am late in the discussion, maybe too late.

Not too late, it's not released in production yet ;). There are two 
PR's that have been discussed here:

https://pagure.io/389-ds-base/pull-request/50988

https://pagure.io/389-ds-base/pull-request/50981

In my understanding what you mean by "generic" is that for a new 
rewriter you do not need a plugin, but to provide some rewrite 
functions and specify them in a rewriters config entry. But there is 
still the need to write rewriter functions, compile  and deploy 
them, and instead of plugins you now have a new interface of 
"filterRewriter" and "returendAttrRewriter functions - so far not 
documented anywhere.

Under generic rewriter I would understand an approach where you do 
not need to provide own functions, but have a rewriter plugin, which 
does rewriting based on rules in rewrite config entries, eg in which 
subtree, for which entries (filter to select), how to map a saerch 
filter, how to rename attrs on return,....
I had the same feeling too, to have these statically in libslapd, and 
much simpler than resolving symbols and dlopen. However, it's looking 
more like it will be a plugin style, but without using the current 
slapi plugin architecture - just a symload at start up. The reason 
for this that thierry explained is that freeipa plans to link to 
samba or sssd as part of one of the rewriter classes, and we don't 
want that to become a dependency of 389-ds.

I have argued in the past for a "lib-ipa" that has the needed shared 
logic between various pieces of the project, but honestly, I forgot 
if that ever happened. I think these days sssd is libipa in a lot of 
ways ...

Anyway, that's why Thierry want's to have a symload in this case :)

Best regards,
Ludwig

On 03/19/2020 01:09 AM, William Brown wrote:
On 19 Mar 2020, at 04:08, thierry bordaz <tbordaz@xxxxxxxxxx> wrote:

On 3/18/20 1:51 AM, William Brown wrote:
On 18 Mar 2020, at 04:08, thierry bordaz <tbordaz@xxxxxxxxxx> 
wrote:

Hi William,

I updated the design according to our offline exchange
Thanks Thierry, I appreciate the conversation and the updates to 
the document: it made clear there were extra details up in your 
brain but not in words yet :) it's always hard to remember all 
the details as we write things, so thanks for the discussion. 
Like you said, it's always good to have a team who is really 
invested and cares about the work we do!

Your design for the core server version looks much better! Thank 
you. I still think there are some missing points. The reason to 
have a libpath rather than inbuild is to avoid a potential 
linking to sssd/samba. I think also that the problem space of the 
global catalog here needs to be looked at too. This feature is 
not in isolation, it's really a part of that.
Okay, I will work on a new PR making core server able to 
retrieve/registers rewriters.

I think the "need" to improve the usability of rewriters is not 
specific to global catalog. Global Catalog is just an opportunity 
to implement it. I think parts of slapi-nis, integration of 
vsphere, GC (and likely others) are also use case for rewriters. 
They were implemented in different ways because rewriters were not 
easy to use or simply not known.
Yes, that's perfectly reasonable, and shouldn't stop your idea from 
being created - what's concerning me is that without a full picture 
you don't know how far to take these rewriters or what direction, 
or what might be needed.

This means we have a whole set of deployment cases to look at.

So the deployment will look like:

IPA DS --> IPA GC

So an ipaAccount from the IPA DS instance will be "copied and 
transformed" into the IPA GC. This process is as yet undefined 
(it sounds like it may be offline or something else ...). We are 
simply not dealing with one instance now, but an out-of-band 
replication and transformation process. It's unclear whether the 
data transform is during this loading process, or in the IPA GC 
somehow.

 From what I understand, it sounds like a method to take an 
ipaAccount and transform it to an AD GC account stub. Then inside 
of that IPA GC there are some virtual attributes you wish to add 
like objectSid binary vs string representations, objectCategory, 
maybe others.

So from our discussion, we have currently focused on "how do we 
transform entries within a single directory server". But that's 
not the problem here. We are saying:

"We take an entry from IPA DS, transform it to an IPA GC stub 
entry, and then apply a set of further "in memory" transformations"
One of the biggest issue with GC is schema. IPA DS and IPA GC have 
not compatible schema. They can not be in the same replication 
topology.
So provisioning of IPA GC requires transformations rules to 
present an other "view" of IPA DS data. Those transformations will 
be on the write path (i.e. stored in DB/indexed). This 
transformation work is almost done and is completely independent 
of 389-ds.
All of this is "write" path: provisioning (online or offline) and 
transformation.

The problem for IPA GC is now on the "read" path. AD clients are 
use to smart shortcuts/control that are supported by IPA GC.
This is the IPA GC instance that will register the rewriters to 
act as GC does.
Yep, I'm aware :)

If that's the process, why not do all the transforms as required 
in the DS -> GC load process? You raised a critically key point - 
we have a concern about the write path as the transform point due 
to IO or time to do the transform, but it sounds like you have to 
do this anyway as an element of the DS -> GC process.
Some of the transformation rules, on the write path, are quite 
complex. Looking at slapi-nis config entries gives an idea what is 
needed. In addition to those transformations, DS to GC online 
provisioning is not simple at all. Relying on sync-repl, you then 
need to transform a received entry into an update. At the moment 
it is an offline provisioning via transformation and import (much 
simpler).

To be honest I am afraid that the transform rules will result in 
rewriting slapi-nis.
*puts finger on nose* I do not want to be near that toxic rewrite 
at all.

I think everytime I have spoken to you about this, I have kept 
learning more and more about this, and the more I see, I have 
many concerns about this feature. I think we do not have the full 
picture. You have admitted that you don't know the full extend or 
ideas here. There is clearly a communication break down here to 
our team from the IPA project, and they aren't telling us what 
they want. It sounds like they are asking you to just do "a small 
piece" but only they know the bigger picture.

The IPA project has the following designs:

https://www.freeipa.org/page/V4/Global_Catalog_Support

https://www.freeipa.org/page/V4/Global_Catalog_HLD

https://www.freeipa.org/page/V4/Global_Catalog_Access_Control

https://www.freeipa.org/page/V4/Global_Catalog_Data_Transformation

This also links to:

https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2003/cc737410(v=ws.10)?redirectedfrom=MSDN 

The freeipa design pages are extremely shallow on details. The 
entire section on how they plan to get data into the GC is:

"""
Global Catalog provisioning

The data in Global Catalog is provisioned from the primary LDAP 
server instance running on the same FreeIPA master. A SYNCREPL 
mechanism is used to retrieve the changes and a modified 
slapi-nis module is used to transform FreeIPA original data to a 
schema compatible with Global Catalog in Active Directory. Unlike 
the original slapi-nis module, the data is stored in a proper 
LDAP backend so it is persistent across the directory server 
restarts.
"""
You are right I do not know the big picture. What I know is that 
parts of GC needs can be solved with rewriters that is by the way 
a supported 389-ds interface. So storing rewriters in simple 
shared library rather than in plugins will help both IPA and 389-ds.
Without the big picture we don't know what they will ask from the 
rewriters, and what we can or cannot deliver.

Where is the example config? Proof of concept? Even a conceptual 
set of accounts and groups showing the data transformation? How 
will they synthesise stable object data points?

The section of "data transformation" even goes to a blank page. 
Is the rewrite you are being asked to do just for objectSid once 
all these other transforms are done? Or is there more?

Honestly, it's worth reading the "how global catalog works" from 
msdn. Just to put it in contrast, that document (when converted 
to a pdf) is 61 pages long. Look at the features. Group caching, 
GC replication, partialAttribute replication based on schema, 
more ...

Honestly, Thierry, I trust you as a very smart and capable 
engineer, but you do not have the full picture here - none of us 
do. This seems like a feature that will explode in complexity and 
scale, and if not done *properly* from the start, may end up with 
many many half-baked, poorly designed solutions tacked together 
to make it look like it works. And that means we'll end up 
carrying that burden, just like slapi-nis (which is everyones 
favourite plugin ...)
Again, rewriters is not new. It has been a supported interface for 
years. The design is just to make them simpler to develop/deploy.
Looking at some plugins I think they are related to a way to give 
different "views" of the same dataset. Many time, a rewriter, 
specific to ldap client needs is a good option.
If GC can make use of it great. But I am sure that others (like 
vsphere) will appreciate.
That's not the problem. You are right that having improved rewriter 
support, probably has some good options for other plugins, or other 
areas. The issue is without the bigger picture, we don't know what 
they need. We don't know what we are on the hook for.

Let's be clear, to me as an external person, a core team of the 389 
project, the information in those design documents is not enough 
for me to make informed engineering decisions about this feature.

There is a pattern and history to this behaviour.

I think what's really concerning isn't the technical issues, but 
the social. I want to make clear - You, Thierry, yourself admitted 
you do not know what is fully expected of you in this feature. How 
are you, as an engineer meant to do your best possible work, 
without the full picture. You are very smart, but not psychic last 
time I checked :)

This does not make me comfortable.

I also know - having been inside Red Hat, and now external to it, 
that the FreeIPA team does a lot of discussion internally and 
privately. It needs to stop. If they want to request features of 
our project, they need to accept that 389-ds has upstream, core 
team members who are not part of Red Hat. They need to be engaging 
on 389-devel, not coming to you internally, and asking for 
features. They need to be designing their features, publicly, and 
clearly, in detail, so that we can make informed engineering 
decisions for our project, that includes our full team and community.

I have resisted talking about this publicly for a long time, but I 
think it's time that it's taken to the open - the FreeIPA team has 
communication and social challenges that they need to address. 
While these social issues continue, we will continue to see poor 
quality features being churned out that negatively impact our 
users, and our reputation as a project.

At this point, I believe that this rewriter feature can not 
progress until the FreeIPA project puts forward a complete, 
detailed and well constructed design of "what they require" for 
their global catalog feature.

Thanks

—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs
_______________________________________________
389-devel mailing list -- 389-devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 
389-devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: 
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/389-devel@xxxxxxxxxxxxxxxxxxxxxxx
--
Red Hat GmbH, http://www.de.redhat.com/, Sitz: Grasbrunn,
Handelsregister: Amtsgericht München, HRB 153243,
Geschäftsführer: Charles Cachera, Laurie Krebs, Michael O'Neill, 
Thomas Savage

—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs

_______________________________________________
389-devel mailing list -- 389-devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to 389-devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-devel@xxxxxxxxxxxxxxxxxxxxxxx