On Wed, Apr 29, 2020 at 3:01 PM James Carter <jwcart2@xxxxxxxxx> wrote: > > On Mon, Mar 30, 2020 at 9:06 AM Chris PeBenito <pebenito@xxxxxxxx> wrote: > > > > On 3/27/20 3:21 PM, Stephen Smalley wrote: > > > On 3/27/20 11:21 AM, Ondrej Mosnacek wrote: > > >> These patches are the userspace side of the kernel change posted at [1]. > > >> > > >> The first patch changes libsepol's internal representation of filename > > >> transition rules in a way similar to kernel commit c3a276111ea2 > > >> ("selinux: optimize storage of filename transitions") [2]. > > >> > > >> The second patch then builds upon that and implements reading and > > >> writing of a new binary policy format that uses this representation also > > >> in the data layout. > > >> > > >> See individual patches for more details. > > >> > > >> NOTE: This series unfortunately breaks the build of setools. Moreover, > > >> when an existing build of setools dynamically links against the new > > >> libsepol, it segfaults. Sadly, there doesn't seem to be a nice way of > > >> handling this, since setools relies on non-public libsepol policydb > > >> API/ABI. > > > > > > I think this has happened before a few years ago when we made a > > > different change to those structures, and required updates on the > > > setools side. > > > > > > Maybe we need to figure out what setools needs to be encapsulated and > > > exported as part of the libsepol public ABI/API, and then stop having it > > > peer into libsepol internals? > > > > I'd be fine with that :) > > > > I am a little confused. I would consider peering into the libsepol > internals to be something like assuming the memory layout of > structures and pulling things out based on those assumptions, but > setools is just using public header files and functions. It seems to > me to be using the public API. Now I think that too much of the > internals are being exposed, but I am not sure we can do anything > about that now. The sepol/policydb/*.h files are considered private and are only for static users of libsepol like checkpolicy. Only the top-level sepol/*.h headers are part of the shared library API/ABI. > It doesn't seem like it would be too hard to update setools to work > with this change, unless there is a requirement that it still work > with older versions of libsepol. I am not very familiar with cython > and I don't know how to make it work differently depending on what > version of libsepol is on the system. > > Speaking of versions, since we are actually modifying structs in the > header files instead of just adding new things, don't we need change > the version of libsepol or something? > > > Ultimately SETools does 2 thing for the policy access: > > * iterate over the entire policy, reading data out of the various objects > > * use linkages between objects in the policy, e.g. get the > > type/attributes from an AV rule, get types from an attribute, etc. using > > the stuff like class_val_to_struct as needed. > > > > So if sufficient functionality to do dispol was exported, that would get > > close to all that was needed. There are some optimizations I made by > > digging at the structs a bit more than the above, but that could > > potentially be dropped if libsepol has sufficient support. See > > <https://github.com/SELinuxProject/setools/blob/master/setools/policyrep/selinuxpolicy.pxi#L673> > > for the policy loading code. > > > > I think the fact that the CIL, kernel to CIL, kernel to conf, and > module to CIL code is all in libsepol speaks to the fact of how > tightly integrated they are to the rest of libsepol. One argument that > could be made is that the policyrep stuff in setools really belongs in > libsepol. > > Thinking about how libsepol could be encapsulated leads me to a couple > of possibilities. One way would be functions that could return lists > of rules. The policy module code gives us avrule_t, role_trans_rule_t, > role_allow_t, filename_trans_rule_t, range_trans_rule_t, and others. > Those structures are probably unlikely to change and, at least in this > case, creating a function that walks the filename_trans hashtable and > returns a list of filename_trans_rule_t certainly seems like it > wouldn't be too hard. Another possible way to encapsulate would be > create a way to walk the various hashtables element by element (I > think hashtab_map() requires too much knowledge of the internal > structures), returning an opaque structure to track where you are in > the hashtable and functions that allow you to get each part of the > rule being stored. There are other ways that it could be done, but I > could rewrite kernel to and module to stuff with either of those. CIL > itself would require some functions to insert rules into the policydb > which probably wouldn't be too hard. None of this would be too hard, > but it would take some time. The real question is would it really be > valuable?