On Tue, 14 Aug 2012, Colin Walters <walters@xxxxxxxxxx> wrote: > Really though in the big picture, while the file context regexps were > probably an OK solution way back when SELinux was a "proof of concept" > prototype, the current policy generating 5000 of them is just crazy... Actually the situation is way better than it was in the early days. When I first started working on SE Linux the software wasn't as optimised and the hardware was way slower. A restorecon type operation would be 99% user CPU time, taking more than 20 minutes of CPU time for relabelling a relatively small filesystem was common. Having 5000 on a modern for argument sake (it's 1923 on my system, but that depends on whether you load a policy with everything or just the modules you need) is a lot easier than the situation in the early days with fewer regular expressions. > One other possibility - I bet one could get a huge speedup in some cases > by splitting up the regexp set based on common prefixes. For example, > if you're trying to match /tmp/krb5cc, there's no reason to run over all > 2000 regexps which start with /usr. This solution is kind of an > intermediate step between "run 5000 regexps serially" and "write custom > code to compile 5000 regexps into a DFA that returns a context". Yes, I wrote code to do that many years ago. Any regex which had a fixed string for the first subdirectory from root would only be called for a filename which was in the same subdirectory. The prefixes were indexed so an integer compare would be used to determine whether a regex would be called. Regexes which applied to multiple prefixes (EG "/.*") would be applied to all files. But I believe that the kerberos performance problem is not calling the regexes but loading. The current code (unless it's changed recently) will compile all regexes, so when kerberos loads the file contexts for a check on /tmp then it will compile all regexes under /usr, /var, and other common prefixes even when they won't be used. I don't know how much time can be saved by skipping the compile of those. Another thing that could be done is that we could have an interface for loading a file_contexts file for a specific prefix. Then the code which generates the file_contexts file could generate files such as file_contexts_tmp which only has entries which match /tmp (10 for the policy I use, maybe 50 or so for the one you use) and which match everything (EG "/.*"). On my system there are 9 file_contexts entries which are not prefix specific of which one is required ("/.*") and of the others /vmlinux.* and /initrd\.img.* are obsolete and the other 6 could be easily split to be prefix specific. So with a minor change to the library interface (adding a new entry point so the new library could work with old apps) we could have a program which knows that it will only label files under /tmp only checking 11 regexes on my system or maybe 50 on your system. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/ -- This message was distributed to subscribers of the selinux mailing list. If you no longer wish to subscribe, send mail to majordomo@xxxxxxxxxxxxx with the words "unsubscribe selinux" without quotes as the message.