On Tue, 29 Mar 2011 10:23:37 -0500 Shirish Pargaonkar <shirishpargaonkar@xxxxxxxxx> wrote: > On Tue, Mar 29, 2011 at 8:30 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Mon, 28 Mar 2011 15:25:01 -0500 > > shirishpargaonkar@xxxxxxxxx wrote: > > > >> From: Shirish Pargaonkar <shirishpargaonkar@xxxxxxxxx> > >> > >> > >> rb tree search and insertion routines. > >> > >> A SID which needs to be mapped, is looked up in one of the rb trees > >> depending whether SID is either owner or group SID. > >> If found in the tree, a (mapped) id from that node is assigned to > >> uid or gid as appropriate. ÂIf unmapped, an upcall is attempted to > >> map the SID to an id. ÂIf upcall is successful, node is marked as > >> mapped. ÂIf upcall fails, node stays marked as unmapped and a mapping > >> is attempted again during next resolution. > >> > >> To map a SID, which can be either a Owner SID or a Group SID, key > >> description starts with the string "os" or "gs" followed by SID converted > >> to a string. Without "os" or "gs", cifs.upcall does not know whether > >> SID needs to be mapped to either an uid or a gid. > >> > >> Nodes in rb tree as fields to prevent multiple upcalls for > >> a SID. ÂAdding and removing nodes is done within global locks. > >> shrinker routine does not prune a node if mapping for that node > >> is either pending or or node was recently created (before timeout > >> period expires and node could be purged). The timeout period should > >> be long enough at least for an upcall to return back to cifs. > >> > >> For now, cifs.upcall is only used to map a SID to an id (uid or gid) but > >> it would be used to obtain an SID (string) for an id. > >> > >> > >> Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@xxxxxxxxx> > >> --- > >> Âfs/cifs/cifsacl.c | Â298 ++++++++++++++++++++++++++++++++++++++++++++++++----- > >> Âfs/cifs/cifsacl.h |  22 ++++ > >> Â2 files changed, 296 insertions(+), 24 deletions(-) > >> > >> diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c > >> index c60bb90..5b2ef73 100644 > >> --- a/fs/cifs/cifsacl.c > >> +++ b/fs/cifs/cifsacl.c > >> @@ -54,7 +54,30 @@ static const struct cifs_sid sid_authusers = { > >> Â/* group users */ > >> Âstatic const struct cifs_sid sid_user = {1, 2 , {0, 0, 0, 0, 0, 5}, {} }; > >> > >> -static const struct cred *root_cred; > >> +const struct cred *root_cred; > >> + > >> +static void > >> +shrink_idmap_tree(struct rb_root *root, int nr_to_scan, int *nr_rem, > >> +           int *nr_del) > >> +{ > >> +   struct rb_node *node; > >> +   struct rb_node *tmp; > >> +   struct cifs_sid_id *psidid; > >> + > >> +   node = rb_first(root); > >> +   while (node) { > >> +       tmp = node; > >> +       node = rb_next(tmp); > >> +       psidid = rb_entry(tmp, struct cifs_sid_id, rbnode); > >> +       if (time_after(psidid->time + SID_MAP_EXPIRE, jiffies) || > >> +               (nr_to_scan == 0) || (*nr_del == nr_to_scan)) > >> +           ++(*nr_rem); > >> +       else { > >> +           rb_erase(tmp, root); > >> +           ++(*nr_del); > >> +       } > >> +   } > >> +} > >> > >> Â/* > >>  * Run idmap cache shrinker. > >> @@ -62,9 +85,21 @@ static const struct cred *root_cred; > >> Âstatic int > >> Âcifs_idmap_shrinker(struct shrinker *shrink, int nr_to_scan, gfp_t gfp_mask) > >> Â{ > >> -   /* Use a pruning scheme in a subsequent patch instead */ > >> -   cifs_destroy_idmaptrees(); > >> -   return 0; > >> +   int nr_del = 0; > >> +   int nr_rem = 0; > >> +   struct rb_root *root; > >> + > >> +   root = &uidtree; > >> +   spin_lock(&siduidlock); > >> +   shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del); > >> +   spin_unlock(&siduidlock); > >> + > >> +   root = &gidtree; > >> +   spin_lock(&sidgidlock); > >> +   shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del); > >> +   spin_unlock(&sidgidlock); > >> + > >> +   return nr_rem; > >> Â} > >> > >> Âstatic struct shrinker cifs_shrinker = { > >> @@ -92,7 +127,6 @@ cifs_idmap_key_destroy(struct key *key) > >>    kfree(key->payload.data); > >> Â} > >> > >> -static > >> Âstruct key_type cifs_idmap_key_type = { > >>    .name    Â= "cifs.cifs_idmap", > >>    .instantiate = cifs_idmap_key_instantiate, > >> @@ -101,6 +135,196 @@ struct key_type cifs_idmap_key_type = { > >>    .match    = user_match, > >> Â}; > >> > >> +static void > >> +id_rb_insert(struct rb_root *root, struct cifs_sid *sidptr, > >> +       struct cifs_sid_id **psidid) > >> +{ > >> +   int rc; > >> +   struct rb_node *node = root->rb_node; > >> +   struct rb_node *parent = NULL; > >> +   struct rb_node **linkto = &(root->rb_node); > >> +   struct cifs_sid_id *lsidid; > >> + > >> +   while (node) { > >> +       lsidid = rb_entry(node, struct cifs_sid_id, rbnode); > >> +       parent = node; > >> +       rc = compare_sids(sidptr, &((lsidid)->sid)); > >> +       if (rc > 0) { > >> +           linkto = &(node->rb_left); > >> +           node = node->rb_left; > >> +       } else if (rc < 0) { > >> +           linkto = &(node->rb_right); > >> +           node = node->rb_right; > >> +       } > >> +   } > >> + > >> +   memcpy(&(*psidid)->sid, sidptr, sizeof(struct cifs_sid)); > >> +   clear_bit(SID_ID_PENDING, &(*psidid)->state); > >> +   clear_bit(SID_ID_MAPPED, &(*psidid)->state); > >> + > >> +   rb_link_node(&(*psidid)->rbnode, parent, linkto); > >> +   rb_insert_color(&(*psidid)->rbnode, root); > >> +} > >> + > >> +static struct cifs_sid_id * > >> +id_rb_search(struct rb_root *root, struct cifs_sid *sidptr) > >> +{ > >> +   int rc; > >> +   struct rb_node *node = root->rb_node; > >> +   struct rb_node *parent = NULL; > >> +   struct rb_node **linkto = &(root->rb_node); > >> +   struct cifs_sid_id *lsidid; > >> + > >> +   while (node) { > >> +       lsidid = rb_entry(node, struct cifs_sid_id, rbnode); > >> +       parent = node; > >> +       rc = compare_sids(sidptr, &((lsidid)->sid)); > >> +       if (rc > 0) { > >> +           linkto = &(node->rb_left); > >> +           node = node->rb_left; > >> +       } else if (rc < 0) { > >> +           linkto = &(node->rb_right); > >> +           node = node->rb_right; > >> +       } else /* node found */ > >> +           return lsidid; > >> +   } > >> + > >> +   return NULL; > >> +} > >> + > >> +static void > >> +sid_to_str(struct cifs_sid *sidptr, char *sidstr) > >> +{ > >> +   int i; > >> +   unsigned long saval; > >> +   char *strptr; > >> + > >> +   strptr = sidstr; > >> + > >> +   sprintf(strptr, "%s", "S"); > >> +   strptr = sidstr + strlen(sidstr); > >> + > >> +   sprintf(strptr, "-%d", sidptr->revision); > >> +   strptr = sidstr + strlen(sidstr); > >> + > >> +   for (i = 0; i < 6; ++i) { > >> +       if (sidptr->authority[i]) { > >> +           sprintf(strptr, "-%d", sidptr->authority[i]); > >> +           strptr = sidstr + strlen(sidstr); > >> +       } > >> +   } > >> + > >> +   for (i = 0; i < sidptr->num_subauth; ++i) { > >> +       saval = le32_to_cpu(sidptr->sub_auth[i]); > >> +       sprintf(strptr, "-%ld", saval); > >> +       strptr = sidstr + strlen(sidstr); > >> +   } > >> +} > >> + > >> +static int > >> +sidid_pending_wait(void *unused) > >> +{ > >> +   schedule(); > >> +   return signal_pending(current) ? -ERESTARTSYS : 0; > >> +} > >> + > >> +static int > >> +sid_to_id(struct cifs_sb_info *cifs_sb, struct cifs_sid *psid, > >> +       struct cifs_fattr *fattr, uint sidtype) > >> +{ > >> +   int rc; > >> +   unsigned long cid; > >> +   char *strptr; > >> +   struct key *idkey; > >> +   const struct cred *saved_cred; > >> +   struct cifs_sid_id *psidid, *npsidid; > >> +   struct rb_root *cidtree; > >> +   spinlock_t *cidlock; > >> + > >> +   if (sidtype == SIDOWNER) { > >> +       cid = cifs_sb->mnt_uid; /* default uid, in case upcall fails */ > >> +       cidlock = &siduidlock; > >> +       cidtree = &uidtree; > >> +   } else if (sidtype == SIDGROUP) { > >> +       cid = cifs_sb->mnt_gid; /* default gid, in case upcall fails */ > >> +       cidlock = &sidgidlock; > >> +       cidtree = &gidtree; > >> +   } else > >> +       return -ENOENT; > >> + > >> +   spin_lock(cidlock); > >> +   psidid = id_rb_search(cidtree, psid); > >> +   spin_unlock(cidlock); > >> + > >> +   if (!psidid) { /* node does not exist, allocate one & attempt adding */ > >> +       npsidid = kzalloc(sizeof(struct cifs_sid_id), GFP_KERNEL); > >> +       if (!npsidid) > >> +           return -ENOMEM; > >> + > >> +       spin_lock(cidlock); > >> +       psidid = id_rb_search(cidtree, psid); > >> +       if (psidid) { /* node happened to get inserted meanwhile */ > >> +           spin_unlock(cidlock); > >> +           kfree(npsidid); > >> +       } else { > >> +           psidid = npsidid; > >> +           id_rb_insert(cidtree, psid, &psidid); > >> +           spin_unlock(cidlock); > >> +       } > >> +   } > >> + > >> +   if (!test_bit(SID_ID_MAPPED, &psidid->state)) { /* SID not mapped */ > > > >    Â^^^ nit: maybe just do: > > > >    Âif (test_bit(SID_ID_MAPPED... > >        Âgoto out; > > > >    Â...or something. That would allow you to reduce the indentation > >    Âof the code below and make it easier to read. > > > > Will make this change. > > >> +       if (!test_and_set_bit(SID_ID_PENDING, &psidid->state)) { > >> +           memset(psidid->sidstr, '\0', SIDLEN); > >> +           if (sidtype == SIDOWNER) > >> +               sprintf(psidid->sidstr, "%s", "os:"); > >> +           else > >> +               sprintf(psidid->sidstr, "%s", "gs:"); > >> + > >> +           strptr = psidid->sidstr + strlen(psidid->sidstr); > >> +           sid_to_str(psid, strptr); > >> + > > > > Would it be better to do the string handling above before you insert > > this struct into the tree? In the case where this thing can't be mapped, > > then you're going to have to re-convert the sid on every pass through > > this function. Just doing that once would make more sense, IMO... > > > > Will address it. > > > I wonder too...once this thing is mapped, then the above string will > > never be used again, will it? If so, would it be best to release that > > memory instead of keeping it around in the cache? > > I think kmalloc the memory for string when node is allocated and free it > once SID is mapped? The string is really not needed at all once > a SID is mapped. > I think so too. This cache can possibly grow to be rather large, so you ought to be as efficient as possible with memory (that's a good goal anyway). It might even be worthwhile to make your own slabcache for these entries, but that can be done in a later patch once the need is demonstrated. > > > >> +           saved_cred = override_creds(root_cred); > >> +           idkey = request_key(&cifs_idmap_key_type, > >> +                       psidid->sidstr, ""); > >> +           if (IS_ERR(idkey)) { > >> +               psidid->id = cid; > >> +               cFYI(1, "%s: Can't map SID to an id", __func__); > >> +           } else { > >> +               psidid->id = > >> +                   *(unsigned long *)idkey->payload.value; > >> +               psidid->time = jiffies; > >> +               set_bit(SID_ID_MAPPED, &psidid->state); > >> +               key_put(idkey); > >> +           } > >> +           revert_creds(saved_cred); > >> +           clear_bit(SID_ID_PENDING, &psidid->state); > >> +           wake_up_bit(&psidid->state, SID_ID_PENDING); > >> +       } else { > >> +           /* > >> +           Â* either an existing node with SID mapped (in which > >> +           Â* case we would not wait) or a node whose SID is > >> +           Â* being mapped by somebody else (in which case, we > >> +           Â* would end up waiting). > >> +           Â*/ > >> +           rc = wait_on_bit(&psidid->state, SID_ID_PENDING, > >> +                   sidid_pending_wait, TASK_INTERRUPTIBLE); > >> +           if (rc) { > >> +               cFYI(1, "%s: sidid_pending_wait interrupted %d", > >> +                   __func__, rc); > >> +               return rc; > >> +           } > >> +       } > >> +   } > >> + > >> +   if (sidtype == SIDOWNER) > >> +       fattr->cf_uid = psidid->id; > >> +   else > >> +       fattr->cf_gid = psidid->id; > >> + > >> +   return 0; > >> +} > >> + > > > > So let's consider the case where we have a directory with a lot of > > files in it that are owned by a user that can't be mapped. Someone does > > a "ls -l" in that dir. > > > > Presumably, on every stat() call we'll call into this routine to map > > the SID to a unix uid. The MAPPED bit will never be set though, so > > we'll always call request_key(). > > > > cifs.upcall will generally negatively instantiate the key with a 1s > > timeout. I think if request-key doesn't find a match (indicating that > > no one set up request-key.conf) then it negatively instantiates the key > > with a 60s timeout. > > > > Is that reasonable behavior here? > > > > Not sure I understand the negative instantiation of key with a timeout. > I had a timeout field in some of the earlier patch versions whrere re-mapping > for an unmapped SID is attempted only after that timeout expires > I believe it works something like this... When the upcall returns an error without instantiating the key, the kernel negatively instantiates the key with a 60s timeout. The cifs.upcall program however will instead negatively instantiate the key with a 1s timeout. When there's a negative key in cache, request_key will return an error (-ENOKEY, I think) without doing the upcall. That's generally a good thing. We don't want to bog down the box by forcing it to keep forking new upcall processes that aren't going to work. The question here is -- what behavior do you want on subsequent request_key invocations when one fails? How long do you want to wait before allowing a new upcall in this situation? -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html