Path canonicalizer: how to integrate?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

In order to make our FilePermission checks work properly it is
necessary to canonicalize filenames.  Classpath's canonicalizer
does not handle symbolic links at all; GCJ does, but in an all-
or-nothing way and using a function that returns different results
on different systems.

I've written a POSIX path canonicalizer in C that mimics (and in
at least one case improves upon) the behaviour of a proprietary
JVM, but I don't know enough about how Classpath builds C stuff
to be able to see how to build it.  It needs be invoked from (or
just replace) gnu.java.io.PlatformHelper.toCanonicalForm().

GCJ has separate implementations for POSIX and Windows, and I'd
like to do the same for Classpath.  This code is about as critical
as it gets from a security standpoint, so it's vital it's easy to
understand.  It's complicated enough already without cluttering it
with stuff to deal with different separators and drive letters.
For non-POSIX systems we could either fall back to the current
implementation or take GCJ's.

All that is a long-winded way of saying "how do I build this, what
file should I put it in, and how do I make it so that stuff is built
differently on POSIX and non-POSIX?"

Cheers,
Gary
-------------- next part --------------
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>

#define MAXPATHLEN 256 /* XXX Get this from somewhere */

char *
getCanonicalPath(const char *path)
{
  char src[MAXPATHLEN], dst[MAXPATHLEN];
  int srci, dsti, dsti_save;
  int len, tmpi;
  int fschecks = 1;
  struct stat sb;

  /* XXX Presumably the argument to this function will be a Java
     String, so this bit will be replaced by some call to extract
     its UTF-8 representation. */
  if (len >= MAXPATHLEN)
    return NULL; /* XXX throw IOException */
  strcpy(src, path);

  /* It is the caller's responsibility to ensure the path is absolute. */
  len = strlen(path);
  if (len == 0 || path[0] != '/')
    return NULL; /* XXX throw RuntimeException */

  dst[0] = '/';
  dst[1] = '\0';
  dsti = 1;

  srci = 1;

  while (src[srci] != '\0')
    {
      /* Skip slashes. */
      while (src[srci] == '/')
	srci++;
      tmpi = srci;
      /* Find next slash. */
      while (src[srci] != '/' && src[srci] != '\0')
	srci++;
      if (srci == tmpi)
	/* We hit the end. */
	break;
      len = srci - tmpi;

      /* Handle "." and "..". */
      if (len == 1 && src[tmpi] == '.')
	continue;
      if (len == 2 && src[tmpi] == '.' && src[tmpi + 1] == '.')
	{
	  if (dsti == 1)
	    /* Unlike other JVMs we do not rewind past the root
	       directory.  I can't see any legitimate reason why you
	       would want this, yet chopping off pieces of path seems
	       like a sure-fire way to introduce vulnerabilities. */
	    return NULL; /* XXX throw IOException */
	  while (dsti > 1 && dst[dsti - 1] != '/')
	    dsti--;
	  if (dsti != 1)
	    dsti--;
	  /* Reenable filesystem checking if disabled: we might have
	     reversed over whatever caused the problem before.  At
	     least one proprietary JVM has inconsistencies because it
	     does not do this. */
	  fschecks = 1;
	  continue;
	}

      /* Handle real path components. */
      if (dsti + len + 1 >= MAXPATHLEN)
	return NULL; /* XXX throw IOException */
      dsti_save = dsti;
      if (dsti > 1)
	dst[dsti++] = '/';
      strncpy(&dst[dsti], &src[tmpi], len);
      dsti += len;
      if (fschecks == 0)
	continue;
	      
      dst[dsti] = '\0';
      if (lstat(dst, &sb) == 0)
	{
	  if (S_ISLNK(sb.st_mode))
	    {
	      char tmp[MAXPATHLEN];

	      tmpi = readlink(dst, tmp, MAXPATHLEN);
	      if (tmpi < 1 || tmpi == MAXPATHLEN)
		return NULL; /* XXX throw IOException */

	      /* Prepend the link's path to src. */
	      if (tmpi + strlen(&src[srci]) >= MAXPATHLEN)
		return NULL; /* XXX throw IOException */
	      while (src[srci] != '\0')
		tmp[tmpi++] = src[srci++];
	      tmp[tmpi] = '\0';
	      strcpy(src, tmp);
	      srci = 0;

	      /* Either replace or append dst depending on whether the
		 link is relative or absolute. */
	      dsti = tmp[0] == '/' ? 1 : dsti_save;
	    }
	}
      else
	{
	  /* Something doesn't exist, or we don't have permission to
	     read it, or a previous path component is a directory, or
	     a symlink is looped.  Whatever, we can't check the
	     filesystem any more. */
	  fschecks = 0;
	}
    }
  dst[dsti] = '\0';

  /* XXX Presumably this bit will be replaced by something call to
     convert the array of UTF-8 bytes into a Java String. */
  return strdup(dst);
}

int
main(int argc, char *argv[])
{
  int i;

  for (i = 1; i < argc; i++)
    printf("%s -> %s\n", argv[i], getCanonicalPath(argv[i]));
}

[Index of Archives]     [Linux Kernel]     [Linux Cryptography]     [Fedora]     [Fedora Directory]     [Red Hat Development]

  Powered by Linux