On Thu, Mar 10, 2011 at 11:25:38AM -0500, William Cohen wrote: > Shared library are heavily used through Linux distributions. > Unfortunately, there are cases of functions in the libraries having > undefined behavior. Rather than immediately reporting the dependence on > that undefined behavior, the applications may later fail in odd and > seemingly random ways. On particular example of this problem is the > memcpy function which has undefined behavior when the source and > destination regions overlap. This resulted in the following bug being > filled about "Strange sound on mp3 flash website": > > http://koji.fedoraproject.org/koji/taskinfo?taskID=2898613 > > The diagnosis of this problem was not straightford because the memcpy > silently corrupted the data in the copy. There are many other examples of > this type of memcpy problems in bugzilla. > > What would be desirable is catching the dependency on undefined behavior > when it occurs. The LD_PRELOAD environment variable allows wrappers for > shared library functions to be inserted. These wrappers can do additional > checks and flag those issues when they occur. The mutrace package in > Fedora is one example of this approach. It makes use of this mechanism to > instrument the mutex operations and can trigger a gdb breakpoint when a > problem mutex operation occurs. > > I have taken the code in the mutrace package and made memstomp which looks > for the memcpy of overlapping regions. > > git repo at: > > http://fedorapeople.org/gitweb?p=wcohen/public_git/memstomp;a=summary > > A fedora scratch package RPM at: > > http://koji.fedoraproject.org/koji/taskinfo?taskID=2898613 > > > Valgrind does check the arguments for memcpy (and many other memory > related checks). The main advantage to using the specialized wrappers > like memstomp is lower overhead. Most people are not willing to pay for > the overhead that valgrind introduces (4x-100x slow downs). The overhead > for the memstomp wrappers should be low enough that it would be feasible > to set the LD_PRELOAD for Fedora alpha releases. This would make the > problems depending on undefined behavior obvious rather than spending a > large amount of time trying to replicate the problem and then diagnosing > it. Nice, but I think the dlsym (NULL, "main") lookup should not be done, at least not by default, we really don't want to encourage people linking programs with -rdynamic, that adds a runtime penalty. And, it would be nice when you have such a library not to check just memcpy, there are plenty of other commonly used calls which could be warned about. memcpy, strcpy, strncpy, strcat, strncat, strtok, strtok_r, mempcpy, strsep, stpcpy, stpncpy, memccpy just to name a few from <string.h>, then for -D_FORTIFY_SOURCE also __memcpy_chk, __mempcpy_chk, __strcpy_chk, __stpcpy_chk, __strncpy_chk, __stpncpy_chk, __strcat_chk, __strncat_chk. In wchar.h e.g. wcscpy, wcsncpy, wcscat, wcsncat, wcstok, wmemcpy, wmempcpy and maybe mbrtowc, wcrtomb, mbrlen, mbsrtowcs, wcsrtombs, mbsnrtowcs, wcsnrtombs, wcstol, wcstoul, wcstoll, wcstoull, ... Maybe also sprintf/snprintf if format string contains some %s/%ls/%S specifiers and those arguments overlap the target. Basically, most of the __restrict/restrict qualified prototypes in glibc headers would be good candidates for overlap tests (if possible to determine length). In the implementation of the checking library you probably want to #include <sys/cdefs.h>, then #undef __restrict #define __restrict and similarly for restrict and __restrict_arr and compile the file with -fno-builtin, to make sure gcc doesn't optimize your checks away based on the arguments being restricted pointers. Jakub -- devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/devel