Hi guys,
My use-case is somewhat unusual, so I thought I'd report these since
they may not get tripped under "normal" use. This is against 2.0.0rc1.
Setup is a shared root on GlusterFS/AFR with only one node (2nd node not
yet built).
Something weird seems to happen with some uses of ncurses libraries and
headers. If I try to compile the kernel using "make menuconfig" while
the root FS (including ncurses libraries and headers) is on GlusterFS,
it fails complaining that it couldn't find the headers. The exact same
system image (same tar ball) expanded onto an ext3 file system doesn't
exhibit this problem. In both cases /usr/src/linux is on an NFS file
system, so since that is the same in both test cases, the problem seems
to be connected to the root FS. Unfortunately, I cannot seem to get the
logs out of the root FS mount since they are initially created on the
initrd which gets deallocated after booting. :(
A similar, but wider-ranging problem happens when installing nVidia
drivers from the nVidia supplied shell archives. It, too, fails to find
curses and falls back to text mode. But it also then fails to install
properly. It first complains about being unable to back up the files it
is replacing, then it complains that it couldn't install some of the
files. The files it places on the FS end up being badly corrupted and
are not even identified as elf libraries (including the kernel module it
builds). This works fine on the ext3 FS, and putting the said files into
the tar ball and extracting them onto the Gluster root that way, creates
the files correctly.
I know this is difficult to troubleshoot without logs, but it's
difficult to get those when the glusterfs daemon gets started before the
root FS gets properly mounted. Any update on when syslog logging might
be available? That would make this much easier as those could be
redirected to a remote syslog.
Another thing that I have found is that just after mounting the gluster
fs that holds the real root fs (the one that gets chroot-ed to just
before the real init gets called), the fs is initially inaccessible for
a little while. If the root setup script mounts glusterfs, chroots to it
and runs init the whole process fails because it cannot find a vital
directory it checks for in the gluster root fs it just mounted. In this
particular case (just in case it's relevant), it's /cluster directory in
the glusterfs root, and the script tries to execute a bind-mount from
/cluster/cdsl/2 to /cdsl.local. This fails.
However, if I add something like
ls -la /path/to/flusterfs-root/mountpoint/cluster > /dev/null
just after mointing the gluster fs, it succeeds and proceeds to mount,
chroot and init correctly.
This seems to be related or similar to a bug somebody reported before
where the gluster fs is inaccessible on the first access attempt after
mounting.
Gordan