Re: Recent unexplained quota problems

Ryan Golhar <golharam@xxxxxxxxx> · Mon, 17 Sep 2007 10:07:14 -0400

I'm running a RHEL v3 server, completely up to date...

I tried to edit a user's quota (as root) using the command
'/usr/sbin/edquota someuser' and I got the error:

edquota: Can't open quotafile /home/aquota.user: Read-only file system
No filesystems with quota detected.

Doing a listing of aquota.user reports:

[root@server log]# ll /home/aquota.user
-rw-------    1 root     root        15360 Sep  9 04:22 /home/aquota.user

/etc/fstab has:
LABEL=/home            /home          ext3    defaults,usrquota 1 2

A listing of /home shows:
drwxr-xr-x  133 root     root         4096 Sep  7 12:49 home

If I try to 'touch test' in /home I get:
[root@server home]# touch test
touch: creating `test': Read-only file system

I rebooted the server and everything seems to be okay.  I'm a little 
concerned about this though because I can't explain it.
Chris St. Pierre wrote:

You should be concerned about this.  The kernel will change a
filesystem to read-only when it detects an IO error against that FS.
This can happen for a number of reasons:

  - Your connection to your SAN dropped;
  - Your hard drive(s) are dying;
  - You have significant data corruption;
  - and on and on...

Except for the first reason I listed, all of the other reasons I know
of are Real Bad.

If you're lucky, you've got some minor data corruption that caused the
kernel to try to write beyond the end of the drive or something like
that; you should try running fsck on the filesystem first.  Be warned,
though, that if you have significant data corruption, fsck may
completely hose the filesystem, so get as good a backup as you can first.

You should check /var/log/messages for kernel messages about this.  If
it happens again, dmesg will also have useful information (at least,
it will until you reboot).

If the problem is transient, a simple userspace mount call will fix
it:

mount -o remount,rw,usrquota /home

But that's a gamble.

Despite what other posters have said, when the kernel changes the
status of the volume, it does so using kernel-level tools, _not_
userspace mount calls, so the arguments show in the mount(1) command
will _not_ reflect the read-only status of the drive.  If mount(1)
shows that the drive is 'ro', then a person or a program, not the
kernel, has mounted it read-only.

The /home partition is just that - a partition on the local harddrive. 
I don't think the harddrive is failing as I haven't seen the usual io 
messages in the nightly log files indicating such, but I suppose its 
possible.

/var/log/messages shows:
Sep 16 04:22:28 aspartic kernel: EXT3-fs: ide0(3,3): couldn't remount 
RDWR because of unprocessed orphan
 inode list.  Please umount/remount instead.

I notice the time is at 4:22.  The nightly cron jobs start at 4am.

Looking at messages.1, I see:
Sep  9 20:53:49 aspartic shutdown: shutting down for system reboot
Sep  9 20:53:49 aspartic init: Switching to runlevel: 6
Sep  9 20:53:50 aspartic su(pam_unix)[19525]: session closed for user root
...
Sep  9 20:53:53 aspartic rpc.mountd: Caught signal 15, un-registering 
and exiting.
Sep  9 20:53:53 aspartic nfs: rpc.mountd shutdown succeeded
Sep  9 20:53:57 aspartic kernel: nfsd: last server has exited
Sep  9 20:53:57 aspartic kernel: nfsd: unexporting all filesystems
Sep  9 20:53:57 aspartic kernel: EXT3-fs error (device ide0(3,3)) in 
start_transaction: Readonly filesys
tem
Sep  9 20:53:57 aspartic kernel: EXT3-fs error (device ide0(3,3)) in 
ext3_delete_inode: Readonly filesys
tem
Sep  9 20:53:57 aspartic kernel: EXT3-fs error (device ide0(3,3)) in 
start_transaction: Readonly filesys
tem
Sep  9 20:53:57 aspartic kernel: EXT3-fs error (device ide0(3,3)) in 
ext3_delete_inode: Readonly filesys
tem
Sep  9 20:53:57 aspartic nfs: nfsd shutdown succeeded
Sep  9 20:53:58 aspartic nfs: rpc.rquotad shutdown succeeded
Sep  9 20:53:58 aspartic nfs: Shutting down NFS services:  succeeded

<After reboot>
Sep  9 20:55:00 aspartic fsck: /home:
Sep  9 20:55:00 aspartic fsck: Clearing orphaned inode 1409343 (uid=556, 
gid=500, mode=040700, size=4096
)
Sep  9 20:55:01 aspartic fsck: /home: Clearing orphaned inode 1409458 
(uid=556, gid=500, mode=0100700, s
ize=617)
Sep  9 20:55:01 aspartic fsck: /home: clean, 116383/3850240 files, 
5306683/7691118 blocks

It looks like the harddrive is going bad, but its odd, because I'm 
accustomed to seeing messages in the nightly log.

--
redhat-list mailing list
unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list