Hi all. I have been running FS sanity on daily builds(glusterfs mounts only at this point) for a few days for a few days and I have been hitting a couple of problems:
================ final pass/fail report ================= Test Date: Sat Jul 5 01:53:00 EDT 2014 Total : [44] Passed: [41] Failed: [3] Abort : [0] Crash : [0] --------------------------------------------------------- [ PASS ] FS Sanity Setup [ PASS ] Running tests. [ PASS ] FS SANITY TEST - arequal [ PASS ] FS SANITY LOG SCAN - arequal [ PASS ] FS SANITY LOG SCAN - bonnie [ PASS ] FS SANITY TEST - glusterfs_build [ PASS ] FS SANITY LOG SCAN - glusterfs_build [ PASS ] FS SANITY TEST - compile_kernel [ PASS ] FS SANITY LOG SCAN - compile_kernel [ PASS ] FS SANITY TEST - dbench [ PASS ] FS SANITY LOG SCAN - dbench [ PASS ] FS SANITY TEST - dd [ PASS ] FS SANITY LOG SCAN - dd [ PASS ] FS SANITY TEST - ffsb [ PASS ] FS SANITY LOG SCAN - ffsb [ PASS ] FS SANITY TEST - fileop [ PASS ] FS SANITY LOG SCAN - fileop [ PASS ] FS SANITY TEST - fsx [ PASS ] FS SANITY LOG SCAN - fsx [ PASS ] FS SANITY LOG SCAN - fs_mark [ PASS ] FS SANITY TEST - iozone [ PASS ] FS SANITY LOG SCAN - iozone [ PASS ] FS SANITY TEST - locks [ PASS ] FS SANITY LOG SCAN - locks [ PASS ] FS SANITY TEST - ltp [ PASS ] FS SANITY LOG SCAN - ltp [ PASS ] FS SANITY TEST - multiple_files [ PASS ] FS SANITY LOG SCAN - multiple_files [ PASS ] FS SANITY TEST - posix_compliance [ PASS ] FS SANITY LOG SCAN - posix_compliance [ PASS ] FS SANITY TEST - postmark [ PASS ] FS SANITY LOG SCAN - postmark [ PASS ] FS SANITY TEST - read_large [ PASS ] FS SANITY LOG SCAN - read_large [ PASS ] FS SANITY TEST - rpc [ PASS ] FS SANITY LOG SCAN - rpc [ PASS ] FS SANITY TEST - syscallbench [ PASS ] FS SANITY LOG SCAN - syscallbench [ PASS ] FS SANITY TEST - tiobench [ PASS ] FS SANITY LOG SCAN - tiobench [ PASS ] FS Sanity Cleanup [ FAIL ] FS SANITY TEST - bonnie [ FAIL ] FS SANITY TEST - fs_mark [ FAIL ] /rhs-tests/beaker/rhs/auto-tests/components/sanity/fs-sanity-tests-v2
Bonnie++ is just very slow(running for 10+ hours on 1 16 GB file) and FS mark has been failing. The bonnie slowness is in re read, here is the best explanation I can find on it:
https://blogs.oracle.com/roch/entry/decoding_bonnie
Rewriting...done
This gets a little interesting. It actually reads 8K, lseek back to the start of the block, overwrites the 8K with new data and loops. (see article for more.).
On FS mark I am seeing:
# fs_mark -d . -D 4 -t 4 -S 5 # Version 3.3, 4 thread(s) starting at Sat Jul 5 00:54:00 2014 # Sync method: POST: Reopen and fsync() each file in order after main write loop. # Directories: Time based hash between directories across 4 subdirectories with 180 seconds per subdirectory. # File names: 40 bytes long, (16 initial bytes of time stamp with 24 random bytes at end of name) # Files info: size 51200 bytes, written with an IO size of 16384 bytes per write # App overhead is time in microseconds spent in the test not doing file writing related system calls. FSUse% Count Size Files/sec App Overhead Error in unlink of ./00/53b784e8~~~~~~~~SKZ0QS9BO7O2EG1DIFQLRDYY : No such file or directory fopen failed to open: fs_log.txt.26676 fs-mark pass # 5 failedI am working on reporting so look for a daily status report email from my jenkins server soon. How do we want to handle failures like this moving forward? Should I just open a BZ after I triage? Do you guys do a new BZ for every failure in the normal regressions tests?-b
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-devel