> [ ... ] entitled to point out how tenuous your thread of logic > is - if he didn't I was going to say exactly the same thing You are both entitled to your opinions, but not to have them unchallenged, also when they are bare statements. > based entirely on a house of assumptions you haven't actually > verified. That seems highly imaginative as my guesses that conclusion that: http://oss.sgi.com/archives/xfs/2014-10/msg00335.html > This issue should be moved to the 'linux-raid' mailing list as > from the reported information it has nothing to do with XFS. ============================= were factually based: http://oss.sgi.com/archives/xfs/2014-10/msg00335.html > There is a ratio of 31 (thirty one) between 'swidth' and > 'sunit' and assuming that this reflects the geometry of the > RAID5 set and given commonly available disk sizes it can be > guessed that with amazing "bravery" someone has configured a > RAID5 out of 32 (thirty two) high capacity/low IOPS 3TB > drives, or something similar. That there is a ratio of 31 is a verified fact, and so is that the reported size of the block device was 100TB. Much of the rest is arithmetic and I indicated that there were was some guesswork involved, mostly assuming that those reported facts were descriptive of the actual configuration. Simply for brevity I did not also point out specifically that the reported facts that «high r_await(160) and w_await(200000)» and the "Subject:" «very high Average Read/Write Request Time» contributed to indicating a (big) issue with the storage layer. and that the presumed width of the array of 32 is congruentn with typical enclosure capacities. Another poster went far further in guesswork, and stated what I was describing as guesses instead as obvious facts: http://oss.sgi.com/archives/xfs/2014-10/msg00337.html > As others mentioned this isn't an XFS problem. The problem is that > your RAID geometry doesn't match your workload. Your very wide > parity stripe is apparently causing excessive seeking with your > read+write workload due to read-modify-write operations. and went on to make a whole discussion wholly unrelated to XFS based on that: > To mitigate this, and to increase resiliency, you should > switch to RAID6 with a smaller chunk. If you need maximum > capacity make a single RAID6 array with 16 KiB chunk size. > This will yield a 496 KiB stripe width, increasing the odds > that all writes are a full stripe, and hopefully eliminating > much of the RMW problem. > A better option might be making three 10 drive RAID6 arrays > (two spares) with 32 KiB chunk, 256 KiB stripe width, and > concatenating the 3 arrays with mdadm --linear. The above assumptions and offtopic suggestions have been unquestioned; by myself too, even if I disagree with some of the recommendations, also as I think them premature because we don't know what the requirements really are beyond what can be guessed from «the reported information». That's also why I suggested to continue the discussion on the Linux RAID list. The guess that the filesystem was meant to be an object store is also based on a verified fact: > if the device name "/data/fhgfs/fhgfs_storage" is dedscriptive, > this "brave" RAID5 set is supposed to hold the object storage > layer of a BeeFS Also BP did not initially question my analysis of the 100TB filesystem case, but asked a wholly separate question asking to explain this aside: > the object storage layer of a BeeFS highly parallel filesystem, > and therefore will likely have mostly-random accesses. To that question I provided a reasonable and detailed *technical* explanation, both as to the specific case and in general, and linking it to both the original question by QH and to the list topic which is XFS. As a reminder this thread seems to me to contain 3 distinct even if connected *technical* topics: * Whether the report about the 100TB RAID based XFS filesystem contained evidence indicating an XFS issue or a RAID issue; this was introduced by QH. * Whether concurrent randomish read-writes tend to be the workload observed by object stores in large parallel HPC systems; this was introduced by BP. * Whether concurrent randomish read-write would happen in the use of that specific filesystem as an object store; this was introduced by myself to link QH's original question to BP's new question, because strictly speaking BP's question seemed to me offtopic in the XFS mailing list. Then BP seemed to switch topics again by mentioning 1 and 2 threaded read-write in the context of the general issue of the access patterns of large parallel HPC filesystem object stores, and that seemed strange to me, as I commented, so I ignore it. > appropriate response would be to ask the OP to describe their > workload and storage in more detail Indeed, and I suggested to move the discussion to the Linux RAID mailing list for that purpose, because the evidence quoted above seemed to indicate that a 32-wide RAID5 was involved, as in: > This issue should be moved to the 'linux-raid' mailing list as > from the reported information it has nothing to do with XFS. ============================= This left free QH to report more information as someone asked to indicate the issue was relevant more to the XFS list than to the Linux RAID list, or to move to the Linux RAID list with more details. Again, the suggestion to continue the discussion in another list that seemed more useful to QH was based on simple inferences based on 3 reported facts: 31 ratio, 100TB size, "fast" single threaded speed vs. slow concurrent read/write speeds (and the concurrent high wait times). You and BP are entitled to think those are not good guesses (just as SH instead took them as good ones) and it would be interesting if you provided substantive reason why the suggestion to continue the discussion in the Linux RAID list was inappropriate, but you haven't contributed any other than your say-so. Also while suggestions have been made to QH to provide more details and/or move the discussion to the Linux RAID list by different people, notably this has not happened yet. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs