Call for Submissions IO500 ISC23

IO500 Committee <committee@xxxxxxxxx> · Thu, 30 Mar 2023 17:48:11 -0600

Stabilization Period: Monday, April 3rd - Friday, April 14th, 2023
Submission Deadline: Tuesday, May 16st, 2023 AoE

The IO500 is now accepting and encouraging submissions for the upcoming 
12th semi-annual IO500 list, in conjunction with ISC23. Once again, we 
are also accepting submissions to the 10 Client Node Challenge to 
encourage the submission of small scale results. The new ranked lists 
will be announced at the ISC23  BoF [1]. We hope to see many new 
results.

What's New
1. Creation of Production and Research Lists - Starting with ISC'22, we 
proposed a separation of the list into separate Production and Research 
lists.  This better reflects the important distinction between storage 
systems that run in production environments and those that may use more 
experimental hardware and software configurations.  At ISC23, we will 
formally create these two lists and users will be able to submit to 
either of the two lists (and their 10 client-node counterparts).  Please 
see the requirements for each list on the IO500 rules page [3].
2. New Submission Tool - There is now a new IO500 submission tool that 
improves the overall submission experience.  Users can create accounts 
and then update and manage all of their submissions through that 
account.  As part of this new tool, we have improved the submission 
fields that describe the hardware and software of the system under test. 
For reproducibility and analysis reasons, we now made the easily 
obtainable fields mandatory - data from storage servers are for users 
often difficult to obtain, therefore, most remain optional. As a new 
system, there may be quirks, please reach out on Slack or the mailing 
list if you see any issues.  Further details will be released on the 
submission page [2].
3. Reproducibility - Every submission will now receive a reproducibility 
score based upon the provided system details and the reproducibility 
questionnaire. This score will inform the community on the amount of 
details provided in the submission and the obtainability of the storage 
system. Further, this score will be used to evaluate if a submission is 
eligible for the Production list.
4. New Phases - We are continuing to evaluate the inclusion of optional 
test phases for additional key workloads - split easy/hard find phases, 
4KB and 1MB random read/write phases, and concurrent metadata 
operations. This is called an extended run. At the moment, we collect 
the information to verify that additional phases do not significantly 
impact the results of a standard run and an extended run to facilitate 
comparisons between the existing and new benchmark phases. In a future 
release, we may include some or all of these results as part of the 
standard benchmark. The extended results are not currently included in 
the scoring of any ranked list.

Background
The benchmark suite is designed to be easy to run and the community has 
multiple active support channels to help with any questions. Please note 
that submissions of all sizes are welcome; the site has customizable 
sorting, so it is possible to submit on a small system and still get a 
very good per-client score, for example. Additionally, the list is about 
much more than just the raw rank; all submissions help the community by 
collecting and publishing a wider corpus of data. More details below.

Following the success of the Top500 in collecting and analyzing 
historical trends in supercomputer technology and evolution, the IO500 
was created in 2017, published its first list at SC17, and has grown 
continually since then. The need for such an initiative has long been 
known within High-Performance Computing; however, defining appropriate 
benchmarks has long been challenging. Despite this challenge, the 
community, after long and spirited discussion, finally reached consensus 
on a suite of benchmarks and a metric for resolving the scores into a 
single ranking.

The multi-fold goals of the benchmark suite are as follows:
1. Maximizing simplicity in running the benchmark suite
2. Encouraging optimization and documentation of tuning parameters for 
performance
3. Allowing submitters to highlight their "hero run" performance numbers
4. Forcing submitters to simultaneously report performance for 
challenging IO patterns.

Specifically, the benchmark suite includes a hero-run of both IOR and 
mdtest configured however possible to maximize performance and establish 
an upper-bound for performance. It also includes an IOR and mdtest run 
with highly prescribed parameters in an attempt to determine a lower 
performance bound. Finally, it includes a namespace search as this has 
been determined to be a highly sought-after feature in HPC storage 
systems that has historically not been well-measured. Submitters are 
encouraged to share their tuning insights for publication.

The goals of the community are also multi-fold:
1. Gather historical data for the sake of analysis and to aid 
predictions of storage futures
2. Collect tuning information to share valuable performance 
optimizations across the community
3. Encourage vendors and designers to optimize for workloads beyond 
"hero runs"
4. Establish bounded expectations for users, procurers, and 
administrators

The IO500 follows a two-staged approach. First, there will be a two-week 
stabilization period during which we encourage the community to verify 
that the benchmark runs properly on a variety of storage systems. During 
this period the benchmark may be updated based upon feedback from the 
community. The final benchmark will then be released. We expect that 
runs compliant with the rules made during the stabilization period will 
be valid as a final submission unless a significant defect is found.

10 Client Node I/O Challenge
The 10 Client Node Challenge is conducted using the regular IO500 
benchmark, however, with the rule that exactly 10 client nodes must be 
used to run the benchmark. You may use any shared storage with any 
number of servers. We will announce the results in the Production and 
Research lists as well as in separate derived lists.

Birds-of-a-Feather
Once again, we encourage you to submit [2] to join our community, and to 
attend the ISC23 BoF [1], where we will announce the new IO500 
Production and Research lists and their 10 client node counterparts. The 
current list includes results from twenty different storage system types 
and 70 institutions. We hope that the upcoming list grows even more.

[1] https://io500.org/pages/bof-isc23
[2] https://io500.org/submission
[3] https://io500.org/rules-submission

--
The IO500 Committee
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx