Hi Tony, On 7/28/22 17:12, Luck, Tony wrote: >>>> Speculating myself as far as I understand IFS is not for factory >>>> tests but for testing in the fields since big cloud vendors have >>>> found that sometimes there are hard to catch CPU defects which >>>> they only find out by running statistics which show that certain >>>> tasks only crash when run on machine a, socket b, core c. >>> >>> Who knows, Intel doesn't say so we can't really guess :( >> >> Right, for version 3 the commit message and ABI documentation changes >> really need to clarify why multiple test-pattern files may be needed >> mucy better. If possible please also include 1 or 2 _clear_ examples >> of cases where more then 1 test-pattern file may be used. > > Sorry for the radio silence. We took Greg's suggestion to go back and > thinks this out completely to heart. As he said, there is no rush to get > this in. We need to do it right. That (taking your time to get this right) is good to hear, thanks. > Your summary above on how this works is completely correct. > > The reason for adding more files is to cover more transistors in the > core. The base file that we started with gets mumble-mumble percent > of the transistors checked. Adding a few more files will increase that > quite significantly. > > So testing a system may look like: > > for each scan file > do > load the scan file > for each core > do > test the core with this set of tests > done > done > > Our internal discussions on naming are following the same direction that > you suggested, but likely even more restrictive. The "suffix" may just be > a two-digit hex number (allowing for up to 256 files ... though for Sapphire > Rapids we are looking at just 6). > > So our current direction is to name six "parts" something like this: > > 06-8f-06-00.scan > 06-8f-06-01.scan > 06-8f-06-02.scan > 06-8f-06-03.scan > 06-8f-06-04.scan > 06-8f-06-05.scan > > but we are still checking to make sure this will work for future CPUs. Once > we have something solid we will come back to the mailing list. Thanks, this sounds good to me. > As also suggested in earlier thread we will change the name of the "reload" > file (since skipping to a new file isn't a "reload"). The "load a scan file" will > write the "part" number to this new file. Regards, Hans