On 11/3/24 13:32, "Kai Mäkisara (Kolumbus)" wrote:
On 31. Oct 2024, at 3.00, John Meneghini <jmeneghi@xxxxxxxxxx> wrote:
Be sure to clear was_reset when ever we clear pos_unknown. If we don't
then the code in st_chk_result() will turn on pos_unknown again.
STp->pos_unknown |= STp->device->was_reset;
This results in confusion as future commands fail unexpectedly.
This brings in my mind (again) the question: is the hack using was_reset set
by scsi_error to detect device reset necessary any more? It was introduced
as a temporary method somewhere between 1.3.20 and 1.3.30 (in 1995)
when the POR (power on/reset) UAs (Unit Attention) were not passed to st.
The worst problem with this hack is clearing was_reset. St should not clear
something set by error handling (layering violation).
OK. I wasn't aware of the history here... but I agree we want to get rid of the layering violation.
Your earlier patch added code to st to set pos_unknown when a POR UA
was found. So, is this now alone enough to catch the resets?
As long at the tape device actually sends a UA, no I don't think we need the was_reset code anymore.
We should remove the device->was_reset code from st.c.
I now did some experiments with scsi_debug and in those experiments
reset initiated using sg_reset did result in st getting POR UA.
But this was just a simple and somewhat artificial test.
Yes, I've noticed that not all of the different emulators out there like MHVTL and scsi_debug
will reliably send a POR UA following sg_reset. Especially scsi_debug. The tape emulation in
scsi_debug is inadequate when it comes to resets AFAIAC.
I'll see if I can develop some further tests for MHVLT, but scsi_debug isn't worth the
trouble (for me) and I've told our QE group here at Red Hat to stop testing the st driver
with scsi_debug. Doing this has led to too many false positives and passing tests. That's
why I finally purchased a tape drive and started testing this myself.
If you want to remove the device->was_reset from the st driver I can help by running
the patch through my tests with the tape device.
--
John A. Meneghini
Senior Principal Platform Storage Engineer
RHEL SST - Platform Storage Group
jmeneghi@xxxxxxxxxx