Re: [SLOF] [PATCH v4] slof/fs/packages/disk-label.fs: improve checking for DOS boot partitions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 2024-04-05 13:46:44, Kautuk Consul wrote:
> Hi,
> 
> On 2024-04-05 14:46:14, Alexey Kardashevskiy wrote:
> > 
> > 
> > On Thu, 4 Apr 2024, at 18:18, Kautuk Consul wrote:
> > > On 2024-04-04 11:35:43, Alexey Kardashevskiy wrote:
> > > > First, sorry I am late into the discussion. Comments below.
> > > > 
> > > > 
> > > > On Thu, 28 Mar 2024, at 17:00, Kautuk Consul wrote:
> > > > > While testing with a qcow2 with a DOS boot partition it was found that
> > > > > when we set the logical_block_size in the guest XML to >512 then the
> > > > > boot would fail in the following interminable loop:
> > > > 
> > > > Why would anyone tweak this? And when you do, what happens inside the SLOF, does it keep using 512?
> > > Well, we had an image with DOS boot partition and we tested it with
> > > logical_block_size = 1024 and got this infinite loop.
> > 
> > This does not really answer to "why" ;)
> 
> Well, my point is that it is tweakable in virsh/qemu and maybe we should be
> handling this error in configuration in SLOF properly ? There shouldn't
> be the possibility of an interminable loop in the software anywhere,
> right ? :-)
> 
> > 
> > > In SLOF the block-size becomes what we configure in the
> > > logical_block_size parameter. This same issue doesn't arise with GPT.
> > 
> > How is GPT different in this regard?
> 
> In GPT the sector number 1 contains the GPT headers, not
> sector 0 as in the case of DOS boot partition. So if the block-size is
> incorrect, the GPT magic number itself isn't found and the "E3404: Not a
> bootable device!" error is immediately thrown.
> 
> > 
> > > > > <SNIP>
> > > > > Trying to load:  from: /pci@800000020000000/scsi@3 ...
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > </SNIP>
> > > > > 
> > > > > Change the "read-sector" Forth subroutine to throw an exception whenever
> > > > > it fails to read a full block-size length of sector from the disk.
> > > > 
> > > > Why not throwing an exception from the "beyond end" message point?
> > > > Or fail to open a device if SLOF does not like the block size? I forgot the internals :(
> > > This loop is interminable and this "Access beyond end of device!"
> > > message continues forever.
> > 
> > Where is that loop exactly? Put CATCH in there.
> 
> That loop is in count-dos-logical-partitions. The reason why I didn't
> put a CATCH in there is because there is already another CATCH statement
> in do-load in slof/fs/boot.fs which covers this throw. For the other
> path that doesn't have a CATCH I have inserted a CATCH in the open
> subroutine in disk-label.fs.
> 
> > 
> > > SLOF doesn't have any option other than to use the block-size that was
> > > set in the logical_block_size parameter. It doesn't have any preference
> > > as the code is very generic for both DOS as well as GPT.
> > > > 
> > > > > Also change the "open" method to initiate CATCH exception handling for the calls to
> > > > > try-partitions/try-files which will also call read-sector which could potentially
> > > > > now throw this new exception.
> > > > > 
> > > > > After making the above changes, it fails properly with the correct error
> > > > > message as follows:
> > > > > <SNIP>
> > > > > Trying to load:  from: /pci@800000020000000/scsi@3 ...
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > virtioblk_transfer: Access beyond end of device!
> > > > > 
> > > > > E3404: Not a bootable device!
> > > > > 
> > > > > E3407: Load failed
> > > > > 
> > > > >   Type 'boot' and press return to continue booting the system.
> > > > >   Type 'reset-all' and press return to reboot the system.
> > > > > 
> > > > > Ready!
> > > > > 0 >
> > > > > </SNIP>
> > > > > 
> > > > > Signed-off-by: Kautuk Consul <kconsul@xxxxxxxxxxxxx>
> > > > > ---
> > > > > slof/fs/packages/disk-label.fs | 12 +++++++++---
> > > > > 1 file changed, 9 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/slof/fs/packages/disk-label.fs b/slof/fs/packages/disk-label.fs
> > > > > index 661c6b0..a6fb231 100644
> > > > > --- a/slof/fs/packages/disk-label.fs
> > > > > +++ b/slof/fs/packages/disk-label.fs
> > > > > @@ -136,7 +136,8 @@ CONSTANT /gpt-part-entry
> > > > > : read-sector ( sector-number -- )
> > > > >     \ block-size is 0x200 on disks, 0x800 on cdrom drives
> > > > >     block-size * 0 seek drop      \ seek to sector
> > > > > -   block block-size read drop    \ read sector
> > > > > +   block block-size read         \ read sector
> > > > > +   block-size < IF throw THEN    \ if we read less than the block-size then throw an exception
> > > > 
> > > > When it fails, is the number of bytes ever non zero? Thanks,
> > > No, it doesn't reach 0. It is lesser than the block-size. For example if
> > > we set the logcial_block_size to 1024, the block-size is that much. if
> > > we are reading the last sector which is physically only 512 bytes long
> > > then we read that 512 bytes which is lesser than 1024, which should be
> > > regarded as an error.
> > 
> > Ah so it only happens when there is an odd number of 512 sectors so reading the last one with block-size==1024 only reads a half => failure, is that right?
> 
> Yes. Or, block-size if set to 2048 or 4096 will also show the same problem.
> 
> > 
> > > > 
> > > > > ;
> > > > >  
> > > > > : (.part-entry) ( part-entry )
> > > > > @@ -723,10 +724,15 @@ CREATE GPT-LINUX-PARTITION 10 allot
> > > > >     THEN
> > > > >  
> > > > >     partition IF
> > > > > -       try-partitions
> > > > > +       ['] try-partitions
> > > > >     ELSE
> > > > > -       try-files
> > > > > +       ['] try-files
> > > > >     THEN
> > > > > +
> > > > > +   \ Catch any exception that might happen due to read-sector failing to read
> > > > > +   \ block-size number of bytes from any sector of the disk.
> > > > > +   CATCH IF false THEN
> > > Segher/Alexey, can we keep this CATCH block or should I remove it ?
> > 
> > imho the original bug should be handled more gracefully. Seeing exceptions in the code just triggers exceptions in my small brain :) Thanks,
> 
> :-). But this is the only other path that doesn't have a CATCH
> like the do-load subroutine in slof/fs/boot.fs. According to Segher
> there shouldn't ever be a problem with throw because if nothing else the
> outer-most interpreter loop's CATCH will catch the exception. But I
> thought to cover this throw in read-sector more locally in places near
> to this functionality. Because the outermost FORTH SLOF interpreter loop is not
> really so related to reading a sector in disk-label.fs.
> 
Alexey/Segher, so what should be the next steps ?
Do you find my explanation above okay or should I simply remove these
CATCH blocks ? Putting a CATCH block in count-dos-logical-partitions is
out of the question as there is already a CATCH in do-load in
slof/fs/boot.fs. This CATCH block in the open subroutine is actually to
prevent the exception to be caught at some non-local place in the code.




[Index of Archives]     [KVM Development]     [KVM ARM]     [KVM ia64]     [Linux Virtualization]     [Linux USB Devel]     [Linux Video]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux