Am 03.06.19 um 21:52 schrieb Luboš Luňák: > On Monday 03 of June 2019, Caolán McNamara wrote: >> On Mon, 2019-06-03 at 12:23 +0200, Luboš Luňák wrote: >>> On Thursday 30 of May 2019, Michael Meeks wrote: >>>> + some in the zip area - assuming they are threading related. >>> >>> Is this about those documents such >>> as /srv/crashtestdata/files/caolan/opendocument_stack_overflow_2.odt >>> ? How >>> can I reproduce that problem? If I try to fetch >>> buildslave@xxxxxxxxxxxxxxxxxxxxxxxxxxxx: >>> an/opendocument_stack_overflow_2.odt , >>> it doesn't exist. >> >> I attach two of the examples here. The input name was foo.sample, the >> output to odt name appears higher up in the bt during the export. >> >> ./instdir/program/soffice.bin --headless --convert-to odt >> opendocument_stack_overflow.sample > > > Ok, so it's not a problem with my code, my changes just happened to show the > problem, and the problem is that those documents are broken. If you try to > unzip the documents, it will complain about incorrect CRC (although it still > will uncompress them). And what happens is that when we try to save the file, > apparently only by that point we'll read those zip streams, there will be a > ZipException about that, and the code in package/ is not exception-safe. So > ZipOutputStream::writeLOC() gets called but not the matching > ZipOutputStream::rawCloseEntry(). > > But this is actually broken on several levels. If I make the code to catch > the exception better, I'll need to make it somehow handle the fact that > writeLOC() prepared for writing en entry, but then there's nothing to write. > But that's actually not important, since ZipPackageStream::saveChild() will > still return failure, so ZipPackageFolder::saveContents() will throw an > exception, making the whole document saving fail. Which in turn means this > whole save business is irrelevant, as there's just no way to save the > document, even though we can load it and we can edit it. Which seems rather > lame. > > Any idea what to do about that? Is it really ok that we just refuse to save > it? Or should we save it even though the contents may be broken? IMHO the only sane solution would be to detect the broken CRCs on read and report a broken file to the user. Eventually we could offer some recovery option: mark the broken CRCs to be recalculated and keep the stuff or drop the broken ZIP entries. I guess most users can't make this decision, so I would opt for optional recovery of the entries, ignoring the CRCs. Easier solution: we just deny / abort loading the file and tell the user babout the broken file. It really strange that the broken CRCs are just detected on write and the document loads without a problem. Or do we ignore all ZIP entries, which we don't know, which would be strange too? _______________________________________________ LibreOffice mailing list LibreOffice@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/libreoffice