Hello everyone, so, I've been working on a document where typesetting was quite important. I've had a special layout in mind I wanted to achieve including spatial text blocks, special alignments, images, etc. Typst makes it very easy to design all sorts of imaginable things, but even if the syntax is as great as it is, making a mistake, misunderstanding something or simply not being sure how well will given content arrange, I was in a need to check out how did my idea turn out. So, I implemented few useful layout checking functions into math_scanner, since it has already showed interesting results when it comes to working with documents' graphics. So, right now, after Tesseract processes the input image, you can: * for any character, check the percentual distance to the left, right, top and bottom edge of the image * If you border a region of the image, you can check its percentual size relative to the page (or column if you've made use of the program's columnns splitting functionality) * Check the size of any focused character in pixels. Note this may not be always accurate, since the size is calculated from the bounding boxes determined dby Tesseract. Making use of these functions, you can easily check say whether a heading is centered, how much vertical space is left in column while writing the text, whether and how paragraphs are aligned, how big are individual text blocks on the paper, or whether your figures were aligned correctly as far as there is text around them you could use to mark a region (note just horizontal / vertical borders are necessary for determining the height / width, respectively). math_scanner can split the input image into columns, which are afterwards treated like standalone images (including rerecognition by Tesseract, this can clear out a lot of clutter). The new layout checking functions respect this mechanism too, so if you have a multi-column document, you can review the layout in each of them separately. Indeed, it's still a good idea to have your work checked dby a sighted reviewer, but it's still a difference to call someone for a check 5 times and 50 times, because you don't expect something, then change your mind, rework, etc. This particular implementation also has its limitations, namely the fact it's run by OCR has few advantages, but also some significant downsides, like recognition errors and the general unavareness of the program about things like figures in the document. It would be very interesting to implement something similar working directly with information from PDF, since tools like Typst or LaTeX tend to include them in somewhat semantic form, so it may be possible to get very interesting results. Right now however I don't think I quite have the time to study the structure of PDF documents nor build a layout explorer from scratch, so since math_scanner already had most of the prerequisites which were good enough for my use-case, this is the optimal route for me at the time being. I'm just letting people know in case someone was interested in my little experiment. You can find the new commits in the development branch of math_scanner: https://github.com/RastislavKish/math_scanner Still Linux only, the Windows branch is actually complete and functional, but it yet has to be merged, since I didn't find anybody on Windows willing/able to install Python and the necessary dependencies for my little program. :) Perhaps I should merge it though, the changes are subtle really, there is very little to go wrong and even if it does, it can be solved when someone notices. I will take a look into it at some point. Have fun Best regards Rastislav _______________________________________________ Blinux-list mailing list Blinux-list@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/blinux-list