Re: PDF processing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Michael Weghorn wrote:
> On 03/03/2020 12.26, Pietro Paolini wrote:
> > I wanted to have a look at the source code
> > to see if there is some sort of PDF "model" being built from the
> > original PDF document, for instance a  set of objects each describing
> > the graphic meanings of a particular region within the page.
> > 
> 
> At a quick glance, 'sdext/source/pdfimport' looks like a good place to
> start with; I personally don't know more related to your more specific
> question.
>
Yep, that's the place - we currently use poppler to parse the PDF,
then generate a tree of quite basic drawing operations from it.

Check sdext/source/pdfimport/tree/genericelements.cxx for the type of
objects in that tree, and
sdext/source/pdfimport/tree/{draw|writer}treevisiting.cxx for a
visitor-pattern kind of tree walking - for your need, you could
e.g. check the object boundaries for each visited object, to check if
they intersect with your region of interest.

Cheers,

-- Thorsten

Attachment: signature.asc
Description: PGP signature

_______________________________________________
LibreOffice mailing list
LibreOffice@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/libreoffice

[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux