Re: Reading parquet files in Calc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jacek,

On 06.10.2022 05:54, Jacek Pliszka wrote:

I found an old thread about adding it to Orcus library instead.

Is it the best approach?

It is an approach. But I wouldn't say it's the best approach. Orcus library has traditionally been geared more toward supporting text-file based file formats, such as csv, xlsx, ods, gnumeric etc ., whereas my understanding of parquet file format is that it is a binary file format.

If Orcus could use arrow library then it should be relatively easy.
similar to .csv files.

Yes, I believe that's doable. Having said that, it's my understanding that the arrow library provides a nice abstraction optimized for columnar in-memory formats. So, if we were to use it in orcus, which is not necessarily optimized for columnar in-memory formats, we may lose some efficiency just by having to potentially go through two layers of abstraction that both have different focus. Someone would need to take a closer look at the design of the arrow library and decide which approach makes more sense: using it in orcus or using it directly in the libreoffice codebase.

I would have been very happy to take a closer look at the arrow library. But right now I'm trying to finish up all the features that need to go into the next release of orcus, so I won't be able to do that anytime soon unfortunately.

Kohei



[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux