On 7/6/24 13:09, sud wrote:
On Fri, Jul 5, 2024 at 8:24 PM Adrian Klaver <adrian.klaver@xxxxxxxxxxx
<mailto:adrian.klaver@xxxxxxxxxxx>> wrote:
On 7/5/24 02:08, sud wrote:
> Hello all,
>
> Its postgres database. We have option of getting files in csv
and/or in
> avro format messages from another system to load it into our
postgres
> database. The volume will be 300million messages per day across many
> files in batches.
Are dumping the entire contents of each file or are you pulling a
portion of the data out?
Yes, all the fields in the file have to be loaded to the columns in the
tables in postgres. But how will that matter here for deciding if we
should ask the data in .csv or .avro format from the outside system to
load into the postgres database in row and column format? Again my
understanding was that irrespective of anything , the .csv file load
will always faster as because the data is already stored in row and
column format as compared to the .avro file in which the parser has to
perform additional job to make it row and column format or map it to the
columns of the database table. Is my understanding correct here?
If you are going to use complete rows and all rows then COPY of CSV in
Postgres would be your best choice.
--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx