Re: how to make duplicate finding query faster?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Dec 30, 2020, at 12:36 AM, Sachin Kumar <sachinkumaras@xxxxxxxxx> wrote:
> 
> Hi All,
> 
> I am uploading data into PostgreSQL using the CSV file and checking if there is any duplicates value in DB it should return a duplicate error.  I am using below mention query.
> 
> if Card_Bank.objects.filter( Q(ACCOUNT_NUMBER=card_number) ).exists(): 
>         flag=2
>       else:
>         flag=1
> it is taking too much time i am using 600k cards in CSV.
> 
> Kindly help me in making the query faster.
> 
> I am using Python, Django & PostgreSQL.
> -- 
> 
> Best Regards, 
> Sachin Kumar

Are you checking one-by-one because your goal is not to fail the whole upload that contains the duplicates, but rather to skip only the duplicates?

If that's the case, I think you'd be better off copying the CSV straight into a temp table, using a join to delete duplicates from it, then insert the remainder into the target table, and finally drop the temp table.





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux