On 5/27/21 4:10 PM, Ron wrote:
On 5/27/21 4:58 PM, Rob Sargent wrote:I have the json file which feeds the staging table and I have code which is designed to load such files. I suppose an sql file with those 16 insert statements would approximate the the apps call. Let me give that a shot.
On 5/27/21 3:08 PM, Sam Gendler wrote:
I almost wish that were true.The same JDBC connection that is resulting in lost data? Sounds to me like you aren't connecting to the DB you think you are connecting to.
However, looking at AWS "Performance Insights" is see the sql statements generate by my app begin executed on the server. Not coincidentally this is from the "Top SQL (10)/Load by waits" view. Now that view does NOT, in point of fact, name the database in which the sql is running, but the rest of my environment pretty much rules out silently switching tracks on myself. I have to read from the correct database, using a UUID, to get data to analyze, then save the results of the analysis back. I'm using my wonderful webapp to run this and I've successfully analyzed and stored result for small starting data sets.
I just notice the "Top database" tab on that screen: there is only one and it's the correct one.
I've reproduced the behaviour. I'm pretty convinced it a size thing, but which part of the system is being max'd out is not clear. Not CPU, but that's the only resource the "Performance Insight" mentions (suggesting this UI wasn't designed by a database person).
The loss of the staging table is most spectacular. It filled from a file which has 7.5M records. It's clear in the tomcat logs that is was created and written to, one record read. Then the "top sql" happens:
insert into segment select * from bulk."rjs_GEV15_01_c9e224ca_85d2_40b4_ad46_327cfb9f0ac6" as s where s.probandset_id >= ? and s.probandset_id < ?the "bulk" table is the staging table. The params are filled in with a quasi-uuid which grab roughly 1/16th of the data. In the stack trace on my tomcat server I get
Caused by: org.jooq.exception.DataAccessException: SQL [insert into segment select * from bulk."rjs_GEV15_01_c9e224ca_85d2_40b4_ad46_327cfb9f0ac6" as s where s.probandset_id >= '30000000-0000-0000-0000-00000000\So this would have been the fourth such insert statement read from that staging table.
0000' and s.probandset_id < '40000000-0000-0000-0000-000000000000' ]; An I/O error occurred while sending to the backend.
That table is not deleted by my code. It is renamed after the last insert into segment, by appending "_done" to the name. But that table, by either name, is nowhere to be seen on the server.
For those scoring at home, the trouble in the tomcat log start with
0:02:11.452 [https-jsse-nio-10.0.2.28-15002-exec-7] INFO edu.utah.camplab.jx.PayloadFromMux -
Is there any way to replicate this in psql?
--
Angular momentum makes the world go 'round.