El día viernes, abril 23, 2021 a las 11:32:02a. m. +0200, Matthias Apitz escribió: > El día viernes, abril 23, 2021 a las 10:48:24a. m. +0200, Laurenz Albe escribió: > > > > The serverlog has around this time (sorry for German): > > > > > > 2021-04-23 05:55:23.591 CEST [2317] LOG: unvollständige Message vom Client > > > 2021-04-23 05:55:23.593 CEST [2317] FEHLER: Speicher aufgebraucht > > > 2021-04-23 05:55:23.593 CEST [2317] DETAIL: Kann Zeichenkettenpuffer mit 0 Bytes nicht um 1380986158 Bytes vergrößern. > > > 2021-04-23 05:55:23.593 CEST [2317] LOG: konnte Daten nicht an den Client senden: Datenübergabe unterbrochen (broken pipe) > > > 2021-04-23 05:55:23.593 CEST [2317] FATAL: Verbindung zum Client wurde verloren > > > 2021-04-23 05:57:20.339 CEST [27190] LOG: unerwartetes EOF auf Client-Verbindung mit einer offenen Transaktion > > > 2021-04-23 05:59:17.419 CEST [26827] LOG: unerwartetes EOF auf Client-Verbindung mit einer offenen Transaktion > > > > > > What else could be checked for this? Any hints about this issue? > > > > It seems pretty clear that both the client and the server are waiting for each other. > > Yes, they are both alive as processes. Both are running on the same > Linux server, i.e. no real network between. And any loss of data would > have caused perhaps the end of one (and then the other) process. > I will try to gather a TCPDUMP for this. > I hit the same situation again. The ESQL/C client sent an UPDATE to the PostgreSQL server: # gdb /opt/lib/sisis/catserver/bin/INDEX 31864 ... #0 0x00007f688bb955c4 in poll () from /lib64/libc.so.6 #1 0x00007f688a82622f in pqSocketPoll () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5 #2 0x00007f688a8260ff in pqSocketCheck () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5 #3 0x00007f688a825f83 in pqWaitTimed () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5 #4 0x00007f688a825f55 in pqWait () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5 #5 0x00007f688a821d3a in PQgetResult () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5 #6 0x00007f688a822478 in PQexecFinish () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5 #7 0x00007f688a82212e in PQexecParams () from /usr/local/sisis-pap/pgsql/lib/libpq.so.5 #8 0x00007f688b792216 in ecpg_execute () from /usr/local/sisis-pap/pgsql/lib/libecpg.so.6 #9 0x00007f688b7935de in ecpg_do () from /usr/local/sisis-pap/pgsql/lib/libecpg.so.6 #10 0x00007f688b7936ef in ECPGdo () from /usr/local/sisis-pap/pgsql/lib/libecpg.so.6 #11 0x00007f688dd831c6 in update_record () at per_kette.pgc:1968 #12 0x00007f688dd7ed16 in per_kette (zugriff=10, scroll=-20000, lock=-20000, key=-20000, sto=-20000, p_daten=0x7ffef0499950, sel_anw=0x0, where_anw=0x0, p_btw_daten=0x0, order_by=0x0, auf_ab=0x0, group_by=0x0, having=0x0, into_temp=0x0, count=0x7ffef0498670) at per_kette.pgc:497 #13 0x00007f688dcf770f in DB_rewr (tabmodul=0x7f688dd7e7ca <per_kette>, p_daten=0x7ffef0499950) at dbcall.pgc:3274 #14 0x00007f688ea58821 in BKDeskInsert (setid=FSTAB_Personen, kette=..., katkey=168952979) at ./BKDesk.c:1447 and the Pos server sits idle: # select * from pg_stat_activity where client_port = 52288; datid | datname | pid | leader_pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type --------+---------+-------+------------+----------+---------+---------------------------------+-------------+-----------------+-------------+---------------------------------+---------------------------------+---------------------------------+---------------------------------+-----------------+------------+---------------------+-------------+--------------+---------------------+---------------- 951941 | sisis | 31894 | | 16384 | sisis | SunRise DBCALL V7.2 (pid=31864) | 127.0.0.1 | | 52288 | 25.04.2021 10:57:15.922137 CEST | 25.04.2021 10:57:15.968081 CEST | 25.04.2021 10:57:16.051256 CEST | 25.04.2021 10:57:16.051273 CEST | Client | ClientRead | idle in transaction | 18314796 | 18314796 | fetch per_kette_seq | client backend But this time I was collecting all TCP traffic of all clients to and from the port 5432. This TCPDUMP command writes all traffic as binary data to files of 100.000.000 bytes in size (-C 100) and appends a number when a new file is needed: # /usr/sbin/tcpdump -n -i lo -C 100 -w /var/spool/sisis/tcp5432. port 5432 & Later one can look into the files for the local port 52288 of the connection with: # tcpdump -n -r tcp5432.2 -Xx -s1024 port 52288 > INDEX.52288.txt 10:57:15.921783 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [S], seq 3652783347, win 65495, options [mss 65495,sackOK,TS val 3424445579 ecr 0,nop,wscale 7], length 0 0x0000: 4500 003c b720 4000 4006 8599 7f00 0001 E..<..@.@....... 0x0010: 7f00 0001 cc40 1538 d9b9 0cf3 0000 0000 .....@.8........ 0x0020: a002 ffd7 fe30 0000 0204 ffd7 0402 080a .....0.......... 0x0030: cc1c e48b 0000 0000 0103 0307 ............ 10:57:15.921792 IP 127.0.0.1.5432 > 127.0.0.1.52288: Flags [S.], seq 4241480135, ack 3652783348, win 65483, options [mss 65495,sackOK,TS val 3424445579 ecr 3424445579,nop,wscale 7], length 0 0x0000: 4500 003c 0000 4000 4006 3cba 7f00 0001 E..<..@.@.<..... 0x0010: 7f00 0001 1538 cc40 fccf d9c7 d9b9 0cf4 .....8.@........ 0x0020: a012 ffcb fe30 0000 0204 ffd7 0402 080a .....0.......... 0x0030: cc1c e48b cc1c e48b 0103 0307 ............ 10:57:15.921799 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [.], ack 1, win 512, options [nop,nop,TS val 3424445579 ecr 3424445579], length 0 0x0000: 4500 0034 b721 4000 4006 85a0 7f00 0001 E..4.!@.@....... 0x0010: 7f00 0001 cc40 1538 d9b9 0cf4 fccf d9c8 .....@.8........ 0x0020: 8010 0200 fe28 0000 0101 080a cc1c e48b .....(.......... 0x0030: cc1c e48b .... And here you see the last command (the UPDATE) sent to the Pos server: 10:57:16.051388 IP 127.0.0.1.52288 > 127.0.0.1.5432: Flags [P.], seq 14407:14812, ack 120304, win 512, options [nop,nop,TS val 3424445708 ecr 3424445708], length 405 0x0000: 4500 01c9 b84f 4000 4006 82dd 7f00 0001 E....O@.@....... 0x0010: 7f00 0001 cc40 1538 d9b9 453a fcd1 afb7 .....@.8..E:.... 0x0020: 8018 0200 ffbd 0000 0101 080a cc1c e50c ................ 0x0030: cc1c e50c 5000 0000 7e00 7570 6461 7465 ....P...~.update 0x0040: 2070 6572 5f6b 6574 7465 2073 6574 2064 .per_kette.set.d 0x0050: 6573 6b6e 7220 3d20 2431 2020 2c20 6672 esknr.=.$1..,.fr 0x0060: 6569 203d 2024 3220 202c 206d 6f64 6966 ei.=.$2..,.modif 0x0070: 203d 2024 3320 202c 2061 6e7a 203d 2024 .=.$3..,.anz.=.$ 0x0080: 3420 202c 206b 6174 6c69 7374 203d 2024 4..,.katlist.=.$ 0x0090: 3520 2077 6865 7265 2063 7572 7265 6e74 5..where.current 0x00a0: 206f 6620 6863 5f70 6572 5f6b 6574 7465 .of.hc_per_kette 0x00b0: 0000 0042 0000 00ff 0000 0000 0005 0000 ...B............ 0x00c0: 0007 3133 3834 3131 3800 0000 0130 0000 ..1384118....0.. 0x00d0: 0001 3000 0000 0232 3600 0000 d25c 7865 ..0....26....\xe 0x00e0: 6539 6530 3130 3031 6561 6630 3130 3035 e9e01001eaf01005 0x00f0: 3761 6630 3130 3030 3563 6530 3130 3038 7af010005ce01008 0x0100: 3264 3230 3130 3064 3466 3830 3130 3038 2d20100d4f801008 0x0110: 6530 6130 3230 3061 6633 3430 3230 3032 e0a0200af3402002 0x0120: 3533 6130 3230 3061 3538 6530 3230 3061 53a0200a58e0200a 0x0130: 6538 6530 3230 3034 3461 3630 3230 3033 e8e020044a602003 0x0140: 6561 3730 3230 3032 3066 3130 3330 3032 ea7020020f103002 0x0150: 3466 3130 3330 3035 3431 3730 3430 3061 4f1030054170400a 0x0160: 3034 3330 3430 3033 6636 3730 3430 3062 04304003f670400b 0x0170: 6138 3930 3430 3033 6265 6430 3030 3030 a8904003bed00000 0x0180: 3534 3930 3130 3039 3235 3230 3130 3037 5490100925201007 0x0190: 3435 3430 3130 3038 3336 3030 3130 3038 4540100836001008 0x01a0: 3238 3330 3130 3039 3330 3431 3230 6100 28301009304120a. 0x01b0: 0100 0044 0000 0006 5000 4500 0000 0900 ...D....P.E..... 0x01c0: 0000 0000 5300 0000 04 ....S.... 10:57:16.051395 IP 127.0.0.1.5432 > 127.0.0.1.52288: Flags [.], ack 14812, win 512, options [nop,nop,TS val 3424445708 ecr 3424445708], length 0 0x0000: 4500 0034 453c 4000 4006 f785 7f00 0001 E..4E<@.@....... 0x0010: 7f00 0001 1538 cc40 fcd1 afb7 d9b9 46cf .....8.@......F. 0x0020: 8010 0200 fe28 0000 0101 080a cc1c e50c .....(.......... 0x0030: cc1c e50c .... The send buffer is complete, the bytea string \x ....00 is terminated by a NULL byte. TCP ACK'ed the package and then no further reaction of the Pos server on this connection. It still sits there if someone needs more information about this connection. >From the collected data, I tend to say: this must be a bug in the Pos server... Any ideas? matthias -- Matthias Apitz, ✉ guru@xxxxxxxxxxx, http://www.unixarea.de/ +49-176-38902045 Public GnuPG key: http://www.unixarea.de/key.pub