"Raschick, Hartmut" <Hartmut.Raschick@xxxxxxxxxxx> writes: > recently we have seen a lot of occurrences of "out of file descriptors: > Too many open files; release and retry" in our postgres log files, every > night when a "vacuum full analyze" is run. After some digging into the > code we found that postgres potentially tries to open as many as a > pre-determined maximum number of file descriptors when vacuuming. That > number is the lesser of the one from the configuration file > (max_files_per_process) and the one determined at start-up by > "src/backend/storage/file/fd.c::count_usable_fds()". Under Solaris now, > it would seem, finding out that number via dup(0) is not sufficient, as > the actual number of interest might be/is the number of usable stream > file descriptors (up until Solaris 10, at least). Also, closing the last > recently used file descriptor might therefore not solve a temporary > problem (as something below 256 is needed). Now, this can be fixed by > setting/leaving the descriptor limit at 256 or changing the > postgresql.conf setting accordingly. Still, the function for determining > the max number is not working as intended under Solaris, it would > appear. One might try using fopen() instead of dup() or have a different > handling for stream and normal file descriptors (including moving > standard file descriptors to above 255 to leave room for stream > ones). Maybe though, all this is not worth the effort; then it might > perhaps be a good idea to mention the limitations/specialties in the > platform specific notes (e.g. have u/limit at 256 maximum). TBH this sounds like unfounded speculation. AFAIK a Postgres backend will not open anything but regular files after its initial startup. I'm not sure what a "stream" is on Solaris, but guessing that it refers to pipes or sockets, I don't think we have a problem with an OS restriction that those be below FD 256. In any case, if we did, it would presumably show up as errors not release-and-retry events. Our usual experience is that you get release-and-retry log messages when the OS is up against the system-wide open-file limit rather than the per-process limit (ie, the underlying error code is ENFILE not EMFILE). I don't know exactly how Solaris strerror() spells those codes so it's difficult to tell from your reported log message which case is happening. If it is the system-wide limit that's at issue, then of course the dup(0) loop isn't likely to find it, and adjusting max_files_per_process (or maybe better, reducing max_connections) is the expected solution. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general