Re: Estimating HugePages Requirements?

Justin Pryzby <pryzby@xxxxxxxxxxxxx> · Thu, 10 Jun 2021 19:23:33 -0500

On Wed, Jun 09, 2021 at 10:55:08PM -0500, Don Seiler wrote:
> On Wed, Jun 9, 2021, 21:03 P C <puravc@xxxxxxxxx> wrote:
> 
> > I agree, its confusing for many and that confusion arises from the fact
> > that you usually talk of shared_buffers in MB or GB whereas hugepages have
> > to be configured in units of 2mb. But once they understand they realize its
> > pretty simple.
> >
> > Don, we have experienced the same not just with postgres but also with
> > oracle. I havent been able to get to the root of it, but what we usually do
> > is, we add another 100-200 pages and that works for us. If the SGA or
> > shared_buffers is high eg 96gb, then we add 250-500 pages. Those few
> > hundred MBs  may be wasted (because the moment you configure hugepages, the
> > operating system considers it as used and does not use it any more) but
> > nowadays, servers have 64 or 128 gb RAM easily and wasting that 500mb to
> > 1gb does not hurt really.
> 
> I don't have a problem with the math, just wanted to know if it was
> possible to better estimate what the actual requirements would be at
> deployment time. My fallback will probably be you did and just pad with an
> extra 512MB by default.

It's because the huge allocation isn't just shared_buffers, but also
wal_buffers:

| The amount of shared memory used for WAL data that has not yet been written to disk.
| The default setting of -1 selects a size equal to 1/32nd (about 3%) of shared_buffers, ...

.. and other stuff:

src/backend/storage/ipc/ipci.c
	 * Size of the Postgres shared-memory block is estimated via
	 * moderately-accurate estimates for the big hogs, plus 100K for the
	 * stuff that's too small to bother with estimating.
	 *
	 * We take some care during this phase to ensure that the total size
	 * request doesn't overflow size_t.  If this gets through, we don't
	 * need to be so careful during the actual allocation phase.
	 */
	size = 100000;
	size = add_size(size, PGSemaphoreShmemSize(numSemas));
	size = add_size(size, SpinlockSemaSize());
	size = add_size(size, hash_estimate_size(SHMEM_INDEX_SIZE,
											 sizeof(ShmemIndexEnt)));
	size = add_size(size, dsm_estimate_size());
	size = add_size(size, BufferShmemSize());
	size = add_size(size, LockShmemSize());
	size = add_size(size, PredicateLockShmemSize());
	size = add_size(size, ProcGlobalShmemSize());
	size = add_size(size, XLOGShmemSize());
	size = add_size(size, CLOGShmemSize());
	size = add_size(size, CommitTsShmemSize());
	size = add_size(size, SUBTRANSShmemSize());
	size = add_size(size, TwoPhaseShmemSize());
	size = add_size(size, BackgroundWorkerShmemSize());
	size = add_size(size, MultiXactShmemSize());
	size = add_size(size, LWLockShmemSize());
	size = add_size(size, ProcArrayShmemSize());
	size = add_size(size, BackendStatusShmemSize());
	size = add_size(size, SInvalShmemSize());
	size = add_size(size, PMSignalShmemSize());
	size = add_size(size, ProcSignalShmemSize());
	size = add_size(size, CheckpointerShmemSize());
	size = add_size(size, AutoVacuumShmemSize());
	size = add_size(size, ReplicationSlotsShmemSize());
	size = add_size(size, ReplicationOriginShmemSize());
	size = add_size(size, WalSndShmemSize());
	size = add_size(size, WalRcvShmemSize());
	size = add_size(size, PgArchShmemSize());
	size = add_size(size, ApplyLauncherShmemSize());
	size = add_size(size, SnapMgrShmemSize());
	size = add_size(size, BTreeShmemSize());
	size = add_size(size, SyncScanShmemSize());
	size = add_size(size, AsyncShmemSize());
#ifdef EXEC_BACKEND
	size = add_size(size, ShmemBackendArraySize());
#endif

	/* freeze the addin request size and include it */
	addin_request_allowed = false;
	size = add_size(size, total_addin_request);

        /* might as well round it off to a multiple of a typical page size */
        size = add_size(size, 8192 - (size % 8192));

BTW, I think it'd be nice if this were a NOTICE:
| elog(DEBUG1, "mmap(%zu) with MAP_HUGETLB failed, huge pages disabled: %m", allocsize);

-- 
Justin