On 1/14/24 9:46 AM, Tom Lane wrote:
Ron Johnson<ronljohnsonjr@xxxxxxxxx> writes:
You can't scan just one column of a row-oriented table.
Technically you kinda can, depending on your definition of "scan".
The real question is why it mentions c2.
The planner did that so that the SeqScan step doesn't have to
perform a projection: it can just return (a pointer to)
the physical tuple it found in the table, without doing extra
work to form a tuple containing only c1.
The piece of info that's not mentioned here is how tuples (rows) are
actually processed to extract individual datums (columns). The full
details are in heap_deform_tuple() in backend/access/common/heaptuple.c,
but the general gist is that (ignoring nulls) to read a tuple the code
has to go datum by datum, computing the size of each datum to determine
the physical location of the *next* datum. So if you want the 3rd datum
in a tuple, you need to calculate the size of the 1st datum to see where
the 2nd datum lives, and then compute the size of the 2nd datum to see
where the 3rd one lives.
In this example, if c1 is literally the first column in the table, then
heap_deform_tuple is free to ignore everything else in the tuple, so
long as the code calling heap_deform_tuple() knows to ask for only 1 datum.
If all that sounds kinda expensive and tedious: you're right, it is, and
it's why deforming tuples is generally done as late as possible. Based
on what Tom's saying, in 9.6+ the HashAggregate code would be calling
heap_deform_tuple(), and I'd expect it to only be retrieving c1. Without
going through all the code, I think what's happening in 9.4 is the
projection ends up calling heap_deform_tuple instead of the HashAgg
code. It's still only grabbing c1; it's just doing it sooner rather than
later. In this particular case I don't think it'd make much difference,
but in more complicated queries it could certainly have a noticable effect.
Either way, the explain output is kinda confusing. IMO it'd be more
accurate if it said something like "Output: t1". And Ron's main point
that you're going to be reading an entire row of t1 from the OS is also
true.
BTW, there's another place where the code waits as long as possible to
access actual data in the hopes of avoiding needless work and that's
values that have been TOASTed. heap_deform_tuple() doesn't actually need
to de-toast data, so it will simply return a Datum that is a "pointer"
(not a C pointer) to the toasted data. That will only be detoasted if
something actually needs the actual data. In some cases that be a big
performance win.
--
Jim Nasby, Data Architect, Austin TX