To increase schema security we do not want to mix our own internal
objects with user objects. Since chunks are created in the
_timescaledb_internal schema our internal functions should live in
a different dedicated schema. This patch make the necessary
adjustments for the following functions:
- finalize_agg_ffunc(internal,text,name,name,name[],bytea,anyelement)
- finalize_agg_sfunc(internal,text,name,name,name[],bytea,anyelement)
- partialize_agg(anyelement)
- finalize_agg(text,name,name,name[][],bytea,anyelement)
Make `partialize_agg()` support parallel query execution. To make this
work, the finalize node need combine the individual partials from each
parallel worker, but the final step that turns the resulting partial
into the finished aggregate should not happen. Thus, in the case of
distributed hypertables, each data node can run a parallel query to
compute a partial, and the access node can later combine and finalize
these partials into the final aggregate. Esssentially, there will be
one combine step (minus final) on each data node, and then another one
plus final on the access node.
To implement this, the finalize aggregate plan is simply modified to
elide the final step, and to reserialize the partial. It is only
possible to do this at the plan stage; if done at the path stage, the
PostgreSQL planner will hit assertions that assume that the node has
certain values (e.g., it doesn't expect combine Paths to skip the
final step).
Previous PR #4307 mark `partialize_agg` and `finalize_agg` as parallel
safe but this change is leading to incorrect results in some cases.
Those functions are supposed work in parallel but seems is not the case
and it is not evident yet the root cause and how to properly use it in
parallel queries so we decided to revert this change and provide correct
results to users.
Fixes#4922
Postgres knows whether a given aggregate is parallel-safe, and creates
parallel aggregation plans based on that. The `partialize_agg` is a
wrapper we use to perform partial aggregation on data nodes. It is a
pure function that produces serialized aggregation state as a result.
Being pure, it doesn't influence parallel safety. This means we don't
need to mark it parallel-unsafe to artificially disable the parallel
plans for partial aggregation. They will be chosen as usual based on
the parallel-safety of the underlying aggregate function.
From PG12 on, CREATE OR REPLACE is supported for aggregates,
therefore, since we have dropped support for PG11, we can avoid
going through the rigamarole of having our aggregates in a separate
file from the functions we define to support them. Nor do we need to
handle aggregates separately from other functions as their creation
is now idempotent.