When looking for the Aggref to determine whether partial or full
aggregation is used get_aggsplit only checked for top-level Aggrefs
in the targetlist. So a targetlist where all Aggrefs where nested
in other expressions would lead to a crash. This function also
only looked for the Aggref in the targetlist but in a query with
a HAVING clause the aggregate might not be in the targetlist if
it is only referenced in the HAVING clause.
Fixes#3664Fixes#3672
To improve remote query push down, do the following:
* Import changes to remote cost estimates from PostgreSQL 14
`postgres_fdw`. The cost estimations for distributed (remote)
queries are originally based on the `postgres_fdw` code in
PG11. However, fixes and changes have been applied in never
PostgreSQL versions, which improves, among other things, costing of
sorts and having clauses.
* Increase the cost of transferring tuples. This penalizes doing
grouping/aggregation on the AN since it requires transferring more
tuples, leading to the planner preferring the push-down plans.
* As a result of the above, the improved costing also makes
distributed queries behave similar across all currently supported
PostgreSQL versions for our test cases.
* Enable `dist_query` tests on PG14 (since it now passes).
* Update the `dist_partial_agg` test to use additional ordering
columns so that there is no diff in the test output due to ordering
of input to the `first` and `last` functions.
This change introduces two ways of fetching data from data nodes: one
using cursors and another one using row-by-row mode. The major
benefit of row-by-row mode is that it enables running parallel plans
on data nodes. The default data fetcher uses row-by-row mode. A new
GUC `timescaledb.remote_data_fetcher` has been added to enable
switching between these two implementations (rowbyrow or cursor).
This change modifies the timescale_fdw to allow aggregates on a subset
of partitioning dimensions to be pushed down to data node. The
deparsing code has also been modified to wrap the pushed aggregate in
a _timescaledb_internal.partialize_agg call, which will return the
proper array of values which postgres can combine into the finalized
value.