69 Commits

Author SHA1 Message Date
Alexander Kuzmenkov
5c0110cbbf Mark partialize_agg as parallel safe
Postgres knows whether a given aggregate is parallel-safe, and creates
parallel aggregation plans based on that. The `partialize_agg` is a
wrapper we use to perform partial aggregation on data nodes. It is a
pure function that produces serialized aggregation state as a result.
Being pure, it doesn't influence parallel safety. This means we don't
need to mark it parallel-unsafe to artificially disable the parallel
plans for partial aggregation. They will be chosen as usual based on
the parallel-safety of the underlying aggregate function.
2022-05-31 14:53:58 +05:30
Sven Klemm
1fbe2eb36f Support intervals with month component when constifying now()
When dealing with Intervals with month component timezone changes
can result in multiple day differences in the outcome of these
calculations due to different month lengths. When dealing with
months we add a 7 day safety buffer.
For all these calculations it is fine if we exclude less chunks
than strictly required for the operation, additional exclusion
with exact values will happen in the executor. But under no
circumstances must we exclude too much cause there would be
no way for the executor to get those chunks back.
2022-05-30 18:02:58 +02:00
Sven Klemm
12574dc8ec Support intervals with day component when constifying now()
The initial patch to use now() expressions during planner hypertable
expansion only supported intervals with no day or month component.
This patch adds support for intervals with day component.

If the interval has a day component then the calculation needs
to take into account daylight saving time switches and thereby a
day would not always be exactly 24 hours. We mitigate this by
adding a safety buffer to account for these dst switches when
dealing with intervals with day component. These calculations
will be repeated with exact values during execution.
Since dst switches seem to range between -1 and 2 hours we set
the safety buffer to 4 hours.

This patch also refactors the tests since the previous tests
made it hard to tell the feature was working after the constified
values have been removed from the plans.
2022-05-28 10:02:33 +02:00
Sven Klemm
dcb7dcc506 Remove constified now() constraints from plan
Commit 35ea80ff added an optimization to enable expressions with
now() to be used during plan-time chunk exclusion by constifying
the now() expression. The added constified constraints were left
in the plan even though they were only required during the
hypertable explansion. This patch marks those constified constraints
and removes them once they are no longer required.
2022-05-24 17:19:18 +02:00
Sven Klemm
eab4efa323 Move metrics_dist1 out of shared_setup
The table metrics_dist1 was only used by a single test and therefore
should not be part of shared_setup but instead be created in the
test that actually uses it. This reduces executed time of
regresscheck-shared when that test is not run.
2022-05-19 21:33:33 +02:00
Sven Klemm
43c8e51510 Fix Var handling for Vars of different level in constify_now
This patch fixes the constify_now optimization to ignore Vars of
different level. Previously this could potentially lead to an
assertion failure cause the varno of that varno might be bigger
than the number of entries in the rangetable. Found by sqlsmith.
2022-05-19 11:45:17 +02:00
Sven Klemm
4988dac273 Fix sqlsmith CI workflow
Commit 3b35da76 changed the setup script for regresscheck-shared
to no longer be usable directly by the sqlsmith workflow. This
patch set TEST_DBNAME at the top of the script so it is easier
to use the script outside of regression check environment.
2022-05-18 11:47:07 +02:00
Sven Klemm
35ea80ffdf Enable now() usage in plan-time chunk exclusion
This implements an optimization to allow now() expression to be
used during plan time chunk exclusions. Since now() is stable it
would not normally be considered for plan time chunk exclusion.
To enable this behaviour we convert `column > now()` expressions
into `column > const AND column > now()`. Assuming that time
always moves forward this is safe even for prepared statements.
This optimization works for SELECT, UPDATE and DELETE.
On hypertables with many chunks this can lead to a considerable
speedup for certain queries.

The following expressions are supported:
- column > now()
- column >= now()
- column > now() - Interval
- column > now() + Interval
- column >= now() - Interval
- column >= now() + Interval

Interval must not have a day or month component as those depend
on timezone settings.

Some microbenchmark to show the improvements, I did best of five
for all of the queries.

-- hypertable with 1k chunks
-- with optimization
select * from metrics1k where time > now() - '5m'::interval;
Time: 3.090 ms

-- without optimization
select * from metrics1k where time > now() - '5m'::interval;
Time: 145.640 ms

-- hypertable with 5k chunks
-- with optimization
select * from metrics5k where time > now() - '5m'::interval;
Time: 4.317 ms

-- without optimization
select * from metrics5k where time > now() - '5m'::interval;
Time: 775.259 ms

-- hypertable with 10k chunks
-- with optimization
select * from metrics10k where time > now() - '5m'::interval;
Time: 4.853 ms

-- without optimization
select * from metrics10k where time > now() - '5m'::interval;
Time: 1766.319 ms (00:01.766)

-- hypertable with 20k chunks
-- with optimization
select * from metrics20k where time > now() - '5m'::interval;
Time: 6.141 ms

-- without optimization
select * from metrics20k where time > now() - '5m'::interval;
Time: 3321.968 ms (00:03.322)

Speedup with 1k chunks: 47x
Speedup with 5k chunks: 179x
Speedup with 10k chunks: 363x
Speedup with 20k chunks: 540x
2022-05-17 21:47:39 +02:00
Fabrízio de Royes Mello
047d6b175b Revert "Pushdown of gapfill to data nodes"
This reverts commit eaf3a38fe9553659e515fac72aaad86cf1a06d1e.
2022-05-16 15:21:32 -03:00
Fabrízio de Royes Mello
4083e48a1c Revert "Add missing gitignore entry"
This reverts commit 57411719fb1f5e4d5863089bb4b840abea3bc3db.
2022-05-16 15:21:32 -03:00
Alexander Kuzmenkov
3b35da7607 More tests for errors when fetching from data nodes
Add a special function that allows to inject these errors.
2022-05-16 18:57:42 +05:30
Alexander Kuzmenkov
6e26a1187a Use binary format in row-by-row fetcher
The general idea is to have two types of fetcher: "fast" and "general
purpose". We use the row-by-row fetcher as the "fast" one. This commit
removes support of text protocol in this fetcher, because it's only
needed for some niche types that don't have a binary serialization, and
is also slower than binary one. Because the row-by-row fetcher now only
understands binary protocol, we must check that the binary
serialization is actually available for the participating data types.
If not, we have to revert to using the cursor fetcher unless row-by-row
was explicitly requested by the user. This happens at execution time,
precisely, at creation of TupleFactory, because that's when we look up
the conversion functions.

The rest of the commit is removing the text protocol support
from row-by-row, plus EXPLAIN changes (we don't know the fetcher type
at the planning stage anymore, so not showing it).
2022-05-06 22:13:17 +05:30
Fabrízio de Royes Mello
57411719fb Add missing gitignore entry
Pull request #4033 introduced a new template SQL test file but missed
to add the properly gitgnore entry to ignore generated test files.
2022-04-08 18:41:10 -03:00
Rafia Sabih
eaf3a38fe9 Pushdown of gapfill to data nodes
Allow the calls of time_bucket_gapfill to be executed at the
data nodes for improved query performance. With this, time_bucket_gapfill
is pushed to data nodes in the following conditions,

1. when only one data node has all the chunks
2. when space dimension does not overlap across data nodes
3. when group-by matches space dimension
2022-04-07 21:09:49 +02:00
Sven Klemm
06d8375594 Enhance extension function test
This patch changes the extension function list to include the
signature as well since functions with different signature are
separate objects in postgres. This also changes the list to include
all functions. Even though functions in internal schemas are not
considered public API they still need be treated the same as functions
in other schemas with regards to extension upgrade/downgrade.

This patch also moves the test to regresscheck-shared since we do
not dedicated database to run these tests.
2022-03-10 11:22:33 +01:00
Sven Klemm
5c22ef3da2 Rename continuous aggregate tests
Change the prefix for continuous aggregate tests from
continuous_aggs_ to cagg_. This is similar to commit 6a8c2b66
which did this adjustment for isolation tests because we were
running into length limitations for the spec name. This patch
adjusts the remaining tests to be consistent with the naming
used in isolation tests.
2022-01-24 14:12:56 +01:00
Sven Klemm
29856fd0ac Eliminate float rounding instabilities in interpolate
When interpolating float values the result of the calculation
might be unstable for certain values when y0 and y1 are equal.
This patch short circuits the formula and returns y0 immediately
when y0 and y1 are identical.

Fixes #1528
2022-01-24 13:33:26 +01:00
Sven Klemm
39645d56da Fix subtract_integer_from_now on 32-bit platforms
This patch fixes subtract_integer_from_now on 32-bit platforms,
improves error handling and adds some basic tests.
subtract_integer_from_now would trigger an assert when called
on a hypertable without integer time dimension (found by sqlsmith).
Additionally subtract_integer_from_now would segfault when called
on a hypertable without partitioning dimensions.
2021-12-20 10:02:57 +01:00
Sven Klemm
a760887145 Fix projection handling in gapfill
When getting the next tuple from the subplan gapfill would apply
the projection to it which was incorrect since the subplan already
did the projection and the projection for the gapfill tuple has to
be done when the tuple is handed to the parent node.

Fixes #3834
2021-12-17 23:58:43 +01:00
Fabrízio de Royes Mello
244568f23a Add regression tests for caggs+compression
Closes timescale/timescaledb-private#962
2021-12-17 10:51:33 -05:00
Sven Klemm
7f494077ed Fix DataNodeScan plans with One-Time Filter
When a query has a filter that only needs to be evaluated once per
query it will be represented as a Result node with the filter
condition on the Result node and the actual query as child of the
result node. find_data_node_scan_state_child did not consider
Result node as valid node to contain a DataNodeScan node leading
to a `unexpected child node of Append or MergeAppend: 62` for
queries that had one-time filter with a subquery.
2021-12-13 21:10:30 +01:00
Sven Klemm
1b4780df31 Fix assertion failure in cursor_fetcher_rewind
The code in cursor_fetcher_rewind asserted that there always
is an associated request which is not true if EOF was reached
already. Found by sqlsmith.

Fixes #3786
2021-12-13 20:03:57 +01:00
Alexander Kuzmenkov
0f81a60cbb Use row-by-row fetcher to enable parallel plans on data nodes
The row-by-row fetcher is more efficient, so we want to use it when we
can -- that is, when the have to read only one table from the data
node, without interleaving it with anything else. This patch adds an
option of choosing the fetcher type automatically. It detects the
simplest case of only one distributed table in the entire query, and
enables row-by-row fetcher. For other cases, the cursor fetcher is
used.
2021-12-10 14:40:34 +03:00
Alexander Kuzmenkov
f1e103fab1 Fix DISTINCT ON queries for distributed hyperatbles
Previously, we would push DISTINCT ON down to the data nodes even when
the pathkeys of the resulting paths on the data nodes were not
compatible with the given DISTINCT ON columns. This commit disables
pushdown when the sorting is not compatible.

Fixes #3784
2021-11-17 15:42:40 +03:00
Fabrízio de Royes Mello
d117d8772f Add missing gitignore entry
Pull request #3717 introduced a new template SQL test file but missed
to add the properly gitgnore entry to ignore generated test files.
2021-10-27 12:37:30 -03:00
Sven Klemm
acc6abee92 Support transparent decompression on individual chunks
This patch adds support for transparent decompression in queries
on individual chunks.
This is required for distributed hypertables with compression
when enable_per_data_node_queries is set to false. Without
this functionality queries on distributed hypertables with
compression would not return data for compressed chunks as
the generated FDW queries would target individual chunks.

Fixes #3714
2021-10-20 20:42:21 +02:00
Fabrízio de Royes Mello
f25e795ec8 Add regression tests for Memoize Node
PostgreSQL 14 introduced new `Memoize Node` that serve as a cache of
results from parameterized nodes.

We should make sure it will work correctly together with ChunckAppend
custom node over hypertables (compressed and uncompressed).

Closes #3684
2021-10-15 19:20:33 -03:00
Sven Klemm
4d425d9470 Disable memoize node for append and transparent_decompression tests
With memoize enabled PG14 append tests produce a very different
plan compared to previous PG versions. To make comparing plans
between PG versions easier we disable memoize for PG14.
PG14 also modified how EXTRACT is shown in EXPLAIN output
so any query using EXTRACT will have different EXPLAIN output
between PG14 and previous versions.
2021-10-09 00:15:23 +02:00
Erik Nordström
f071f89ade Fix issues in the dist_chunk test
This change fixes issues in the shared `dist_chunk` test that caused
flakiness. Since the test shares a database and state with other tests
running in parallel, it should modify the database (e.g., creating new
tables and chunks) while the test runs. Such modifications will cause
non-deterministic behavior that varies depending on the order the
tests are run in.

To fix this issue, all the table creations have been moved into the
shared setup script and the test itself has been made less dependent
on hard-coded IDs and chunk names. One of the tables used in the has
been changed to use space-partitioning to make chunk placement on
nodes more predictible.
2021-07-29 16:53:12 +03:00
Nikhil
3651e6e102 Move related tests into dist_chunk
The "chunk_drop_replica" test is part of the overall chunk copy/move
functionality. All related tests will go into this dist_chunk test.

Also, fix the earlier flakiness in dist_chunk test by moving the
compress_chunk calls into the shared setup
2021-07-29 16:53:12 +03:00
Nikhil
762053431e Implement drop_chunk_replica API
This function drops a chunk on a specified data node. It then removes
the metadata about the datanode, chunk association on the access node.

This function is meant for internal use as part of the "move chunk"
functionality.

If only one chunk replica remains then this function refuses to drop it
to avoid data loss.
2021-07-29 16:53:12 +03:00
Ruslan Fomkin
404f1cdbad Create chunk table from access node
Creates a table for chunk replica on the given data node. The table
gets the same schema and name as the chunk. The created chunk replica
table is not added into metadata on the access node or data node.

The primary goal is to use it during copy/move chunk.
2021-07-29 16:53:12 +03:00
Sven Klemm
41261e4e58 Stabilize timestamp_limits test output
Remove the chunk name completely from output as the name might have
different length leading to different output as table headers are
adjusted to the length of the field values.
2021-06-23 19:32:46 +02:00
Sven Klemm
7b67f72f86 Move timestamp_limits and with_clause_parser test
Move the timestamp_limits and with_clause_parser test to
regresscheck-shared since those tests don't need a private
database incurring less overhead to run those tests.
Also add missing ORDER BY clauses to some of the queries
in timestamp_limits to make the output more stable.
2021-06-23 14:51:09 +02:00
Mats Kindahl
71e8f13871 Add workflow and CMake support for formatting
Add a workflow to check that CMake files are correctly formatted as
well as a custom target to format CMake files in the repository. This
commit also runs the formatting on all CMake files in the repository.
2021-06-17 22:52:29 +02:00
Sven Klemm
110d77a2fe Combine test files
Merge test files that after the removal of PG11 support need
to be no longer version specific.
2021-06-01 20:21:06 +02:00
Sven Klemm
22ceabcb83 Remove PG11 specific test output files 2021-06-01 20:21:06 +02:00
Nikhil
a3d8f9fecd Make SELECT DISTINCT handle non-var targetlists
The current SELECT DISTINCT pushdown code assumed that the targetlist
will always contain references to column attributes of the target
table.

So, even a simple "SELECT DISTINCT 1 from table;" causes a segmentation
fault because the "varno" field is not assigned. Fix this oversight.

Issue reported by @svenklemm

Fixes timescale/timescaledb-private#920
2021-04-22 20:19:35 +05:30
Ruslan Fomkin
507b5e5c15 Ignore generated SQL test files from templates
`index` test was moved into SQL test template in #3123 and
`generated_columns` - in #2927. They are added to relevant gitignore.
2021-04-21 12:26:19 +02:00
Nikhil
425cdd16e4 Add "SELECT DISTINCT" pushdown in multi-node
Construct "SELECT DISTINCT target_list" or "SELECT DISTINCT ON (col1,
col..) target_list" statement to push down the DISTINCT clause to the
remote side.

We only allow references to basic "Vars" or constants in the DISTINCT
exprs

So, "SELECT DISTINCT col1" is fine but "SELECT DISTINCT 2*col1" is not.

"SELECT DISTINCT col1, 'const1', NULL, col2" which is a mix of column
references and constants is also supported. Everything else is not
supported.

This pushdown also needs to work when
timescaledb.enable_per_data_node_queries is disabled.

All existing test cases in which "SELECT DISTINCT" is now being pushed
down have been modified. New test cases have been added to check that
the remote side uses "Skip Scans" as is suitable in some cases.
2021-04-21 12:53:53 +05:30
Sven Klemm
f54fc16fed Fix gapfill/hashagg planner interaction
The hashagg optimization adds a hashagg plan with modified costs
to the list of paths. This happens after gapfill pathes have been
created so those newly created pathes would miss the GapFill node.
If those created pathes would turn out to be cheaper than other
pathes GapFill would fail to work as no GapFill node would be
executed.
This patch changes hashagg path creation to skip adding pathes
when there is a GapFill path.

Fixes #3048
2021-03-26 14:23:58 +01:00
gayyappan
0d99592b20 Add tests for ALTER COLUMN .. DROP EXPRESSION
This PR adds tests for dropping defaults for generated
columns. This is done by executing
ALTER TABLE .. ALTER COLUMN .. DROP EXPRESSION
This feature is introduced in PG13.
2021-02-12 12:32:28 -05:00
Sven Klemm
b95e93a651 Run regresscheck-shared on PG13
This patch enables the regresscheck-shared testsuite to run on PG13
2021-02-10 12:59:39 +01:00
Erik Nordström
36c1cd849a Fix corruption in gapfill plan
This change fixes a bug with gapfill that caused certain query plans
to fail with the error "could not find pathkey item to sort". This was
caused by a corruption in the query plan which happened as a result of
removing the last two arguments to the `time_bucket_gapfill` function
expression during query planning. Since the function expression was
modified via a reference to the original query plan tree, it affected
also the expression in the target list. When the planner couldn't
match the target list with the corresponding equivalence member (which
still included the two removed arguments), the error was generated.

The original reason for removing the two arguments was to avoid
passing them on to `time_bucket`, which is called internally by
`time_bucket_gapfill`. However, the last to arguments aren't passed on
anyway, so it isn't necessary to modify the original argument list.

Fixes #2232
2021-01-27 23:41:54 +01:00
Mats Kindahl
b1dc0305d3 Fix path to dist_gapfill
The `TEST_OUTPUT_DIR` for shared tests `dist_gapfill` is incorrect
inside the test file (when called from `pg_regress`) because
`TEST_OUTPUT_DIR` is set to the parent directory rather than the
subdirectory.

This commit fixes the paths in `dist_gapfill`.
2021-01-22 22:18:18 +01:00
Ruslan Fomkin
3448bcf2af Move gapfill tests into using shared database 2020-12-15 13:16:53 +01:00
Mats Kindahl
9b5f20dd74 Fix ABI check build dependencies and tests
Some tests contain code that only work if the build is based on a Git
clone, which caused these tests to fail when Git was not available.
This commit splits out those tests and only enabling them if Git is
found and Git information can be retrieved.
2020-12-07 18:10:26 +01:00
Sven Klemm
dc913ef0d4 Fix DecompressChunk path generation
The non-parallel pathes generated by DecompressChunk were
incorrectly marked as parallel_safe even when the child scan
was not parallel aware. Leading to incorrect query results
when those pathes were used in a parallel plan.
Additionaly DecompressChunk code didnt set total_table_pages on
PlannerInfo leading to an assertion failure in BitmapHeapscan
path creation.
2020-10-19 22:10:12 +02:00
Sven Klemm
87f78b4844 Move distributed insert tests to shared test
Change the distributed insert test to shared test so it can run in
parallel and doesn't require dedicated distributed setup.
2020-10-13 14:22:17 +02:00
Dmitry Simonenko
a51aa6d04b Move enterprise features to community
This patch removes enterprise license support and moves
move_chunk() function under community license (TSL).

Licensing validation code been reworked and simplified.
Previously used timescaledb.license_key guc been renamed to
timescaledb.license.

This change also makes testing code more strict against
used license. Apache test suite now can test only apache-licensed
functions.

Fixes #2359
2020-09-30 15:14:17 +03:00