115 Commits

Author SHA1 Message Date
Dmitry Simonenko
ea5038f263 Add connection cache invalidation ignore logic
Calling `ts_dist_cmd_invoke_on_data_nodes_using_search_path()` function
without an active transaction allows connection invalidation event
happen between applying `search_path` and the actual command
execution, which leads to an error.

This change introduces a way to ignore connection cache invalidations
using `remote_connection_cache_invalidation_ignore()` function.

This work is based on @nikkhils original fix and the problem research.

Fix #4022
2022-10-04 10:50:45 +03:00
Konstantina Skovola
9bd772de25 Add interface for troubleshooting job failures
This commit gives more visibility into job failures by making the
information regarding a job runtime error available in an extension
table (`job_errors`) that users can directly query.
This commit also adds an infromational view on top of the table for
convenience.
To prevent the `job_errors` table from growing too large,
a retention job is also set up with a default retention interval
of 1 month. The retention job is registered with a custom check
function that requires that a valid "drop_after" interval be provided
in the config field of the job.
2022-09-30 15:22:27 +02:00
Sven Klemm
1d4b9d6977 Fix join on time column of compressed chunk
Do not allow paths that are parameterized on a
compressed column to exist when creating paths
for a compressed chunk.
2022-09-29 10:36:02 +02:00
Sven Klemm
940187936c Fix segfault when INNER JOINing hypertables
This fixing a segfault when INNER JOINing 2 hypertables that are
ordered by time.
2022-09-28 17:12:45 +02:00
Sven Klemm
2529ae3f68 Fix chunk exclusion for prepared statements and dst changes
The constify code constifying TIMESTAMPTZ expressions when doing
chunk exclusion did not account for daylight saving time switches
leading to different calculation outcomes when timezone changes.
This patch adds a 4 hour safety buffer to any such calculations.
2022-09-22 18:16:20 +02:00
Sven Klemm
ffd9dfb7eb Fix assertion failure in constify_now
The code added to support VIEWs did not account for the fact that
varno could be from a different nesting level and therefore not
be present in the current range table.
2022-09-16 17:40:03 +02:00
Alexander Kuzmenkov
fee27484ce Do not use row-by-row fetcher for parameterized plans
We have to prepare the data node statement in this case, and COPY
queries don't work with prepared statements.
2022-09-15 22:59:06 +03:00
Sven Klemm
d2baef3ef3 Fix planner chunk exclusion for VIEWs
Allow planner chunk exclusion in subqueries. When we decicde on
whether a query may benefit from constifying now and encounter a
subquery peek into the subquery and check if the constraint
references a hypertable partitioning column.

Fixes #4524
2022-09-12 17:29:14 +02:00
Bharathy
b869f91e25 Show warnings during create_hypertable().
The schema of base table on which hypertables are created, should define
columns with proper data types. As per postgres best practices Wiki
(https://wiki.postgresql.org/wiki/Don't_Do_This), one should not define
columns with CHAR, VARCHAR, VARCHAR(N), instead use TEXT data type.
Similarly instead of using timestamp, one should use timestamptz.
This patch reports a WARNING to end user when creating hypertables,
if underlying parent table, has columns of above mentioned data types.

Fixes #4335
2022-09-12 18:47:47 +05:30
Sven Klemm
a26a5974dc Improve space constraint exclusion datatype handling
This patch adjusts the operator logic for valid space dimension
constraints to no longer look for an exact match on both sides
of the operator but instead allow mismatched datatypes.

Previously a constraint like `col = value` would require `col`
and `value` to have matching datatype with this change `col` and
`value` can be different datatype as long as they have equality
operator in btree family.

Mismatching datatype can happen commonly when using int8 columns
and comparing them with integer literals. Integer literals default
to int4 so the datatypes would not match unless special care has
been taken in writing the constraints and therefore the optimization
would never apply in those cases.
2022-09-11 10:57:54 +02:00
Sven Klemm
f27e627341 Fix chunk exclusion for space partitions in SELECT FOR UPDATE queries
Since we do not use our own hypertable expansion for SELECT FOR UPDATE
queries we need to make sure to add the extra information necessary to
get hashed space partitions with the native postgres inheritance
expansion working.
2022-09-11 10:57:54 +02:00
Sven Klemm
b34b91f18b Add timezone support to time_bucket_gapfill
This patch adds a new time_bucket_gapfill function that
allows bucketing in a specific timezone.

You can gapfill with explicit timezone like so:
`SELECT time_bucket_gapfill('1 day', time, 'Europe/Berlin') ...`

Unfortunately this introduces an ambiguity with some previous
call variations when an untyped start/finish argument was passed
to the function. Some queries might need to be adjusted and either
explicitly name the positional argument or resolve the type ambiguity
by casting to the intended type.
2022-09-07 16:37:53 +02:00
Dmitry Simonenko
c697700add Add hypertable distributed argument and defaults
This PR introduces a new `distributed` argument to the
create_hypertable() function as well as two new GUC's to
control its default behaviour: timescaledb.hypertable_distributed_default
and timescaledb.hypertable_replication_factor_default.

The main idea of this change is to allow automatic creation
of the distributed hypertables by default.
2022-08-29 17:44:16 +03:00
Fabrízio de Royes Mello
e34218ce29 Migrate Continuous Aggregates to the new format
Timescale 2.7 released a new version of Continuous Aggregate (#4269)
that store the final aggregation state instead of the byte array of
the partial aggregate state, offering multiple opportunities of
optimizations as well a more compact form.

When upgrading to Timescale 2.7, new created Continuous Aggregates
are using the new format, but existing Continuous Aggregates keep
using the format they were defined with.

Created a procedure to upgrade existing Continuous Aggregates from
the old format to the new format, by calling a simple procedure:

test=# CALL cagg_migrate('conditions_summary_daily');

Closes #4424
2022-08-25 17:49:09 -03:00
Matvey Arye
c43307387e Add runtime exclusion for hypertables
In some cases, entire hypertables can be excluded
at runtime. Some Examples:

   WHERE col @> ANY(subselect)
   if the subselect returns empty set

   WHERE col op (subselect)
   if the op is a strict operator and
   the subselect returns empty set.

When qual clauses are not on partition columns, we use
the old chunk exclusion, otherwise we try hypertable exclusion.

Hypertable exclusion is executed once per hypertable.
This is cheaper than the chunk  exclusion
that is once-per-chunk.
2022-08-25 13:17:21 -04:00
Sven Klemm
5d934baf1d Add timezone support to time_bucket
This patch adds a new function time_bucket(period,timestamp,timezone)
which supports bucketing for arbitrary timezones.
2022-08-25 12:59:05 +02:00
Konstantina Skovola
dc145b7485 Add parameter check_config to alter_job
Previously users had no way to update the check function
registered with add_job. This commit adds a parameter check_config
to alter_job to allow updating the check function field.

Also, previously the signature expected from a check was of
the form (job_id, config) and there was no validation
that the check function given had the correct signature.
This commit removes the job_id as it is not required and
also checks that the check function has the correct signature
when it is registered with add_job, preventing an error being
thrown at job runtime.
2022-08-25 10:38:03 +03:00
Mats Kindahl
e0f3e17575 Use new validation functions
Old patch was using old validation functions, but there are already
validation functions that both read and validate the policy, so using
those. Also removing the old `job_config_check` function since that is
no longer use and instead adding a `job_config_check` that calls the
checking function with the configuration.
2022-08-25 10:38:03 +03:00
Sven Klemm
1c0bf4b777 Support bucketing by month in time_bucket_gapfill 2022-08-22 19:07:32 +02:00
Rafia Sabih
16fdb6ca5e Checks for policy validation and compatibility
At the time of adding or updating policies, it is
checked if the policies are compatible with each
other and to those already on the CAgg.
These checks are:
- refresh and compression policies should not overlap
- refresh and retention policies should not overlap
- compression and retention policies should not overlap

Co-authored-by: Markos Fountoulakis <markos@timescale.com>
2022-08-12 00:55:18 +03:00
Rafia Sabih
088f688780 Miscellaneous
-Add infinity for refresh window range
 Now to create open ended refresh policy
 use +/- infinity for end_offset and star_offset
 respectivly for the refresh policy.
-Add remove_all_policies function
 This will remove all the policies on a given
 CAgg.
-Remove parameter refresh_schedule_interval
-Fix downgrade scripts
-Fix IF EXISTS case

Co-authored-by: Markos Fountoulakis <markos@timescale.com>
2022-08-12 00:55:18 +03:00
Rafia Sabih
bca65f4697 1 step CAgg policy management
This simplifies the process of adding the policies
for the CAggs. Now, with one single sql statements
all the policies can be added for a given CAgg.
Similarly, all the policies can be removed or modified
via single sql statement only.

This also adds a new function as well as a view to show all
the policies on a continuous aggregate.
2022-08-12 00:55:18 +03:00
Sven Klemm
49b6486dad Change get_git_commit to return full commit hash
This patch changes get_git_commit to always return the full hash.
Since different git versions do not agree on the length of the
abbreviated hash this made the length flaky. To make the length
consistent change it to always be the full hash.
2022-08-01 10:45:17 +02:00
Sven Klemm
eccd6df782 Throw better error message on incompatible row fetcher settings
When a query has multiple distributed hypertables the row-by-by
fetcher cannot be used. This patch changes the fetcher selection
logic to throw a better error message in those situations.
Previously the following error would be produced in those situations:
unexpected PQresult status 7 when starting COPY mode
2022-07-29 11:40:00 +02:00
Sven Klemm
d5619283f3 Fix gapfill group comparison
The gapfill mechanism to detect an aggregation group change was
using datumIsEqual to compare the group values. datumIsEqual does
not detoast values so when one value is toasted and the other value
is not it will not return the correct result. This patch changes
the gapfill code to use the correct equal operator for the type
of the group column instead of datumIsEqual.
2022-07-19 19:14:30 +02:00
Sven Klemm
0d175b262e Fix prepared statement param handling in ChunkAppend
This patch fixes the param handling in prepared statements for generic
plans in ChunkAppend making those params usable in chunk exclusion.
Previously those params would not be resolved and therefore not used
for chunk exclusion.

Fixes #3719
2022-07-19 14:50:17 +02:00
Sven Klemm
597b71881a Fix assertion hit in row_by_row_fetcher_close
When executing multinode queries that initialize row-by-row fetcher
but never execute it the node cleanup code would hit an assertion
checking the state of the fetcher. Found by sqlsmith.
2022-07-18 09:39:48 +02:00
Alexander Kuzmenkov
1bbb6059cb Add more tests for distributed INSERT and COPY
More interleavings of INSERT/COPY, and test with slow recv() to check
waiting.
2022-07-04 22:38:53 +05:30
gayyappan
6c20e74674 Block drop chunk if chunk is in frozen state
A chunk in frozen state cannot be dropped.
drop_chunks will skip over frozen chunks without erroring.
Internal api , drop_chunk will error if you attempt to  drop
a chunk without unfreezing it.

This PR also adds a new internal API to unfreeze a chunk.
2022-06-30 09:56:50 -04:00
gayyappan
79bf4f53b1 Add api to associate a hypertable with custom jobs
This PR introduces a new SQL function to associate a
hypertable or continuous agg with a custom job. If
this dependency is setup, the job is automatically
deleted when the hypertable/cagg is dropped.
2022-06-23 13:33:33 -04:00
gayyappan
131f58ee60 Add internal api for foreign table chunk
Add _timescaledb_internal.attach_osm_table_chunk.
This treats a pre-existing foreign table as a
hypertable chunk by adding dummy metadata to the
catalog tables.
2022-06-23 10:11:56 -04:00
Nikhil Sontakke
e3b2fbdf15 Fix empty bytea handlng with distributed tables
The "empty" bytea value in a column of a distributed table when
selected was being returned as "null". The actual value on the
datanodes was being stored appropriately but just the return code path
was converting it into "null" on the AN. This has been handled via the
use of PQgetisnull() function now.

Fixes #3455
2022-06-22 12:25:54 +05:30
Alexander Kuzmenkov
5c69adfb7e Add more tests for errors on data nodes
Use a data type with faulty send/recv functions to test various error
handling paths.
2022-06-21 14:55:14 +05:30
Erik Nordström
19b3f67b9c Drop remote data when detaching data node
Add a parameter `drop_remote_data` to `detach_data_node()` which
allows dropping the hypertable on the data node when detaching
it. This is useful when detaching a data node and then immediately
attaching it again. If the data remains on the data node, the
re-attach will fail with an error complaining that the hypertable
already exists.

The new parameter is analogous to the `drop_database` parameter of
`delete_data_node`. The new parameter is `false` by default for
compatibility and ensures that a data node can be detached without
requiring communicating with the data node (e.g., if the data node is
not responding due to a failure).

Closes #4414
2022-06-14 15:53:41 +02:00
Sven Klemm
308ce8c47b Fix various misspellings 2022-06-13 10:53:08 +02:00
Sven Klemm
216ea65937 Enable chunk exclusion for space dimensions in UPDATE/DELETE
This patch transforms constraints on hash-based space partitions to make
them usable by postgres constraint exclusion.

If we have an equality condition on a space partitioning column, we add
a corresponding condition on get_partition_hash on this column. These
conditions match the constraints on chunks, so postgres' constraint
exclusion is able to use them and exclude the chunks.

The following transformations are done:

device_id = 1
becomes
((device_id = 1) AND (_timescaledb_internal.get_partition_hash(device_id) = 242423622))

s1 = ANY ('{s1_2,s1_2}'::text[])
becomes
((s1 = ANY ('{s1_2,s1_2}'::text[])) AND
(_timescaledb_internal.get_partition_hash(s1) = ANY ('{1583420735,1583420735}'::integer[])))

These transformations are not visible in EXPLAIN output as we remove
them again after hypertable expansion is done.
2022-06-07 13:10:28 +02:00
Sven Klemm
ce59820678 Fix removal of constified constraints
Commit dcb7dcc5 removed the constified intermediate values used
during hypertable expansion but only did so completely for PG14.
For PG12 and PG13 some constraints remained in the plan.
2022-06-06 15:47:02 +02:00
Konstantina Skovola
b6a974e7f3 Add schedule_interval to policies
Add a parameter `schedule_interval` to retention and
compression policies to allow users to define the schedule
interval. Fall back to previous default if no value is
specified.

Fixes #3806
2022-06-06 16:22:22 +03:00
Erik Nordström
8f9975d7be Fix crash during insert into distributed hypertable
For certain inserts on a distributed hypertable, e.g., involving CTEs
and upserts, plans can be generated that weren't properly handled by
the DataNodeCopy and DataNodeDispatch execution nodes. In particular,
the nodes expect ChunkDispatch as a child node, but PostgreSQL can
sometimes insert a Result node above ChunkDispatch, causing the crash.

Further, behavioral changes in PG14 also caused the DataNodeCopy node
to sometimes wrongly believe a RETURNING clause was present. The check
for returning clauses has been updated to fix this issue.

Fixes #4339
2022-06-02 17:25:33 +02:00
Alexander Kuzmenkov
5c0110cbbf Mark partialize_agg as parallel safe
Postgres knows whether a given aggregate is parallel-safe, and creates
parallel aggregation plans based on that. The `partialize_agg` is a
wrapper we use to perform partial aggregation on data nodes. It is a
pure function that produces serialized aggregation state as a result.
Being pure, it doesn't influence parallel safety. This means we don't
need to mark it parallel-unsafe to artificially disable the parallel
plans for partial aggregation. They will be chosen as usual based on
the parallel-safety of the underlying aggregate function.
2022-05-31 14:53:58 +05:30
Sven Klemm
1fbe2eb36f Support intervals with month component when constifying now()
When dealing with Intervals with month component timezone changes
can result in multiple day differences in the outcome of these
calculations due to different month lengths. When dealing with
months we add a 7 day safety buffer.
For all these calculations it is fine if we exclude less chunks
than strictly required for the operation, additional exclusion
with exact values will happen in the executor. But under no
circumstances must we exclude too much cause there would be
no way for the executor to get those chunks back.
2022-05-30 18:02:58 +02:00
Sven Klemm
12574dc8ec Support intervals with day component when constifying now()
The initial patch to use now() expressions during planner hypertable
expansion only supported intervals with no day or month component.
This patch adds support for intervals with day component.

If the interval has a day component then the calculation needs
to take into account daylight saving time switches and thereby a
day would not always be exactly 24 hours. We mitigate this by
adding a safety buffer to account for these dst switches when
dealing with intervals with day component. These calculations
will be repeated with exact values during execution.
Since dst switches seem to range between -1 and 2 hours we set
the safety buffer to 4 hours.

This patch also refactors the tests since the previous tests
made it hard to tell the feature was working after the constified
values have been removed from the plans.
2022-05-28 10:02:33 +02:00
Sven Klemm
dcb7dcc506 Remove constified now() constraints from plan
Commit 35ea80ff added an optimization to enable expressions with
now() to be used during plan-time chunk exclusion by constifying
the now() expression. The added constified constraints were left
in the plan even though they were only required during the
hypertable explansion. This patch marks those constified constraints
and removes them once they are no longer required.
2022-05-24 17:19:18 +02:00
Sven Klemm
8c5c7bb4ad Filter out chunk ids in shared tests
Multinode queries use _timescaledb_internal.chunks_in to specify
the chunks from which to select data. The chunk id in
regresscheck-shared is not stable and may differ depending on
execution order leading to flaky tests.
2022-05-19 21:33:33 +02:00
Sven Klemm
eab4efa323 Move metrics_dist1 out of shared_setup
The table metrics_dist1 was only used by a single test and therefore
should not be part of shared_setup but instead be created in the
test that actually uses it. This reduces executed time of
regresscheck-shared when that test is not run.
2022-05-19 21:33:33 +02:00
Sven Klemm
43c8e51510 Fix Var handling for Vars of different level in constify_now
This patch fixes the constify_now optimization to ignore Vars of
different level. Previously this could potentially lead to an
assertion failure cause the varno of that varno might be bigger
than the number of entries in the rangetable. Found by sqlsmith.
2022-05-19 11:45:17 +02:00
Dmitry Simonenko
f1575bb4c3 Support moving compressed chunks between data nodes
This change allows to copy or move compressed chunks
between data nodes by including compressed chunk into the
chunk copy command stages.
2022-05-18 22:14:50 +03:00
Sven Klemm
11c6813b1d Fix flaky regresscheck-shared
While we do filter out chunk ids and hypertable ids from the test
output, the output was still unstable when those ids switch between
single and double digit as that changes the length of the query
decorator in EXPLAIN output. This patch removes this decorator
entirely from all shared test output.
2022-05-18 17:34:11 +02:00
Nikhil Sontakke
ddd02922c9 Support non-superuser move chunk operations
The non-superuser needs to have REPLICATION privileges atleast. A
new function "subscription_cmd" has been added to allow running
subscription related commands on datanodes. This function implicitly
upgrades to the bootstrapped superuser and then performs subscription
creation/alteration/deletion commands. It only accepts subscriptions
related commands and errors out otherwise.
2022-05-18 16:56:31 +05:30
Sven Klemm
35ea80ffdf Enable now() usage in plan-time chunk exclusion
This implements an optimization to allow now() expression to be
used during plan time chunk exclusions. Since now() is stable it
would not normally be considered for plan time chunk exclusion.
To enable this behaviour we convert `column > now()` expressions
into `column > const AND column > now()`. Assuming that time
always moves forward this is safe even for prepared statements.
This optimization works for SELECT, UPDATE and DELETE.
On hypertables with many chunks this can lead to a considerable
speedup for certain queries.

The following expressions are supported:
- column > now()
- column >= now()
- column > now() - Interval
- column > now() + Interval
- column >= now() - Interval
- column >= now() + Interval

Interval must not have a day or month component as those depend
on timezone settings.

Some microbenchmark to show the improvements, I did best of five
for all of the queries.

-- hypertable with 1k chunks
-- with optimization
select * from metrics1k where time > now() - '5m'::interval;
Time: 3.090 ms

-- without optimization
select * from metrics1k where time > now() - '5m'::interval;
Time: 145.640 ms

-- hypertable with 5k chunks
-- with optimization
select * from metrics5k where time > now() - '5m'::interval;
Time: 4.317 ms

-- without optimization
select * from metrics5k where time > now() - '5m'::interval;
Time: 775.259 ms

-- hypertable with 10k chunks
-- with optimization
select * from metrics10k where time > now() - '5m'::interval;
Time: 4.853 ms

-- without optimization
select * from metrics10k where time > now() - '5m'::interval;
Time: 1766.319 ms (00:01.766)

-- hypertable with 20k chunks
-- with optimization
select * from metrics20k where time > now() - '5m'::interval;
Time: 6.141 ms

-- without optimization
select * from metrics20k where time > now() - '5m'::interval;
Time: 3321.968 ms (00:03.322)

Speedup with 1k chunks: 47x
Speedup with 5k chunks: 179x
Speedup with 10k chunks: 363x
Speedup with 20k chunks: 540x
2022-05-17 21:47:39 +02:00