3703 Commits

Author SHA1 Message Date
Nikhil Sontakke
c92e29ba3a Fix DML HA in multi-node
If a datanode goes down for whatever reason then DML activity to
chunks residing on (or targeted to) that DN will start erroring out.
We now handle this by marking the target chunk as "stale" for this
DN by changing the metadata on the access node. This allows us to
continue to do DML to replicas of the same chunk data on other DNs
in the setup. This obviously will only work for chunks which have
"replication_factor" > 1. Note that for chunks which do not have
undergo any change will continue to carry the appropriate DN related
metadata on the AN.

This means that such "stale" chunks will become underreplicated and
need to be re-balanced by using the copy_chunk functionality by a micro
service or some such process.

Fixes #4846
2022-11-25 17:42:26 +05:30
Dmitry Simonenko
26e3be1452 Test dist caggs with an unavailable data node
Add additional test cases to ensure caggs functionality on distributed
hypertable during data node being unavailable.

Fix #4978
2022-11-24 19:15:40 +02:00
Dmitry Simonenko
826dcd2721 Ensure nodes availability using dist restore point
Make sure that a data node list does not have unavailable data nodes
when using create_distributed_restore_point() API.

Fix #4979
2022-11-24 16:08:06 +02:00
Bharathy
7bfd28a02f Fix dist_fetcher_type test on PG15 2022-11-24 18:41:46 +05:30
Dmitry Simonenko
5813173e07 Introduce drop_stale_chunks() function
This function drops chunks on a specified data node if those chunks are
not known by the access node.

Call drop_stale_chunks() automatically when data node becomes
available again.

Fix #4848
2022-11-23 19:21:05 +02:00
Alexander Kuzmenkov
bdae647f0a Add i386 check results to database
Also add some more gdb commands to give us more context.
2022-11-23 18:53:40 +04:00
Alexander Kuzmenkov
26db866637 Fix GITHUB_OUTPUT on Windows
We have to add it to WSLENV and translate it as a path, so that it
properly passes the WSL <-> native process boundary.
2022-11-23 18:53:40 +04:00
Konstantina Skovola
40297f1897 Fix TRUNCATE on hierarchical caggs
When truncating a cagg that had another cagg defined on
top of it, the nested cagg would not get invalidated accordingly.
That was because we were not adding a corresponding entry in
the hypertable invalidation log for the materialization hypertable
of the base cagg.
This commit adds an invalidation entry in the table so that
subsequent refreshes see and properly process this invalidation.

Co-authored-by: name <fabriziomello@gmail.com>
2022-11-23 11:17:17 +02:00
Fabrízio de Royes Mello
35fa891013 Add missing gitignore entry
Pull request #4998 introduced a new template SQL test file but missed
to add the properly `.gitignore` entry to ignore generated test files.
2022-11-23 05:08:05 -03:00
Fabrízio de Royes Mello
e84a6e2e65 Remove the refresh step from CAgg migration
We're facing some weird `portal snapshot` issues running the
`refresh_continuous_aggregate` procedure called from other procedures.

Fixed it by ignoring the Refresh Continuous Aggregate step from the
`cagg_migrate` and warning users to run it manually after the execution.

Fixes #4913
2022-11-22 16:49:13 -03:00
Lakshmi Narayanan Sreethar
7bc6e56cb7 Fix plan_hashagg test failure in PG15
Updated the expected output of plan_hashagg to reflect changes introduced
by postgres/postgres@4b160492.
2022-11-22 22:36:22 +05:30
Sven Klemm
639a5018a3 Change time of scheduled CI run
Since we now use the date as a part of the cache key to ensure no
stale cache entries hiding build failures we need to make sure
we have a cache entry present before workflows that depend on cache
are run.
2022-11-22 14:49:28 +01:00
Konstantina Skovola
48d9733fda Add telemetry for caggs on top of caggs
Commit #4668 introduced hierarchical caggs. This patch adds
a field `num_caggs_nested` to the telemetry report to include the
number of caggs defined on top of other caggs.
2022-11-22 13:39:27 +02:00
Jan Nidzwetzki
fd84bf42a5 Use Ensure in get_or_add_baserel_from_cache
This patch changes an Assert in get_or_add_baserel_from_cache to an
Ensure. Therefore, this check is also performed in release builds. This
is done to detect metadata corruptions at an early stage.
2022-11-22 10:37:45 +01:00
Fabrízio de Royes Mello
a5b8c9b084 Fix caggs on caggs tests on PG15
PR #4668 introduced the Hierarchical Continuous Aggregates (aka
Continuous Aggregate on top of another Continuous Aggregate) but
unfortunately we miss to fix the regression tests on PG15.
2022-11-21 15:19:44 -03:00
Bharathy
89cede81bd Fix PG15 specific tests. 2022-11-21 16:09:42 +05:30
Fabrízio de Royes Mello
3b5653e4cc Ignore trailing whitespaces changes in git blame 2022-11-19 11:42:36 -03:00
Fabrízio de Royes Mello
a4356f342f Remove trailing whitespaces from test code 2022-11-18 16:31:47 -03:00
Fabrízio de Royes Mello
b1742969d0 Add SQL test files to trailing whitespace CI check
In commit 1f807153 we added a CI check for trailing whitespaces over
our source code files (.c and .h).

This commit add SQL test files (.sql and .sql.in) to this check.
2022-11-18 16:31:47 -03:00
Fabrízio de Royes Mello
3749953e97 Hierarchical Continuous Aggregates
Enable users create Hierarchical Continuous Aggregates (aka Continuous
Aggregates on top of another Continuous Aggregates).

With this PR users can create levels of aggregation granularity in
Continuous Aggregates making the refresh process even faster.

A problem with this feature can be in upper levels we can end up with
the "average of averages". But to get the "real average" we can rely on
"stats_aggs" TimescaleDB Toolkit function that calculate and store the
partials that can be finalized with other toolkit functions like
"average" and "sum".

Closes #1400
2022-11-18 14:34:18 -03:00
Jan Nidzwetzki
fd11479700 Speed up get_or_add_baserel_from_cache operation
Commit 9f4dcea30135d1e36d1c452d631fc8b8743b3995 introduces the
get_or_add_baserel_from_cache function. It contains a performance
regression, since an expensive metadata scan
(ts_chunk_get_hypertable_id_by_relid) is performed even when it could be
avoided.
2022-11-18 15:29:49 +01:00
Jan Nidzwetzki
380464df9b Perform frozen chunk status check via trigger
The commit 9f4dcea30135d1e36d1c452d631fc8b8743b3995 introduces frozen
chunks. Checking whether a chunk is frozen or not has been done so far
in the query planner. If it is not possible to determine which chunks
are affected by a query in the planner (e.g., due to a cast in the WHERE
condition), all chunks are checked. This leads (1) to an increased
planning time and (2) to the situation that a single frozen chunk could
reject queries, even if the frozen chunk is not addressed by the query.
2022-11-18 15:29:49 +01:00
Lakshmi Narayanan Sreethar
7c32ceb073 Fix perl test import in PG15
Removed an invalid import from 007_healthcheck.pl test.
Also enabled all the perl tests and a couple of others in PG15.
2022-11-18 13:55:59 +05:30
gayyappan
b9ca06d6e3 Move freeze/unfreeze chunk to tsl
Move code for freeze and unfreeze chunk to tsl directory.
2022-11-17 15:28:47 -05:00
Bharathy
bfa641a81c INSERT .. SELECT on distributed hypertable fails on PG15
INSERT .. SELECT query containing distributed hypertables generates plan
with DataNodeCopy node which is not supported. Issue is in function
tsl_create_distributed_insert_path() where we decide if we should
generate DataNodeCopy or DataNodeDispatch node based on the kind of
query. In PG15 for INSERT .. SELECT query timescaledb planner generates
DataNodeCopy as rte->subquery is set to NULL. This is because of a commit
in PG15 where rte->subquery is set to NULL as part of a fix.

This patch checks if SELECT subquery has distributed hypertables or not
by looking into root->parse->jointree which represents subquery.

Fixes #4983
2022-11-17 21:18:23 +05:30
Sachin
1e3200be7d USE C function for time_bucket() offset
Instead of using SQL UDF for handling offset parameter
added ts_timestamp/tz/date_offset_bucket() which will
handle offset
2022-11-17 13:08:19 +00:00
Lakshmi Narayanan Sreethar
839e42dd0c Use async API to drop database from delete_data_node
PG15 introduced a ProcSignalBarrier mechanism in drop database
implementation to force all backends to close the file handles for
dropped tables. The backend that is executing the drop database command
will emit a new process signal barrier and wait for other backends to
accept it. But the backend which is executing the delete_data_node
function will not be able to process the above mentioned signal as it
will be stuck waiting for the drop database query to return. Thus the
two backends end up waiting for each other causing a deadlock.

Fixed it by using the async API to execute the drop database command
from delete_data_node instead of the blocking remote_connection_cmdf_ok
call.

Fixes #4838
2022-11-17 18:09:39 +05:30
Alexander Kuzmenkov
1b65297ff7 Fix memory leak with INSERT into compressed hypertable
We used to allocate some temporary data in the ExecutorContext.
2022-11-16 13:58:52 +04:00
Alexander Kuzmenkov
7e4ebd131f Escape the quotes in gdb command 2022-11-15 21:49:39 +04:00
Alexander Kuzmenkov
676d1fb1f1 Fix const null clauses in runtime chunk exclusion
The code we inherited from postgres expects that if we have a const null
or false clause, it's going to be the single one, but that's not true
for runtime chunk exclusion because we don't try to fold such
restrictinfos after evaluating the mutable functions. Fix it to also
work for multiple restrictinfos.
2022-11-15 21:49:39 +04:00
Mats Kindahl
f3a3da7804 Take advisory lock for job tuple
Job ids are locked using an advisory lock rather than a row lock on the
jobs table, but this lock is not taken in the job API functions
(`alter_job`, `delete_job`, etc.), which appears to cause a race
condition resulting in addition of multiple rows with the same job id.

This commit adds an advisory `RowExclusiveLock` on the job id while
altering it to match the advisory locks taken while performing other
modifications.

Closes #4863
2022-11-15 17:58:49 +01:00
Ante Kresic
51e5f31918 Update compress chunk interval on compressed data
Compress chunk interval is set using an ALTER TABLE statement.
This change makes it so you can update the compress chunk interval
while keeping the rest of the compression settings intact.
Updating it will only affect chunks that are compressed and rolled
up after the change.
2022-11-15 15:31:16 +01:00
Sven Klemm
8b6eb9024f Check for interrupts in gapfill main loop
Add CHECK_FOR_INTERRUPTS() macro to gapfill main loop.
2022-11-15 15:02:46 +01:00
Sven Klemm
87756bcff9 Bump postgres versions used in CI
Use PG 12.13, 13.9 and 14.6 in our CI
2022-11-15 13:48:10 +01:00
Sven Klemm
2f237e6e57 Add Enterprise Linux 9 packages to RPM package test 2022-11-15 12:13:55 +01:00
Lakshmi Narayanan Sreethar
33531212b2 Disable dist_move_chunk test in PG15
The dist_move_chunk causes the CI to hang when compiled and run with
PG15 as explained in #4972.

Also fixed schema permission issues in data_node and dist_param tests.
2022-11-15 14:10:45 +05:30
Bharathy
8afdddc2da Deprecate continuous aggregates with old format
This patch will report a warning when upgrading to new timescaledb extension,
if their exists any caggs with partial aggregates only on release builds.
Also restrict users from creating cagss with old format on timescaledb with
PG15.
2022-11-15 08:38:03 +05:30
Mats Kindahl
b085833fda Print errors in release builds for jobs
Old assertions checking integrety of metadata for jobs will print error
message in release builds instead of continuing executing with bad
metadata.
2022-11-14 16:54:13 +01:00
Alexander Kuzmenkov
121631c70f Support parameterized data node scans in joins
This allows us to perform a nested loop join of a small outer local
table to an inner distributed hypertable, without downloading the
entire hypertable to the access node.
2022-11-14 18:57:15 +04:00
Alexander Kuzmenkov
9964ba8ba6 Remove accidental debug output
Was added in # 4890
2022-11-14 18:57:15 +04:00
Alexander Kuzmenkov
0d30155b26 Upload test results into a database
This will help us find the flaky tests or the rare failures.
2022-11-14 17:35:50 +04:00
Alexander Kuzmenkov
0360812e3c Simplify llvm configuration for linux/macos builds
Set it only in the matrixbuilder.
2022-11-14 17:35:50 +04:00
Alexander Kuzmenkov
feb09c54e9 Rebuild cached PG daily and on config changes
Otherwise it's easy to break these builds and not notice it until much
later.
2022-11-14 17:35:50 +04:00
Mats Kindahl
141e114ccb Fix race in bgw_db_scheduler_fixed
When deleting a job in the test, the job does not necessarily terminate
immediately, so wait for a log entries from the job before checking the
jobs table.

Fixed #4859
2022-11-14 13:23:23 +01:00
Markos Fountoulakis
e2b7c76c9c Disable MERGE when using hypertables
Fixes #4930

Co-authored-by: Lakshmi Narayanan Sreethar <lakshmi@timescale.com>
2022-11-14 13:57:17 +02:00
Fabrízio de Royes Mello
9e276c58ee Revert "Upload test results into the database"
This reverts commit 252cefb509153fadcb32741a27ec3fa977487049 because it
broke our CI globally.
2022-11-11 15:25:01 -03:00
Fabrízio de Royes Mello
6ae192631e Fix CAgg migration with timestamp without timezone
It was a leftover from the original implementation where we didn't add
tests for time dimension using `timestamp without timezone`.

Fixed it by dealing with this datatype and added regression tests.

Fixes #4956
2022-11-11 15:25:01 -03:00
Alexander Kuzmenkov
252cefb509 Upload test results into the database
This will help us find the flaky tests or the rare failures.
2022-11-11 20:02:29 +04:00
Erik Nordström
f13214891c Add function to alter data nodes
Add a new function, `alter_data_node()`, which can be used to change
the data node's configuration originally set up via `add_data_node()`
on the access node.

The new functions introduces a new option "available" that allows
configuring the availability of the data node. Setting
`available=>false` means that the node should no longer be used for
reads and writes. Only read "failover" is implemented as part of this
change, however.

To fail over reads, the alter data node function finds all the chunks
for which the unavailable data node is the "primary" query target and
"fails over" to a chunk replica on another data node instead. If some
chunks do not have a replica to fail over to, a warning will be
raised.

When a data node is available again, the function can be used to
switch back to using the data node for queries.

Closes #2104
2022-11-11 13:59:42 +01:00
Sven Klemm
fe6731cead Fix compress_segmentby in isolation tests
compress_segmentby should never be on a column with random() values
as that will result in very inefficient compression as the batches
will only have 1 tuple each.
2022-11-10 17:38:57 +01:00