3228 Commits

Author SHA1 Message Date
Rafia Sabih
eaf3a38fe9 Pushdown of gapfill to data nodes
Allow the calls of time_bucket_gapfill to be executed at the
data nodes for improved query performance. With this, time_bucket_gapfill
is pushed to data nodes in the following conditions,

1. when only one data node has all the chunks
2. when space dimension does not overlap across data nodes
3. when group-by matches space dimension
2022-04-07 21:09:49 +02:00
Mats Kindahl
1b2926c076 Do not modify aggregation state in finalize
The function `tsl_finalize_agg_ffunc` modified the aggregation state by
setting `trans_value` to the final result when computing the final
value. Since the state can be re-used several times, there could be
several calls to the finalization function, and the finalization
function would be confused when passed a final value instead of a
aggregation state transition value.

This commit fixes this by not modifying the `trans_value` when
computing the final value and instead just returns it (or the original
`trans_value` if there is no finalization function).

Fixes #3248
2022-04-06 20:50:47 +02:00
Sven Klemm
ae50a53485 Add chunk exclusion for UPDATE for PG14
Currently only IMMUTABLE constraints will exclude chunks from an UPDATE plan,
with this patch STABLE expressions will be used to exclude chunks as well.
This is a big performance improvement as chunks not matching partitioning
column constraints don't have to be scanned for UPDATEs.
Since the codepath for UPDATE is different for PG < 14 this patch only adds
the optimization for PG14.

With this patch the plan for UPDATE on hypertables looks like this:

 Custom Scan (HypertableModify) (actual rows=0 loops=1)
   ->  Update on public.metrics_int2 (actual rows=0 loops=1)
         Update on public.metrics_int2 metrics_int2_1
         Update on _timescaledb_internal._hyper_1_1_chunk metrics_int2
         Update on _timescaledb_internal._hyper_1_2_chunk metrics_int2
         Update on _timescaledb_internal._hyper_1_3_chunk metrics_int2
         ->  Custom Scan (ChunkAppend) on public.metrics_int2 (actual rows=0 loops=1)
               Output: '123'::text, metrics_int2.tableoid, metrics_int2.ctid
               Startup Exclusion: true
               Runtime Exclusion: false
               Chunks excluded during startup: 3
               ->  Seq Scan on public.metrics_int2 metrics_int2_1 (actual rows=0 loops=1)
                     Output: metrics_int2_1.tableoid, metrics_int2_1.ctid
                     Filter: (metrics_int2_1."time" = length(version()))
2022-04-06 12:41:14 +02:00
Alexander Kuzmenkov
ff945a7a94 Data node scan doesn't support system columns: move this check to an appropriate place
Before, we would complain that we don't support fetching the system
columns with per-data node queries enabled, but still execute the code
that fetches it. Don't do this and complain earlier.
2022-04-05 16:01:16 +05:30
Konstantina Skovola
a064fd3b48 Add logging for retention policy
Also remove unused code from compression_api. The function
policy_compression_get_verbose_log was unused. Moved it to
policy_utils and renamed to policy_get_verbose_log so that it can
be used by all policies.
2022-04-04 17:47:13 +03:00
Konstantina Skovola
f2e43900c9 Update changelog for 4159 2022-04-04 13:01:55 +03:00
Dmitry Simonenko
a4b151b024 Fix owner change for distributed hypertable
Allow ALTER TABLE OWNER TO command to be used with distributed
hypertable.

Fix #4180
2022-04-04 09:56:23 +03:00
Alexander Kuzmenkov
4ee8872177 Use virtual tuples in row-by-row fetcher
We needlessly form/deform the heap tuples currently. Sometimes we do
need this when we have row marks and need a ctid (UPDATE RETURNING),
but not in this case. The implementation has three parts:

1. Change data fetcher interface to store a tuple into given slot
instead of returning a heap tuple.

2. Expose the creation of virtual tuple in tuple factory.

3. Use these facilities in row-by-row fetcher.

This gives some small speedup. It will become more important in the
future, as other parts of row-by-row fetcher are optimized.
2022-04-01 23:03:03 +05:30
Sven Klemm
e16908ccd7 Ignore bulk formatting changes in git blame
This patch adds a .git-blame-ignore-revs that contains a list of
commits with bulk formatting changes to be ignored with git blame.
This file will be used by GitHub but to use it locally you need
to tell git about it eg. with the following command:
`git config blame.ignoreRevsFile .git-blame-ignore-revs`
2022-04-01 16:25:48 +02:00
Konstantina Skovola
a154ae56e9 Fix ADD COLUMN IF NOT EXISTS error on compressed hypertable
Stop throwing exception with message "column of relation already exists"
when running the command ALTER TABLE ... ADD COLUMN IF NOT EXISTS ...
on compressed hypertables.

Fix #4087
2022-04-01 00:44:07 -07:00
Erik Nordström
972afe0096 Add TAP tests for extension state
Add a TAP test that checks that the extensions state is updated across
concurrent sessions/backends when the extension is "dropped" or
"created".
2022-04-01 08:31:34 +02:00
Erik Nordström
01c724b9be Fix relcache callback handling causing crashes
Fix a crash that could corrupt indexes when running VACUUM FULL
pg_class.

The crash happens when caches are queried/updated within a cache
invalidation function, which can lead to corruption and recursive
cache invalidations.

To solve the issue, make sure the relcache invalidation callback is
simple and never invokes the relcache or syscache directly or
indirectly.

Some background: The extension is preloaded and thus have planner
hooks installed irrespective of whether the extension is actually
installed or not in the current database. However, the hooks need to
be disabled as long as the extension is not installed. To avoid always
having to dynamically check for the presence of the extension, the
state is cached in the session.

However, the cached state needs to be updated if the extension changes
(altered/dropped/created). Therefore, the relcache invalidation
callback mechanism is (ab)used in TimescaleDB to signal updates to the
extension state across all active backends.

The signaling is implemented by installing a dummy table as part of
the extension and any invalidation on the relid for that table signals
a change in the extension state. However, as of this change, the
actual state is no longer determined in the callback itself, since it
requires use of the relcache and causes the bad behavior. Therefore,
the only thing that remains in the callback after this change is to
reset the extension state.

The actual state is instead resolved on-demand, but can still be
cached when the extension is in the installed state and the dummy
table is present with a known relid. However, if the extension is not
installed, the extension state can no longer be cached as there is no
way to signal other backends that the state should be reset when they
don't know the dummy table's relid, and cannot resolve it from within
the callback itself.

Fixes #3924
2022-04-01 08:31:34 +02:00
Mats Kindahl
9c46a5d5c6 Abort sessions after extension reload
If a session is started and loads (and caches, by OID) functions in the
extension to use them in, for example, a `SELECT` query on a continuous
aggregate, the extension will be marked as loaded internally.

If an `ALTER EXTENSION` is then executed in a separate session, it will
update `pg_extension` to hold the new version, and any other sessions
will see this as the new version, including the session that already
loaded the previous version of the shared library.

Since the pre-update session has loaded some functions from the old
version already, running the same queries with the old named functions
will trigger a reload of the new version of the shared library to get
the new functions (same name, but different OID), but since this has
already been loaded in a different version, it will trigger an error
that GUC variables are re-defined.

Further queries after that will then corrupt the database causing a
crash.

This commit fixes this by recording the version loaded rather than if
it has been loaded and check that the version did not change after a
query has been analyzed (in the `post_analyze_hook`). If the version
changed, it will generate a fatal error to force an abort of the
session.

Fixes #4191
2022-03-30 15:21:12 +02:00
Konstantina Skovola
915bd032bd Fix spelling errors and omissions 2022-03-30 05:21:54 -07:00
Mats Kindahl
81b71b685c Remove signal-unsafe calls from signal handlers
Functions `elog` and `ereport` are unsafe to use in signal handlers
since they call `malloc`. This commit removes them from signal
handlers.

Fixes #4200
2022-03-30 13:37:45 +02:00
Sven Klemm
347b45f109 Add chunk exclusion for DELETE for PG14
Currently only IMMUTABLE constraints will exclude chunks from a DELETE plan,
with this patch STABLE expressions will be used to exclude chunks as well.
This is a big performance improvement as chunks not matching partitioning
column constraints don't have to be scanned for DELETEs.
Additionally this improves usability of DELETEs on hypertables with some
chunks compressed. Previously you weren't able to do DELETE on those hypertables
which had non-IMMUTABLE constraints. Since the codepath for DELETE is
different for PG < 14 this patch only adds the optimization for PG14.

With this patch the plan for DELETE on hypertables looks like this:

 Custom Scan (HypertableModify) (actual rows=0 loops=1)
   ->  Delete on metrics (actual rows=0 loops=1)
         Delete on metrics metrics_1
         Delete on _hyper_5_8_chunk metrics
         Delete on _hyper_5_11_chunk metrics
         Delete on _hyper_5_12_chunk metrics
         Delete on _hyper_5_13_chunk metrics
         Delete on _hyper_5_14_chunk metrics_2
         ->  Custom Scan (ChunkAppend) on metrics (actual rows=1 loops=1)
               Chunks excluded during startup: 4
               ->  Seq Scan on metrics metrics_1 (actual rows=0 loops=1)
                     Filter: ("time" > (now() - '3 years'::interval))
               ->  Bitmap Heap Scan on _hyper_5_14_chunk metrics_2 (actual rows=1 loops=1)
                     Recheck Cond: ("time" > (now() - '3 years'::interval))
                     Heap Blocks: exact=1
                     ->  Bitmap Index Scan on _hyper_5_14_chunk_metrics_time_idx (actual rows=1 loops=1)
                           Index Cond: ("time" > (now() - '3 years'::interval))
2022-03-28 18:54:05 +02:00
Alexander Kuzmenkov
935684c83a Cache whether a rel is a chunk in classify_relation
Use a per-query hash table for this. This speeds up the repeated calls
to classify_relation by avoiding the costly chunk lookup.
2022-03-23 16:49:02 +05:30
Alexander Kuzmenkov
ae79ba6eb4 Scan less chunk metadata when planning ForeignModify
Instead of loading the entire Chunk struct, just look up the data
nodes.
2022-03-23 14:03:34 +05:30
Rafia Sabih
fb8dec9fa4 Update comments to Postgresql standard style 2022-03-22 20:16:12 +01:00
Erik Nordström
c1cf067c4f Improve restriction scanning during hypertable expansion
Improve the performance of metadata scanning during hypertable
expansion.

When a hypertable is expanded to include all children chunks, only the
chunks that match the query restrictions are included. To find the
matching chunks, the planner first scans for all matching dimension
slices. The chunks that reference those slices are the chunks to
include in the expansion.

This change optimizes the scanning for slices by avoiding repeated
open/close of the dimension slice metadata table and index.

At the same time, related dimension slice scanning functions have been
refactored along the same line.

An index on the chunk constraint metadata table is also changed to
allow scanning on dimension_slice_id. Previously, dimension_slice_id
was the second key in the index, which made scans on this key less
efficient.
2022-03-21 15:18:44 +01:00
Nikhil Sontakke
966c5eb2c2 Fix remote EXPLAIN with parameterized queries
In certain multi-node queries, we end up using a parameterized query
on the datanodes. If "timescaledb.enable_remote_explain" is enabled we
run an EXPLAIN on the datanode with the remote query. EXPLAIN doesn't
work with parameterized queries. So, we check for that case and avoid
invoking a remote EXPLAIN if so.

Fixes #3974

Reported and test case provided by @daydayup863
2022-03-21 17:29:47 +05:30
Sven Klemm
e101b3ea60 Set minimum required cmake version to 3.10
cmake > 3.10 is not packaged for some of the platforms we build
packages eg old ubuntu and debian version. Currently we modify
the CMakeLists.txt in those build environments and set the
minimum version to 3.10 already, which proofes that timescaledb
builds fine with cmake 3.10.
2022-03-21 09:36:14 +01:00
Sven Klemm
566a4ff104 Route UPDATE through HypertableModify
Route UPDATE on Hypertables through our custom HypertableModify
node. This patch by itself does not make any other changes to
UPDATE but is the foundation for other features regarding UPDATE
on hypertables.
2022-03-18 10:47:45 +01:00
Erik Nordström
846878c6bb Ensure scan functions use long-lived memory context
PostgreSQL scan functions might allocate memory that needs to live for
the duration of the scan. This applies also to functions that are
called during the scan, such as getting the next tuple. To avoid
situations when such functions are accidentally called on, e.g., a
short-lived per-tuple context, add a explicit scan memory context to
the Scanner interface that wraps the PostgreSQL scan API.
2022-03-18 08:06:21 +01:00
Erik Nordström
b954c00fa8 Fix memory handling during scans
Scan functions cannot be called on a per-tuple memory context as they
might allocate data that need to live until the end of the scan. Fix
this in a couple of places to ensure correct memory handling.

Fixes #4148, #4145
2022-03-18 08:06:21 +01:00
Sven Klemm
a759b2b2b9 Show number of chunks excluded in ConstraintAwareAppend EXPLAIN
This patch changes the ConstraintAwareAppend EXPLAIN output to show
the number of chunks excluded instead of the number of chunks left.
The number of chunks left can be seen from other EXPLAIN output
while the actual number of exclusions that happened can not. This
also makes the output consistent with output of ChunkAppend.
2022-03-17 17:46:01 +01:00
Fabrízio de Royes Mello
ce3e04a9ec Rename forgotten master branch name references 2022-03-15 15:47:02 -03:00
Fabrízio de Royes Mello
332dffeebc Rename master branch to main
Following what many communities already did we agreed in renaming the
`master` branch to `main`.

Resources:
- https://sfconservancy.org/news/2020/jun/23/gitbranchname/
- https://postgr.es/m/20200615182235.x7lch5n6kcjq4aue@alap3.anarazel.de

Closes #4163
2022-03-15 15:04:30 -03:00
Sven Klemm
077b2edbc5 Change ChunkAppend file organization
This patch changes the organization of the ChunkAppend code. It
removes all header files except chunk_append/chunk_append.h.
It also merges exec.c and explain.c to remove unnecessary function
exports, since the code from explain.c was only used by exec.c
2022-03-15 16:25:57 +01:00
Sven Klemm
cc89f1dc84 Remove duplicate contain_param functions
This patch exports the contain_param function in planner.c and
changes ChunkAppend to use that version instead of having two
implementations of that function.
2022-03-15 16:25:57 +01:00
Erik Nordström
f00bdadf0c Trigger Sqlsmith tests manually or by push to branch
Add workflow events to allow manually running Sqlsmith tests or when
pushing to the 'sqlsmith' branch. This is useful when submitting PRs
that one wants to run extra checks on, including Sqlsmith.
2022-03-15 16:08:03 +01:00
Sven Klemm
ab6b90caff Reference CVE ID in CHANGELOG
The CVE ID was already referenced in the commit introducing the fix
but not in the CHANGELOG.
2022-03-15 01:27:05 +01:00
Mats Kindahl
f5fd06cabb Ignore invalid relid when deleting hypertable
When running `performDeletion` is is necessary to have a valid relation
id, but when doing a lookup using `ts_hypertable_get_by_id` this might
actually return a hypertable entry pointing to a table that does not
exist because it has been deleted previously. In this case, only the
catalog entry should be removed, but it is not necessary to delete the
actual table.

This scenario can occur if both the hypertable and a compressed table
are deleted as part of running a `sql_drop` event, for example, if a
compressed hypertable is defined inside an extension. In this case, the
compressed hypertable (indeed all tables) will be deleted first, and
the lookup of the compressed hypertable will find it in the metadata
but a lookup of the actual table will fail since the table does not
exist.

Fixes #4140
2022-03-14 14:03:49 +01:00
Dmitry Simonenko
f82df7ca4a Allow ANALYZE command on a data node directly
Allow execution of VACUUM/ANALYZE commands on a data node without
enabling timescaledb.enable_client_ddl_on_data_nodes GUC

Fix #4157
2022-03-11 11:14:23 +02:00
Sven Klemm
8f56ced825 Add workflow for running sqlsmith
sqlsmith is a random SQL query generator and very useful for finding
bugs in our implementation as it tests complex queries and thereby
hits codepaths and interactions between different features not tested
in our normal regression checks.
2022-03-10 13:26:10 +01:00
Sven Klemm
06d8375594 Enhance extension function test
This patch changes the extension function list to include the
signature as well since functions with different signature are
separate objects in postgres. This also changes the list to include
all functions. Even though functions in internal schemas are not
considered public API they still need be treated the same as functions
in other schemas with regards to extension upgrade/downgrade.

This patch also moves the test to regresscheck-shared since we do
not dedicated database to run these tests.
2022-03-10 11:22:33 +01:00
Fabrízio de Royes Mello
33bbdccdcd Refactor function hypertable_local_size
Reorganize the code and fix minor bug that was not computing the size
of FSM, VM and INIT forks of the parent hypertable.

Fixed the bug by exposing the `ts_relation_size` function to the SQL
level to encapsulate the logic to compute `heap`, `indexes` and `toast`
sizes.
2022-03-07 16:38:40 -03:00
Fabrízio de Royes Mello
18afcfd62f Refactor function ts_relation_size
Current implementation iterate over fork types to calculate the size of
each one by calling `pg_relation_size` PostgreSQL function and other
calls to calculate indexes and table size (six function calls).

Improving it by halving PostgreSQL function calls to calculate the size
of the relations (now three function calls).
2022-03-04 10:05:47 -03:00
Markos Fountoulakis
e9fb9acbbb Fix regressions found in nightly CI
Add concurrent_query_and_drop_chunks to ignore-list and fix C compiler
warning.
2022-03-03 19:29:58 +02:00
Mats Kindahl
15d33f0624 Add option to compile without telemetry
Add option `USE_TELEMETRY` that can be used to exclude telemetry from
the compile.

Telemetry-specific SQL is moved, which is only included when extension
is compiled with telemetry and the notice is changed so that the
message about telemetry is not printed when Telemetry is not compiled
in.

The following code is not compiled in when telemetry is not used:
- Cross-module functions for telemetry.
- Checks for telemetry job in job execution.
- GUC variables `telemetry_level` and `telemetry_cloud`.

Telemetry subsystem is not included when compiling without telemetry,
which requires some functions to be moved out of the telemetry
subsystem:
- Metadata handling is moved out of the telemetry module since it is
  used not only with telemetry.
- UUID functions are moved into a separate module instead of being
  part of the telemetry subsystem.
- Telemetry functions are either added or removed when updating from a
  previous version.

Tests are updated to:
- Not use telemetry functions to get UUID or Metadata and instead use
  the moved UUID and metadata functions.
- Not include telemetry information in tests that do not require it.
- Configuration files do not set telemetry variables when telemetry is
  not compiled in.
- Replaced usage of telemetry functions in non-telemetry tests with
  other sources of same information.

Fixes #3931
2022-03-03 12:21:07 +01:00
Sven Klemm
642a745767 Fix arm64 apt package test
This patch changes the workflow to run apt-get update before
installing any packages in case the local package database is
outdated and references packages no longer available.
2022-03-02 18:00:11 +01:00
Mats Kindahl
b909d4857d Fixes to smoke update tests
Smoke tests where missing critical files and some tests had changed
since last run and did not handle update smoke tests, so fixing all
necessary issues.
2022-03-01 13:15:46 +01:00
Erik Nordström
14deea6bd5 Improve chunk scan performance
Chunk scan performance during querying is improved by avoiding
repeated open and close of relations and indexes when joining chunk
information from different metadata tables.

When executing a query on a hypertable, it is expanded to include all
its children chunks. However, during the expansion, the chunks that
don't match the query constraints should also be excluded. The
following changes are made to make the scanning and exclusion more
efficient:

* Ensure metadata relations and indexes are only opened once even
  though metadata for multiple chunks are scanned. This avoids doing
  repeated open and close of tables and indexes for each chunk
  scanned.
* Avoid interleaving scans of different relations, ensuring better
  data locality, and having, e.g., indexes warm in cache.
* Avoid unnecessary scans that repeat work already done.
* Ensure chunks are locked in a consistent order (based on Oid).

To enable the above changes, some refactoring was necessary. The chunk
scans that happen during constraint exclusion are moved into separate
source files (`chunk_scan.c`) for better structure and readability.

Some test outputs are affected due to the new ordering of chunks in
append relations.
2022-02-28 16:53:01 +01:00
Erik Nordström
32c1e3aef2 Allow control of relation open/close in Scanner
Make the Scanner module more flexible by allowing optional control
over when the scanned relation is opened and closed. Relations can
then remain open over multiple scans, which can improve performance
and efficiency.

Closes #2173
2022-02-28 16:53:01 +01:00
Erik Nordström
0f351ff612 Simplify Scanner by embedding internal state
As part of adding a scan iterator interface on top of the Scanner
module (commit 8baaa98), the internal scanner state that was
previously private, was made public. Now that it is public, it makes
more sense to make it part of the standard user-facing `ScannerCtx`
struct, which also simplifies the code elsewhere.
2022-02-28 16:53:01 +01:00
Sven Klemm
91820e26f6 Improve planner error messages regarding nodes
Change error messages when unexpected nodes are encountered to
actually show the node name instead of the node id.
2022-02-23 19:43:30 +01:00
Sven Klemm
3f303c7d42 Refactor tsl_debug_append_path
This patch splits the node name logic from the child path logic
to allow getting a string representation for any postgres node.
This adds a new function:
const char * ts_get_node_name(Path *path)

This patch doesn't add any new callers to the function but it will
be used in subsequent patches to produce more user friendly error
messages when unexpected node types are encountered during planning.
2022-02-23 16:44:19 +01:00
Dmitry Simonenko
57368dd98e Fix RENAME TO/SET SCHEMA on distributed hypertable
This PR fixes ON_END logic for distributed DDL execution
by removing old leftover check, which marked those commands
as unsupported.

Fix: #4106
2022-02-23 14:26:45 +03:00
Sven Klemm
a4648b11b4 Fix segfault on INSERT in distributed hypertables
When inserting in a distributed hypertable with a query on a
distributed hypertable a segfault would occur when all the chunks
on the query would get pruned.
2022-02-23 07:24:31 +01:00
Sven Klemm
58fd0c5cef Test APT ARM64 packages
Add tests for ARM64 Debian and Ubuntu packages
2022-02-22 14:20:54 +01:00