3919 Commits

Author SHA1 Message Date
Sven Klemm
04f43335de Move aggregate support function into _timescaledb_functions
This patch moves the support functions for histogram, first and last
into the _timescaledb_functions schema. Since we alter the schema
of the existing functions in upgrade scripts and do not change the
aggregates this should work completely transparently for any user
objects using those aggregates.
2023-04-11 13:11:07 +02:00
Konstantina Skovola
3814a3f351 Properly format license error hint
Commit 57fde383b3dddd0b52263218e65a0135981c2d34 changed the
messaging but did not format the error hint correctly.
This patch fixes the error hint.

Fixes #5490
2023-04-10 14:06:39 +03:00
Alexander Kuzmenkov
8c77be6c68 Look up compressed column metadata only at planning time
Now we look them up again at execution time, which adds up for tables
with a large number of chunks.

This gives about 15% speedup (100 mcs) on a small query on a table from
tests with 50 chunks:
`select id, ts, value from metric_compressed order by id, ts limit 100;`
2023-04-10 14:45:11 +04:00
Konstantina Skovola
df70f3e050 Remove unused variable in tsl_get_compressed_chunk_index_for_recompression
Commit 72c0f5b25e569015aacb98cc1be3169a1720116d introduced
an unused variable. This patch removes it.
2023-04-06 10:58:57 +03:00
Zoltan Haindrich
975e9ca166 Fix segfault after column drop on compressed table
Decompression produces records which have all the decompressed data
set, but it also retains the fields which are used internally during
decompression.
These didn't cause any problem - unless an operation is being done
with the whole row - in which case all the fields which have ended up
being non-null can be a potential segfault source.

Fixes #5458 #5411
2023-04-06 08:49:54 +02:00
Sven Klemm
feef9206fa Add _timescaledb_functions schema
Currently internal user objects like chunks and our functions
live in the same schema making locking down that schema hard.
This patch adds a new schema _timescaledb_functions that is meant
to be the schema used for timescaledb internal functions to
allow separation of code and chunks or other user objects.
2023-04-05 21:01:24 +02:00
Fabrízio de Royes Mello
6440bb3477 Remove unused function
Remove unused function `invalidation_threshold_htid_found`.
2023-04-05 11:38:08 -03:00
Bharathy
1fb058b199 Support UPDATE/DELETE on compressed hypertables.
This patch does following:

1. Executor changes to parse qual ExprState to check if SEGMENTBY
   column is specified in WHERE clause.
2. Based on step 1, we build scan keys.
3. Executor changes to do heapscan on compressed chunk based on
   scan keys and move only those rows which match the WHERE clause
   to staging area aka uncompressed chunk.
4. Mark affected chunk as partially compressed.
5. Perform regular UPDATE/DELETE operations on staging area.
6. Since there is no Custom Scan (HypertableModify) node for
   UPDATE/DELETE operations on PG versions < 14, we don't support this
   feature on PG12 and PG13.
2023-04-05 17:19:45 +05:30
Sven Klemm
c2941a3f9a Fix windows package test
Use the windows packages from the github release for package testing.
2023-04-04 15:53:22 +02:00
Erik Nordström
2e6c6b5c58 Refactor and optimize distributed COPY
Refactor the code path that handles remote distributed COPY. The
main changes include:

* Use a hash table to lookup data node connections instead of a list.
* Refactor the per-data node buffer code that accumulates rows into
  bigger CopyData messages.
* Reduce the default number of rows in a CopyData message to 100. This
  seems to improve throughput, probably striking a better balance
  between message overhead and latency.
* The number of rows to send in each CopyData message can now be
  changed via a new foreign data wrapper option.
2023-04-04 15:35:54 +02:00
Ildar Musin
c6b9f50978 Fix OSM chunks exclusion from append paths
OSM chunks have their own fdw_private which conflicts with checks in
the MergeAppend code path causing segfaults. This commit fixes this by
returning early when there is an OSM chunk in the MergeAppendPath.
2023-04-03 17:46:23 +02:00
Nikhil Sontakke
517dee9f6b Add test for superuser chunk copy/move
Add isolation test case to check that the chunk object created during
chunk copy/move operation on the destination datanode always has
superuser credentials till the end of the operation.
2023-04-03 11:31:58 +05:30
Rafia Sabih
ff5959f8f9 Handle when FROM clause is missing in continuous aggregate definition
It now errors out for such a case.

Fixes #5500
2023-03-29 22:29:16 +02:00
Konstantina Skovola
cb81c331ae Allow named time_bucket arguments in Cagg definition
Fixes #5450
2023-03-28 18:45:41 +03:00
Rafia Sabih
98218c1d07 Enable joins for heirarchical continuous aggregates
The joins could be between a continuous aggregate and hypertable,
continuous aggregate and a regular Postgres table,
and continuous aggregate and a regular Postgres view.
2023-03-28 15:12:54 +02:00
Mats Kindahl
777c599a34 Do not segfault on large histogram() parameters
There is a bug in `width_bucket()` causing an overflow and subsequent
NaN value as a result of dividing with `+inf`. The NaN value is
interpreted as an integer and hence generates an index out of range for
the buckets.

This commit fixes this by generating an error rather than
segfaulting for bucket indexes that are out of range.
2023-03-28 12:47:02 +02:00
Konstantina Skovola
22841abdf0 Update community license related errors
Update the error message printed when attempting to use
a community license feature with apache license installed.

Fixes #5438
2023-03-27 16:25:28 +03:00
Erik Nordström
a51d21efbe Fix issue creating dimensional constraints
During chunk creation, the chunk's dimensional CHECK constraints are
created via an "upcall" to PL/pgSQL code. However, creating
dimensional constraints in PL/pgSQL code sometimes fails, especially
during high-concurrency inserts, because PL/pgSQL code scans metadata
using a snapshot that might not see the same metadata as the C
code. As a result, chunk creation sometimes fail during constraint
creation.

To fix this issue, implement dimensional CHECK-constraint creation in
C code. Other constraints (FK, PK, etc.) are still created via an
upcall, but should probably also be rewritten in C. However, since
these constraints don't depend on recently updated metadata, this is
left to a future change.

Fixes #5456
2023-03-24 10:55:08 +01:00
Konstantina Skovola
72c0f5b25e Rewrite recompress_chunk in C for segmentwise processing
This patch introduces a C-function to perform the recompression at
a finer granularity instead of decompressing and subsequently
compressing the entire chunk.

This improves performance for the following reasons:
- it needs to sort less data at a time and
- it avoids recreating the decompressed chunk and the heap
inserts associated with that by decompressing each segment
into a tuplesort instead.

If no segmentby is specified when enabling compression or if an
index does not exist on the compressed chunk then the operation is
performed as before, decompressing and subsequently
compressing the entire chunk.
2023-03-23 11:39:43 +02:00
Nikhil Sontakke
7e43f45ccb Ensure superuser perms during copy/move chunk
There is a security loophole in current core Postgres, due to which
it's possible for a non-superuser to gain superuser access by attaching
dependencies like expression indexes, triggers, etc. before logical
replication commences.

To avoid this, we now ensure that the chunk objects that get created
for the subscription are done so as a superuser. This avoids malicious
dependencies by regular users.
2023-03-23 13:26:47 +05:30
Fabrízio de Royes Mello
38fcd1b76b Improve Realtime Continuous Aggregate performance
When calling the `cagg_watermark` function to get the watermark of a
Continuous Aggregate we execute a `SELECT MAX(time_dimension)` query
in the underlying materialization hypertable.

The problem is that a `SELECT MAX(time_dimention)` query can be
expensive because it will scan all hypertable chunks increasing the
planning time for a Realtime Continuous Aggregates.

Improved it by creating a new catalog table to serve as a cache table
to store the current Continous Aggregate watermark in the following
situations:
- Create CAgg: store the minimum value of hypertable time dimension
  data type;
- Refresh CAgg: store the last value of the time dimension materialized
  in the underlying materialization hypertable (or the minimum value of
  materialization hypertable time dimension data type if there's no
  data materialized);
- Drop CAgg Chunks: the same as refresh cagg.

Closes #4699, #5307
2023-03-22 16:35:23 -03:00
shhnwz
699fcf48aa Stats improvement for Uncompressed Chunks
During the compression autovacuum use to be disabled for uncompressed
chunk and enable after decompression. This leads to postgres
maintainence issue. Let's not disable autovacuum for uncompressed
chunk anymore. Let postgres take care of the stats in its natural way.

Fixes #309
2023-03-22 23:51:13 +05:30
Alexander Kuzmenkov
5c07a57a02 Simplify control flow in decompress_chunk_exec
No functional changes, mostly just reshuffles the code to prepare for
batch decompression.

Also removes unneeded repeated column value stores and ExecStoreTuple,
to save 3-5% execution time on some queries.
2023-03-22 13:08:22 +04:00
Erik Nordström
63b416b6b0 Use consistent snapshots when scanning metadata
Invalidate the catalog snapshot in the scanner to ensure that any
lookups into `pg_catalog` uses a snapshot that is consistent with the
snapshot used to scan TimescaleDB metadata.

This fixes an issue where a chunk could be looked up without having a
proper relid filled in, causing an assertion failure
(`ASSERT_IS_VALID_CHUNK`). When a chunk is scanned and found (in
`chunk_tuple_found()`), the Oid of the chunk table is filled in using
`get_relname_relid()`, which could return InvalidOid due to use of a
different snapshot when scanning `pg_class`. Calling
`InvalidateCatalogSnapshot()` before starting the metadata scan in
`Scanner` ensures the pg_catalog snapshot used is refreshed.

Due to the difficulty of reproducing this MVCC issue, no regression or
isolation test is provided, but it is easy to hit this bug when doing
highly concurrent COPY:s into a distributed hypertable.
2023-03-21 10:34:23 +01:00
Fabrízio de Royes Mello
7d6cf90ee7 Add missing gitignore entry
Pull request #4827 introduced a new template SQL test file but missed
to add the properly `.gitignore` entry to ignore generated test files.
2023-03-20 14:43:05 -03:00
Bharathy
cc51e20e87 Add support for ON CONFLICT DO UPDATE for compressed hypertables
This patch fixes execution of INSERT with ON CONFLICT DO UPDATE by
removing error and allowing UPDATE do happen on the given compressed
hypertable.
2023-03-20 22:55:27 +05:30
Konstantina Skovola
8cccc375fb Add license information to extension description
Fixes #5436
2023-03-20 13:27:41 -03:00
Konstantina Skovola
736c20fb71 Hardcode PG14-15 versions for chocolatey 2023-03-20 17:38:32 +02:00
syvb
2570ab1110 Add new clangd cache location to gitignore 2023-03-17 14:13:25 -04:00
Mats Kindahl
67ff84e8f2 Add check for malloc failure in libpq calls
The functions `PQconndefaults` and `PQmakeEmptyPGresult` calls
`malloc` and can return NULL if it fails to allocate memory for the
defaults and the empty result. It is checked with an `Assert`, but this
will be removed in production builds.

Replace the `Assert` with an checks to generate an error in production
builds rather than trying to de-reference the pointer and cause a
crash.
2023-03-16 14:20:54 +01:00
Zoltan Haindrich
790b322b24 Fix DEFAULT value handling in decompress_chunk
The sql function decompress_chunk did not filled in
default values during its operation.

Fixes #5412
2023-03-16 09:16:50 +01:00
Alexander Kuzmenkov
827684f3e2 Use prepared statements for parameterized data node scans
This allows us to avoid replanning the inner query on each new loop,
speeding up the joins.
2023-03-15 18:22:01 +04:00
Sven Klemm
03a799b874 Mention that new status values need handling in downgrade script
When adding new status values we must make sure to add special
handling for these values to the downgrade script as previous
versions will not know how to deal with those.
2023-03-14 23:59:10 +01:00
Dmitry Simonenko
f8022eb332 Add additional tests for compression with HA
Make sure inserts into compressed chunks work when a DN is down

Fix #5039
2023-03-13 17:43:48 +02:00
Sven Klemm
65562f02e8 Support unique constraints on compressed chunks
This patch allows unique constraints on compressed chunks. When
trying to INSERT into compressed chunks with unique constraints
any potentially conflicting compressed batches will be decompressed
to let postgres do constraint checking on the INSERT.
With this patch only INSERT ON CONFLICT DO NOTHING will be supported.
For decompression only segment by information is considered to
determine conflicting batches. This will be enhanced in a follow-up
patch to also include orderby metadata to require decompressing
less batches.
2023-03-13 12:04:38 +01:00
Sven Klemm
c02cb76b38 Don't reindex relation during decompress_chunk
Reindexing a relation requires AccessExclusiveLock which prevents
queries on that chunk. This patch changes decompress_chunk to update
the index during decompression instead of reindexing. This patch
does not change the required locks as there are locking adjustments
needed in other places to make it safe to weaken that lock.
2023-03-13 10:58:26 +01:00
Sven Klemm
20ea406616 Add utility function to map attribute numbers
This patch adds a function ts_map_attno that can be used to map
the attribute number from one relation to another by column name.
2023-03-13 10:57:17 +01:00
Jan Nidzwetzki
356a20777c Handle user-defined FDW options properly
This patch changes the way user-defined FDW options (e.g., startup
costs, per-tuple costs) are handled. So far, these values were retrieved
in apply_fdw_and_server_options() but reset to default values afterward.
2023-03-13 10:39:52 +01:00
Maheedhar PV
5e0391392a Out of on_proc_exit slots on guc license change
Problem:

When the guc timescaledb.license = 'timescale' is set in the conf file
and a SIGHUP is sent to postgress process and a reload of the tsl
module is triggered.

This reload happens in 2 phases 1. tsl_module_load is called which
will load the module only if not already loaded and 2.The
ts_module_init is called for every ts_license_guc_assign_hook
irrespective of if it is new load.This ts_module_init initialization
function also registers a on_proc_exit function to be called on exit.

The list of on_proc_exit methods are maintained in a fixed array
on_proc_exit_list of size MAX_ON_EXITS (20) which gets filled up on
repeated SIGHUPs and hence an error.

Fix:

The fix is to make the ts_module_init() register the on_proc_exit
callback, only in case the module is reloaded and not in every init
call.

Closes #5233
2023-03-13 06:24:01 +05:30
Alexander Kuzmenkov
e92d5ba748 Add more tests for compression
Unit tests for different data sequences, and SQL test for float4.
2023-03-10 20:34:17 +04:00
Jan Nidzwetzki
f5db023152 Track file trailer only in debug builds
The commit 96574a7 changes the handling of the file_trailer_received
flag. It is now only used in asserts and not in any other kind of logic.
This patch encapsulates the file_trailer_received in a
USE_ASSERT_CHECKING macro.
2023-03-10 10:44:53 +01:00
Sven Klemm
217ba014a7 Use version checks to decide about RelationGetSmgr backporting
Use explicit version checks to decide whether to define backported
RelationGetSmgr function or rely on the function being available.
This simplifies the cmake code a bit and make the backporting similar
to how we handle this for other functions.
2023-03-09 16:55:50 +01:00
Jan Nidzwetzki
7b8177aa74 Fix file trailer handling in the COPY fetcher
The copy fetcher fetches tuples in batches. When the last element in the
batch is the file trailer, the trailer was not handled correctly. The
existing logic did not perform a PQgetCopyData in that case. Therefore
the state of the fetcher was not set to EOF and the copy operation was
not correctly finished at this point.

Fixes: #5323
2023-03-09 14:29:06 +01:00
Sven Klemm
a854b2760f Simplify ts_indexing_relation_has_primary_or_unique_index
Rely on postgres functionality for index column tracking instead
of rolling our own.
2023-03-08 22:43:59 +01:00
Bharathy
f54dd7b05d Fix SEGMENTBY columns predicates to be pushed down
WHERE clause with SEGMENTBY column of type text/bytea
non-equality operators are not pushed down to Seq Scan
node of compressed chunk. This patch fixes this issue.

Fixes #5286
2023-03-08 19:17:43 +05:30
Erik Nordström
c76a0cff68 Add parallel support for partialize_agg()
Make `partialize_agg()` support parallel query execution. To make this
work, the finalize node need combine the individual partials from each
parallel worker, but the final step that turns the resulting partial
into the finished aggregate should not happen. Thus, in the case of
distributed hypertables, each data node can run a parallel query to
compute a partial, and the access node can later combine and finalize
these partials into the final aggregate. Esssentially, there will be
one combine step (minus final) on each data node, and then another one
plus final on the access node.

To implement this, the finalize aggregate plan is simply modified to
elide the final step, and to reserialize the partial. It is only
possible to do this at the plan stage; if done at the path stage, the
PostgreSQL planner will hit assertions that assume that the node has
certain values (e.g., it doesn't expect combine Paths to skip the
final step).
2023-03-08 14:14:25 +01:00
Bharathy
c13ed17fbc Fix DELETE command tag
DELETE on hypertables always reports 0 as affected rows.
This patch fixes this issue.
2023-03-07 20:45:12 +05:30
Sven Klemm
f680b99529 Fix assertion in calculate_chunk_interval for negative target size
When called with negative chunk_target_size_bytes
calculate_chunk_interval will throw an assertion. This patch adds
error handling for this condition. Found by sqlsmith.
2023-03-07 14:50:57 +01:00
Sven Klemm
00321dba41 2.10.1 Post-release adjustments
Add 2.10.1 to update test scripts and adjust the downgrade versioning.
2023-03-07 13:44:54 +01:00
Konstantina Skovola
5a3cacd06f Fix sub-second intervals in hierarchical caggs
Previously we used date_part("epoch", interval) and integer division
internally to determine whether the top cagg's interval is a
multiple of its parent's.
This led to precision loss and wrong results
in the case of intervals with sub-second components.

Fixed by using the `ts_interval_value_to_internal` function to convert
intervals to appropriate integer representation for division.

Fixes #5277
2023-03-07 13:25:49 +02:00