119 Commits

Author SHA1 Message Date
Ante Kresic
fb0df1ae4e Insert into indexes during chunk compression
If there any indexes on the compressed chunk, insert into them while
inserting the heap data rather than reindexing the relation at the
end. This reduces the amount of locking on the compressed chunk
indexes which created issues when merging chunks and should help
with the future updates of compressed data.
2023-06-26 09:37:12 +02:00
Sven Klemm
e302aa2ae9 Fix handling of Result nodes below Sort nodes in ConstraintAwareAppend
With PG 15 Result nodes can appear between Sort nodes and
DecompressChunk when postgres tries to adjust the targetlist.
2023-06-13 18:42:02 +02:00
Sotiris Stamokostas
1a93c2d482 Improve parallel workers for decompression
So far, we have set the number of desired workers for decompression to
1. If a query touches only one chunk, we end up with one worker in a
parallel plan. Only if the query touches multiple chunks PostgreSQL
spins up multiple workers. These workers could then be used to process
the data of one chunk.

This patch removes our custom worker calculation and relies on
PostgreSQL logic to calculate the desired parallelity.

Co-authored-by: Jan Kristof Nidzwetzki <jan@timescale.com>
2023-06-02 16:16:08 +03:00
Bharathy
b38c920266 MERGE support on hypertables
This patch does following:

1. Planner changes to create ChunkDispatch node when MERGE command
   has INSERT action.
2. Changes to map partition attributes from a tuple returned from
   child node of ChunkDispatch against physical targetlist, so that
   ChunkDispatch node can read the correct value from partition column.
3. Fixed issues with MERGE on compressed hypertable.
4. Added more testcases.
5. MERGE in distributed hypertables is not supported.
6. Since there is no Custom Scan (HypertableModify) node for MERGE
   with UPDATE/DELETE on compressed hypertables, we don't support this.

Fixes #5139
2023-05-27 10:29:11 +05:30
Jan Nidzwetzki
df32ad4b79 Optimize compressed chunk resorting
This patch adds an optimization to the DecompressChunk node. If the
query 'order by' and the compression 'order by' are compatible (query
'order by' is equal or a prefix of compression 'order by'), the
compressed batches of the segments are decompressed in parallel and
merged using a binary heep. This preserves the ordering and the sorting
of the result can be prevented. Especially LIMIT queries benefit from
this optimization because only the first tuples of some batches have to
be decompressed. Previously, all segments were completely decompressed
and sorted.

Fixes: #4223

Co-authored-by: Sotiris Stamokostas <sotiris@timescale.com>
2023-05-02 10:46:15 +02:00
Bharathy
44dc042bb3 Fixed transparent decompress chunk test which seem to be flaky. 2023-04-24 19:58:02 +05:30
Sven Klemm
744b44cc52 Fix parameterization in DecompressChunk path generation
All children of an append path are required to have the same parameterization
so we have to reparameterize when the selected path does not have the right
parameterization.
2023-04-20 17:20:04 +02:00
shhnwz
ca472ebb0d Fixed transparent decompress chunk
Transparent decompress chunk was added into to ignore list due to
the side effect of #5118. This issue is to fix the flaky nature of
the test.
2023-04-18 14:12:22 +05:30
Sven Klemm
90e54def8a Improve interpolate error message on datatype mismatch
Include information about the expected and the returned datatype
in the error details of interpolate.
2023-04-15 09:38:42 +02:00
Alexander Kuzmenkov
827684f3e2 Use prepared statements for parameterized data node scans
This allows us to avoid replanning the inner query on each new loop,
speeding up the joins.
2023-03-15 18:22:01 +04:00
Pallavi Sontakke
6be14423d5
Flag test space_constraint.sql.in for release run (#5380)
It was incorrectly flagged as requiring a debug build.

Disable-check: force-changelog-changed
2023-03-03 15:52:34 +05:30
Erik Nordström
b81033b835 Make data node command execution interruptible
The function to execute remote commands on data nodes used a blocking
libpq API that doesn't integrate with PostgreSQL interrupt handling,
making it impossible for a user or statement timeout to cancel a
remote command.

Refactor the remote command execution function to use a non-blocking
API and integrate with PostgreSQL signal handling via WaitEventSets.

Partial fix for #4958.

Refactor remote command execution function
2023-02-03 13:15:28 +01:00
Bharathy
3a8d294d58 SELECT from partial compressed chunks crashes
SELECT from partially compressed chunk crashes due to reference to NULL
pointer. When generating paths for DecompressChunk, uncompressed_partial_path
is null which is not checked, thus causing a crash. This patch checks for NULL
before calling create_append_path().

Fixes #5134
2023-01-02 20:40:30 +05:30
Sven Klemm
c0e9bb4a30 Fix enabling compression on caggs with renamed columns
On caggs with realtime aggregation changing the column name does
not update all the column aliases inside the view metadata.
This patch changes the code that creates the compression
configuration for caggs to get the column name from the materialization
hypertable instead of the view internals.

Fixes #5100
2022-12-19 10:46:37 +01:00
Lakshmi Narayanan Sreethar
3b3846b0ff Fix assertion failure in cursor_fetcher_rewind
The cursor_fetcher_rewind method assumes that the data node cursor is
rewind either after eof or when there is an associated request. But the
rewind can also occur once the server has generated required number of
rows by joining the relation being scanned with another regular
relation. In this case, the fetch would not have reached eof and there
will be no associated requests as the rows would have been already
loaded into the cursor causing the assertion in cursor_fetcher_rewind
to fail. Fixed that by removing the Assert and updating
cursor_fetcher_rewind to discard the response only if there is an
associated request.

Fixes #5053
2022-12-14 21:00:53 +05:30
Erik Nordström
fd42fe76fa Read until EOF in COPY fetcher
Ensure the COPY fetcher implementation reads data until EOF with
`PQgetCopyData()`. Also ensure the malloc'ed copy data is freed with
`PQfreemem()` if an error is thrown in the processing loop.

Previously, the COPY fetcher didn't read until EOF, and instead
assumed EOF when the COPY file trailer is received. Since EOF wasn't
reached, it required terminating the COPY with an extra call to the
(deprecated) `PQendcopy()` function.

Still, there are cases when a COPY needs to be prematurely terminated,
for example, when querying with a LIMIT clause. Therefore, distinguish
between "normal" end (when receiving EOF) and forceful end (cancel the
ongoing query).
2022-12-05 18:28:35 +01:00
Bharathy
7bfd28a02f Fix dist_fetcher_type test on PG15 2022-11-24 18:41:46 +05:30
Fabrízio de Royes Mello
a4356f342f Remove trailing whitespaces from test code 2022-11-18 16:31:47 -03:00
Bharathy
bfa641a81c INSERT .. SELECT on distributed hypertable fails on PG15
INSERT .. SELECT query containing distributed hypertables generates plan
with DataNodeCopy node which is not supported. Issue is in function
tsl_create_distributed_insert_path() where we decide if we should
generate DataNodeCopy or DataNodeDispatch node based on the kind of
query. In PG15 for INSERT .. SELECT query timescaledb planner generates
DataNodeCopy as rte->subquery is set to NULL. This is because of a commit
in PG15 where rte->subquery is set to NULL as part of a fix.

This patch checks if SELECT subquery has distributed hypertables or not
by looking into root->parse->jointree which represents subquery.

Fixes #4983
2022-11-17 21:18:23 +05:30
Sven Klemm
9744b4f3bc Remove BitmapScan support in DecompressChunk
We don't want to support BitmapScans below DecompressChunk
as this adds additional complexity to support and there
is little benefit in doing so.
This fixes a bug that can happen when we have a parameterized
BitmapScan that is parameterized on a compressed column and
will lead to an execution failure with an error regarding
incorrect attribute types in the expression.
2022-11-08 23:29:29 +01:00
Bharathy
2a64450651 Add new tests to gitignore list
Since new tests specific to PG15 were added, these tests which generated .sql files needs to be added to .gitnore
2022-11-07 22:14:39 +05:30
Bharathy
3a9688cc97 Extra Result node on top of CustomScan on PG15
On PG15 CustomScan by default is not projection capable, thus wraps this
node in Result node. THis change in PG15 causes tests result files which
have EXPLAIN output to fail. This patch fixes the plan outputs.

Fixes #4833
2022-11-07 21:20:08 +05:30
Alexander Kuzmenkov
da9af2c05d Do not cache the classify_relation result
It depends on the context, not only on the relation id. The same chunk
can be expanded both as a child of hypertable and as an independent
table.
2022-10-26 17:05:39 +04:00
Bharathy
0e32656b54 Support for PG15.
As part of this patch, added and fixed some of the regress checks which
fail on PG15.
2022-10-17 21:43:44 +05:30
Alexander Kuzmenkov
066bcbed6d Rename row-by-row fetcher to COPY fetcher
This name better reflects its characteristics, and I'm thinking about
resurrecting the old row-by-row fetcher later, because it can be useful
for parameterized queries.
2022-10-14 23:04:27 +03:00
Bharathy
38878bee16 Fix segementation fault during INSERT into compressed hypertable.
INSERT into compressed hypertable with number of open chunks greater
than ts_guc_max_open_chunks_per_insert causes segementation fault.
New row which needs to be inserted into compressed chunk has to be
compressed. Memory required as part of compressing a row is allocated
from RowCompressor::per_row_ctx memory context. Once row is compressed,
ExecInsert() is called, where memory from same context is used to
allocate and free it instead of using "Executor State". This causes
a corruption in memory.

Fixes: #4778
2022-10-13 20:48:23 +05:30
Alexander Kuzmenkov
30596c0c47 Batch rows on access node for distributed COPY
Group the incoming rows into batches on access node before COPYing to
data nodes. This gives 2x-5x speedup on various COPY queries to
distributed hypertables.

Also fix the text format passthrough, and prefer text transfer format
for text input to be able to use this passthrough. It saves a lot of
CPU on the access node.
2022-10-10 16:30:53 +03:00
Sven Klemm
8cda0e17ec Extend the now() optimization to also apply to CURRENT_TIMESTAMP
The optimization that constifies certain now() expressions before
hypertable expansion did not apply to CURRENT_TIMESTAMP even
though it is functionally similar to now(). This patch extends the
optimization to CURRENT_TIMESTAMP.
2022-10-05 20:46:30 +02:00
Dmitry Simonenko
ea5038f263 Add connection cache invalidation ignore logic
Calling `ts_dist_cmd_invoke_on_data_nodes_using_search_path()` function
without an active transaction allows connection invalidation event
happen between applying `search_path` and the actual command
execution, which leads to an error.

This change introduces a way to ignore connection cache invalidations
using `remote_connection_cache_invalidation_ignore()` function.

This work is based on @nikkhils original fix and the problem research.

Fix #4022
2022-10-04 10:50:45 +03:00
Sven Klemm
1d4b9d6977 Fix join on time column of compressed chunk
Do not allow paths that are parameterized on a
compressed column to exist when creating paths
for a compressed chunk.
2022-09-29 10:36:02 +02:00
Sven Klemm
940187936c Fix segfault when INNER JOINing hypertables
This fixing a segfault when INNER JOINing 2 hypertables that are
ordered by time.
2022-09-28 17:12:45 +02:00
Sven Klemm
2529ae3f68 Fix chunk exclusion for prepared statements and dst changes
The constify code constifying TIMESTAMPTZ expressions when doing
chunk exclusion did not account for daylight saving time switches
leading to different calculation outcomes when timezone changes.
This patch adds a 4 hour safety buffer to any such calculations.
2022-09-22 18:16:20 +02:00
Sven Klemm
ffd9dfb7eb Fix assertion failure in constify_now
The code added to support VIEWs did not account for the fact that
varno could be from a different nesting level and therefore not
be present in the current range table.
2022-09-16 17:40:03 +02:00
Alexander Kuzmenkov
fee27484ce Do not use row-by-row fetcher for parameterized plans
We have to prepare the data node statement in this case, and COPY
queries don't work with prepared statements.
2022-09-15 22:59:06 +03:00
Sven Klemm
d2baef3ef3 Fix planner chunk exclusion for VIEWs
Allow planner chunk exclusion in subqueries. When we decicde on
whether a query may benefit from constifying now and encounter a
subquery peek into the subquery and check if the constraint
references a hypertable partitioning column.

Fixes #4524
2022-09-12 17:29:14 +02:00
Sven Klemm
a26a5974dc Improve space constraint exclusion datatype handling
This patch adjusts the operator logic for valid space dimension
constraints to no longer look for an exact match on both sides
of the operator but instead allow mismatched datatypes.

Previously a constraint like `col = value` would require `col`
and `value` to have matching datatype with this change `col` and
`value` can be different datatype as long as they have equality
operator in btree family.

Mismatching datatype can happen commonly when using int8 columns
and comparing them with integer literals. Integer literals default
to int4 so the datatypes would not match unless special care has
been taken in writing the constraints and therefore the optimization
would never apply in those cases.
2022-09-11 10:57:54 +02:00
Sven Klemm
f27e627341 Fix chunk exclusion for space partitions in SELECT FOR UPDATE queries
Since we do not use our own hypertable expansion for SELECT FOR UPDATE
queries we need to make sure to add the extra information necessary to
get hashed space partitions with the native postgres inheritance
expansion working.
2022-09-11 10:57:54 +02:00
Sven Klemm
b34b91f18b Add timezone support to time_bucket_gapfill
This patch adds a new time_bucket_gapfill function that
allows bucketing in a specific timezone.

You can gapfill with explicit timezone like so:
`SELECT time_bucket_gapfill('1 day', time, 'Europe/Berlin') ...`

Unfortunately this introduces an ambiguity with some previous
call variations when an untyped start/finish argument was passed
to the function. Some queries might need to be adjusted and either
explicitly name the positional argument or resolve the type ambiguity
by casting to the intended type.
2022-09-07 16:37:53 +02:00
Sven Klemm
1c0bf4b777 Support bucketing by month in time_bucket_gapfill 2022-08-22 19:07:32 +02:00
Sven Klemm
49b6486dad Change get_git_commit to return full commit hash
This patch changes get_git_commit to always return the full hash.
Since different git versions do not agree on the length of the
abbreviated hash this made the length flaky. To make the length
consistent change it to always be the full hash.
2022-08-01 10:45:17 +02:00
Sven Klemm
eccd6df782 Throw better error message on incompatible row fetcher settings
When a query has multiple distributed hypertables the row-by-by
fetcher cannot be used. This patch changes the fetcher selection
logic to throw a better error message in those situations.
Previously the following error would be produced in those situations:
unexpected PQresult status 7 when starting COPY mode
2022-07-29 11:40:00 +02:00
Sven Klemm
d5619283f3 Fix gapfill group comparison
The gapfill mechanism to detect an aggregation group change was
using datumIsEqual to compare the group values. datumIsEqual does
not detoast values so when one value is toasted and the other value
is not it will not return the correct result. This patch changes
the gapfill code to use the correct equal operator for the type
of the group column instead of datumIsEqual.
2022-07-19 19:14:30 +02:00
Sven Klemm
0d175b262e Fix prepared statement param handling in ChunkAppend
This patch fixes the param handling in prepared statements for generic
plans in ChunkAppend making those params usable in chunk exclusion.
Previously those params would not be resolved and therefore not used
for chunk exclusion.

Fixes #3719
2022-07-19 14:50:17 +02:00
Sven Klemm
597b71881a Fix assertion hit in row_by_row_fetcher_close
When executing multinode queries that initialize row-by-row fetcher
but never execute it the node cleanup code would hit an assertion
checking the state of the fetcher. Found by sqlsmith.
2022-07-18 09:39:48 +02:00
Alexander Kuzmenkov
1bbb6059cb Add more tests for distributed INSERT and COPY
More interleavings of INSERT/COPY, and test with slow recv() to check
waiting.
2022-07-04 22:38:53 +05:30
Nikhil Sontakke
e3b2fbdf15 Fix empty bytea handlng with distributed tables
The "empty" bytea value in a column of a distributed table when
selected was being returned as "null". The actual value on the
datanodes was being stored appropriately but just the return code path
was converting it into "null" on the AN. This has been handled via the
use of PQgetisnull() function now.

Fixes #3455
2022-06-22 12:25:54 +05:30
Alexander Kuzmenkov
5c69adfb7e Add more tests for errors on data nodes
Use a data type with faulty send/recv functions to test various error
handling paths.
2022-06-21 14:55:14 +05:30
Sven Klemm
308ce8c47b Fix various misspellings 2022-06-13 10:53:08 +02:00
Sven Klemm
216ea65937 Enable chunk exclusion for space dimensions in UPDATE/DELETE
This patch transforms constraints on hash-based space partitions to make
them usable by postgres constraint exclusion.

If we have an equality condition on a space partitioning column, we add
a corresponding condition on get_partition_hash on this column. These
conditions match the constraints on chunks, so postgres' constraint
exclusion is able to use them and exclude the chunks.

The following transformations are done:

device_id = 1
becomes
((device_id = 1) AND (_timescaledb_internal.get_partition_hash(device_id) = 242423622))

s1 = ANY ('{s1_2,s1_2}'::text[])
becomes
((s1 = ANY ('{s1_2,s1_2}'::text[])) AND
(_timescaledb_internal.get_partition_hash(s1) = ANY ('{1583420735,1583420735}'::integer[])))

These transformations are not visible in EXPLAIN output as we remove
them again after hypertable expansion is done.
2022-06-07 13:10:28 +02:00
Erik Nordström
8f9975d7be Fix crash during insert into distributed hypertable
For certain inserts on a distributed hypertable, e.g., involving CTEs
and upserts, plans can be generated that weren't properly handled by
the DataNodeCopy and DataNodeDispatch execution nodes. In particular,
the nodes expect ChunkDispatch as a child node, but PostgreSQL can
sometimes insert a Result node above ChunkDispatch, causing the crash.

Further, behavioral changes in PG14 also caused the DataNodeCopy node
to sometimes wrongly believe a RETURNING clause was present. The check
for returning clauses has been updated to fix this issue.

Fixes #4339
2022-06-02 17:25:33 +02:00