214 Commits

Author SHA1 Message Date
Sven Klemm
789bb26dfb Lock down search_path in SPI calls 2023-02-01 07:54:03 +01:00
Nikhil Sontakke
c92e29ba3a Fix DML HA in multi-node
If a datanode goes down for whatever reason then DML activity to
chunks residing on (or targeted to) that DN will start erroring out.
We now handle this by marking the target chunk as "stale" for this
DN by changing the metadata on the access node. This allows us to
continue to do DML to replicas of the same chunk data on other DNs
in the setup. This obviously will only work for chunks which have
"replication_factor" > 1. Note that for chunks which do not have
undergo any change will continue to carry the appropriate DN related
metadata on the AN.

This means that such "stale" chunks will become underreplicated and
need to be re-balanced by using the copy_chunk functionality by a micro
service or some such process.

Fixes #4846
2022-11-25 17:42:26 +05:30
Erik Nordström
f13214891c Add function to alter data nodes
Add a new function, `alter_data_node()`, which can be used to change
the data node's configuration originally set up via `add_data_node()`
on the access node.

The new functions introduces a new option "available" that allows
configuring the availability of the data node. Setting
`available=>false` means that the node should no longer be used for
reads and writes. Only read "failover" is implemented as part of this
change, however.

To fail over reads, the alter data node function finds all the chunks
for which the unavailable data node is the "primary" query target and
"fails over" to a chunk replica on another data node instead. If some
chunks do not have a replica to fail over to, a warning will be
raised.

When a data node is available again, the function can be used to
switch back to using the data node for queries.

Closes #2104
2022-11-11 13:59:42 +01:00
Fabrízio de Royes Mello
f1535660b0 Honor usage of OidIsValid() macro
Postgres source code define the macro `OidIsValid()` to check if the Oid
is valid or not (comparing against the `InvalidOid` type). See
`src/include/c.h` in Postgres source three.

Changed all direct comparisons against `InvalidOid` for the `OidIsValid`
call and add a coccinelle check to make sure the future changes will use
it correctly.
2022-11-03 16:10:50 -03:00
Ante Kresic
2475c1b92f Roll up uncompressed chunks into compressed ones
This change introduces a new option to the compression procedure which
decouples the uncompressed chunk interval from the compressed chunk
interval. It does this by allowing multiple uncompressed chunks into one
compressed chunk as part of the compression procedure. The main use-case
is to allow much smaller uncompressed chunks than compressed ones. This
has several advantages:
- Reduce the size of btrees on uncompressed data (thus allowing faster
inserts because those indexes are memory-resident).
- Decrease disk-space usage for uncompressed data.
- Reduce number of chunks over historical data.

From a UX point of view, we simple add a compression with clause option
`compress_chunk_time_interval`. The user should set that according to
their needs for constraint exclusion over historical data. Ideally, it
should be a multiple of the uncompressed chunk interval and so we throw
a warning if it is not.
2022-11-02 15:14:18 +01:00
Alexander Kuzmenkov
313845a882 Enable -Wextra
Our code mostly has warnings about comparison with different
signedness.
2022-10-27 16:06:58 +04:00
Alexander Kuzmenkov
864da20cee Build on Ubuntu 22.04
It has newer GCC which should detect more warnings.
2022-10-26 23:32:05 +04:00
Mats Kindahl
276d3a331d Add macro to assert or error
For some unexpected conditions, we have a check and an error that is
generated. Since this always generate an error, it is more difficult to
find the bug if the error is generated rather than an assert fired
generating a core dump. Similarly, some asserts can occur in production
builds and will lead to strange situations triggering a crash. For
those cases we should instead generate an error.

This commit introduces a macro `Ensure` that will result in an assert
in debug builds, but an error message in release build. This macro
should only be used for conditions that should not occur during normal
runtime, but which can happen is odd corner-cases in release builds and
therefore warrants an error message.

It also replaces some existing checks with such errors to demonstrate
usage.
2022-10-20 13:35:09 +02:00
Alexander Kuzmenkov
7758f5959c Update .clang-format for version 14
The only configuration we're missing is the newline for braces after
case labels. The rest of the differences looks like bugs/omissions of
the version 8 that we use now.

Require clang-format-14 in cmake and use it in the CI check. We can't
support versions earlier than 14 because they have some
formatting differences that can't be configured.
2022-10-10 17:12:36 +03:00
Bharathy
d00a55772c error compressing wide table
Consider a compressed hypertable has many columns (like more than 600 columns).
In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes
error as "row is too big: size 10856, maximum size 8160."

This patch estimates the tuple size of compressed hypertable and reports a
warning when compression is enabled on hypertable. Thus user gets aware of
this warning before calling compress_chunk().

Fixes #4398
2022-09-17 11:24:23 +05:30
Bharathy
b869f91e25 Show warnings during create_hypertable().
The schema of base table on which hypertables are created, should define
columns with proper data types. As per postgres best practices Wiki
(https://wiki.postgresql.org/wiki/Don't_Do_This), one should not define
columns with CHAR, VARCHAR, VARCHAR(N), instead use TEXT data type.
Similarly instead of using timestamp, one should use timestamptz.
This patch reports a WARNING to end user when creating hypertables,
if underlying parent table, has columns of above mentioned data types.

Fixes #4335
2022-09-12 18:47:47 +05:30
Dmitry Simonenko
c697700add Add hypertable distributed argument and defaults
This PR introduces a new `distributed` argument to the
create_hypertable() function as well as two new GUC's to
control its default behaviour: timescaledb.hypertable_distributed_default
and timescaledb.hypertable_replication_factor_default.

The main idea of this change is to allow automatic creation
of the distributed hypertables by default.
2022-08-29 17:44:16 +03:00
Alexander Kuzmenkov
51259b31c4 Fix OOM in large INSERTs
Do not allocate various temporary data in PortalContext, such as the
hyperspace point corresponding to the row, or the intermediate data
required for chunk lookup.
2022-08-23 19:40:51 +03:00
Erik Nordström
025bda6a81 Add stateful partition mappings
Add a new metadata table `dimension_partition` which explicitly and
statefully details how a space dimension is split into partitions, and
(in the case of multi-node) which data nodes are responsible for
storing chunks in each partition. Previously, partition and data nodes
were assigned dynamically based on the current state when creating a
chunk.

This is the first in a series of changes that will add more advanced
functionality over time. For now, the metadata table simply writes out
what was previously computed dynamically in code. Future code changes
will alter the behavior to do smarter updates to the partitions when,
e.g., adding and removing data nodes.

The idea of the `dimension_partition` table is to minimize changes in
the partition to data node mappings across various events, such as
changes in the number of data nodes, number of partitions, or the
replication factor, which affect the mappings. For example, increasing
the number of partitions from 3 to 4 currently leads to redefining all
partition ranges and data node mappings to account for the new
partition. Complete repartitioning can be disruptive to multi-node
deployments. With stateful mappings, it is possible to split an
existing partition without affecting the other partitions (similar to
partitioning using consistent hashing).

Note that the dimension partition table expresses the current state of
space partitions; i.e., the space-dimension constraints and data nodes
to be assigned to new chunks. Existing chunks are not affected by
changes in the dimension partition table, although an external job
could rewrite, move, or copy chunks as desired to comply with the
current dimension partition state. As such, the dimension partition
table represents the "desired" space partitioning state.

Part of #4125
2022-08-02 11:38:32 +02:00
Alexander Kuzmenkov
3c56d3eceb Faster lookup of chunks by point
Don't keep the chunk constraints while searching. The number of
candidate chunks can be very large, so keeping these constraints is a
lot of work and uses a lot of memory. For finding the matching chunk,
it is enough to track the number of dimensions that matched a given
chunk id. After finding the chunk id, we can look up only the matching
chunk data with the usual function.

This saves some work when doing INSERTs.
2022-06-07 18:10:20 +05:30
Alexander Kuzmenkov
9012e2a20d Do not create a memory context for each Chunk
For some reason, we create a MemoryContext for each Chunk. This context
then is almost never used. Just don't do this.
2022-05-02 17:49:00 +05:30
Josh Soref
68aec9593c Fix various misspellings
This patch fixes various misspellings of committed, constraint and
insufficient in code, comments and documentation.
2022-04-22 11:06:52 +02:00
Mats Kindahl
f5fd06cabb Ignore invalid relid when deleting hypertable
When running `performDeletion` is is necessary to have a valid relation
id, but when doing a lookup using `ts_hypertable_get_by_id` this might
actually return a hypertable entry pointing to a table that does not
exist because it has been deleted previously. In this case, only the
catalog entry should be removed, but it is not necessary to delete the
actual table.

This scenario can occur if both the hypertable and a compressed table
are deleted as part of running a `sql_drop` event, for example, if a
compressed hypertable is defined inside an extension. In this case, the
compressed hypertable (indeed all tables) will be deleted first, and
the lookup of the compressed hypertable will find it in the metadata
but a lookup of the actual table will fail since the table does not
exist.

Fixes #4140
2022-03-14 14:03:49 +01:00
Erik Nordström
e56b95daec Add telemetry stats based on type of relation
Refactor the telemetry function and format to include stats broken
down on common relation types. The types include:

- Tables
- Partitioned tables
- Hypertables
- Distributed hypertables
- Continuous aggregates
- Materialized views
- Views

and for each of these types report (when applicable):

- Total number of relations
- Total number of children/chunks
- Total data volume (broken into heap, toast, and indexes).
- Compression stats
- PG stats, like reltuples

The telemetry function has also been refactored to return `jsonb`
instead of `text`. This makes it easier to query and manipulate the
resulting JSON format, and also gives cleaner output.

Closes #3932
2022-02-08 09:44:55 +01:00
gayyappan
9f64df8567 Add ts_catalog subdirectory
Move files that are related to timescaledb catalog
access to this subdirectory
2022-01-24 16:58:09 -05:00
Sven Klemm
39645d56da Fix subtract_integer_from_now on 32-bit platforms
This patch fixes subtract_integer_from_now on 32-bit platforms,
improves error handling and adds some basic tests.
subtract_integer_from_now would trigger an assert when called
on a hypertable without integer time dimension (found by sqlsmith).
Additionally subtract_integer_from_now would segfault when called
on a hypertable without partitioning dimensions.
2021-12-20 10:02:57 +01:00
Sven Klemm
7d9ea6237c Remove namein calls from scankey initialization
A lot of our scankey initialization when scanning indexes for
a name had a superfluous namein call.  This patch remove those
unnecessary calls.
2021-11-03 14:56:32 +01:00
Dmitry Simonenko
3d11927567 Rework distributed DDL processing logic
This patch does refactoring and rework of the logic beside
dist_ddl_preprocess() function.

The idea behind it is to simplify process by splitting
each DDL command logic inside separate function and avoid relaying on
the hypertable list count to make decisions.

This change allows easier to process more complex commands
(such as GRANT), which would require query rewrite or to be
executed on a different data nodes. Additionally this would make it
easier to follow and be more alike as main code path inside
src/process_util.c.
2021-10-29 16:15:58 +03:00
Nikhil Sontakke
68697859df Fix GRANT/REVOKE ALL IN SCHEMA handling
Fix the "GRANT/REVOKE ALL IN SCHEMA" handling uniformly across
single-node and multi-node.

Even thought this is a SCHEMA specific activity, we decided to
include the chunks even if they are part of another SCHEMA. So
they will also end up getting/resetting the same privileges.

Includes test case changes for both single-node and multi-node use
cases.
2021-10-22 16:48:16 +05:30
Sven Klemm
f0ba729acb Adjust code to PG14 es_result_rel_info removal
PG14 removes es_result_relation_info from executor state.

https://github.com/postgres/postgres/commit/a04daa97
2021-10-04 13:26:22 +02:00
Sven Klemm
36a82d0851 Fix compiler warning about missing braces
Older gcc versions will throw a warning about missing braces when
a nested struct is initialized with {0}.
2021-08-17 18:39:54 +02:00
Erik Nordström
98110af75b Constify parameters and return values of core APIs
Harden core APIs by adding the `const` qualifier to pointer parameters
and return values passed by reference. Adding `const` to APIs has
several benefits and potentially reduces bugs.

* Allows core APIs to be called using `const` objects.
* Callers know that objects passed by reference are not modified as a
  side-effect of a function call.
* Returning `const` pointers enforces "read-only" usage of pointers to
  internal objects, forcing users to copy objects when mutating them
  or using explicit APIs for mutations.
* Allows compiler to apply optimizations and helps static analysis.

Note that these changes are so far only applied to core API
functions. Further work can be done to improve other parts of the
code.
2021-06-14 22:09:10 +02:00
Matvey Arye
b72dab16c0 Add some more randomness to chunk assignment
Previously the assignment of data nodes to chunks had a bit
of a thundering-herd problem for multiple hypertables
without space partions: the data node assigned for the
first chunk was always the same across hypertables.
We fix this by adding the hypertable_id to the
index into the datanode array. This de-synchronizes
across hypertables but maintains consistency for any
given hypertable.

We could make this consistent for space partitioned tables
as well but avoid doing so now to prevent partitions
jumping nodes due to this change.

This also effects tablespace selection in the same way.
2021-06-08 14:04:23 +02:00
Sven Klemm
fb863f12c7 Remove support for PG11
Remove support for compiling against PostgreSQL 11. This patch also
removes PG11 specific compatibility macros.
2021-06-01 20:21:06 +02:00
Sven Klemm
99ffe8fd6c Fix blocking triggers with transition tables
Block creation of triggers with transition tables on trigger creation
instead of erroring out when a chunk is created.

Fixes #3234
2021-05-20 21:55:26 +02:00
Sven Klemm
eef71fdfb1 Replace StrNCpy with strlcpy
PG14 removes StrNCpy and some Name helper functions.

https://github.com/postgres/postgres/commit/1784f278a6
2021-05-20 08:54:54 +02:00
Ruslan Fomkin
639aef76a4 Refactor chunk creation for future extension
Separates chunk preparation and metadata update. Separates preparation
of constraints names, since there is no overlap between preparing
names for dimension constraints and other constraints. Factors out
creation of json string describing dimension slices of a chunk.

This refactoring is preparation for implementing new functionalities.
2021-04-06 14:02:22 +02:00
gayyappan
f649736f2f Support ADD COLUMN for compressed hypertables
Support ALTER TABLE .. ADD COLUMN <colname> <typname>
for hypertables with compressed chunks.
2021-01-14 09:32:50 -05:00
Erik Nordström
ec82d16154 Remove unused field for ignoring invalidations
Remove an unused field called `max_ignore_invalidation_older_than`
that was left in the `Hypertable` struct after refactoring of
continuous aggregates.
2020-12-21 13:46:43 +01:00
gayyappan
bca1e35a52 Support disabling compression on distributed hypertables
This change allows a user to execute
ALTER TABLE hyper SET (timescaledb.compress = false)
on distributed hypertables.

Fixes #2716
2020-12-08 12:08:54 -05:00
gayyappan
d41aa2aff5 Rename macro TS_HYPERTABLE_HAS_COMPRESSION
This PR cleans up some macro names.
Rename macro TS_HYPERTABLE_HAS_COMPRESSION
to TS_HYPERTABLE_HAS_COMPRESSION_TABLE.
2020-12-02 15:44:48 -05:00
gayyappan
7c76fd4d09 Save compression settings on access node for distributed hypertables
1. Add compression_state column for hypertable catalog
by renaming compressed column for the hypertable catalog
table. compression_state is a tri-state column.
This column indicates if the hypertable has
compression enabled (value = 1) or if it is an internal
compression table (value = 2).

2. Save compression settings on access node when compression
is turned on for a distributed hypertable
For a distributed hypertable, that has compression enabled,
compression_state is set. We don't create any internal tables
on the access node.

Fixes #2660
2020-12-02 10:42:57 -05:00
Dmitry Simonenko
bbcb2b22fa Fix SCHEMA DROP CASCADE with continuous aggregates
This change fixes the situation when schema object is dropped
before the cagg which leads to an error when it tries to
resolve it.

Issue: #2350
2020-11-10 11:46:19 +03:00
Mats Kindahl
e9cb14985e Read function name dynamically
The function name is hard-coded in some cases in the C function, so
this commit instead define and use a macro that will extract the
function name from the `fcinfo` structure. This prevents mismatches
between the hard-coded names and the actual function name.

Closes #2579
2020-10-21 15:03:32 +02:00
Erik Nordström
3cf9c857c4 Make errors and messages conform to style guide
Errors and messages are overhauled to conform to the official
PostgreSQL style guide. In particular, the following things from the
guide has been given special attention:

* Correct capitalization of first letter: capitalize only for hints,
  and detail messages.
* Correct handling of periods at the end of messages (should be elided
  for primary message, but not detail and hint messages).
* The primary message should be short, factual, and avoid reference to
  implementation details such as specific function names.

Some messages have also been reworded for clarity and to better
conform with the last bullet above (short primary message). In other
cases, messages have been updated to fix references to, e.g., function
parameters that used the wrong parameter name.

Closes #2364
2020-10-20 16:49:32 +02:00
Erik Nordström
2cc2df23bd Lock dimension slices when creating new chunk
This change makes two changes to address issues with processes doing
concurrent inserts and `drop_chunks` calls:

- When a new chunk is created, any dimension slices that existed prior
  to creating the new chunk are locked to prevent them from being
  dropped before the chunk-creating process commits.

- When a chunk is being dropped, concurrent inserts into the chunk
  that is being dropped will try to lock the dimension slices of the
  chunk. In case the locking fails (due to the slices being
  concurrently deleted), the insert process will treat the chunk as
  not existing and will instead recreate it. Previously, the chunk
  slices (and thus chunk) would be found, but the insert would fail
  when committing since the chunk was concurrently deleted.

A prior commit (PR #2150) partially solved a related problem, but
didn't lock all the slices of a chunk. That commit also threw an error
when a lock on a slice could not be taken due to the slice being
deleted by another transaction. This is now changed to treat that case
as a missing slice instead, causing it to be recreated.

Fixes #1986
2020-10-15 21:56:10 +02:00
Mats Kindahl
da97ce6e8b Make function parameter names consistent
Renaming the parameter `hypertable_or_cagg` in functions `drop_chunks`
and `show_chunks` to `relation` and changing parameter name from
`main_table` to `hypertable` or `relation` depending on context.
2020-10-02 08:52:20 +02:00
Erik Nordström
c884fe43f0 Remove unused materializer code
The new refresh functionality for continuous aggregates replaces the
old materializer, which means some code is no longer used and should
be removed.

Closes #2395
2020-09-15 17:18:59 +02:00
Erik Nordström
f49492b83d Cap invalidation threshold at last data bucket
When refreshing with an "infinite" refresh window going forward in
time, the invalidation threshold is also moved forward to the end of
the valid time range. This effectively renders the invalidation
threshold useless, leading to unnecessary write amplification.

To handle infinite refreshes better, this change caps the refresh
window at the end of the last bucket of data in the underlying
hypertable, as to not move the invalidation threshold further than
necessary. For instance, if the max time value in the hypertable is
11, a refresh command such as:

```
CALL refresh_continuous_aggregate(NULL, NULL);
```
would be turned into
```
CALL refresh_continuous_aggregate(NULL, 20);
```

assuming that a bucket starts at 10 and ends at 20 (exclusive). Thus
the invalidation threshold would at most move to 20, allowing the
threshold to still do its work once time again moves forward and
beyond it.

Note that one must never process invalidations beyond the invalidation
threshold without also moving it, as that would clear that area from
invalidations and thus prohibit refreshing that region once the
invalidation threshold is moved forward. Therefore, if we do not move
the threshold further than a certain point, we cannot refresh beyond
it either. An alternative, and perhaps safer, approach would be to
always invalidate the region over which the invalidation threshold is
moved (i.e., new_threshold - old_threshold). However, that is left for
a future change.

It would be possible to also cap non-infinite refreshes, e.g.,
refreshes that end at a higher time value than the max time value in
the hypertable. However, when an explicit end is specified, it might
be on purpose so optimizing this case is also left for the future.

Closes #2333
2020-09-09 19:46:28 +02:00
Mats Kindahl
4f32439362 Update tablespace of table on attach and detach
If a tablespace is attached to a hypertable the tablespace of the
hypertable is not set, but if the tablespace is set it is also
attached. A similar situation occurs if tablespaces are detached.
This means that if a hypertable is created with a tablespace and then
all tablespaces are detached, the chunks will still be put in the
tablespace of the hypertable.

With this commit, attaching a tablespace to a hypertable will set the
tablespace of the hypertable if it does not already have one. Detaching
a tablespace from a hypertable will set the tablespace to the default
tablespace if the tablespace being detached is the tablespace for the
hypertable.

If `detach_tablespace` is called with only a tablespace name, it will
be detached from all tables it is attached to. This commit ensures that
the tablespace for the hypertable is set to the default tablespace if
it was set to the tablespace being detached.

Fixes #2299
2020-09-09 09:30:07 +02:00
Erik Nordström
4538fc6c40 Optimize continuous aggregate refresh
This change ensures a refresh of a continuous aggregate only
re-materializes the part of the aggregate that has been
invalidated. This makes refreshing much more efficient, and sometimes
eliminates the need to materialize data entirely (i.e., in case there
are no invalidations in the refresh window).

The ranges to refresh are the remainders of invalidations after they
are cut by the refresh window (i.e., all invalidations, or parts of
invalidations, that fall within the refresh window). The invalidations
used for a refresh are collected in a tuple store (which spills to
disk) as to not allocate too much memory in case of many
invalidations. Invalidations are, however, merged and deduplicated
before being added to the tuplestore, similar to how invalidations are
processed in the invalidation logs.

Currently, the refreshing proceeds with just materializing all
invalidated ranges in the order they appear in the tuple store, and
the ordering does not matter since all invalidated regions are
refreshed in the same transaction.
2020-08-31 10:22:32 +02:00
Mats Kindahl
c054b381c6 Change syntax for continuous aggregates
We change the syntax for defining continuous aggregates to use `CREATE
MATERIALIZED VIEW` rather than `CREATE VIEW`. The command still creates
a view, while `CREATE MATERIALIZED VIEW` creates a table.  Raise an
error if `CREATE VIEW` is used to create a continuous aggregate and
redirect to `CREATE MATERIALIZED VIEW`.

In a similar vein, `DROP MATERIALIZED VIEW` is used for continuous
aggregates and continuous aggregates cannot be dropped with `DROP
VIEW`.

Continuous aggregates are altered using `ALTER MATERIALIZED VIEW`
rather than `ALTER VIEW`, so we ensure that it works for `ALTER
MATERIALIZED VIEW` and gives an error if you try to use `ALTER VIEW` to
change a continuous aggregate.

Note that we allow `ALTER VIEW ... SET SCHEMA` to be used with the
partial view as well as with the direct view, so this is handled as a
special case.

Fixes #2233

Co-authored-by: =?UTF-8?q?Erik=20Nordstr=C3=B6m?= <erik@timescale.com>
Co-authored-by: Mats Kindahl <mats@timescale.com>
2020-08-27 17:16:10 +02:00
Mats Kindahl
769bc31dc2 Lock dimension slice tuple when scanning
In the function `ts_hypercube_from_constraints` a hypercube is build
from constraints which reference dimension slices in `dimension_slice`.
As part of a run of `drop_chunks` or when a chunk is explicitly dropped
as part of other operations, dimension slices can be removed from this
table causing the dimension slices to be removed, which makes the
hypercube reference non-existent dimension slices which subsequently
causes a crash.

This commit fixes this by adding a tuple lock on the dimension slices
that are used to build the hypercube.

If two `drop_chunks` are running concurrently, there can be a race if
dimension slices are removed as a result removing a chunk. We treat
this case in the same way as if the dimension slice was updated: report
an error that another session locked the tuple.

Fixes #1986
2020-08-26 09:44:20 +02:00
Mats Kindahl
aec7c59538 Block data migration for distributed hypertables
Option `migrate_data` does not currently work for distributed
hypertables, so we block it for the time being and generate an error if
an attempt is made to migrate data when creating a distributed
hypertable.

Fixes #2230
2020-08-20 15:07:01 +02:00
Sven Klemm
bb891cf4d2 Refactor retention policy
This patch changes the retention policy to store its configuration
in the bgw_job table and removes the bgw_policy_drop_chunks table.
2020-08-03 22:33:54 +02:00