Remove functions that are no longer used due to refactorings.
Removes the following functions:
- hypertable_tuple_match_name
- ts_hypertable_get_all_by_name
- ts_hypertable_has_tuples
This change ensures that API functions and DDL operations
which modify data respects read-only transaction state
set by default_transaction_read_only option.
Function `ts_hypertable_create_from_info` will error if the hypertable
already exists unless the if-not-exists flag is not set. If we reach
this point, either if-not-exists-flag was set or the hypertable did not
exist and was created above.
In `ts_hypertable_create_from_info`, a call to
`ts_dimension_info_validate` with `space_dim_info` will be made if (and
only if) the table did not exist. The function does not only validate
the dimension, it also set `dimension_id` field.
If the table already existed, `created` will be false and the
`dimension_id` will be set to the invalid OID, which means that
`ts_hypertable_check_partitioning` will crash since it it expect a
proper OID to be passed.
This commit fixes that by checking if the hypertable exists prior to
calling `ts_hypertable_create_from_info` in
`ts_hypertable_create_internal` and aborting with an error if the
hypertable already exists.
Fixes#1987
This patch removes code support for PG9.6 and PG10. In addition to
removing PG96 and PG10 macros the following changes are done:
remove HAVE_INT64_TIMESTAMP since this is always true on PG10+
remove PG_VERSION_SUPPORTS_MULTINODE
This change modifies the default 'associated_table_prefix' string
for compressed hypertables from "_hyper_n" to "_dist_hyper_n" where
n is the hypertable id of the table. This change makes it so that
when the backend hypertable for a distributed hypertable is created
on a data node, it won't have an 'associated_table_prefix' that may
conflict with locally created tables.
As part of this change, chunk names for distributed tables no longer
append the string "dist" as this is now part of the prefix for those
chunk names (i.e. instead of "_hyper_1_1_dist_chunk" we will now
have "_dist_hyper_1_1_chunk").
Implements SQL function set_replication_factor, which changes
replication factor of a distributed hypertable. The change of the
replication factor doesn't affect existing chunks. Newly created
chunks are replicated according to new replication factor.
Prior to this change attempting to add a dimension to a distributed
hypertable which currently or previously contained data would fail
with an opaque error. This change will properly test distributed
hypertables when adding dimensions and will print appropriate errors.
It is not possible to properly reference another table from a
distributed hypertable since this would require replication of the
referenced table.
This commit add a warning message when a distributed hypertable attempt
to reference any other table using a foreign key.
This change includes telemetry fixes which extends HypertablesStat
with num_hypertables_compressed. It also updates the way how the
number of regular hypertables is calculated, which is now treated as a
non-compressed and not related to continuous aggregates.
Since NULL value for replication factor in SQL DDL corresponds to
HYPERTABLE_REGULAR now, which is different from
HYPERTABLE_DISTRIBUTED_MEMBER, there is no need to check for non-NULL
value and comparing with HYPERTABLE_DISTRIBUTED_MEMBER is enough.
This change will make `create_distributed_hypertable` attach only
those data nodes that the creating user has USAGE permission on
instead of trying to attach all data nodes (even when permissions do
not allow it). This prevents an ERROR when a user tries to create a
distributed hypertable and lacks USAGE on one or more data nodes. This
new behavior *only* applies when no explicit set of data nodes is
specified. When an explicit set is specified the behavior remains
unchanged, i.e., USAGE is required on all the explicitly specified
data nodes.
Note that, with distributed hypertables, USAGE permission on a data
node (foreign server) only governs the ability to attach it to a
hypertable. This is analogous to the regular behavior of foreign
servers where USAGE governs who can create foreign tables using the
server. The actual usage of the server once associated with a table is
not affected as that would break the table if permissions are revoked.
The new behavior introduced with this change makes it simpler to use
`create_distributed_hypertable` in a multi-user and multi-permission
environment where users have different permissions on data nodes
(DNs). For instance, imagine user A being allowed to attach DN1 and
DN2, while user B can attach DN2 and DN3. Without this change,
`create_distributed_hypertable` will always fail since it tries to
attach DN1, DN2, and DN3 irrespective of the user that calls the
function. Even worse, if only DN1 and DN2 existed initially, user A
would be able to create distributed hypertables without errors, but,
as soon as DN3 is added, `create_distributed_hypertable` would start
to fail for user A.
The only way to avoid the above described errors when creating
distributed hypertables is for users to pass an explicit set of data
nodes that only includes the data nodes they have USAGE on. If a
user is forced to do that, the result would in any case be the same as
that introduced with this change. Unfortunately, users currently have
no way of knowing which data nodes they have USAGE on unless they
query the PostgreSQL catalogs. In many cases, a user might not even
care about the data nodes they are using if they aren't DBAs
themselves.
To summarize the new behavior:
* If public (everyone) has USAGE on all data nodes, there is no
change in behavior.
* If a user has USAGE on a subset of data nodes, it will by default
attach only those data nodes since it cannot attach the other ones
anyway.
* If the user specifies an explicit set of data nodes to attach, all
of those nodes still require USAGE permissions. (Behavior
unchanged.)
This change extends telemetry report and adds new 'distributed_db'
section which includes following keys: 'distributed_member',
'data_nodes_count' and 'distributed_hypertables_count'.
This change ensures that all dimension-releated DDL commands on
hypertables are distributed to its data nodes. Most importantly,
`add_dimension()` now applies across all data nodes of a hypertable
when issued on the access node. This ensures the dimension
configuration is the same on all nodes of a particular hypertable.
While the configuration of chunk time interval and number of
partitions do not matter that much for data nodes (since the access
node takes care of sizing chunks), functions like
`set_chunk_time_interval()` and `set_number_partitions()` are
distributed as well. This ensures that dimension-related configuration
changes apply across all nodes.
Running `drop_chunks` on a distributed hypertable should remove chunks
from all its data nodes. To make it work we send the same SQL command
to all involved data nodes.
For convenience, this adds the option to create a distributed
hypertable without specifying the number of partitions in the space
dimension even in the case when no data nodes are specified
(defaulting to the data nodes added to the database).
Distributed hypertables are now repartitioned when attaching new data
nodes and the current number of partition (slices) in the first closed
(space) dimension is less than the number of data nodes. Increasing
the number of partitions is necessary to make use of a newly attached
data node. However, repartitioning is optional and can be avoided via
a boolean parameter in `attach_server()`.
In addition to the above repartitioning, this change also adds
informational messages to `create_hypertable` and
`set_number_partitions` to raise awareness of situations when the
number of partitions in the space dimensions is lower than the number
of attached data nodes.
The timescale clustering code so far has been written referring to the
remote databases as 'servers'. This terminology is a bit overloaded,
and in particular we don't enforce any network topology limitations
that the term 'server' would suggest. In light of this we've decided
to change to use the term 'node' when referring to the different
databases in a distributed database. Specifically we refer to the
frontend as an 'access node' and to the backends as 'data nodes',
though we may omit the access or data qualifier where it's unambiguous.
As the vast bulk of the code so far has been written for the case where
there was a single access node, almost all instances of 'server' were
references to data nodes. This change has updated the code to rename
those instances.
This change includes the only rename changes required by the renaming
of server to data node across the clustering codebase. This change
is being committed separately from the bulk of the rename changes to
prevent git from losing the file history of renamed files (merging the
rename with extensive code modifications resulted in git treating some
of the file moves as a file delete and new file creation).
This refactors the `hypertable_distributed` test to make better use of
the `remote_exec` utility function. The refactoring also makes sure we
actually use space partitioning when testing distributed hypertables.
Chunks are placed across data nodes based on the ordinal of the slice
in the first space dimension, if such a dimension exists. For
instance, if a chunk belongs to the second slice in the space
dimension, this ordinal number will be used modulo the number of
data nodes to find the data node to place the chunk on.
However, the ordinal is calculated based on the existing slices in the
dimension, and, because slices are created lazily, the ordinal of a
specific slice might vary until all slices are created in the space
dimension. This has the result that chunks aren't consistently placed
on data nodes based on their space partition, prohibiting some
push-down optimizations that rely on consistent partitioning.
This change ensures the ordinal of a space slice is calculated as if
all slices in the dimension are pre-existing. This might still lead to
inconsistencies during times of repartioning, but fixes issues that
occur initially when no slices exists.
Prevent server delete if the server contains data, unless user
specifies `force => true`. In case the server is the only data
replica, we don't allow delete/detach unless table/chunks are dropped.
The idea is to have the same semantics for delete as for detach since
delete actually calls detach
We also try to update pg_foreign_table when we delete server if there
is another server containing the same chunk.
An internal function is added to enable updating foreign table server
which might be useful in some cases since foreign table server is
considered a default server for that particular chunk.
Since this command needs to work even if the server we're trying to
remove is non responsive, we're not removing any data on the remote
data node.
This change adds support for blocking DDL operations on data nodes
when not executed via the access node.
Specifically, DDL operations can only be executed on data nodes if the
request came on a connection from the access node or DDL operations
have been explicitly allowed by setting
`timescaledb.enable_client_ddl_on_data_servers=true`.
This functionality enables users to block or allow creation of new
chunks on a data node for one or more hypertables. Use cases for this
include the ability to block new chunks when a data node is running
low on disk space or to affect chunk distribution across data nodes.
Sometimes blocking data nodes for new chunks can make a hypertable
under-replicated. For that case an additional argument `force => true`
can be supplied to force blocking new chunks.
Here are some examples.
Block for a specific hypertable:
`SELECT * FROM block_new_chunks_on_server('server_1', 'disttable');`
Block for all hypertables on the server:
`SELECT * FROM block_new_chunks_on_server('server_1', force =>true);`
Unblock:
`SELECT * FROM allow_new_chunks_on_server('server_1', true);`
This change adds the `force` argument to `detach_server` as well. If
detaching or blocking new chunks will make a hypertable
under-replicated then `force => true` needs to used.
Hypertables created on a data node by an access node (via
`create_distributed_hypertable()`) will now have their
`replication_factor` set to -1. This makes it possible to distinguish
regular data node hypertables from those that are part of a larger
distributed hypertable.
This functionality will be needed for decision making based on the
connection type, for example allow or block a DDL commands on a data
node.
This change adds support for pushing down FULL partitionwise
aggregates to remote servers. Partial partitionwise aggregates cannot
yet be pushed down since that requires a way to tell the remote server
to compute a specific partial.
NOTE: Push-down aggregates are a PG11 only feature as it builds on top
of partitionwise aggregate push-down only available in
PG11. Therefore, a number of query-running tests now only run on PG11,
since these have different output on PG10.
To make push downs work on a per-server basis, hypertables are now
first expended into chunk append plans. This is useful to let the
planner do chunk exclusion and cost estimation of individual
chunks. The append path is then converted into a per-server plan by
grouping chunks by servers, with reduced cost because there is only
one startup cost per server instead of per chunk.
Future optimizations might consider avoiding the original per-chunk
plan computation, in order to increase planning spead.
To make use of existing PostgreSQL planning code for partitionwise
aggregates, we need to create range table entries for the server
relations even though these aren't "real" tables in the system. This
is because the planner code expects those entries to be present for
any "partitions" it is planning aggregates on (note that in
"declarative partitioning" all partitions are system tables). For this
purpose, we create range table entries for each server that points to
the root hypertable relation. This is in a sense "correct" since each
per-server relation is an identical (albeit partial) hypertable on the
remote server. The upside of pointing the server rel's range table
entry to the root hypertable is that the planner can make use of the
indexes on the hypertable for planning purposes. This leads to more
efficient remote queries when, e.g., ordering is important (i.e., we
get push down sorts for free).
Chunk server mappings are now cleaned up when dropping chunk tables
and foreign servers. In particular, this works even when such objects
are removed as a result of cascading deletions of other objects.
Some refactoring has been done to the event trigger handling code in
addition to adding support for new event objects.
When creating a hypertable with backend servers (replication_factor >
0), this will deparse the table structure and send the commands to
create it on all of the backend nodes. It will then send a command
to create the hypertable on each backend.
There is still some more work needed to handle assigning hypertables
with multiple space dimensions and supporting partitioning functions.
In distributed hypertables, chunks are foreign tables and such tables
do not support (or should not support) indexes, certain constraints,
and triggers. Therefore, such objects should not recurse to foreign
table chunks nor add a mappings in the `chunk_constraint` or
`chunk_index` tables.
This change ensures that we properly filter out the indexes, triggers,
and constraints that should not recurse to chunks on distributed
hypertables.
A frontend node will now maintain mappings from a local chunk to the
corresponding remote chunks in a `chunk_server` table.
The frontend creates local chunks as foreign tables and adds entries
to `chunk_server` for each chunk it creates on remote data node.
Currently, the creation of remote chunks is not implemented, so a
dummy chunk_id for the remote chunk will be added instead for testing
purposes.
In a multi-node (clustering) setup, TimescaleDB needs to track which
remote servers have data for a particular distributed hypertable. It
also needs to know which servers to place new chunks on and to use in
queries against a distributed hypertable.
A new metadata table, `hypertable_server` is added to map a local
hypertable ID to a hypertable ID on a remote server. We require that
the remote hypertable has the same schema and name as the local
hypertable.
When a local server is removed (using `DROP SERVER` or our
`delete_server()`), all remote hypertable mappings for that server
should also be removed.
This adds an internal API function to create a chunk using explicit
constraints (dimension slices). A function to export a chunk in a
format consistent with the chunk creation function is also added.
The chunk export/create functions are needed for distributed
hypertables so that an access node can create chunks on data nodes
according to its own (global) partitioning configuration.
The internal chunk API is updated to avoid returning `Chunk` objects
that are marked `dropped=true` along with some refactoring, hardening,
and cleanup of the internal chunk APIs. In particular, apart from
being returned in a dropped state, chunks could also be returned in a
partial state (without all fields set, partial constraints,
etc.). None of this is allowed as of this change. Further, lock
handling was unclear when joining chunk metadata from different
catalog tables. This is made clear by having chunks built within
nested scan loops so that proper locks are held when joining in
additional metadata (such as constraints).
This change also fixes issues with dropped chunks that caused chunk
metadata to be processed many times instead of just once, leading to
potential bugs or bad performance.
In particular, since the introduction of the “dropped” flag, chunk
metadata can exist in two states: 1. `dropped=false`
2. `dropped=true`. When dropping chunks (e.g., via `drop_chunks`,
`DROP TABLE <chunk>`, or `DROP TABLE <hypertable>`) there are also two
modes of dropping: 1. DELETE row and 2. UPDATE row and SET
dropped=true.
The deletion mode and the current state of chunk lead to a
cross-product resulting in 4 cases when dropping/deleting a chunk:
1. DELETE row when dropped=false
2. DELETE row when dropped=true
3. UPDATE row when dropped=false
4. UPDATE row when dropped=true
Unfortunately, the code didn't distinguish between these cases. In
particular, case (4) should not be able to happen, but since it did it
lead to a recursing loop where an UPDATE created a new tuple that then
is recursed to in the same loop, and so on.
To fix this recursing loop and make the code for dropping chunks less
error prone, a number of assertions have been added, including some
new light-weight scan functions to access chunk information without
building a full-blown chunk.
This change also removes the need to provide the number of constraints
when scanning for chunks. This was really just a hint anyway, but this
is no longer needed since all constraints are joined in anyway.
This change fixes various compiler warnings that show up on different
compilers and platforms. In particular, MSVC is sensitive to functions
that do not return a value after throwing an error since it doesn't
realize that the code path is not reachable.
When calling show_chunks or drop_chunks without specifying
a particular hypertable TimescaleDB iterates through all
existing hypertables and builds a list. While doing this
it adds the internal '_compressed_hypertable_*' tables
which leads to incorrect behaviour of
ts_chunk_get_chunks_in_time_range function. This fix
filters out the internal compressed tables while scanning
at ts_hypertable_get_all function.
Cache queries support multiple optional behaviors, such as "missing
ok" (do not fail on cache miss) and "no create" (do not create a new
entry if one doesn't exist in the cache). With multiple boolean
parameters, the query API has become unwieldy so this change turns
these booleans into one flag parameter.
This change includes a major refactoring to support PostgreSQL
12. Note that many tests aren't passing at this point. Changes
include, but are not limited to:
- Handle changes related to table access methods
- New way to expand hypertables since expansion has changed in
PostgreSQL 12 (more on this below).
- Handle changes related to table expansion for UPDATE/DELETE
- Fixes for various TimescaleDB optimizations that were affected by
planner changes in PostgreSQL (gapfill, first/last, etc.)
Before PostgreSQL 12, planning was organized something like as
follows:
1. construct add `RelOptInfo` for base and appendrels
2. add restrict info, joins, etc.
3. perform the actual planning with `make_one_rel`
For our optimizations we would expand hypertables in the middle of
step 1; since nothing in the query planner before `make_one_rel` cared
about the inheritance children, we didn’t have to be too precises
about where we were doing it.
However, with PG12, and the optimizations around declarative
partitioning, PostgreSQL now does care about when the children are
expanded, since it wants as much information as possible to perform
partition-pruning. Now planning is organized like:
1. construct add RelOptInfo for base rels only
2. add restrict info, joins, etc.
3. expand appendrels, removing irrelevant declarative partitions
4. perform the actual planning with make_one_rel
Step 3 always expands appendrels, so when we also expand them during
step 1, the hypertable gets expanded twice, and things in the planner
break.
The changes to support PostgreSQL 12 attempts to solve this problem by
keeping the hypertable root marked as a non-inheritance table until
`make_one_rel` is called, and only then revealing to PostgreSQL that
it does in fact have inheritance children. While this strategy entails
the least code change on our end, the fact that the first hook we can
use to re-enable inheritance is `set_rel_pathlist_hook` it does entail
a number of annoyances:
1. this hook is called after the sizes of tables are calculated, so we
must recalculate the sizes of all hypertables, as they will not
have taken the chunk sizes into account
2. the table upon which the hook is called will have its paths planned
under the assumption it has no inheritance children, so if it's a
hypertable we have to replan it's paths
Unfortunately, the code for doing these is static, so we need to copy
them into our own codebase, instead of just using PostgreSQL's.
In PostgreSQL 12, UPDATE/DELETE on inheritance relations have also
changed and are now planned in two stages:
- In stage 1, the statement is planned as if it was a `SELECT` and all
leaf tables are discovered.
- In stage 2, the original query is planned against each leaf table,
discovered in stage 1, directly, not part of an Append.
Unfortunately, this means we cannot look in the appendrelinfo during
UPDATE/DELETE planning, in particular to determine if a table is a
chunk, as the appendrelinfo is not at the point we wish to do so
initialized. This has consequences for how we identify operations on
chunks (sometimes for blocking and something for enabling
functionality).
Refactors multiple implementations of finding hypertables in cache
and failing with different error messages if not found. The
implementations are replaced with calling functions, which encapsulate
a single error message. This provides the unified error message and
removes need for copy-paste.
If a chunk is dropped but it has a continuous aggregate that is
not dropped we want to preserve the chunk catalog row instead of
deleting the row. This is to prevent dangling identifiers in the
materialization hypertable. It also preserves the dimension slice
and chunk constraints rows for the chunk since those will be necessary
when enabling this with multinode and is necessary to recreate the
chunk too. The postgres objects associated with the chunk are all
dropped (table, constraints, indexes).
If data is ever reinserted to the same data region, the chunk is
recreated with the same dimension definitions as before. The postgres
objects are simply recreated.
Allow dropping raw chunks on the raw hypertable while keeping
the continuous aggregate. This allows for downsampling data
and allows users to save on TCO. We only allow dropping
such data when the dropped data is older than the
`ignore_invalidation_older_than` parameter on all the associated
continuous aggs. This ensures that any modifications to the
region of data which was dropped should never be reflected
in the continuous agg and thus avoids semantic ambiguity
if chunks are dropped but then again recreated due to an
insert.
Before we drop a chunk we need to make sure to process any
continuous aggregate invalidations that were registed on
data inside the chunk. Thus we add an option to materialization
to perform materialization transactionally, to only process
invalidations, and to process invalidation only before a timestamp.
We fix drop_chunks and policy to properly process
`cascade_to_materialization` as a tri-state variable (unknown,
true, false); Existing policy rows should change false to NULL
(unknown) and true stays as true since it was explicitly set.
Remove the form data for bgw_policy_drop_chunk because there
is no good way to represent the tri-state variable in the
form data.
When dropping chunks with cascade_to_materialization = false, all
invalidations on the chunks are processed before dropping the chunk.
If we are so far behind that even the completion threshold is inside
the chunks being dropped, we error. There are 2 reasons that we error:
1) We can't safely process new ranges transactionally without taking
heavy weight locks and potentially locking the entire sytem
2) If a completion threshold is that far behind the system probably has
some serious issues anyway.
We added a timescaledb.ignore_invalidation_older_than parameter for
continuous aggregatess. This parameter accept a time-interval (e.g. 1
month). if set, it limits the amount of time for which to process
invalidation. Thus, if
timescaledb.ignore_invalidation_older_than = '1 month'
then any modifications for data older than 1 month from the current
timestamp at insert time will not cause updates to the continuous
aggregate. This limits the amount of work that a backfill can trigger.
This parameter must be >= 0. A value of 0 means that invalidations are
never processed.
When recording invalidations for the hypertable at insert time, we use
the maximum ignore_invalidation_older_than of any continuous agg attached
to the hypertable as a cutoff for whether to record the invalidation
at all. When materializing a particular continuous agg, we use that
aggs ignore_invalidation_older_than cutoff. However we have to apply
that cutoff relative to the insert time not the materialization
time to make it easier for users to reason about. Therefore,
we record the insert time as part of the invalidation entry.