Errors and messages are overhauled to conform to the official
PostgreSQL style guide. In particular, the following things from the
guide has been given special attention:
* Correct capitalization of first letter: capitalize only for hints,
and detail messages.
* Correct handling of periods at the end of messages (should be elided
for primary message, but not detail and hint messages).
* The primary message should be short, factual, and avoid reference to
implementation details such as specific function names.
Some messages have also been reworded for clarity and to better
conform with the last bullet above (short primary message). In other
cases, messages have been updated to fix references to, e.g., function
parameters that used the wrong parameter name.
Closes#2364
This change will call a function on a remote database to validate
its configuration before following through with an add_data_node
call. Right now the check will ensure that the data is able to use
prepared transactions, but further checks can be easily added in
the future.
Since this uses the timescaledb extension to validate the remote
database, it runs at the end of bootstrapping. We may want to
consider adding code to undo our bootstrapping changes if this check
fails.
Distributed hypertables are now repartitioned when attaching new data
nodes and the current number of partition (slices) in the first closed
(space) dimension is less than the number of data nodes. Increasing
the number of partitions is necessary to make use of a newly attached
data node. However, repartitioning is optional and can be avoided via
a boolean parameter in `attach_server()`.
In addition to the above repartitioning, this change also adds
informational messages to `create_hypertable` and
`set_number_partitions` to raise awareness of situations when the
number of partitions in the space dimensions is lower than the number
of attached data nodes.
This change fixes the following:
* Refactor the code for setting the default data node for a chunk. The
`set_chunk_default_data_node()` API function now takes a
`regclass`/`oid` instead of separate schema + table names and
returns `true` when a new data node is set and `false` if called
with a data node that is already the default. Like before,
exceptions are thrown on errors. It also does proper permissions
checks. The related code has been moved from `data_node.c` to
`chunk.c` since this is an operation on a chunk, and the code now
also lives in the `tsl` directory since this is non-trivial logic
that should fall under the TSL license.
* When setting the default data node on a chunk (failing over to
another data node), it is now verified that the new data node
actually has a replica of the chunk and that the corresponding
foreign server belongs to the "right" foreign data wrapper.
* Error messages and permissions handling have been tweaked.
The timescale clustering code so far has been written referring to the
remote databases as 'servers'. This terminology is a bit overloaded,
and in particular we don't enforce any network topology limitations
that the term 'server' would suggest. In light of this we've decided
to change to use the term 'node' when referring to the different
databases in a distributed database. Specifically we refer to the
frontend as an 'access node' and to the backends as 'data nodes',
though we may omit the access or data qualifier where it's unambiguous.
As the vast bulk of the code so far has been written for the case where
there was a single access node, almost all instances of 'server' were
references to data nodes. This change has updated the code to rename
those instances.
Prevent server delete if the server contains data, unless user
specifies `force => true`. In case the server is the only data
replica, we don't allow delete/detach unless table/chunks are dropped.
The idea is to have the same semantics for delete as for detach since
delete actually calls detach
We also try to update pg_foreign_table when we delete server if there
is another server containing the same chunk.
An internal function is added to enable updating foreign table server
which might be useful in some cases since foreign table server is
considered a default server for that particular chunk.
Since this command needs to work even if the server we're trying to
remove is non responsive, we're not removing any data on the remote
data node.
A server can now be detached from one or more distributed hypertables
so that it no longer in use. We only allow detaching a server if there
is no data on the server and detaching it doesn't risk making a
hypertable under-replicated.
A user can detach a server for a specific hypertable, or for all
hypertables to which the server is attached.
`SELECT * FROM detach_server('server1', 'my_hypertable');`
`SELECT * FROM detach_server('server2');`
In a multi-node (clustering) setup, TimescaleDB needs to track which
remote servers have data for a particular distributed hypertable. It
also needs to know which servers to place new chunks on and to use in
queries against a distributed hypertable.
A new metadata table, `hypertable_server` is added to map a local
hypertable ID to a hypertable ID on a remote server. We require that
the remote hypertable has the same schema and name as the local
hypertable.
When a local server is removed (using `DROP SERVER` or our
`delete_server()`), all remote hypertable mappings for that server
should also be removed.
This adds an internal API function to create a chunk using explicit
constraints (dimension slices). A function to export a chunk in a
format consistent with the chunk creation function is also added.
The chunk export/create functions are needed for distributed
hypertables so that an access node can create chunks on data nodes
according to its own (global) partitioning configuration.
The functions for adding and updating dimensions have been refactored
in C to:
- improve usage of proper error codes
- make messages that better conform with the PostgreSQL standard.
- improve security by avoiding that lots of code run under SECURITY DEFINER
A new if_not_exists option has also been added to add_dimension() and
a the number of partitions can now be set using the new
set_number_partitions() function.
A bug in the validation of smallint time intervals has been fixed. The
previous code didn't check for intervals > 0 and smallint intervals
accepted values up to UINT16_MAX instead of INT16_MAX.
Tablespaces can now be detached from hypertables using
`tablespace_detach()`. This function can either detach
a tablespace from all tables or only a specific table.
Having the ability to detach tablespace allows more
advanced storage management, for instance, one can detach
tablespaces that are running low on diskspace while attaching
new ones to replace the old ones.
Attaching tablespaces to hypertables is now handled
in native code, with improved permissions checking and
caching of tablespaces in the Hypertable data object.
This change fixes two things that were overlooked in a prior
refactoring of chunk index handling.
First, column attribute numbers of a hypertable might not match a
chunk if, e.g., a column on the hypertable has been removed. In such
circumstances, indexes created on chunks based on a corresponding
hypertable index need to account for differences in column attribute
numbers. This change ensures that column attributes are always
translated to match the chunk an index is created on.
Second, ShareLock was acquired by mistake on each hypertable index
when recursing these indexes to chunks, potentially causing
deadlocks. ShareLock should only be taken on the heap relation that an
index is created on. This is now fixed. Further, locking during index
creation has been cleaned up so that it is easier to overview the
locks taken on various relations.
This PR adds more regression tests for index creation and tests for more
user-errors. Significantly, it checks for the presence of both the time
and spaced-partition columns in unique indexes. This is needed because
Timescale cannot guarantee uniqueness if colliding rows don't land in the
same chunk. Fixes#29.