This change replace UserMappings with newly introduced TSConnectionId
object, which represent a pair of foreign server id and local user id.
Authentication has been moved to non-password based, since original
UserMappings were used to store a data node user passwords as
well. This is a temporary step, until introduction of certificate
based authentication.
List of changes:
* add_data_node() password and bootstrap_password arguments removed
* introduced authentication using pgpass file
* RemoteTxn format string which represents tx changed to
tx-version-xid-server_id-user_id
* data_node_dispatch, remote transaction cache, connection cache hash
tables keys switched to TSConnectionId instead of user mappings
* remote_connection_open() been rework to exclude user options
* Tests upgraded, user mappings and passwords usage has been excluded
This change refactors and hardens parts of data node management
functionality.
* A number of of permissions checks have been added to data node
management functions. This includes checking that the user has
proper permissions for both table and server objects.
* Permissions checks are now done when creating remote chunks on data
nodes.
* The add_data_node() API function has been simplified and now returns
more intuitive status about created objects (foreign server,
database, extension). It is no longer necessary to specify a user to
connect with as this is always assumed to be the current user. The
bootstrap user can still be specified explicitly, however, as that
user might require elevated permissions on the remote node to
bootstrap.
* Functions that capture exceptions without re-throwing, such as
`ping_data_node()` and `get_user_mapping()`, have been refactored to
not do this as the transaction state and memory contexts are not in
states where it is safe to proceed as normal.
* Data node management functions now consistently check that any
foreign servers operated on are actually TimescaleDB server objects.
* Tests now run with a superuser a regular user specific to
clustering. These users have password auth enabled in `pg_hba.conf`,
which is required by the connection library when connecting as a
non-superuser. Tests have been refactored to bootstrap data nodes
using these user roles.
The timescale clustering code so far has been written referring to the
remote databases as 'servers'. This terminology is a bit overloaded,
and in particular we don't enforce any network topology limitations
that the term 'server' would suggest. In light of this we've decided
to change to use the term 'node' when referring to the different
databases in a distributed database. Specifically we refer to the
frontend as an 'access node' and to the backends as 'data nodes',
though we may omit the access or data qualifier where it's unambiguous.
As the vast bulk of the code so far has been written for the case where
there was a single access node, almost all instances of 'server' were
references to data nodes. This change has updated the code to rename
those instances.
This change includes the only rename changes required by the renaming
of server to data node across the clustering codebase. This change
is being committed separately from the bulk of the rename changes to
prevent git from losing the file history of renamed files (merging the
rename with extensive code modifications resulted in git treating some
of the file moves as a file delete and new file creation).
Since distributed hypertables will only be support on PG11 or greater,
ensure that we do not compile multinode-related files on previous
versions. Also raise appropriate errors when trying to invoke
multinode-related functionality on versions prior to PG11.
The idea here is to allow multiple async requests to be
created for the same connection. Since connection can process only
one request at the time only that means that one request can be
running and the rest needs to be deferred. The deferred async request
will run on get response if the connection is not in use by running
async request.
This support should pave the way for async creation of cursors.
Multinode-related APIs now raise errors when called any PostgreSQL
version below 11, as these versions do not have the required features
to support multinode or have different behavior.
Raising errors at runtime on affected APIs is preferred over excluding
these functions altogether. Having a different user-facing SQL API
would severly complicate the upgrade process for the extension.
A new CMake check has been added to disable multinode features on
unsupported PostgreSQL versions. It also generates a macro in
`config.h` that can be used in code to check for multinode support.
This change ensures that chunk replicas are created on remote
(datanode) servers whenever a chunk is created in a local distributed
hypertable.
Remote chunks are created using the `create_chunk()` function, which
has been slightly refactored to allow specifying an explicit chunk
table name. The one making the remote call also records the resulting
remote chunk IDs in its `chunk_server` mappings table.
Since remote command invokation without super-user permissions
requires password authentication, the test configuration files have
been updated to require password authentication for a cluster test
user that is used in tests.
A frontend node will now maintain mappings from a local chunk to the
corresponding remote chunks in a `chunk_server` table.
The frontend creates local chunks as foreign tables and adds entries
to `chunk_server` for each chunk it creates on remote data node.
Currently, the creation of remote chunks is not implemented, so a
dummy chunk_id for the remote chunk will be added instead for testing
purposes.
This adds an internal API function to create a chunk using explicit
constraints (dimension slices). A function to export a chunk in a
format consistent with the chunk creation function is also added.
The chunk export/create functions are needed for distributed
hypertables so that an access node can create chunks on data nodes
according to its own (global) partitioning configuration.