Since pg_prepared_xacts is shared between databases, the healing
function tried to resolve prepared transactions created by other
distributed databases.
This change makes the healing function to work only with current
database.
Fix#3433
Add an optional password parameter to `add_data_node` so that users
that don't have a password in a `passfile` on the access node can add
data nodes using password authentication. Together with user mappings,
this allows full multinode configuration without relying on passwords
or certificates provided in external/on-disk files.
While wasswords can be provided in the database via a user mapping
object, such a mapping is created on a per-server basis and requires
the foreign server to exist prior to creating the mapping. When adding
a data node, however, bootstrapping and/or validation of the data node
happens at the same time as the server object is created, which means
no user mapping can be created prior to adding the data
node. Therefore, the password must be provided as an argument to add
data node instead of via a user mapping.
Fortunately, using a function parameter might be preferred to a user
mapping since the (plaintext) password won't be stored in the
database. A user mapping for the user that created the data node can
optionally be added after the data node has been added. But it might
be desirable to only create user mappings for unprivileged users that
will mostly interact only with specific distributed hypertables.
When the access node executes `add_data_node`, bootstrapping the data
node is done by:
1. Optionally creating the database on the remote server.
2. Creating a schema for the TimescaleDB extension objects.
3. Creating the TimescaleDB extension in the database.
After bootstrapping, the `dist_uuid` of the data node and access node
is set to the `uuid` of the access node.
If `bootstrap` is `true`, bootstrapping of the data node is done.
If `boostrap` is `false`, bootstrapping is not done, but the procedure
attempts to connect to the database and verify that the TimescaleDB
extension is loaded and that the `dist_uuid` is clear. If it is not
possible to connect to the database, or if `dist_uuid` is set,
`add_data_node` will fail.
This change replace UserMappings with newly introduced TSConnectionId
object, which represent a pair of foreign server id and local user id.
Authentication has been moved to non-password based, since original
UserMappings were used to store a data node user passwords as
well. This is a temporary step, until introduction of certificate
based authentication.
List of changes:
* add_data_node() password and bootstrap_password arguments removed
* introduced authentication using pgpass file
* RemoteTxn format string which represents tx changed to
tx-version-xid-server_id-user_id
* data_node_dispatch, remote transaction cache, connection cache hash
tables keys switched to TSConnectionId instead of user mappings
* remote_connection_open() been rework to exclude user options
* Tests upgraded, user mappings and passwords usage has been excluded
This change refactors and hardens parts of data node management
functionality.
* A number of of permissions checks have been added to data node
management functions. This includes checking that the user has
proper permissions for both table and server objects.
* Permissions checks are now done when creating remote chunks on data
nodes.
* The add_data_node() API function has been simplified and now returns
more intuitive status about created objects (foreign server,
database, extension). It is no longer necessary to specify a user to
connect with as this is always assumed to be the current user. The
bootstrap user can still be specified explicitly, however, as that
user might require elevated permissions on the remote node to
bootstrap.
* Functions that capture exceptions without re-throwing, such as
`ping_data_node()` and `get_user_mapping()`, have been refactored to
not do this as the transaction state and memory contexts are not in
states where it is safe to proceed as normal.
* Data node management functions now consistently check that any
foreign servers operated on are actually TimescaleDB server objects.
* Tests now run with a superuser a regular user specific to
clustering. These users have password auth enabled in `pg_hba.conf`,
which is required by the connection library when connecting as a
non-superuser. Tests have been refactored to bootstrap data nodes
using these user roles.
The timescale clustering code so far has been written referring to the
remote databases as 'servers'. This terminology is a bit overloaded,
and in particular we don't enforce any network topology limitations
that the term 'server' would suggest. In light of this we've decided
to change to use the term 'node' when referring to the different
databases in a distributed database. Specifically we refer to the
frontend as an 'access node' and to the backends as 'data nodes',
though we may omit the access or data qualifier where it's unambiguous.
As the vast bulk of the code so far has been written for the case where
there was a single access node, almost all instances of 'server' were
references to data nodes. This change has updated the code to rename
those instances.
This change adds a distributed database id to the installation data for a
database. It also provides a number of utilities that can be used for
getting/setting/clearing this value or using it to determing if a database is
a frontend, backend, or not a member of distributed database.
This change also includes modifications to the add_server and delete_server
functions to check the distributed id to ensure the operation is allowed, and
then update or clear it appropriately. After this changes it will no longer
be possible to add a database as a backend to multiple frontend databases, nor
will it be possible to add a frontend database as a backend to any other
database.
This patch adds functionality for automatic database and extension
creation on remote server. New function arguments: bootstrap_database, bootstrap_user
and bootstrap_password.
To make the heal function safe to non-ts prepared txns we
introduce a prefix "ts" to our prepared txns. This allows
us to separate cases where there is a ts vs non-ts prepared txn
and have heal ignore non-ts txns.
An alternative would be to consider all txns that don't parse
correctly as non-ts transactions. But, that is less robust to
bugs in our parsing/printing code.
One downside to the current approach is that all prepared txns
with a "ts" prefix are considered reserved for ts. That
should be acceptable.
This commit adds the ability to resolve whether or not 2PC
transactions have been committed or aborted and also adds a heal
function to resolve transactions that have been prepared but not
committed or rolled back.
This commit also removes the server id of the primary key on the
remote_txn table and adds another index. This was done because the
`remote_txn_persistent_record_exists` should not rely on the server
being contacted but should rather just check for the existance of the
id. This makes the resolution safe to setups where two frontend server
definitions point to the same database. While this may not be a
properly configured setup, it's better if the resolution process is
robust to this case.