90 Commits

Author SHA1 Message Date
Erik Nordström
5d12a3883d Make connection establishment interruptible
Refactor the data node connection establishment so that it is
interruptible, e.g., by ctrl-c or `statement_timeout`.

Previously, the connection establishment used blocking libpq calls. By
instead using asynchronous connection APIs and integrating with
PostgreSQL interrupt handling, the connection establishment can be
canceled by an interrupt caused by a statement timeout or a user.

Fixes #2757
2023-01-30 17:48:59 +01:00
Erik Nordström
1676259840 Fix repartition behavior when attaching data node
When attaching a data node and specifying `repartition=>false`, the
current number of partitions should remain instead of recalculating
the partitioning based on the number of data nodes.

Fixes #5157
2023-01-18 16:20:49 +01:00
Dmitry Simonenko
5c897ff75d Fix default data node availability status
Function alter_data_node() return uninitialized value for "available"
option when it is not presented in the option list.

Fix #5154
2023-01-16 16:33:10 +02:00
Erik Nordström
1e7b9bc558 Fix issue with deleting data node and dropping database
When deleting a data node with the option `drop_database=>true`, the
database is deleted even if the command fails.

Fix this behavior by dropping the remote database at the end of the
delete data node operation so that other checks fail first.
2023-01-13 13:54:27 +01:00
Dmitry Simonenko
826dcd2721 Ensure nodes availability using dist restore point
Make sure that a data node list does not have unavailable data nodes
when using create_distributed_restore_point() API.

Fix #4979
2022-11-24 16:08:06 +02:00
Dmitry Simonenko
5813173e07 Introduce drop_stale_chunks() function
This function drops chunks on a specified data node if those chunks are
not known by the access node.

Call drop_stale_chunks() automatically when data node becomes
available again.

Fix #4848
2022-11-23 19:21:05 +02:00
Lakshmi Narayanan Sreethar
839e42dd0c Use async API to drop database from delete_data_node
PG15 introduced a ProcSignalBarrier mechanism in drop database
implementation to force all backends to close the file handles for
dropped tables. The backend that is executing the drop database command
will emit a new process signal barrier and wait for other backends to
accept it. But the backend which is executing the delete_data_node
function will not be able to process the above mentioned signal as it
will be stuck waiting for the drop database query to return. Thus the
two backends end up waiting for each other causing a deadlock.

Fixed it by using the async API to execute the drop database command
from delete_data_node instead of the blocking remote_connection_cmdf_ok
call.

Fixes #4838
2022-11-17 18:09:39 +05:30
Erik Nordström
f13214891c Add function to alter data nodes
Add a new function, `alter_data_node()`, which can be used to change
the data node's configuration originally set up via `add_data_node()`
on the access node.

The new functions introduces a new option "available" that allows
configuring the availability of the data node. Setting
`available=>false` means that the node should no longer be used for
reads and writes. Only read "failover" is implemented as part of this
change, however.

To fail over reads, the alter data node function finds all the chunks
for which the unavailable data node is the "primary" query target and
"fails over" to a chunk replica on another data node instead. If some
chunks do not have a replica to fail over to, a warning will be
raised.

When a data node is available again, the function can be used to
switch back to using the data node for queries.

Closes #2104
2022-11-11 13:59:42 +01:00
Fabrízio de Royes Mello
f1535660b0 Honor usage of OidIsValid() macro
Postgres source code define the macro `OidIsValid()` to check if the Oid
is valid or not (comparing against the `InvalidOid` type). See
`src/include/c.h` in Postgres source three.

Changed all direct comparisons against `InvalidOid` for the `OidIsValid`
call and add a coccinelle check to make sure the future changes will use
it correctly.
2022-11-03 16:10:50 -03:00
Alexander Kuzmenkov
313845a882 Enable -Wextra
Our code mostly has warnings about comparison with different
signedness.
2022-10-27 16:06:58 +04:00
Erik Nordström
025bda6a81 Add stateful partition mappings
Add a new metadata table `dimension_partition` which explicitly and
statefully details how a space dimension is split into partitions, and
(in the case of multi-node) which data nodes are responsible for
storing chunks in each partition. Previously, partition and data nodes
were assigned dynamically based on the current state when creating a
chunk.

This is the first in a series of changes that will add more advanced
functionality over time. For now, the metadata table simply writes out
what was previously computed dynamically in code. Future code changes
will alter the behavior to do smarter updates to the partitions when,
e.g., adding and removing data nodes.

The idea of the `dimension_partition` table is to minimize changes in
the partition to data node mappings across various events, such as
changes in the number of data nodes, number of partitions, or the
replication factor, which affect the mappings. For example, increasing
the number of partitions from 3 to 4 currently leads to redefining all
partition ranges and data node mappings to account for the new
partition. Complete repartitioning can be disruptive to multi-node
deployments. With stateful mappings, it is possible to split an
existing partition without affecting the other partitions (similar to
partitioning using consistent hashing).

Note that the dimension partition table expresses the current state of
space partitions; i.e., the space-dimension constraints and data nodes
to be assigned to new chunks. Existing chunks are not affected by
changes in the dimension partition table, although an external job
could rewrite, move, or copy chunks as desired to comply with the
current dimension partition state. As such, the dimension partition
table represents the "desired" space partitioning state.

Part of #4125
2022-08-02 11:38:32 +02:00
Nikhil Sontakke
fdb12f7abe Handle timescaledb versions aptly in multinode
The current check where we deem a DN incompatible if it's on a newer
version is exactly the opposite of what we want it to be. Fix that and
also add relevant test cases.
2022-07-06 20:09:09 +05:30
Sven Klemm
02d4aefb85 Fix flaky data_node_bootstrap test
Copy collation and chartype before releasing syscache since we need
them past the lifetime of the current context.
2022-06-21 11:53:18 +02:00
Nikhil Sontakke
ed55654a32 Retain hypertable ownership on attach_data_node
If a superuser is used to invoke attach_data_node on a hypertable then
we need to ensure that the object created on this data node has the
same original ownership permissions.

Fixes #4433
2022-06-17 18:05:25 +05:30
Erik Nordström
19b3f67b9c Drop remote data when detaching data node
Add a parameter `drop_remote_data` to `detach_data_node()` which
allows dropping the hypertable on the data node when detaching
it. This is useful when detaching a data node and then immediately
attaching it again. If the data remains on the data node, the
re-attach will fail with an error complaining that the hypertable
already exists.

The new parameter is analogous to the `drop_database` parameter of
`delete_data_node`. The new parameter is `false` by default for
compatibility and ensures that a data node can be detached without
requiring communicating with the data node (e.g., if the data node is
not responding due to a failure).

Closes #4414
2022-06-14 15:53:41 +02:00
Sven Klemm
308ce8c47b Fix various misspellings 2022-06-13 10:53:08 +02:00
Sven Klemm
96202a99bd Adjust code to PG15 pg_database changes
PG15 changes the type of collate and ctype from name to text.

https://github.com/postgres/postgres/commit/54637508
2022-06-05 14:43:55 +02:00
Sven Klemm
2715b5564a Replace pg_atoi with pg_strtoint16/32
PG 15 removes pg_atoi, so this patch changes all callers to use
pg_strtoint16/32.

https://github.com/postgres/postgres/commit/73508475
2022-05-30 08:32:50 +02:00
Mats Kindahl
34bf695444 Add initializer to auto variable
Compilers are not smart enough to check that `conn` is initialized
inside the loop so not initializing it gives an error. Added an
initializer to the auto variable to get rid of the error.
2022-05-25 14:36:23 +02:00
gayyappan
9f64df8567 Add ts_catalog subdirectory
Move files that are related to timescaledb catalog
access to this subdirectory
2022-01-24 16:58:09 -05:00
Mats Kindahl
e320679c4c Remove grants on data node bootstrap
Starting with PG15, default permissions on the public schema is
restricted for any non-superuser non-owner. This causes test failures
since tables can no longer be created without explicitly adding
permissions, so we remove grant when bootstrapping the data nodes and
instead grant permissions to the users in the regression tests. This
keeps the default permissions on data nodes, but allow regression tests
to run.

Fixes #3957

Reference: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b073c3cc
2022-01-17 17:36:33 +01:00
Aleksander Alekseev
d7eaa55a47 Override the default ACL for public schema on PG15
Since PG15, by default, non-superuser accounts are not allowed to create
tables in public schema of databases they don't own. This default can be
changed manually. This patch ensures that the permissions are going to be the
same regardless of the used PostgreSQL version.

Without this patch, none of our tests pass on PG15 because they fail with the
"access denied to schema public" error. This is why runner.sh was modified.
Then, some other tests keep failing because when we call create_distributed_hypertable()
we create a new database on each of the data nodes, also not granting enough
permissions to non-privileged users. This is what the fix of data_node.c
addresses.

This is not necessarily the best approach possible, but it preserves the same
behavior on PostgreSQL >= 15 and PostgreSQL < 15. Maybe one day we will come up
with something better (especially when there will be no need to support PG < 15)
but until then the patch seems to be good enough.

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b073c3cc
2021-12-23 16:18:17 +03:00
Erik Nordström
e0f02c8c1a Add option to drop database when deleting data node
When deleting a data node, it is often convenient to be able to also
drop the database on the data node so that the node can be added again
using the same database name. However, dropping the database is
optional since it should be possible to delete a data node even if it
is no longer responding.

With the new functionality, a data node's database can be dropped as
follows:

```sql
SELECT delete_data_node('dn1', drop_database=>true);
```

Note that the default behavior is still to not drop the database in
order to be compatible with the old behavior. Enabling the option also
makes the function non-transactional, since dropping a database is not
transactional. Therefore, it is not possible to use this option in a
transaction block.

Closes #3876
2021-12-16 15:59:50 +01:00
Nikhil Sontakke
4cecdb50f9 Fix remote txn heal logic
* A few tweaks to the remote txn resolution logic
* Add logic to delete a specific record in remote_txn table by GID
* Allow heal logic to move on to other cleanup if one specific GID
fails
* Do not rely on ongoing txns while cleaning up entries from remote_txn
table

Includes test case changes to try out various failure scenarios in the
healing function.

Fixes #3219
2021-12-09 20:44:07 +05:30
Dmitry Simonenko
3d11927567 Rework distributed DDL processing logic
This patch does refactoring and rework of the logic beside
dist_ddl_preprocess() function.

The idea behind it is to simplify process by splitting
each DDL command logic inside separate function and avoid relaying on
the hypertable list count to make decisions.

This change allows easier to process more complex commands
(such as GRANT), which would require query rewrite or to be
executed on a different data nodes. Additionally this would make it
easier to follow and be more alike as main code path inside
src/process_util.c.
2021-10-29 16:15:58 +03:00
Erik Nordström
d740e19c5f Fix DirectFunctionCall crash in distributed_exec
This change fixes a crash that occurred when calling
`distributed_exec` via a direct function call.

The crash was triggered by a dynamic lookup of the function name via
the function Oid in the `FunctionCallInfo` struct in order to generate
error messages for read-only and transaction block checks. However,
this information is provided by the parsing stage, which is not
executed when doing direct function calls, thus leading to a
segmentation fault when trying to dereference a pointer in the
`FunctionCallInfo` that wasn't set.

Note that this problem is not limited to `distributed_exec`; it is
present in all SQL-callable functions that use the same pattern and
macros.

To fix the problem, update the macros and patterns used for checking
for read-only mode and transaction blocks to avoid doing the function
name lookup when the pointer is not set. Instead fall back to the C
function name in that case (via C macro `__func__`).

A test case is added in C code to call `distributed_exec` via a direct
function call within a transaction block in order to hit the
previously crashing error message.

The `distributed_exec` function has also been updated with better
handling of input parameters, like empty arrays of data nodes, or
arrays containing NULL elements.
2021-10-26 15:20:56 +02:00
Erik Nordström
28a5650382 Allow anyone to use size utilities on distributed hypertables
This change removes a check for `USAGE` privileges on data nodes
required to query the data node using utility commands, such as
`hypertable_size`. Normally, PostgreSQL doesn't require `USAGE` on a
foreign server to query its remote tables. Also, size utilities, like
`pg_table_size` can be used by anyone---even roles without any
privileges on a table. The behavior on distributed hypertables is now
consistent with PostgreSQL.

Fixes #3698
2021-10-15 15:01:44 +02:00
Sven Klemm
d0426ff234 Move all compatibility related files into compat directory 2021-08-28 05:17:22 +02:00
Ruslan Fomkin
404f1cdbad Create chunk table from access node
Creates a table for chunk replica on the given data node. The table
gets the same schema and name as the chunk. The created chunk replica
table is not added into metadata on the access node or data node.

The primary goal is to use it during copy/move chunk.
2021-07-29 16:53:12 +03:00
Erik Nordström
98110af75b Constify parameters and return values of core APIs
Harden core APIs by adding the `const` qualifier to pointer parameters
and return values passed by reference. Adding `const` to APIs has
several benefits and potentially reduces bugs.

* Allows core APIs to be called using `const` objects.
* Callers know that objects passed by reference are not modified as a
  side-effect of a function call.
* Returning `const` pointers enforces "read-only" usage of pointers to
  internal objects, forcing users to copy objects when mutating them
  or using explicit APIs for mutations.
* Allows compiler to apply optimizations and helps static analysis.

Note that these changes are so far only applied to core API
functions. Further work can be done to improve other parts of the
code.
2021-06-14 22:09:10 +02:00
Erik Nordström
c224bc7994 Always validate existing database and extension
This change ensures the database and extension is validated whenever
these objects aren't created, instead of only doing validation when
`bootstrap=>false` is passed when adding a data node.

This fixes a corner case where a data node could be added and removed
several times, even though the data node's database was already marked
as having been part of a multi-node setup.

A new test checks that a data node cannot be re-added after deleting
it on the access node, irrespective of whether one bootstraps the data
node or not when it is added.
2020-12-29 13:37:27 +01:00
Erik Nordström
3bd29da988 Bootstrap data nodes with versioned extension
When the access node bootstraps a data node and creates the extension,
it should use the extension version of the access node. This change
adds the `VERSION` option to the `CREATE EXTENSION` statement sent to
a data node so that the extension versions on the access node and data
nodes will be the same. Without the version option, data nodes will be
bootstrapped with the latest version installed, potentially leading to
data nodes running different versions of the extension compared to the
access node.
2020-12-21 13:01:25 +01:00
Erik Nordström
eede24bcf2 Add error detail when adding a data node fails
When `add_data_node` fails, it often gives an opaque error that it
couldn't connect to the data node. This change adds the libpq
connection error as a detailed message in the error.
2020-12-18 11:31:56 +01:00
Erik Nordström
877f48230c Fix crash and cancel when adding data node
This change fixes two issues with `add_data_node`:

1. In one case, a check for a valid connection pointer was not done,
   causing a segmentation fault when connection attempts failed.

2. Connections were made with a blocking API that hangs
   indefinitely when the receiving end is not responding. The user
   couldn't cancel the connection attempt with CTRL-C, since no wait
   latch or interrupt checking was used. The code is now updated to
   use a non-blocking connection API, where it is possible to wait on
   the socket and latch, respecting interrupts.
2020-12-17 09:39:09 +01:00
niksa
7f3feb8200 Introduce additional db for data node bootstrapping
We now try connecting to three databases before giving up:
postgres, template1 and defaultdb.
2020-12-02 18:20:54 +01:00
Erik Nordström
e284b2dfc0 Fix uninitialized variable in data node validation
Initialize a boolean variable used to check for a compatible extension
on a data node. Leaving it uninitialized might lead to a potential
read of a garbage value and unpredictible behavior.
2020-11-11 23:25:30 +01:00
Erik Nordström
47d26b422e Allow optional password when adding data node
Add an optional password parameter to `add_data_node` so that users
that don't have a password in a `passfile` on the access node can add
data nodes using password authentication. Together with user mappings,
this allows full multinode configuration without relying on passwords
or certificates provided in external/on-disk files.

While wasswords can be provided in the database via a user mapping
object, such a mapping is created on a per-server basis and requires
the foreign server to exist prior to creating the mapping. When adding
a data node, however, bootstrapping and/or validation of the data node
happens at the same time as the server object is created, which means
no user mapping can be created prior to adding the data
node. Therefore, the password must be provided as an argument to add
data node instead of via a user mapping.

Fortunately, using a function parameter might be preferred to a user
mapping since the (plaintext) password won't be stored in the
database. A user mapping for the user that created the data node can
optionally be added after the data node has been added. But it might
be desirable to only create user mappings for unprivileged users that
will mostly interact only with specific distributed hypertables.
2020-11-10 13:48:21 +01:00
niksa
f8f53aaeed Fix validation of available extensions on data node
We want to check all available extension versions
and not just the installed one. This is because we
might be setting up a cluster for a database that
has different extension version then the `postgres` or
`template1` database which  we actually use to perform
this validation.
So instead of using `pg_available_extensions` view we
use `pg_available_extension_versions` that should return
the same list of extension versions no matter which database
we connect to.

This should also make it possible to add a data node that
has run ALTER EXTENSION UPDATE.

This gives no guarantees that installed version will be
compatible because we currently we use default version
(the one specified in control file)  when
installing an extension.
2020-11-06 12:11:55 +01:00
Mats Kindahl
e9cb14985e Read function name dynamically
The function name is hard-coded in some cases in the C function, so
this commit instead define and use a macro that will extract the
function name from the `fcinfo` structure. This prevents mismatches
between the hard-coded names and the actual function name.

Closes #2579
2020-10-21 15:03:32 +02:00
Erik Nordström
3cf9c857c4 Make errors and messages conform to style guide
Errors and messages are overhauled to conform to the official
PostgreSQL style guide. In particular, the following things from the
guide has been given special attention:

* Correct capitalization of first letter: capitalize only for hints,
  and detail messages.
* Correct handling of periods at the end of messages (should be elided
  for primary message, but not detail and hint messages).
* The primary message should be short, factual, and avoid reference to
  implementation details such as specific function names.

Some messages have also been reworded for clarity and to better
conform with the last bullet above (short primary message). In other
cases, messages have been updated to fix references to, e.g., function
parameters that used the wrong parameter name.

Closes #2364
2020-10-20 16:49:32 +02:00
Dmitry Simonenko
ebc4fd9b9e Add if_attached argument to detach_data_node()
This change makes detach_data_node() function consistent with
other data node management functions by adding missing
if_attach argument.

The function will not show an error in case if data node is not
attached and if_attached is set to true.

Issue: #2506
2020-10-08 20:53:14 +03:00
Mats Kindahl
8ddaef66ea Remove error for correct bootstrap of data node
If the database exists on the data node when executing `add_data_node`
it will generate an error in the data node log, which can cause
problems since there is an error indication in the log but there are no
failing operations.

This commit fixes this by first validating the database and only if it
does not exist, create the database.

Closes #2503
2020-10-08 09:03:15 +02:00
niksa
65f31122ee Fix validation logic when adding a new data node
We stop enforcing an extension owner to be the same as a user adding a
data node since that's not strictly necessary. In multi-node setups
it is common that a data node is pre bootstrapped and an extension owner
is already set. This will prevent getting an error when a non
extension owner tries to add a data node.
2020-10-07 21:11:30 +02:00
Mats Kindahl
03d2f32178 Add self-reference check to add_data_node
If the access node is adding itself as a data node using `add_data_node`
it will deadlock since transactions will be opened on both the access
node and data node both trying to update the metadata.

This commit fixes this by updating `set_dist_id` to check if the UUID
being added as `dist_uuid` is the same as the `uuid` of the node.  If
that is the case, it raises an error.

Fixes #2133
2020-07-30 21:19:33 +02:00
Dmitry Simonenko
04bcc949c1 Add checks for read-only transactions
This change ensures that API functions and DDL operations
which modify data respects read-only transaction state
set by default_transaction_read_only option.
2020-06-22 17:03:04 +03:00
Erik Nordström
47a4d931f3 Remove need to pin connection cache
Since the connection cache is no longer replaced on a transaction
rollback, it is not necessary to pin the connection cache (this wasn't
done correctly in some places in any case, e.g.,
`data_node_get_connection`).
2020-06-13 12:05:41 +02:00
Sven Klemm
db617bf1d6 Fix typos in comments and documentation 2020-06-10 15:09:31 +02:00
Dmitry Simonenko
f3b7907778 Cleanup remote_txn on data node delete
Remove any 2PC transaction records associated with the data node
from _timescaledb_catalog.remote_txn on delete_data_node() call.
2020-06-10 15:34:40 +03:00
Sven Klemm
c90397fd6a Remove support for PG9.6 and PG10
This patch removes code support for PG9.6 and PG10. In addition to
removing PG96 and PG10 macros the following changes are done:

remove HAVE_INT64_TIMESTAMP since this is always true on PG10+
remove PG_VERSION_SUPPORTS_MULTINODE
2020-06-02 23:48:35 +02:00
Mats Kindahl
c2744e13ad Show error message on unavailable extension
If the extension is not available on the data node, a strange error
message will be displayed since the extension cannot be installed. This
commit check for the availability of the extension before trying to
bootstrap the node and print a more helpful informational message if
the extension is not available.
2020-05-27 17:31:09 +02:00