This PR introduces a timeout argument and a new logic to the
timescale_internal.ping_data_node() function which allows
to handle io timeouts for nodes being unresponsive.
Fix#5312
This change fixes the following:
* Refactor the code for setting the default data node for a chunk. The
`set_chunk_default_data_node()` API function now takes a
`regclass`/`oid` instead of separate schema + table names and
returns `true` when a new data node is set and `false` if called
with a data node that is already the default. Like before,
exceptions are thrown on errors. It also does proper permissions
checks. The related code has been moved from `data_node.c` to
`chunk.c` since this is an operation on a chunk, and the code now
also lives in the `tsl` directory since this is non-trivial logic
that should fall under the TSL license.
* When setting the default data node on a chunk (failing over to
another data node), it is now verified that the new data node
actually has a replica of the chunk and that the corresponding
foreign server belongs to the "right" foreign data wrapper.
* Error messages and permissions handling have been tweaked.
The timescale clustering code so far has been written referring to the
remote databases as 'servers'. This terminology is a bit overloaded,
and in particular we don't enforce any network topology limitations
that the term 'server' would suggest. In light of this we've decided
to change to use the term 'node' when referring to the different
databases in a distributed database. Specifically we refer to the
frontend as an 'access node' and to the backends as 'data nodes',
though we may omit the access or data qualifier where it's unambiguous.
As the vast bulk of the code so far has been written for the case where
there was a single access node, almost all instances of 'server' were
references to data nodes. This change has updated the code to rename
those instances.
This change includes the only rename changes required by the renaming
of server to data node across the clustering codebase. This change
is being committed separately from the bulk of the rename changes to
prevent git from losing the file history of renamed files (merging the
rename with extensive code modifications resulted in git treating some
of the file moves as a file delete and new file creation).