timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-16 10:33:27 +08:00

Author	SHA1	Message	Date
Mats Kindahl	77776faf20	Fix port usage for add_data_node() For a statement which only specify the database, we expect the data node to be created on the same Postgres instance as the one where the statement is executed. SELECT * FROM add_data_node('data1', database => 'base1'); However, if the port for the server is changed in the configuration file to not use the default port, the command will try to connect to the wrong Postgres server, namly the one listening on port 5432. This commit fixes this by letting `host` and `port` parameters be NULL by default and use the following logic to decide what port should be used. - If a port is explicitly provided, use that. - If a port is not provided but a host is provided, it is assumed that the intention is to connect to a default-installed Postgres server on a different address, so use the default Postgres port (5432). - If neither port nor host is provided, it assumed that the intention is to connect to the same server as where the command is executed, so use the port that was written in the configuration file. The default host to use is still 'localhost', but it is not written explicitly in the function definition in `ddl_api.sql`. The commit also fixes one warning where an uninitialized variable could be used.	2020-05-27 17:31:09 +02:00
Dmitry Simonenko	c8563b2d46	Add distributed_exec() function This function allows users to execute a SQL query on a list of data nodes. The purpose is to provide users a way to, e.g., create roles on data nodes. The current implementation is quite straightforward. Just execute any provided query on a list of data nodes. The query will execute with the current user role. The function does not return or print any result values. In case of error, it will print the data node name and a related error message.	2020-05-27 17:31:09 +02:00
Dmitry Simonenko	79f6223631	Replace UserMappings with a connection ID This change replace UserMappings with newly introduced TSConnectionId object, which represent a pair of foreign server id and local user id. Authentication has been moved to non-password based, since original UserMappings were used to store a data node user passwords as well. This is a temporary step, until introduction of certificate based authentication. List of changes: * add_data_node() password and bootstrap_password arguments removed * introduced authentication using pgpass file * RemoteTxn format string which represents tx changed to tx-version-xid-server_id-user_id * data_node_dispatch, remote transaction cache, connection cache hash tables keys switched to TSConnectionId instead of user mappings * remote_connection_open() been rework to exclude user options * Tests upgraded, user mappings and passwords usage has been excluded	2020-05-27 17:31:09 +02:00
Mats Kindahl	ac3f0bcb92	Change order of parameters in attach_data_node All data node functions except `attach_data_node` take the node name as the first parameter. This commit changes the order of the two first parameters to `attach_data_node` so that the node name is the first parameter and the hypertable is the second parameter.	2020-05-27 17:31:09 +02:00
Erik Nordström	5309cd6c5f	Repartition hypertables when attaching data node Distributed hypertables are now repartitioned when attaching new data nodes and the current number of partition (slices) in the first closed (space) dimension is less than the number of data nodes. Increasing the number of partitions is necessary to make use of a newly attached data node. However, repartitioning is optional and can be avoided via a boolean parameter in `attach_server()`. In addition to the above repartitioning, this change also adds informational messages to `create_hypertable` and `set_number_partitions` to raise awareness of situations when the number of partitions in the space dimensions is lower than the number of attached data nodes.	2020-05-27 17:31:09 +02:00
Erik Nordström	b07461ec00	Refactor and harden data node management This change refactors and hardens parts of data node management functionality. * A number of of permissions checks have been added to data node management functions. This includes checking that the user has proper permissions for both table and server objects. * Permissions checks are now done when creating remote chunks on data nodes. * The add_data_node() API function has been simplified and now returns more intuitive status about created objects (foreign server, database, extension). It is no longer necessary to specify a user to connect with as this is always assumed to be the current user. The bootstrap user can still be specified explicitly, however, as that user might require elevated permissions on the remote node to bootstrap. * Functions that capture exceptions without re-throwing, such as `ping_data_node()` and `get_user_mapping()`, have been refactored to not do this as the transaction state and memory contexts are not in states where it is safe to proceed as normal. * Data node management functions now consistently check that any foreign servers operated on are actually TimescaleDB server objects. * Tests now run with a superuser a regular user specific to clustering. These users have password auth enabled in `pg_hba.conf`, which is required by the connection library when connecting as a non-superuser. Tests have been refactored to bootstrap data nodes using these user roles.	2020-05-27 17:31:09 +02:00
Brian Rowe	79fb46456f	Rename server to data node The timescale clustering code so far has been written referring to the remote databases as 'servers'. This terminology is a bit overloaded, and in particular we don't enforce any network topology limitations that the term 'server' would suggest. In light of this we've decided to change to use the term 'node' when referring to the different databases in a distributed database. Specifically we refer to the frontend as an 'access node' and to the backends as 'data nodes', though we may omit the access or data qualifier where it's unambiguous. As the vast bulk of the code so far has been written for the case where there was a single access node, almost all instances of 'server' were references to data nodes. This change has updated the code to rename those instances.	2020-05-27 17:31:09 +02:00
niksa	0da34e840e	Fix server detach/delete corner cases Prevent server delete if the server contains data, unless user specifies `force => true`. In case the server is the only data replica, we don't allow delete/detach unless table/chunks are dropped. The idea is to have the same semantics for delete as for detach since delete actually calls detach We also try to update pg_foreign_table when we delete server if there is another server containing the same chunk. An internal function is added to enable updating foreign table server which might be useful in some cases since foreign table server is considered a default server for that particular chunk. Since this command needs to work even if the server we're trying to remove is non responsive, we're not removing any data on the remote data node.	2020-05-27 17:31:09 +02:00
niksa	2fd99c6f4b	Block new chunks on data nodes This functionality enables users to block or allow creation of new chunks on a data node for one or more hypertables. Use cases for this include the ability to block new chunks when a data node is running low on disk space or to affect chunk distribution across data nodes. Sometimes blocking data nodes for new chunks can make a hypertable under-replicated. For that case an additional argument `force => true` can be supplied to force blocking new chunks. Here are some examples. Block for a specific hypertable: `SELECT * FROM block_new_chunks_on_server('server_1', 'disttable');` Block for all hypertables on the server: `SELECT * FROM block_new_chunks_on_server('server_1', force =>true);` Unblock: `SELECT * FROM allow_new_chunks_on_server('server_1', true);` This change adds the `force` argument to `detach_server` as well. If detaching or blocking new chunks will make a hypertable under-replicated then `force => true` needs to used.	2020-05-27 17:31:09 +02:00
niksa	d8d13d9475	Allow detaching servers from hypertables A server can now be detached from one or more distributed hypertables so that it no longer in use. We only allow detaching a server if there is no data on the server and detaching it doesn't risk making a hypertable under-replicated. A user can detach a server for a specific hypertable, or for all hypertables to which the server is attached. `SELECT * FROM detach_server('server1', 'my_hypertable');` `SELECT * FROM detach_server('server2');`	2020-05-27 17:31:09 +02:00
Brian Rowe	59e3d7f1bd	Add create_distributed_hypertable command This change adds a variant of the create_hypertable command that will ensure the created table is distributed.	2020-05-27 17:31:09 +02:00
Brian Rowe	b1c6172d0a	Add attach_server function This adds an attach_server function which is used to associate a server with an existing hypertable.	2020-05-27 17:31:09 +02:00
Dmitry Simonenko	d8982c3e15	Add add_server() support for remote server bootstrapping This patch adds functionality for automatic database and extension creation on remote server. New function arguments: bootstrap_database, bootstrap_user and bootstrap_password.	2020-05-27 17:31:09 +02:00
Erik Nordström	e2371558f7	Create chunks on remote servers This change ensures that chunk replicas are created on remote (datanode) servers whenever a chunk is created in a local distributed hypertable. Remote chunks are created using the `create_chunk()` function, which has been slightly refactored to allow specifying an explicit chunk table name. The one making the remote call also records the resulting remote chunk IDs in its `chunk_server` mappings table. Since remote command invokation without super-user permissions requires password authentication, the test configuration files have been updated to require password authentication for a cluster test user that is used in tests.	2020-05-27 17:31:09 +02:00
Erik Nordström	125f793307	Add password parameter to add_server() Establishing a remote connection requires a password, unless the connection is made as a superuser. Therefore, this change adds the option to specify a password in the `add_server()` command. This is a required parameter unless called as a superuser.	2020-05-27 17:31:09 +02:00
Erik Nordström	ece582d458	Add mappings table for remote hypertables In a multi-node (clustering) setup, TimescaleDB needs to track which remote servers have data for a particular distributed hypertable. It also needs to know which servers to place new chunks on and to use in queries against a distributed hypertable. A new metadata table, `hypertable_server` is added to map a local hypertable ID to a hypertable ID on a remote server. We require that the remote hypertable has the same schema and name as the local hypertable. When a local server is removed (using `DROP SERVER` or our `delete_server()`), all remote hypertable mappings for that server should also be removed.	2020-05-27 17:31:09 +02:00
Erik Nordström	eca7cc337a	Add server management API and functionality Servers for a scale-out clustering setup can now be added and deleted with `add_server()` and `delete_server()`, providing a convenience API for server management. While similar functionality can be achieved using the standard PostgreSQL `CREATE SERVER` and `CREATE USER MAPPING` commands, this new API makes it easier to add clustering servers and user mappings consistent with the needs of TimescaleDBs particular clustering setup. The API currently works with the `postgres_fdw` foreign data wrapper. It will be updated to use our own foreign data wrapper once it is available.	2020-05-27 17:31:09 +02:00
Sven Klemm	a11910b5d5	Mark drop_chunks as VOLATILE and PARALLEL UNSAFE The drop_chunks function was incorrectly marked as stable and parallel safe this patch fixes the attributes.	2019-07-22 07:29:33 +02:00
Stephen Polcyn	1dc1850793	Drop_chunks returns list of dropped chunks Previously, drop_chunks returned an empty table, giving the user no indication of what (if anything) had happened. Now, drop_chunks returns a list of the chunks identifiers in the same style as show_chunks, with the chunk's schema and table name. Notably, when show_chunks is called directly before drop_chunks, the output should be the same.	2019-07-19 12:13:24 -04:00
Joshua Lockerman	45fb1fc2c8	Handle drop_chunks on tables that have cont aggs For hypetables that have continuous aggregates, calling drop_chunks now drops all of the rows in the materialization table that were based on the dropped chunks. Since we don't know what the correct default behavior for drop_chunks is, we've added a new argument, cascade_to_materializations, which must be set to true in order to call drop_chunks on a hypertable which has a continuous aggregate. drop_chunks is blocked on the materialization tables of continuous aggregates	2019-04-26 13:08:00 -04:00
Sven Klemm	f89fd07c5b	Remove year from SQL file license text This changes the license text for SQL files to be identical with the license text for C files.	2019-01-13 23:30:22 +01:00
Joshua Lockerman	47b5b7d553	Log which chunks are dropped by background workers We don't want to do this silently, so that users are able to debug where their chunks went.	2019-01-10 13:53:38 -05:00
Erik Nordström	e4a4f8e2f8	Add support for functions on open (time) dimensions TimescaleDB has always supported functions on closed (space) dimension, i.e., for hash partitioning. However, functions have not been supported on open (time) dimensions, instead requiring columns to have a supported time type (e.g, integer or timestamp). This restricts the tables that can be time partitioned. Tables with custom "time" types, which can be transformed by a function expression into a supported time type, are not supported. This change generalizes partitioning so that both open and closed dimensions can have an associated partitioning function that calculates a dimensional value. Fortunately, since we already support functions on closed dimensions, the changes necessary to support this on any dimension are minimal. Thus, open dimensions now support an (optional) partitioning function that transforms the input type to a supported time type (e.g., integer or timestamp type). Any indexes on such dimensional columns become expression indexes. Tests have been added for chunk expansion and the hashagg and sort transform optimizations on tables that are using a time partitioning function. Currently, not all of these optimizations are well supported, but this could potentially be fixed in the future.	2018-12-12 10:14:31 +01:00
Amy Tai	83014ee2b0	Implement drop_chunks in C Remove the existing PLPGSQL function that implements drop_chunks, replacing it with a direct call to the C function, which also implements the old PLPGSQL checks in C. Refactor out much of the code shared between the C implementations of show_chunks and drop_chunks.	2018-12-06 13:27:12 -05:00
Narek Galstyan	9a3402809f	Implement show_chunks in C and have drop_chunks use it Timescale provides an efficient and easy to use api to drop individual chunks from timescale database through drop_chunks. This PR builds on that functionality and through a new show_chunks function gives the opportunity to see the chunks that would be dropped if drop_chunks was run. Additionally, it adds a newer_than option to drop_chunks (also supported by show_chunks) that allows to see/drop chunks in an interval or newer than a point in time. This commit includes: - Implementation of show_chunks in C - Additional helper functions to work with chunks - New version of drop_chunks in sql that uses show_chunks. This also adds a newer_than option to drop_chunks - More enhanced tests of drop_chunks and new tests for show_chunks Among other reasons, show_chunks was implemented in C in order to be able to have both older_than and newer_than arguments be null. This was not possible in SQL because the arguments had to have polymorphic types and whether they are used in function body or not, PL/pgSQL requires these arguments to typecheck.	2018-11-28 13:46:07 -05:00
Joshua Lockerman	e06733acf0	Fix casing in SQL license header to be consistent with elsewhere	2018-11-15 15:18:58 -05:00
Joshua Lockerman	20ec6914c0	Add license headers to SQL files and test code	2018-10-29 13:28:19 -04:00
Sven Klemm	3e3bb0c796	Add bool created to create_hypertable and add_dimension return value Add bool created to return value of create_hypertable and add_dimension. When if_not_exists is true and creation is skipped because the object already exists created will be false, otherwise it will be true. This modifies the functions to return meta data even when no object was created.	2018-10-17 17:12:53 +02:00
Sven Klemm	d9b2dfed6b	Change return value of add_dimension to TABLE Change the return value of add_dimension to return a record consisting of dimension_id, schema_name, table_name, column_name. This improves user feedback about success of the operation but also gives the function an API returning useful information for non-human consumption.	2018-10-15 17:53:00 +02:00
Sven Klemm	a83e2838c9	Change return value of create_hypertable to TABLE Change create_hypertable to return a record consisting of (hypertable_id, schema_name, table_name). This improves user feedback about success of the operation but also gives the function an API returning useful information for non-human consumption.	2018-10-12 17:40:10 +02:00
Joshua Lockerman	974788516a	Prefix public C functions with ts_ We've decided to adopt the ts_ prefix on all exported C functions in order to avoid having symbol conflicts with future postgres functions. We've already started using this prefix on new functions and this commit adds the prefix to to the old functions.	2018-09-27 11:45:04 -04:00
Erik Nordström	2e7b32cd91	Add WARNING when doing min-max heap scan for adaptive chunking Adaptive chunking uses the min and max value of previous chunks to estimate their "fill factor". Ideally, min and max should be retreived using an index, but if no index exists we fall back to a heap scan. A heap scan can be very expensive, so we now raise a WARNING if no index exists. This change also renames set_adaptive_chunk_sizing() to simply set_adaptive_chunking().	2018-08-08 17:01:31 +02:00
Erik Nordström	9c9cdca6d3	Add support for adaptive chunk sizing Users can now (optionally) set a target chunk size and TimescaleDB will try to adapt the interval length of the first open ("time") dimension in order to reach that target chunk size. If a hypertable has more than one open dimension, only the first one will have a dynamically adapting interval. Users can optionally specify their own function that calculates the new dimension interval. They can also set a target size of 0 in order to estimate a suitable target size for a chunk based on available memory.	2018-08-08 17:01:31 +02:00
Mike Futerko	4f2f1a6eb7	Update the error messages to conform with the style guide; Fix tests An attempt to unify the error messages to conform with the PostgreSQL error messages style guide. See the link below: https://www.postgresql.org/docs/current/static/error-style-guide.html	2018-07-10 12:55:02 -04:00
Narek Galstyan	4b4211fe94	Fix some external functions when setting a custom schema Make sure internal references to timescale functions use the correct schema to refer to these functions. Fixes #554	2018-06-19 10:15:19 -04:00
Erik Nordström	ef744916a8	Migrate table data when creating a hypertable Tables can now hold existing data, which is optionally migrated from the main table to chunks when create_hypertable() is called. The data migration is similar to the COPY path, with the single difference that the inserted/copied tuples come from an existing table instead of being read from a file. After the data has been migrated, the main table is truncated. One potential downside of this approach is that all of this happens in a single transaction, which means that the table is blocked while migration is ongoing, preventing inserts by other transactions.	2018-02-15 23:30:54 +01:00
Erik Nordström	d6baccb9d7	Improve tablespace handling, including blocks for DROP and REVOKE This change improves the handling of tablespaces as follows: - Add if_not_attached / if_attached options to attach_tablespace() and detach_tablespace(), respectively - Block DROP tablespace if it is still attached to a table - Block REVOKE if it means the table owner no longer has CREATE permissions on an attached tablespace - Make error messages follow the PostgreSQL style guide	2018-02-05 23:16:20 +01:00
Erik Nordström	6e011d12fb	Refactor hypertable-related API functions This is a continuation of prior efforts to refactor API functions in C to: - improve usage of proper error codes - use error messages that better conform with the PostgreSQL standard. - improve security by avoiding that lots of code run under SECURITY DEFINER - move towards doing all metadata updates using a consistent catalog API Most importantly, `create_hypertable()` has been refactored in C, which simplifies a lot of code that previously required upcalls/downcalls between C code and plpgsql code, or duplicated functionality between the two environments.	2018-01-26 18:42:20 +01:00
Erik Nordström	71962b86ec	Refactor dimension-related API functions The functions for adding and updating dimensions have been refactored in C to: - improve usage of proper error codes - make messages that better conform with the PostgreSQL standard. - improve security by avoiding that lots of code run under SECURITY DEFINER A new if_not_exists option has also been added to add_dimension() and a the number of partitions can now be set using the new set_number_partitions() function. A bug in the validation of smallint time intervals has been fixed. The previous code didn't check for intervals > 0 and smallint intervals accepted values up to UINT16_MAX instead of INT16_MAX.	2018-01-25 19:02:34 +01:00
Erik Nordström	4df8f287a6	Add proper permissions handling for associated (chunk) schemas A hypertable's associated schema is used to create and store internal data tables (chunks). A hypertable creates tables in that schema, typically with full superuser permissions, regardless of whether the hypertable's owner or the current user have permissions for the schema. If the schema doesn't exist, the hypertable will create it when creating the first chunk, even though the user or table owner does not have permissions to create schemas in the database. This change adds proper permissions checks to create_hypertable() so that users cannot create hypertables with a custom associated schema unless they have the proper permissions on the schema or the database. Chunks are also no longer created with internal schema permissions if the associated schema is something different from the internal schema.	2017-12-28 11:24:29 +01:00
Matvey Arye	2fe447ba14	Make TimescaleDB work with pg_upgrade Compatibility with pg_upgrade required 2 changes: 1) search_path on functions cannot be blank for pg_upgrade. 2) The timescaledb.restoring GUC had to apply to more code (now moved to higher-level check) `pg_upgrade` must be passed the following option: `-O "-c timescaledb.restoring='on'"`	2017-12-19 11:47:49 -05:00
Erik Nordström	176b75e43d	Add command to show tablespaces attached to a hypertable Users can now call `show_tablespaces()` to list the tablespaces attached to a particular hypertable.	2017-12-09 18:27:50 +01:00
Erik Nordström	6e92383592	Add function to detach tablespaces from hypertables Tablespaces can now be detached from hypertables using `tablespace_detach()`. This function can either detach a tablespace from all tables or only a specific table. Having the ability to detach tablespace allows more advanced storage management, for instance, one can detach tablespaces that are running low on diskspace while attaching new ones to replace the old ones.	2017-12-09 18:27:50 +01:00
Erik Nordström	e593876cb0	Refactor tablespace handling Attaching tablespaces to hypertables is now handled in native code, with improved permissions checking and caching of tablespaces in the Hypertable data object.	2017-12-09 18:27:50 +01:00
Rob Kiefer	e44e47ed88	Update add_dimension to take INTERVAL times The user should be able to add time dimensions using INTERVAL when the column type is TIMESTAMP/TIMESTAMPTZ/DATE, so this change adds that support. Additionally it adds some additional tests and checks for add_dimension, e.g., a nice error when the table is not a hypertable.	2017-12-07 12:09:35 -05:00
Rob Kiefer	0763e62f8f	Update set_chunk_time_interval to take INTERVAL times For convenience, the user should be able to specify the new chunk time intervals using INTERVAL datatype if the hypertable is using a TIMESTAMP/TIMESTAMPTZ/DATE datatype for its time column.	2017-12-07 12:09:35 -05:00
Michael J. Freedman	51854ac5c9	Fix error message to reflect that drop_chunks can take a DATE interval	2017-12-05 11:26:01 -05:00
Matvey Arye	8b772be994	Change time handling in drop_chunks for TIMESTAMP times This PR fixes the handling of drop_chunks when the hypertable's time field is a TIMESTAMP or DATE field. Previously, such hypertables needed drop_chunks to be given a timestamptz in UTC. Now, drop_chunks can take a DATE or TIMESTAMP. Also, the INTERVAL version of drop_chunks correctly handles these cases. A consequence of this change is that drop_chunks cannot be called on multiple tables (with table_name = NULL or schema_name = NULL) if the tables have different time column types.	2017-11-27 16:17:42 -05:00
Erik Nordström	1e947da456	Permission fixes and allow SET ROLE This change reduces the usage of SECURITY DEFINER on SQL functions and fixes related permissions issues. It also properly checks hypertable permissions relative the current_user instead of the session_user, which otherwise breaks SET ROLE, among other things.	2017-11-27 15:55:26 +01:00
jwdeitch	6594018e32	Handle when create_hypertable is invoked on partitioned table - create_hypertable will raise exception on invocation	2017-11-21 12:28:16 -05:00

1 2

72 Commits