timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-25 15:50:27 +08:00

Author	SHA1	Message	Date
Erik Nordström	1dd9314f4d	Improve linting support with clang-tidy This change replaces the existing `clang-tidy` linter target with CMake's built-in support for it. The old way of invoking the linter relied on the `run-clang-tidy` wrapper script, which is not installed by default on some platforms. Discovery of the `clang-tidy` tool has also been improved to work with more installation locations. As a result, linting now happens at compile time and is enabled automatically when `clang-tidy` is installed and found. In enabling `clang-tidy`, several non-trivial issues were discovered in compression-related code. These might be false positives, but, until a proper solution can be found, "warnings-as-errors" have been disabled for that code to allow compilation to succeed with the linter enabled.	2020-05-29 14:04:25 +02:00
Sven Klemm	1c1b3c856e	Cleanup GUC names Change our GUC names to use enable-prefix for all boolean GUCs similar to postgres GUC names. This patch renames disable_optimizations to enable_optimizations and constraint_aware_append to enable_constraint_aware_append and removes optimize_non_hypertables.	2020-05-28 18:35:09 +02:00
Ruslan Fomkin	6bc4765f4d	Remove regression tests on PG 9.6 and 10 The first step of removing support for PG 9.6 and 10 is to remove the regression tests, which run against PostgreSQL versions 9.6 and 10.	2020-05-28 15:14:09 +02:00
Erik Nordström	14492cc562	Add AppVeyor configuration for multinode This change updates the AppVeyor configuration for multinode-related tests. These changes include, but are not limited to: * Set `max_prepared_transactions` for 2PC. * Add SSL/TLS configuration (although this is off for now due to failing `loader` test when SSL is on). * Update port settings since `add_data_node` outputs port. * Ignore `remote_connection` and `remote_txn` since they use a "node killer" which does not work on Windows (SIGTERM not supported). * Set timezone and datestyle	2020-05-27 17:31:09 +02:00
Mats Kindahl	c2744e13ad	Show error message on unavailable extension If the extension is not available on the data node, a strange error message will be displayed since the extension cannot be installed. This commit check for the availability of the extension before trying to bootstrap the node and print a more helpful informational message if the extension is not available.	2020-05-27 17:31:09 +02:00
Dmitry Simonenko	ac94947199	Fix insert batch size calculation for prepared statements Related to the issue #1702. Current insert batching implementation depends on a number of table columns and a batch size. It does not take into account a maximum number of prepared statements arguments, which by default can be exceeded with tables having a large number of columns. This PR has two effects: 1) It automatically recalculates insert batch size instead of using fixed TUPLE_THRESHOLD value, if the expected total number of prepared statement arguments will exceed the limit. 2) If fixes integer overload in INSERT statement deparsing if the number of arguments is greater then 16k.	2020-05-27 17:31:09 +02:00
Mats Kindahl	f214b64b31	Add test for grant propagation Add test for grant propagation when attaching a data node to a table. Function `data_node_attach` already calls `hypertable_assign_data_nodes`, which assigns data nodes, so grants are properly propagated to data nodes when they are attached.	2020-05-27 17:31:09 +02:00
Dmitry Simonenko	5044f5d115	Allow SERIAL columns for distributed hypertables This change introduce fix and adds support for serial columns on distributed hypertables. It fixes issue #1663. Basically a SERIAL type is a syntax sugar which automatically creates a SEQUENCE object and makes it dependable on the column. It also sets DEFAULT expression for using the sequence. The idea behind the fix is to avoid using the default expression when deparsing and recreating tables on data nodes in case if the column has a dependable sequence object.	2020-05-27 17:31:09 +02:00
Mats Kindahl	7a93a2f805	Change location of user certificates and keys User certificates and keys for logging into data nodes are stored at the top level of the `ssl_dir` or in the data directory. This can cause some confusion since a lot of files with user names resembling existing configuration files will be created as users are added, so this commit change the location of the user certificates and keys to be in the `timescaledb/certs` subdirectory of either the `ssl_dir` or data directory. In addition, since user names can contain strange characters (quoted names are allowed as role names, which can contain anything) the commit changes the names for certificates and keys to use the MD5 sum as hex string as base name for the files. This will prevent strange user names from accessing files outside the certificate directory. The subdirectory is currently hardcoded.	2020-05-27 17:31:09 +02:00
Dmitry Simonenko	7b5275e540	Allow ALTER TABLE SET on distributed hypertable This change allows to use ALTER TABLE SET/RESET, SET OIDS and SET WITHOUT OIDS clauses with a distributed hypertable. This PR has two effects: 1. It prevents having to copy storage options for foreign table chunks when their objects are created on the AN. The command updates only root table options on the AN and passes it for execution on the data nodes. 2 It prevents distributed hypertable chunks to be updated in the 'ddl_command_end' event trigger on AN, because PostgreSQL does not support altering storage options for foreign tables.	2020-05-27 17:31:09 +02:00
Dmitry Simonenko	9b4aae813f	Support storage options for distributed hypertables This change allows to deparse and include a main table storage options for the CREATE TABLE command which is executed during the create_distributed_hypertable() call.	2020-05-27 17:31:09 +02:00
Erik Nordström	26c6e156d7	Fix port conversion issue in add_data_node This change fixes an issue with port conversion in the `add_data_node` command that results in an error when a port is not explicitly given and PostgreSQL is configured to use a high port number. Note that this issue does _not_ occur when the port number is given as an explicit argument to `add_data_node`. The underlying issue is that, without an explicit port number, the remote port is assumed to be the same as the port configured for the local server instance. The conversion of that port number was done using a _signed_ two-byte integer, while the valid port range fits within an _unsigned_ two-byte integer. To test higher port ranges without an explicit argument to `add_data_node`, the default port for test instances has been updated to a high port number to test integer range overflow for small signed integers.	2020-05-27 17:31:09 +02:00
Dmitry Simonenko	0f2d7251cf	Basic LIMIT push down support This initial implementation allows to deparse LIMIT clause and include it in the push down query sended to the data nodes. Current implementation is quite restrictive and allows to use LIMIT only for simple queries without aggregates or in conjunction with the ORDER BY clause.	2020-05-27 17:31:09 +02:00
Ruslan Fomkin	effdc478ae	Check replication factor for exceeding data nodes set_replication_factor will check if the replication factor is bigger than the amount of attached data nodes. It returns an error in such case.	2020-05-27 17:31:09 +02:00
Erik Nordström	686860ea23	Support compression on distributed hypertables Initial support for compression on distributed hypertables. This _only_ includes the ability to run `compress_chunk` and `decompress_chunk` on a distributed hypertable. There is no support for automation, at least not beyond what one can do individually on each data node. Note that an access node keeps no local metadata about which distributed hypertables have compressed chunks. This information needs to be fetched directly from data nodes, although such functionality is not yet implemented. For example, informational views on the access nodes will not yet report the correct compression states for distributed hypertables.	2020-05-27 17:31:09 +02:00
Brian Rowe	bf343d7718	Add a test to verify attach_node behavior This change adds a new case to the data_node test that verifies that attaching a data node to a hypertable on a data node will fail (as hypertables are not marked as distributed on data nodes).	2020-05-27 17:31:09 +02:00
Ruslan Fomkin	c44a202576	Implement altering replication factor Implements SQL function set_replication_factor, which changes replication factor of a distributed hypertable. The change of the replication factor doesn't affect existing chunks. Newly created chunks are replicated according to new replication factor.	2020-05-27 17:31:09 +02:00
Brian Rowe	d49e9a5739	Add repartition option on detach/delete_data_node This change adds a new parameter to the detach_data_node and delete_data_node functions that will allow the user to automatically shrink their space dimension to match the number of nodes.	2020-05-27 17:31:09 +02:00
Erik Nordström	32f3d17cde	Rename hypertable_distributed test The `hypertable_distributed` test is now renamed to `dist_hypertable` for consistency with other distributed tests that have the `dist_` prefix.	2020-05-27 17:31:09 +02:00
Brian Rowe	0017208368	Test dimension add on distributed hypertables Prior to this change attempting to add a dimension to a distributed hypertable which currently or previously contained data would fail with an opaque error. This change will properly test distributed hypertables when adding dimensions and will print appropriate errors.	2020-05-27 17:31:09 +02:00
Mats Kindahl	8d28fad66d	Error on reference from distributed hypertable It is not possible to properly reference another table from a distributed hypertable since this would require replication of the referenced table. This commit add a warning message when a distributed hypertable attempt to reference any other table using a foreign key.	2020-05-27 17:31:09 +02:00
Erik Nordström	32bdf64205	Fix compiler warning in release builds This fixes a couple of warnings about unused variables used for assert checking that appear in release builds. The `PG_USED_FOR_ASSERTS_ONLY` attribute has been applied to the variable declarations to quench the warnings.	2020-05-27 17:31:09 +02:00
Erik Nordström	f20ad8231d	Release 2.0.0-beta4 This release includes user experience improvements for managing data nodes, more efficient statistics collection for distributed hypertables, and miscellaneous fixes and improvements. 2.0.0-beta4	2020-05-27 17:31:09 +02:00
Erik Nordström	55803125f3	Better handling of chunk insert state destruction Previously, the memory context for the chunk insert state was freed using a reset callback on the per-tuple context. This created an unfortunate cyclic dependency between memory contexts, since both the per-tuple context and chunk insert state shared the same memory context parent (the query memory context). Thus, when deletion happens by calling MemoryContextDelete on the parent, without having deleted the children first, the parent could first delete the chunk insert state child, followed by the per-tuple context which then tried to delete the chunk insert state again. A better way to handle this is to simply switch the parent of the chunk insert state's memory context to be the per-tuple context as long as it is still valid, thus breaking the cycle.	2020-05-27 17:31:09 +02:00
Dmitry Simonenko	2cec573213	Fix crash when cancelling long distributed insert Problem lies in incorrect handling of reset callback for the ChunkInsertState during the transaction abort procedure, which frees parent memory context. Because reset callbacks are executed only after all child memory context got deleted, it is possible to end up in the sutiation when the context is already deleted before this callback function being called.	2020-05-27 17:31:09 +02:00
Erik Nordström	150041566c	Fix non-determinism in distributed query tests Tests that run `ANALYZE` on distributed hypertables are susceptible to non-deterministic behavior because `ANALYZE` does random sampling and might have a different seed depending on data node. This change fixes this issue by running `setseed()` on all data node sessions before `ANALYZE` is run. Unfortunately, while this makes the behavior on a specific machine more deterministic, it doesn't produce the exact same statistics across different machines and/or C-libraries since those might have different PRNG implementations for `random()`.	2020-05-27 17:31:09 +02:00
Erik Nordström	8887f26baf	Fix array construction issue for remote colstats When fetching remote column statistics (`pg_statistic`) from data nodes, the `stanumbers` field was not turned into an array correctly. This caused values to be corrupted when importing them to the access node. This issue has been fixed along with some compiler warning issues (e.g., mixed declaration and code).	2020-05-27 17:31:09 +02:00
Brian Rowe	fad33fe954	Collect column stats for distributed tables. This change adds a new command to return a subset of the column stats for a hypertable (column width, percent null, and percent distinct). As part of the execution of this command on an access node, these stats will be collected for distributed chunks and updated on the access node.	2020-05-27 17:31:09 +02:00
Mats Kindahl	222bf75910	Use template1 as secondary connection database The `postgres` database might not exists on a data node, but `template1` will always exist so if a connection using `postgres` fails, we use `template1` as a secondary database. This is similar to how `connectMaintenanceDatabase` in the PostgreSQL code base works.	2020-05-27 17:31:09 +02:00
Erik Nordström	7a25d4bfb3	Fix mixed declaration and code warning This change fixes a "mixed declaration and code" warning in the remote chunk estimation code.	2020-05-27 17:31:09 +02:00
Erik Nordström	f747e9df8b	Remove the partitionwise_distributed test The partitionwise_distributed test is now superseeded by dist_query, which is a much cleaner and better test for the same things.	2020-05-27 17:31:09 +02:00
Erik Nordström	597d04a77a	Refactor distributed query tests Tests for queries on distributed hypertables are now consolidated in the `dist_query` test. Not only does this test provide more consistent EXPLAIN output, but it also runs all queries against different types of tables holding the same data, including comparing the result output with `diff`. The different types of tables compared are: - Regular table (for reference) - One-dimensional distributed hypertabe - Two-dimensional distributed hypertabe (which is partially repartitioned) EXPLAINs are provided on the two-dimensional table showing the effect on plans when quering repartitioned time ranges. In most case, FULL push-down is not possible in such cases. In addition to test refactoring, this change includes a fix for handling `HAVING` clauses in remote partialized queries. Such clauses should not be sent to the remote end in case of partial queries since any aggregates in the `HAVING` clause must be returned in the result target list. Fortunately, modification of the target list is already taken care of by the main planner.	2020-05-27 17:31:09 +02:00
Erik Nordström	88d59735f9	Make dist_query test PG version specific This change makes the dist_query test PG version-specific in preparation for test changes that will produce different output between, e.g., PG11 and PG12.	2020-05-27 17:31:09 +02:00
Dmitry Simonenko	6f5da9b5eb	Fix memory leak during long distributed insert Tuple expression context memory context is not properly reset during chunk dispatch execution which eventually consumes all available memory during the query execution: INSERT INTO test_table SELECT now() - random() * interval '2 years', (i/100)::text, random() FROM generate_series(1,700000) AS sub(i); This problem does not reproduces for a distributed hypertables with disabled batching and for a regular hypertables. Because luckly the tuple expression context got freed during the ModifyTable node execution.	2020-05-27 17:31:09 +02:00
Dmitry Simonenko	71e2c35d48	Run distributed VACUUM/ANALYZE without FDW API Run VACUUM/ANALYZE and automatically import the updated stats using the distributed DDL functionality instead of FDW analyze wrappers.	2020-05-27 17:31:09 +02:00
Erik Nordström	59b35db9c9	Print only pruned paths in debug output When printing paths in debug mode, the "considered" paths saved in the private rel info is a superset of the paths in the rel's pathlist. The result of this is that many paths are printed multiple times. This change makes sure we only print the "diff", i.e, the pruned paths that were considered but are no longer in the pathlist. A number of other issues with the debug output has also been addressed, like consistent function naming and being more robust to printing rels that might not have `fdw_private` set.	2020-05-27 17:31:09 +02:00
niksa	c60cabd768	Improve relation size estimate In case when there are no stats (number of tuples/pages) we can use two approaches to estimate relation size: interpolate relation size using stats from previous chunks (if exists) or estimate using shared buffer size (shared buffer size should align with chunk size).	2020-05-27 17:31:09 +02:00
Mats Kindahl	29ce1510a5	Allow extension on data node Before this commit, an existing extension would cause an error and abort the addition of a data node if `bootstrap` was `true`. To allow extension to already exist on the data node, this commit will first check if the extension exists on the data node. If the extension exists, it will be validated, otherwise the extension will be created on the data node.	2020-05-27 17:31:09 +02:00
niksa	66255eb5cb	Improve planner debug output To better understand choices that planner makes we need to print all the paths (with costs) that planner considered. Otherwise it might be hard to understand why certain path is not picked (eg. due to high startup/total costs) since it will never show up in relation path list that we print. This should help while working on improving distributed cost mode. This fix focuses only on paths that involve data nodes.	2020-05-27 17:31:09 +02:00
niksa	3bd1f914f1	Use qualified table name for chunks_in If a table contains a column with a same name as table name then query parser will get confused when parsing `chunks_in` function. The parser would think that we are passing in column instead of a table. Using qualified table name fixes this problem. Note that we needed to expand table using .* in order to avoid parser confusion caused by schema.table syntax.	2020-05-27 17:31:09 +02:00
Mats Kindahl	267a13ec98	Fix data node extension version check Currently, if the major version of the extension on the access node is later than the version of the extension on the data node, the data node is accepted. Since major versions are not compatible, it should not be accepted. Changed the check to only accept the data node if: - The major version is the same on the data node and the access node. - The minor version on the data node is same or earlier than than access node. In addition, the code will print a warning if the version on the data node is older than the version on the access node.	2020-05-27 17:31:09 +02:00
Dmitry Simonenko	11ef10332e	Add number of compressed hypertables to stat This change includes telemetry fixes which extends HypertablesStat with num_hypertables_compressed. It also updates the way how the number of regular hypertables is calculated, which is now treated as a non-compressed and not related to continuous aggregates.	2020-05-27 17:31:09 +02:00
Mats Kindahl	d5f5d92790	Refactor to add database representation Adding a `Database` structure to keep track of database name, collation, encoding, and character type.	2020-05-27 17:31:09 +02:00
niksa	aa327518d6	Row-by-row fetcher hardening Fix dangling pointers when closing async response results. Remove unnecessary data fetch call.	2020-05-27 17:31:09 +02:00
Mats Kindahl	fdc7138bda	Validate database on data node When a data node needs bootstrapping and the database to be bootstrapped already exists, it was blindly assumed to be configured correctly. With this commit, we validate the database if it already existed before proceeding and raise an error if it is not correctly configured. When validating the data node and bootstrap is `true`, we are connected to the `postgres` database rather than the database to validate. This means that we cannot use `current_database()` and instead pass the database name as a parameter to `data_node_validate_database`.	2020-05-27 17:31:09 +02:00
Mats Kindahl	c14948ad98	Propagate grants to data nodes Before this commit, grants and revokes where not propagated to data nodes. After this commit, grant and revokes on a distributed hypertable are propagated to the data nodes of the hypertable.	2020-05-27 17:31:09 +02:00
Mats Kindahl	96dd266a0b	Propagate grants when creating hypertables When creating a hypertable, grants were not propagated to the table on the remote node, which causes later statements to fail when not executed as the owner of the table. This commit deparse grant statements from the table definition and add the grants to the deparsed statement to send when creating the table on the data node.	2020-05-27 17:31:09 +02:00
Ruslan Fomkin	ef823a3060	Remove unnecessary check from distributed DDL Since NULL value for replication factor in SQL DDL corresponds to HYPERTABLE_REGULAR now, which is different from HYPERTABLE_DISTRIBUTED_MEMBER, there is no need to check for non-NULL value and comparing with HYPERTABLE_DISTRIBUTED_MEMBER is enough.	2020-05-27 17:31:09 +02:00
Ruslan Fomkin	6aec69f9c4	Rename exported test functions to follow convention Rename exported functions used in distributed tests to follow the convention of ts_ prefix, which was recently forced in non-distributed tests.	2020-05-27 17:31:09 +02:00
Ruslan Fomkin	78a5ba5bf2	Fix uninitialized warning in test help function Test code in remote_exec fails to build due to the maybe uninitialized error on 32-bit alpine package on a string variable. This fix moves initialization to the string variable declaration, refactors a loop to have a single place with exit condition, which checks for both NULL value and empty string.	2020-05-27 17:31:09 +02:00

1 2 3 4 5 ...

2043 Commits