timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-16 18:43:18 +08:00

Author	SHA1	Message	Date
Sven Klemm	789bb26dfb	Lock down search_path in SPI calls	2023-02-01 07:54:03 +01:00
Nikhil Sontakke	c92e29ba3a	Fix DML HA in multi-node If a datanode goes down for whatever reason then DML activity to chunks residing on (or targeted to) that DN will start erroring out. We now handle this by marking the target chunk as "stale" for this DN by changing the metadata on the access node. This allows us to continue to do DML to replicas of the same chunk data on other DNs in the setup. This obviously will only work for chunks which have "replication_factor" > 1. Note that for chunks which do not have undergo any change will continue to carry the appropriate DN related metadata on the AN. This means that such "stale" chunks will become underreplicated and need to be re-balanced by using the copy_chunk functionality by a micro service or some such process. Fixes #4846	2022-11-25 17:42:26 +05:30
Erik Nordström	f13214891c	Add function to alter data nodes Add a new function, `alter_data_node()`, which can be used to change the data node's configuration originally set up via `add_data_node()` on the access node. The new functions introduces a new option "available" that allows configuring the availability of the data node. Setting `available=>false` means that the node should no longer be used for reads and writes. Only read "failover" is implemented as part of this change, however. To fail over reads, the alter data node function finds all the chunks for which the unavailable data node is the "primary" query target and "fails over" to a chunk replica on another data node instead. If some chunks do not have a replica to fail over to, a warning will be raised. When a data node is available again, the function can be used to switch back to using the data node for queries. Closes #2104	2022-11-11 13:59:42 +01:00
Fabrízio de Royes Mello	f1535660b0	Honor usage of OidIsValid() macro Postgres source code define the macro `OidIsValid()` to check if the Oid is valid or not (comparing against the `InvalidOid` type). See `src/include/c.h` in Postgres source three. Changed all direct comparisons against `InvalidOid` for the `OidIsValid` call and add a coccinelle check to make sure the future changes will use it correctly.	2022-11-03 16:10:50 -03:00
Ante Kresic	2475c1b92f	Roll up uncompressed chunks into compressed ones This change introduces a new option to the compression procedure which decouples the uncompressed chunk interval from the compressed chunk interval. It does this by allowing multiple uncompressed chunks into one compressed chunk as part of the compression procedure. The main use-case is to allow much smaller uncompressed chunks than compressed ones. This has several advantages: - Reduce the size of btrees on uncompressed data (thus allowing faster inserts because those indexes are memory-resident). - Decrease disk-space usage for uncompressed data. - Reduce number of chunks over historical data. From a UX point of view, we simple add a compression with clause option `compress_chunk_time_interval`. The user should set that according to their needs for constraint exclusion over historical data. Ideally, it should be a multiple of the uncompressed chunk interval and so we throw a warning if it is not.	2022-11-02 15:14:18 +01:00
Alexander Kuzmenkov	313845a882	Enable -Wextra Our code mostly has warnings about comparison with different signedness.	2022-10-27 16:06:58 +04:00
Alexander Kuzmenkov	864da20cee	Build on Ubuntu 22.04 It has newer GCC which should detect more warnings.	2022-10-26 23:32:05 +04:00
Mats Kindahl	276d3a331d	Add macro to assert or error For some unexpected conditions, we have a check and an error that is generated. Since this always generate an error, it is more difficult to find the bug if the error is generated rather than an assert fired generating a core dump. Similarly, some asserts can occur in production builds and will lead to strange situations triggering a crash. For those cases we should instead generate an error. This commit introduces a macro `Ensure` that will result in an assert in debug builds, but an error message in release build. This macro should only be used for conditions that should not occur during normal runtime, but which can happen is odd corner-cases in release builds and therefore warrants an error message. It also replaces some existing checks with such errors to demonstrate usage.	2022-10-20 13:35:09 +02:00
Alexander Kuzmenkov	7758f5959c	Update .clang-format for version 14 The only configuration we're missing is the newline for braces after case labels. The rest of the differences looks like bugs/omissions of the version 8 that we use now. Require clang-format-14 in cmake and use it in the CI check. We can't support versions earlier than 14 because they have some formatting differences that can't be configured.	2022-10-10 17:12:36 +03:00
Bharathy	d00a55772c	error compressing wide table Consider a compressed hypertable has many columns (like more than 600 columns). In call to compress_chunk(), the compressed tuple size exceeds, 8K which causes error as "row is too big: size 10856, maximum size 8160." This patch estimates the tuple size of compressed hypertable and reports a warning when compression is enabled on hypertable. Thus user gets aware of this warning before calling compress_chunk(). Fixes #4398	2022-09-17 11:24:23 +05:30
Bharathy	b869f91e25	Show warnings during create_hypertable(). The schema of base table on which hypertables are created, should define columns with proper data types. As per postgres best practices Wiki (https://wiki.postgresql.org/wiki/Don't_Do_This), one should not define columns with CHAR, VARCHAR, VARCHAR(N), instead use TEXT data type. Similarly instead of using timestamp, one should use timestamptz. This patch reports a WARNING to end user when creating hypertables, if underlying parent table, has columns of above mentioned data types. Fixes #4335	2022-09-12 18:47:47 +05:30
Dmitry Simonenko	c697700add	Add hypertable distributed argument and defaults This PR introduces a new `distributed` argument to the create_hypertable() function as well as two new GUC's to control its default behaviour: timescaledb.hypertable_distributed_default and timescaledb.hypertable_replication_factor_default. The main idea of this change is to allow automatic creation of the distributed hypertables by default.	2022-08-29 17:44:16 +03:00
Alexander Kuzmenkov	51259b31c4	Fix OOM in large INSERTs Do not allocate various temporary data in PortalContext, such as the hyperspace point corresponding to the row, or the intermediate data required for chunk lookup.	2022-08-23 19:40:51 +03:00
Erik Nordström	025bda6a81	Add stateful partition mappings Add a new metadata table `dimension_partition` which explicitly and statefully details how a space dimension is split into partitions, and (in the case of multi-node) which data nodes are responsible for storing chunks in each partition. Previously, partition and data nodes were assigned dynamically based on the current state when creating a chunk. This is the first in a series of changes that will add more advanced functionality over time. For now, the metadata table simply writes out what was previously computed dynamically in code. Future code changes will alter the behavior to do smarter updates to the partitions when, e.g., adding and removing data nodes. The idea of the `dimension_partition` table is to minimize changes in the partition to data node mappings across various events, such as changes in the number of data nodes, number of partitions, or the replication factor, which affect the mappings. For example, increasing the number of partitions from 3 to 4 currently leads to redefining all partition ranges and data node mappings to account for the new partition. Complete repartitioning can be disruptive to multi-node deployments. With stateful mappings, it is possible to split an existing partition without affecting the other partitions (similar to partitioning using consistent hashing). Note that the dimension partition table expresses the current state of space partitions; i.e., the space-dimension constraints and data nodes to be assigned to new chunks. Existing chunks are not affected by changes in the dimension partition table, although an external job could rewrite, move, or copy chunks as desired to comply with the current dimension partition state. As such, the dimension partition table represents the "desired" space partitioning state. Part of #4125	2022-08-02 11:38:32 +02:00
Alexander Kuzmenkov	3c56d3eceb	Faster lookup of chunks by point Don't keep the chunk constraints while searching. The number of candidate chunks can be very large, so keeping these constraints is a lot of work and uses a lot of memory. For finding the matching chunk, it is enough to track the number of dimensions that matched a given chunk id. After finding the chunk id, we can look up only the matching chunk data with the usual function. This saves some work when doing INSERTs.	2022-06-07 18:10:20 +05:30
Alexander Kuzmenkov	9012e2a20d	Do not create a memory context for each Chunk For some reason, we create a MemoryContext for each Chunk. This context then is almost never used. Just don't do this.	2022-05-02 17:49:00 +05:30
Josh Soref	68aec9593c	Fix various misspellings This patch fixes various misspellings of committed, constraint and insufficient in code, comments and documentation.	2022-04-22 11:06:52 +02:00
Mats Kindahl	f5fd06cabb	Ignore invalid relid when deleting hypertable When running `performDeletion` is is necessary to have a valid relation id, but when doing a lookup using `ts_hypertable_get_by_id` this might actually return a hypertable entry pointing to a table that does not exist because it has been deleted previously. In this case, only the catalog entry should be removed, but it is not necessary to delete the actual table. This scenario can occur if both the hypertable and a compressed table are deleted as part of running a `sql_drop` event, for example, if a compressed hypertable is defined inside an extension. In this case, the compressed hypertable (indeed all tables) will be deleted first, and the lookup of the compressed hypertable will find it in the metadata but a lookup of the actual table will fail since the table does not exist. Fixes #4140	2022-03-14 14:03:49 +01:00
Erik Nordström	e56b95daec	Add telemetry stats based on type of relation Refactor the telemetry function and format to include stats broken down on common relation types. The types include: - Tables - Partitioned tables - Hypertables - Distributed hypertables - Continuous aggregates - Materialized views - Views and for each of these types report (when applicable): - Total number of relations - Total number of children/chunks - Total data volume (broken into heap, toast, and indexes). - Compression stats - PG stats, like reltuples The telemetry function has also been refactored to return `jsonb` instead of `text`. This makes it easier to query and manipulate the resulting JSON format, and also gives cleaner output. Closes #3932	2022-02-08 09:44:55 +01:00
gayyappan	9f64df8567	Add ts_catalog subdirectory Move files that are related to timescaledb catalog access to this subdirectory	2022-01-24 16:58:09 -05:00
Sven Klemm	39645d56da	Fix subtract_integer_from_now on 32-bit platforms This patch fixes subtract_integer_from_now on 32-bit platforms, improves error handling and adds some basic tests. subtract_integer_from_now would trigger an assert when called on a hypertable without integer time dimension (found by sqlsmith). Additionally subtract_integer_from_now would segfault when called on a hypertable without partitioning dimensions.	2021-12-20 10:02:57 +01:00
Sven Klemm	7d9ea6237c	Remove namein calls from scankey initialization A lot of our scankey initialization when scanning indexes for a name had a superfluous namein call. This patch remove those unnecessary calls.	2021-11-03 14:56:32 +01:00
Dmitry Simonenko	3d11927567	Rework distributed DDL processing logic This patch does refactoring and rework of the logic beside dist_ddl_preprocess() function. The idea behind it is to simplify process by splitting each DDL command logic inside separate function and avoid relaying on the hypertable list count to make decisions. This change allows easier to process more complex commands (such as GRANT), which would require query rewrite or to be executed on a different data nodes. Additionally this would make it easier to follow and be more alike as main code path inside src/process_util.c.	2021-10-29 16:15:58 +03:00
Nikhil Sontakke	68697859df	Fix GRANT/REVOKE ALL IN SCHEMA handling Fix the "GRANT/REVOKE ALL IN SCHEMA" handling uniformly across single-node and multi-node. Even thought this is a SCHEMA specific activity, we decided to include the chunks even if they are part of another SCHEMA. So they will also end up getting/resetting the same privileges. Includes test case changes for both single-node and multi-node use cases.	2021-10-22 16:48:16 +05:30
Sven Klemm	f0ba729acb	Adjust code to PG14 es_result_rel_info removal PG14 removes es_result_relation_info from executor state. https://github.com/postgres/postgres/commit/a04daa97	2021-10-04 13:26:22 +02:00
Sven Klemm	36a82d0851	Fix compiler warning about missing braces Older gcc versions will throw a warning about missing braces when a nested struct is initialized with {0}.	2021-08-17 18:39:54 +02:00
Erik Nordström	98110af75b	Constify parameters and return values of core APIs Harden core APIs by adding the `const` qualifier to pointer parameters and return values passed by reference. Adding `const` to APIs has several benefits and potentially reduces bugs. * Allows core APIs to be called using `const` objects. * Callers know that objects passed by reference are not modified as a side-effect of a function call. * Returning `const` pointers enforces "read-only" usage of pointers to internal objects, forcing users to copy objects when mutating them or using explicit APIs for mutations. * Allows compiler to apply optimizations and helps static analysis. Note that these changes are so far only applied to core API functions. Further work can be done to improve other parts of the code.	2021-06-14 22:09:10 +02:00
Matvey Arye	b72dab16c0	Add some more randomness to chunk assignment Previously the assignment of data nodes to chunks had a bit of a thundering-herd problem for multiple hypertables without space partions: the data node assigned for the first chunk was always the same across hypertables. We fix this by adding the hypertable_id to the index into the datanode array. This de-synchronizes across hypertables but maintains consistency for any given hypertable. We could make this consistent for space partitioned tables as well but avoid doing so now to prevent partitions jumping nodes due to this change. This also effects tablespace selection in the same way.	2021-06-08 14:04:23 +02:00
Sven Klemm	fb863f12c7	Remove support for PG11 Remove support for compiling against PostgreSQL 11. This patch also removes PG11 specific compatibility macros.	2021-06-01 20:21:06 +02:00
Sven Klemm	99ffe8fd6c	Fix blocking triggers with transition tables Block creation of triggers with transition tables on trigger creation instead of erroring out when a chunk is created. Fixes #3234	2021-05-20 21:55:26 +02:00
Sven Klemm	eef71fdfb1	Replace StrNCpy with strlcpy PG14 removes StrNCpy and some Name helper functions. https://github.com/postgres/postgres/commit/1784f278a6	2021-05-20 08:54:54 +02:00
Ruslan Fomkin	639aef76a4	Refactor chunk creation for future extension Separates chunk preparation and metadata update. Separates preparation of constraints names, since there is no overlap between preparing names for dimension constraints and other constraints. Factors out creation of json string describing dimension slices of a chunk. This refactoring is preparation for implementing new functionalities.	2021-04-06 14:02:22 +02:00
gayyappan	f649736f2f	Support ADD COLUMN for compressed hypertables Support ALTER TABLE .. ADD COLUMN <colname> <typname> for hypertables with compressed chunks.	2021-01-14 09:32:50 -05:00
Erik Nordström	ec82d16154	Remove unused field for ignoring invalidations Remove an unused field called `max_ignore_invalidation_older_than` that was left in the `Hypertable` struct after refactoring of continuous aggregates.	2020-12-21 13:46:43 +01:00
gayyappan	bca1e35a52	Support disabling compression on distributed hypertables This change allows a user to execute ALTER TABLE hyper SET (timescaledb.compress = false) on distributed hypertables. Fixes #2716	2020-12-08 12:08:54 -05:00
gayyappan	d41aa2aff5	Rename macro TS_HYPERTABLE_HAS_COMPRESSION This PR cleans up some macro names. Rename macro TS_HYPERTABLE_HAS_COMPRESSION to TS_HYPERTABLE_HAS_COMPRESSION_TABLE.	2020-12-02 15:44:48 -05:00
gayyappan	7c76fd4d09	Save compression settings on access node for distributed hypertables 1. Add compression_state column for hypertable catalog by renaming compressed column for the hypertable catalog table. compression_state is a tri-state column. This column indicates if the hypertable has compression enabled (value = 1) or if it is an internal compression table (value = 2). 2. Save compression settings on access node when compression is turned on for a distributed hypertable For a distributed hypertable, that has compression enabled, compression_state is set. We don't create any internal tables on the access node. Fixes #2660	2020-12-02 10:42:57 -05:00
Dmitry Simonenko	bbcb2b22fa	Fix SCHEMA DROP CASCADE with continuous aggregates This change fixes the situation when schema object is dropped before the cagg which leads to an error when it tries to resolve it. Issue: #2350	2020-11-10 11:46:19 +03:00
Mats Kindahl	e9cb14985e	Read function name dynamically The function name is hard-coded in some cases in the C function, so this commit instead define and use a macro that will extract the function name from the `fcinfo` structure. This prevents mismatches between the hard-coded names and the actual function name. Closes #2579	2020-10-21 15:03:32 +02:00
Erik Nordström	3cf9c857c4	Make errors and messages conform to style guide Errors and messages are overhauled to conform to the official PostgreSQL style guide. In particular, the following things from the guide has been given special attention: * Correct capitalization of first letter: capitalize only for hints, and detail messages. * Correct handling of periods at the end of messages (should be elided for primary message, but not detail and hint messages). * The primary message should be short, factual, and avoid reference to implementation details such as specific function names. Some messages have also been reworded for clarity and to better conform with the last bullet above (short primary message). In other cases, messages have been updated to fix references to, e.g., function parameters that used the wrong parameter name. Closes #2364	2020-10-20 16:49:32 +02:00
Erik Nordström	2cc2df23bd	Lock dimension slices when creating new chunk This change makes two changes to address issues with processes doing concurrent inserts and `drop_chunks` calls: - When a new chunk is created, any dimension slices that existed prior to creating the new chunk are locked to prevent them from being dropped before the chunk-creating process commits. - When a chunk is being dropped, concurrent inserts into the chunk that is being dropped will try to lock the dimension slices of the chunk. In case the locking fails (due to the slices being concurrently deleted), the insert process will treat the chunk as not existing and will instead recreate it. Previously, the chunk slices (and thus chunk) would be found, but the insert would fail when committing since the chunk was concurrently deleted. A prior commit (PR #2150) partially solved a related problem, but didn't lock all the slices of a chunk. That commit also threw an error when a lock on a slice could not be taken due to the slice being deleted by another transaction. This is now changed to treat that case as a missing slice instead, causing it to be recreated. Fixes #1986	2020-10-15 21:56:10 +02:00
Mats Kindahl	da97ce6e8b	Make function parameter names consistent Renaming the parameter `hypertable_or_cagg` in functions `drop_chunks` and `show_chunks` to `relation` and changing parameter name from `main_table` to `hypertable` or `relation` depending on context.	2020-10-02 08:52:20 +02:00
Erik Nordström	c884fe43f0	Remove unused materializer code The new refresh functionality for continuous aggregates replaces the old materializer, which means some code is no longer used and should be removed. Closes #2395	2020-09-15 17:18:59 +02:00
Erik Nordström	f49492b83d	Cap invalidation threshold at last data bucket When refreshing with an "infinite" refresh window going forward in time, the invalidation threshold is also moved forward to the end of the valid time range. This effectively renders the invalidation threshold useless, leading to unnecessary write amplification. To handle infinite refreshes better, this change caps the refresh window at the end of the last bucket of data in the underlying hypertable, as to not move the invalidation threshold further than necessary. For instance, if the max time value in the hypertable is 11, a refresh command such as: ``` CALL refresh_continuous_aggregate(NULL, NULL); ``` would be turned into ``` CALL refresh_continuous_aggregate(NULL, 20); ``` assuming that a bucket starts at 10 and ends at 20 (exclusive). Thus the invalidation threshold would at most move to 20, allowing the threshold to still do its work once time again moves forward and beyond it. Note that one must never process invalidations beyond the invalidation threshold without also moving it, as that would clear that area from invalidations and thus prohibit refreshing that region once the invalidation threshold is moved forward. Therefore, if we do not move the threshold further than a certain point, we cannot refresh beyond it either. An alternative, and perhaps safer, approach would be to always invalidate the region over which the invalidation threshold is moved (i.e., new_threshold - old_threshold). However, that is left for a future change. It would be possible to also cap non-infinite refreshes, e.g., refreshes that end at a higher time value than the max time value in the hypertable. However, when an explicit end is specified, it might be on purpose so optimizing this case is also left for the future. Closes #2333	2020-09-09 19:46:28 +02:00
Mats Kindahl	4f32439362	Update tablespace of table on attach and detach If a tablespace is attached to a hypertable the tablespace of the hypertable is not set, but if the tablespace is set it is also attached. A similar situation occurs if tablespaces are detached. This means that if a hypertable is created with a tablespace and then all tablespaces are detached, the chunks will still be put in the tablespace of the hypertable. With this commit, attaching a tablespace to a hypertable will set the tablespace of the hypertable if it does not already have one. Detaching a tablespace from a hypertable will set the tablespace to the default tablespace if the tablespace being detached is the tablespace for the hypertable. If `detach_tablespace` is called with only a tablespace name, it will be detached from all tables it is attached to. This commit ensures that the tablespace for the hypertable is set to the default tablespace if it was set to the tablespace being detached. Fixes #2299	2020-09-09 09:30:07 +02:00
Erik Nordström	4538fc6c40	Optimize continuous aggregate refresh This change ensures a refresh of a continuous aggregate only re-materializes the part of the aggregate that has been invalidated. This makes refreshing much more efficient, and sometimes eliminates the need to materialize data entirely (i.e., in case there are no invalidations in the refresh window). The ranges to refresh are the remainders of invalidations after they are cut by the refresh window (i.e., all invalidations, or parts of invalidations, that fall within the refresh window). The invalidations used for a refresh are collected in a tuple store (which spills to disk) as to not allocate too much memory in case of many invalidations. Invalidations are, however, merged and deduplicated before being added to the tuplestore, similar to how invalidations are processed in the invalidation logs. Currently, the refreshing proceeds with just materializing all invalidated ranges in the order they appear in the tuple store, and the ordering does not matter since all invalidated regions are refreshed in the same transaction.	2020-08-31 10:22:32 +02:00
Mats Kindahl	c054b381c6	Change syntax for continuous aggregates We change the syntax for defining continuous aggregates to use `CREATE MATERIALIZED VIEW` rather than `CREATE VIEW`. The command still creates a view, while `CREATE MATERIALIZED VIEW` creates a table. Raise an error if `CREATE VIEW` is used to create a continuous aggregate and redirect to `CREATE MATERIALIZED VIEW`. In a similar vein, `DROP MATERIALIZED VIEW` is used for continuous aggregates and continuous aggregates cannot be dropped with `DROP VIEW`. Continuous aggregates are altered using `ALTER MATERIALIZED VIEW` rather than `ALTER VIEW`, so we ensure that it works for `ALTER MATERIALIZED VIEW` and gives an error if you try to use `ALTER VIEW` to change a continuous aggregate. Note that we allow `ALTER VIEW ... SET SCHEMA` to be used with the partial view as well as with the direct view, so this is handled as a special case. Fixes #2233 Co-authored-by: =?UTF-8?q?Erik=20Nordstr=C3=B6m?= <erik@timescale.com> Co-authored-by: Mats Kindahl <mats@timescale.com>	2020-08-27 17:16:10 +02:00
Mats Kindahl	769bc31dc2	Lock dimension slice tuple when scanning In the function `ts_hypercube_from_constraints` a hypercube is build from constraints which reference dimension slices in `dimension_slice`. As part of a run of `drop_chunks` or when a chunk is explicitly dropped as part of other operations, dimension slices can be removed from this table causing the dimension slices to be removed, which makes the hypercube reference non-existent dimension slices which subsequently causes a crash. This commit fixes this by adding a tuple lock on the dimension slices that are used to build the hypercube. If two `drop_chunks` are running concurrently, there can be a race if dimension slices are removed as a result removing a chunk. We treat this case in the same way as if the dimension slice was updated: report an error that another session locked the tuple. Fixes #1986	2020-08-26 09:44:20 +02:00
Mats Kindahl	aec7c59538	Block data migration for distributed hypertables Option `migrate_data` does not currently work for distributed hypertables, so we block it for the time being and generate an error if an attempt is made to migrate data when creating a distributed hypertable. Fixes #2230	2020-08-20 15:07:01 +02:00
Sven Klemm	bb891cf4d2	Refactor retention policy This patch changes the retention policy to store its configuration in the bgw_job table and removes the bgw_policy_drop_chunks table.	2020-08-03 22:33:54 +02:00

1 2 3 4 5

214 Commits