timescaledb

mirror of https://github.com/timescale/timescaledb.git synced 2025-05-17 19:13:16 +08:00

Author	SHA1	Message	Date
Nikhil Sontakke	2b8f98c616	Support approximate hypertable size If a lot of chunks are involved then the current pl/pgsql function to compute the size of each chunk via a nested loop is pretty slow. Additionally, the current functionality makes a system call to get the file size on disk for each chunk everytime this function is called. That again slows things down. We now have an approximate function which is implemented in C to avoid the issues in the pl/pgsql function. Additionally, this function also uses per backend caching using the smgr layer to compute the approximate size cheaply. The PG cache invalidation clears off the cached size for a chunk when DML happens into it. That size cache is thus able to get the latest size in a matter of minutes. Also, due to the backend caching, any long running session will only fetch latest data for new or modified chunks and can use the cached data (which is calculated afresh the first time around) effectively for older chunks.	2024-02-01 13:25:41 +05:30
Sven Klemm	11dd9af847	Remove multinode catalog objects This patch removes the following objects: tables: - _timescaledb_catalog.chunk_data_node - _timescaledb_catalog.dimension_partition - _timescaledb_catalog.hypertable_data_node - _timescaledb_catalog.remote_txn views: - timescaledb_information.data_nodes functions: - _timescaledb_functions.hypertable_remote_size - _timescaledb_functions.chunks_remote_size - _timescaledb_functions.indexes_remote_size - _timescaledb_functions.compressed_chunk_remote_stats	2023-12-18 10:53:27 +01:00
Sven Klemm	8a2029f569	Remove rxid type and distributed size util functions	2023-12-13 23:38:32 +01:00
Nikhil Sontakke	293104add2	Use numrows_pre_compression in approx row count The approximate_row_count function was using the reltuples from compressed chunks and multiplying that with 1000 which is the default batch size. This was leading to a huge skew between the actual row count and the approximate one. We now use the numrows_pre_compression value from the timescaledb catalog which accurately represents the number of rows before the actual compression.	2023-12-04 22:26:56 +05:30
Fabrízio de Royes Mello	0254abd9c2	Remove useless table lock In ae21ee96 we fixed a race condition when running a query to get the hypertable sizes and one or more chunks was dropped in a concurrent session leading to exception because the chunks does not exist. In fact the table lock introduced is useless because we also added proper joins with Postgres catalog tables to ensure that the relation exists in the database when calculating the sizes. And even worse with this table lock now dropping chunks wait for the functions that calculate the hypertable sizes. Fixed it by removing the useless table lock and also added isolation tests to make sure we'll not end up with race conditions again.	2023-11-15 12:17:33 -03:00
Dipesh Pandit	93519d0af8	Function approximate_row_count returns 0 for caggs (#6053 ) The approximate_row_count function is executed directly on the user view instead of corresponding materialized hypertable which returns 0 for caggs. The function is updated to fetch the details for materialized hypertable information corresponding to cagg and then get the approximate_row_count for the materialized hypertable. Fixes #6051	2023-09-12 11:41:59 +05:30
Sven Klemm	183362e17b	Move size_util functions to _timescaledb_functions schema To increase schema security we do not want to mix our own internal objects with user objects. Since chunks are created in the _timescaledb_internal schema our internal functions should live in a different dedicated schema. This patch make the necessary adjustments for the following functions: - relation_size(regclass) - data_node_hypertable_info(name, name, name) - data_node_chunk_info(name, name, name) - hypertable_local_size(name, name) - hypertable_remote_size(name, name) - chunks_local_size(name, name) - chunks_remote_size(name, name) - range_value_to_pretty(bigint, regtype) - get_approx_row_count(regclass) - data_node_compressed_chunk_stats(name, name, name) - compressed_chunk_local_stats(name, name) - compressed_chunk_remote_stats(name, name) - indexes_local_size(name, name) - data_node_index_size(name, name, name) - indexes_remote_size(name, name, name)	2023-08-22 12:05:23 +02:00
Sven Klemm	0a66bdb8d3	Move functions to _timescaledb_functions schema To increase schema security we do not want to mix our own internal objects with user objects. Since chunks are created in the _timescaledb_internal schema our internal functions should live in a different dedicated schema. This patch make the necessary adjustments for the following functions: - to_unix_microseconds(timestamptz) - to_timestamp(bigint) - to_timestamp_without_timezone(bigint) - to_date(bigint) - to_interval(bigint) - interval_to_usec(interval) - time_to_internal(anyelement) - subtract_integer_from_now(regclass, bigint)	2023-08-21 15:01:35 +02:00
noctarius aka Christoph Engelbert	4c3d64aa98	Support CAGG names in chunk_detailed_size (#5839 ) This patch adds support to pass continuous aggregate names to `chunk_detailed_size` to align it to the behavior of other functions such as `show_chunks`, `drop_chunks`, `hypertable_size`.	2023-07-12 20:22:14 +02:00
Dipesh Pandit	c507f31069	Internal Server Error when loading Explorer tab (#5723 ) Internal Server Error when loading Explorer tab (SDC #995) This is with reference to a weird scenarios where chunk table entry exist in timescaledb catalog but it does not exist in PG catalog. The stale entry blocks executing hypertable_size function on the hypertable. The changes in this patch are related to improvements suggested for hypertable_size function which involves: 1. Locking the hypertable in ACCESS SHARE mode in function hypertable_size to avoid risk of chunks being dropped by another concurrent process. 2. Joining the hypertable and inherited chunk tables with "pg_class" to make sure that a stale table without an entry is pg_catalog is not included as part of hypertable size calculation. 3. An additional filter (schema_name) is required on pg_class to avoid calculating size of multiple hypertables with same in different schema. NOTE: With this change calling hypertable_size function will require select privilege on the table. Disable-check: force-changelog-file	2023-06-01 19:34:47 +05:30
Erik Nordström	a51d21efbe	Fix issue creating dimensional constraints During chunk creation, the chunk's dimensional CHECK constraints are created via an "upcall" to PL/pgSQL code. However, creating dimensional constraints in PL/pgSQL code sometimes fails, especially during high-concurrency inserts, because PL/pgSQL code scans metadata using a snapshot that might not see the same metadata as the C code. As a result, chunk creation sometimes fail during constraint creation. To fix this issue, implement dimensional CHECK-constraint creation in C code. Other constraints (FK, PK, etc.) are still created via an upcall, but should probably also be rewritten in C. However, since these constraints don't depend on recently updated metadata, this is left to a future change. Fixes #5456	2023-03-24 10:55:08 +01:00
shhnwz	699fcf48aa	Stats improvement for Uncompressed Chunks During the compression autovacuum use to be disabled for uncompressed chunk and enable after decompression. This leads to postgres maintainence issue. Let's not disable autovacuum for uncompressed chunk anymore. Let postgres take care of the stats in its natural way. Fixes #309	2023-03-22 23:51:13 +05:30
noctarius aka Christoph Engelbert	0118e6b952	Support CAGG names in hypertable_(detailed_)size This small patch adds support for continuous aggregates to the `hypertable_detailed_size` (and with that `hypertable_size`). It adds an additional check to see if a continuous aggregate exists if a hypertable with the given regclass name isn't found.	2023-02-24 10:48:31 -03:00
Fabrízio de Royes Mello	a76f76f4ee	Improve size utils functions and views performance Changed queries to use LATERAL join on size functions and views instead of CTEs and it eliminate a lot of unnecessary projections and give a chance for the planner to push-down predicates. Closes #4775	2022-10-05 17:40:28 -03:00
Sven Klemm	a4081516ca	Append pg_temp to search_path Postgres will prepend pg_temp to the effective search_path if it is not present in the search_path. While pg_temp will never be used to look up functions or operators unless explicitly requested pg_temp will be used to look up relations. Putting pg_temp in search_path makes sure objects in pg_temp will be considered last and pg_temp cannot be used to mask existing objects.	2022-05-03 07:55:43 +02:00
Fabrízio de Royes Mello	33bbdccdcd	Refactor function `hypertable_local_size` Reorganize the code and fix minor bug that was not computing the size of FSM, VM and INIT forks of the parent hypertable. Fixed the bug by exposing the `ts_relation_size` function to the SQL level to encapsulate the logic to compute `heap`, `indexes` and `toast` sizes.	2022-03-07 16:38:40 -03:00
Sven Klemm	6dddfaa54e	Lock down search_path in install scripts This patch locks down search_path in extension install and update scripts to only contain pg_catalog, this requires that any reference in those scripts is fully qualified. Additionally we add explicit create commands to all update scripts for objects added to the public schema. This change will make update scripts fail if a function with identical signature already exists when installing or upgrading instead reusing the existing object.	2022-02-09 17:53:20 +01:00
Erik Nordström	b78b25d317	Fail size utility functions when data nodes do not respond Size utility functions, such as `hypertable_size()`, excluded non-responding data nodes from size calculations, which led to the functions succeeding but returning the wrong size information. To avoid reporting confusing numbers, it is better to fail. This change updates the SQL queries for the relevant functions to no longer exclude non-responding data nodes and also adds a TAP test to illustrate the error when data nodes are not responding. Fixes #3713	2021-11-15 19:50:33 +01:00
Fabrízio de Royes Mello	609b5ea34a	Refactor SQL function approximate_row_count Simplify the CTE to recursively inspect all partitions of a relation and calculate the sum of `pg_class.reltuples` taking in account the differences introduced by PG14.	2021-10-01 14:20:17 -03:00
Sven Klemm	43bb5ba7d1	Optimize approximate_row_count Rewrite approximate_row_count to SQL instead of PLpgSQL and remove superfluous JOINs against pg_namespace. Adjust tuple calculation for PG14 since in PG14 reltuples for partitioned tables is the sum of it's children so we need to exclude those from calculation to not doublecount.	2021-09-16 14:57:47 +02:00
Sven Klemm	ff5d7e42bb	Adjust code to PG14 reltuples changes PG14 changes the initial value of pg_class.reltuples to -1 to allow differentiating between an empty relation and a relation where ANALYZE has not yet run. https://github.com/postgres/postgres/commit/3d351d916b	2021-06-29 16:35:35 +02:00
gayyappan	45462c775e	Fix hypertable_chunk_local_size view The view uses cached information from compression_chunk_size to report the size of compressed chunks. Since compressed chunks can be modified, we call pg_relation_size on the compressed chunk while reporting the size The view also incorrectly used the hypertable's reltoastrelid to calculate toast bytes. It has been changed to use the chunk's reltoastrelid.	2021-05-24 11:52:03 -04:00
Erik Nordström	931da9a656	Refactor and harden size and stats functions Fix a number of issues with size and stats functions: * Return `0` size instead of `NULL` in several functions when hypertables have no chunks (e.g., `hypertable_size`, `hypertable_detailed_size`). * Return `NULL` when functions are called on non-hypertables instead of simply failing with generic error `query returned no rows`. * Include size of "root" hypertable, which can have non-zero size indexes and other objects even if the root table holds no data. * Make `hypertable_detailed_size` include one additional row for storage size of objects on the access node. While the access node stores no data, the empty hypertable may still take up some disk space. * Improve test coverage for all size utility functions. In particular, add tests on regular tables as well as empty and compressed hypertables. * Several size utility functions that were defined as `PL/pgSQL` functions have been converted to simple `SQL` functions since they ran only a single SQL query. The `dist_util` test is moved to the solo test group because, otherwise, it gives different size output when run in parallel vs. in isolation. Fixes #2871	2021-03-23 16:23:56 +01:00
Mats Kindahl	da97ce6e8b	Make function parameter names consistent Renaming the parameter `hypertable_or_cagg` in functions `drop_chunks` and `show_chunks` to `relation` and changing parameter name from `main_table` to `hypertable` or `relation` depending on context.	2020-10-02 08:52:20 +02:00
Dmitry Simonenko	e10b437712	Make hypertable_approximate_row_count return row count only This change renames function to approximate_row_count() and adds support for regular tables. Return a row count estimate for a table instead of a table list.	2020-09-02 12:18:34 +03:00
gayyappan	eecc93f3b6	Add hypertable_index_size function Function to compute the size for a specific index of a hypertable	2020-08-10 18:00:51 -04:00
gayyappan	9f13fb9906	Add functions for compression stats Add chunk_compression_stats and hypertable_compression_stats functions to get before/after compression sizes	2020-08-03 10:19:55 -04:00
gayyappan	c93f963709	Remove chunk_relation_size Remove chunk_relation_size and chunk_relation_size_pretty functions Fix row_number in chunks view	2020-07-30 16:06:04 -04:00
gayyappan	7d3b4b5442	New size utils functions Add hypertable_detailed_size , chunk_detailed_size, hypertable_size functions. Remove hypertable_relation_size, hypertable_relation_size_pretty, and indexes_relation_size_pretty Remove size information from hypertables view.	2020-07-29 15:30:39 -04:00
Brian Rowe	79fb46456f	Rename server to data node The timescale clustering code so far has been written referring to the remote databases as 'servers'. This terminology is a bit overloaded, and in particular we don't enforce any network topology limitations that the term 'server' would suggest. In light of this we've decided to change to use the term 'node' when referring to the different databases in a distributed database. Specifically we refer to the frontend as an 'access node' and to the backends as 'data nodes', though we may omit the access or data qualifier where it's unambiguous. As the vast bulk of the code so far has been written for the case where there was a single access node, almost all instances of 'server' were references to data nodes. This change has updated the code to rename those instances.	2020-05-27 17:31:09 +02:00
Brian Rowe	e110a42a2b	Add space usage utilities to distributed database This change adds a new utility function for postgres `server_hypertable_info`. This function will contact a provided node and pull down the space information for all the distributed hypertables on that node. Additionally, a new view `distributed_server_info` has been added to timescaledb_information. This view leverages the new remote_hypertable_data function to display a list of nodes, along with counts of tables, chunks, and total bytes used by distributed data. Finally, this change also adds a `hypertable_server_relation_size` function, which, given the name of a distributed hypertable, will print the space information for that hypertable on each node of the distributed database.	2020-05-27 17:31:09 +02:00
zcavaliero	656bd3f6cc	Add logic to ignore dropped chunks in hypertable_relation_size Function hypertable_relation_size includes chunks that were dropped which causes a failure when looking up the size of dropped chunks. This patch adds a constraint to ignore dropped chunks when determining the size of the hypertable.	2020-03-12 16:16:06 +01:00
Sven Klemm	f89fd07c5b	Remove year from SQL file license text This changes the license text for SQL files to be identical with the license text for C files.	2019-01-13 23:30:22 +01:00
Sven Klemm	5b6a5f4511	Change size utility and job functions to STRICT Change hypertable_relation_size, hypertable_relation_size_pretty, chunk_relation_size, chunk_relation_size_pretty, indexes_relation_size, indexes_relation_size_pretty, partitioning_column_to_pretty, insert_job and delete_job to STRICT	2018-11-23 20:54:27 +01:00
Joshua Lockerman	e06733acf0	Fix casing in SQL license header to be consistent with elsewhere	2018-11-15 15:18:58 -05:00
Joshua Lockerman	20ec6914c0	Add license headers to SQL files and test code	2018-10-29 13:28:19 -04:00
Rob Kiefer	67a8a41e22	Make chunk identifiers formatting safe using format	2018-07-11 11:54:33 -04:00
Rob Kiefer	41af6ff3ae	Fix misreported toast_size in chunk_relation_size funcs A bug in the SQL for getting the size of chunks would use the TOAST size of the main/dummy table as the toast size for the chunks rather than each chunks' own toast size.	2018-07-11 11:54:33 -04:00
Narek Galstyan	4b4211fe94	Fix some external functions when setting a custom schema Make sure internal references to timescale functions use the correct schema to refer to these functions. Fixes #554	2018-06-19 10:15:19 -04:00
Rob Kiefer	c660fcd8ff	Add hypertable_approximate_row_count convenience function Getting an approximate row count for a hypertable involves getting estimates for all of its chunks rather than just looking up a single value in the catalog tables. This PR provides a convenience function for doing the JOINs/summing.	2018-05-18 10:21:43 -04:00
Matvey Arye	a4e1e32d02	Change range_start and range_end semantics We now use INT64_MAX and INT64_MIN as the max and min values for dimension_slice ranges. If a dimension_slice has a range_start of INT64_MIN or the range_end is INT64_MAX, we remove the corresponding check constraint on the chunk since it signifies that this end of the range is infinite. Closed ranges now always have INT64_MIN as range_end of first slice and range_end of INT64_MAX for the last slice. Also, points corresponding to INT64_MAX are always put in the same slice as INT64_MAX-1 to avoid problems with the semantics that coordinate < range_end.	2017-11-14 08:00:10 -08:00
Erik Nordström	097db3d589	Refactor chunk index handling This change refactors the chunk index handling to make better use of standard PostgreSQL catalog information, while removing the hypertable_index metadata table and associated triggers, including those on the chunk_index table. The chunk_index table itself is also simplified. A benefit of this refactoring is that indexes are no longer created using string mangling to construct the CREATE INDEX command for a chunk, based on the string definition of the hypertable index. Instead, indexes are created in C using proper index-related internal data structures. Chunk indexes can now also be renamed and are added in the parent index tablespace. Changing tablespace on a hypertable index also recurses to chunks, as expected. Default indexes that are added when creating a hypertable use the hypertable's tablespace. Creating Hypertable indexes with the CONCURRENTLY modifier is currently blocked, due to unclear semantics regarding concurrent creation over many tables, including how to deal with snapshots.	2017-10-03 10:51:32 +02:00
Olof Rensfelt	8cf8d3c377	Improve the size utils functions. The hypertable, chunk, and index size functions are now split into main function and a corresponding ´pretty´ function. In chunk_relation_size_pretty() the ranges are now converted into a human readable form when they are time types.	2017-09-16 09:53:08 +02:00
Olof Rensfelt	0137c92cdb	Fix output order of chunk dimensions and ranges in chunk_relation_size.	2017-08-24 11:14:52 +02:00
Olof Rensfelt	e0eeeb9bdb	Add hypertable, chunk, and indexes size utils functions.	2017-07-13 14:14:40 +02:00

45 Commits