Changed queries to use LATERAL join on size functions and views instead
of CTEs and it eliminate a lot of unnecessary projections and give a
chance for the planner to push-down predicates.
Closes#4775
Postgres will prepend pg_temp to the effective search_path if it
is not present in the search_path. While pg_temp will never be
used to look up functions or operators unless explicitly requested
pg_temp will be used to look up relations. Putting pg_temp in
search_path makes sure objects in pg_temp will be considered last
and pg_temp cannot be used to mask existing objects.
Reorganize the code and fix minor bug that was not computing the size
of FSM, VM and INIT forks of the parent hypertable.
Fixed the bug by exposing the `ts_relation_size` function to the SQL
level to encapsulate the logic to compute `heap`, `indexes` and `toast`
sizes.
This patch locks down search_path in extension install and update
scripts to only contain pg_catalog, this requires that any reference
in those scripts is fully qualified. Additionally we add explicit
create commands to all update scripts for objects added to the
public schema. This change will make update scripts fail if a
function with identical signature already exists when installing
or upgrading instead reusing the existing object.
Size utility functions, such as `hypertable_size()`, excluded
non-responding data nodes from size calculations, which led to the
functions succeeding but returning the wrong size information. To
avoid reporting confusing numbers, it is better to fail.
This change updates the SQL queries for the relevant functions to no
longer exclude non-responding data nodes and also adds a TAP test to
illustrate the error when data nodes are not responding.
Fixes#3713
Simplify the CTE to recursively inspect all partitions of a relation
and calculate the sum of `pg_class.reltuples` taking in account the
differences introduced by PG14.
Rewrite approximate_row_count to SQL instead of PLpgSQL and remove
superfluous JOINs against pg_namespace. Adjust tuple calculation
for PG14 since in PG14 reltuples for partitioned tables is the sum
of it's children so we need to exclude those from calculation to
not doublecount.
The view uses cached information from compression_chunk_size to
report the size of compressed chunks. Since compressed chunks
can be modified, we call pg_relation_size on the compressed chunk
while reporting the size
The view also incorrectly used the hypertable's reltoastrelid to
calculate toast bytes. It has been changed to use the chunk's
reltoastrelid.
Fix a number of issues with size and stats functions:
* Return `0` size instead of `NULL` in several functions when
hypertables have no chunks (e.g., `hypertable_size`,
`hypertable_detailed_size`).
* Return `NULL` when functions are called on non-hypertables instead
of simply failing with generic error `query returned no rows`.
* Include size of "root" hypertable, which can have non-zero size
indexes and other objects even if the root table holds no data.
* Make `hypertable_detailed_size` include one additional row for
storage size of objects on the access node. While the access node
stores no data, the empty hypertable may still take up some disk
space.
* Improve test coverage for all size utility functions. In particular,
add tests on regular tables as well as empty and compressed
hypertables.
* Several size utility functions that were defined as `PL/pgSQL`
functions have been converted to simple `SQL` functions since they
ran only a single SQL query.
The `dist_util` test is moved to the solo test group because,
otherwise, it gives different size output when run in parallel vs. in
isolation.
Fixes#2871
Renaming the parameter `hypertable_or_cagg` in functions `drop_chunks`
and `show_chunks` to `relation` and changing parameter name from
`main_table` to `hypertable` or `relation` depending on context.
This change renames function to approximate_row_count() and adds
support for regular tables. Return a row count estimate for a
table instead of a table list.
The timescale clustering code so far has been written referring to the
remote databases as 'servers'. This terminology is a bit overloaded,
and in particular we don't enforce any network topology limitations
that the term 'server' would suggest. In light of this we've decided
to change to use the term 'node' when referring to the different
databases in a distributed database. Specifically we refer to the
frontend as an 'access node' and to the backends as 'data nodes',
though we may omit the access or data qualifier where it's unambiguous.
As the vast bulk of the code so far has been written for the case where
there was a single access node, almost all instances of 'server' were
references to data nodes. This change has updated the code to rename
those instances.
This change adds a new utility function for postgres
`server_hypertable_info`. This function will contact a provided node
and pull down the space information for all the distributed hypertables
on that node.
Additionally, a new view `distributed_server_info` has been added to
timescaledb_information. This view leverages the new
remote_hypertable_data function to display a list of nodes, along with
counts of tables, chunks, and total bytes used by distributed data.
Finally, this change also adds a `hypertable_server_relation_size`
function, which, given the name of a distributed hypertable, will print
the space information for that hypertable on each node of the
distributed database.
Function hypertable_relation_size includes chunks that were dropped
which causes a failure when looking up the size of dropped chunks.
This patch adds a constraint to ignore dropped chunks when determining
the size of the hypertable.
A bug in the SQL for getting the size of chunks would use the
TOAST size of the main/dummy table as the toast size for the
chunks rather than each chunks' own toast size.
Getting an approximate row count for a hypertable involves getting
estimates for all of its chunks rather than just looking up a
single value in the catalog tables. This PR provides a convenience
function for doing the JOINs/summing.
We now use INT64_MAX and INT64_MIN as the max and min values for
dimension_slice ranges. If a dimension_slice has a range_start of
INT64_MIN or the range_end is INT64_MAX, we remove the corresponding
check constraint on the chunk since it signifies that this end of the
range is infinite. Closed ranges now always have INT64_MIN as range_end
of first slice and range_end of INT64_MAX for the last slice.
Also, points corresponding to INT64_MAX are always
put in the same slice as INT64_MAX-1 to avoid problems with the
semantics that coordinate < range_end.
This change refactors the chunk index handling to make better use
of standard PostgreSQL catalog information, while removing the
hypertable_index metadata table and associated triggers, including
those on the chunk_index table. The chunk_index table itself is
also simplified.
A benefit of this refactoring is that indexes are no longer
created using string mangling to construct the CREATE INDEX command
for a chunk, based on the string definition of the hypertable
index. Instead, indexes are created in C using proper index-related
internal data structures.
Chunk indexes can now also be renamed and are added in the parent
index tablespace. Changing tablespace on a hypertable index also
recurses to chunks, as expected. Default indexes that are added when
creating a hypertable use the hypertable's tablespace.
Creating Hypertable indexes with the CONCURRENTLY modifier is
currently blocked, due to unclear semantics regarding concurrent
creation over many tables, including how to deal with snapshots.
The hypertable, chunk, and index size functions are
now split into main function and a corresponding ´pretty´
function. In chunk_relation_size_pretty() the ranges are
now converted into a human readable form when they are time types.