Cache queries support multiple optional behaviors, such as "missing
ok" (do not fail on cache miss) and "no create" (do not create a new
entry if one doesn't exist in the cache). With multiple boolean
parameters, the query API has become unwieldy so this change turns
these booleans into one flag parameter.
This change refactors our main planner hooks in `planner.c` with the
intention of providing a consistent way to classify planned relations
across hooks. In our hooks, we'd like to know whether a planned
relation (`RelOptInfo`) is one of the following:
* Hypertable
* Hypertable child (a hypertable can appear as a child of itself)
* Chunk as a child of hypertable (from expansion)
* Chunk as standalone (operation directly on chunk)
* Any other relation
Previously, there was no way to consistently know which of these one
was dealing with. Instead, a mix of various functions was used without
"remembering" the classification for reuse in later sections of the
code.
When classifying relations according to the above categories, the only
source of truth about a relation is our catalog metadata. In case of
hypertables, this is cached in the hypertable cache. However, this
cache is read-through, so, in case of a cache miss, the metadata will
always be scanned to resolve a new entry. To avoid unnecessary
metadata scans, this change introduces a way to do cache-only
queries. This requires maintaining a single warmed cache throughout
planning and is enabled by using a planner-global cache object. The
pre-planning query processing warms the cache by populating it with
all hypertables in the to-be-planned query.
New cluster-like command which writes to a new index than swaps,
much like is done for the data table, and only acquires
exclusive locks for said swap. This trades off disk usage for
lower contention: we hold locks for a much lower period of time,
allowing reads to work concurrently, but we have both the old
and new versions of the table existing at once, approximately
doubling storage usage while reorder is running.
Currently only works on chunks.
Future proofing: if we ever want to make our functions available to
others they’d need to be prefixed to prevent name collisions. In
order to avoid having some functions with the ts_ prefix and
others without, we’re adding the prefix to all non-static
functions now.
Previously, cache lookups were run on the cache's memory
context. While simple, this risked allocating transient (work) data on
that memory context, e.g., when scanning for new cache data during
cache misses.
This change makes scan functions take a memory context, which the
found data should be allocated on. All other data is allocated on the
current memory (typically the transaction's memory context). With this
functionality, a cache can pass its memory context to the scan, thus
avoiding taking on unnecessary memory allocations.
Source code indentation has been updated in PostgreSQL 10 to fix a
number of issues. This update applies this new indentation to the
entire code base.
The new indentation requires a new version of pg_bsd_indent, which can
be found here:
https://git.postgresql.org/git/pg_bsd_indent.git
This commit adds logic for cache pinning to handle subtxn. It also makes
it easier to find cache pinning leaks. Finally, it fixes handling of
cross-commit operations like VACUUM and CLUSTER. Previously, such
operations incorrectly released that cache pin on the first commit
even though the object was used after that.
For all exported functions the macro PGDLLEXPORT needs to be pre-
pended. Additionally, on Windows `open` is a macro that needed to
be renamed. A few other small changes are done to make Visual
Studio's compiler happy / get rid of warnings (e.g. adding return
statements after elog).
Currently, if a transaction ends (normally or abnormally) while a
cache is pinned, that cache will never be released, leading to a
memory leak. This change ensures that all taken cache pins are
released when a transaction ends by registering all pins in a list and
cleaning it up in a transaction end callback. This makes the use of
cache_release() optional. However, timely calls to cache_release() is
still encouraged since it is better to release memory as soon as
possible during long-running transactions.