This commit adds support for dynamically loaded submodules to timescaledb
as well an initial license-key implementation in the tsl subdirectory.
Dynamically loaded modules allow our users to determine which licenses they
wish to use for their version of timescaledb; if they wish to only use
Apache-Licensed code, they do not load the Timescale-Licensed submodule. Calls
from the Apache-Licensed code into the Timescale-Licensed submodule are
handled via dynamicaly-set function pointers; see tsl/src/Readme.module.md for
more details.
This commit also adds code for license keys for the ApacheOnly, Community, and
Enterprise editions. The license key determines which features are enabled,
and controls loading the submodule: when a license key that requires the
sub-module is installed, the module is automatically loaded.
Currently the ApacheOnly and Community license-keys are hardcoded to be
"ApacheOnly" and "Community" respectively. The first version of the enterprise
license-key is described in tsl/src/Readme.module.md
Fixes get_create_command to produce a valid create_hypertable
call when the column name is a keyword
Add test.execute_sql helper function to test support functions
Remove the following unused functions:
_timescaledb_internal.to_microseconds(TIMESTAMPTZ)
_timescaledb_internal.to_timestamp_pg(BIGINT)
_timescaledb_internal.time_to_internal(anyelement)
TimescaleDB has always supported functions on closed (space)
dimension, i.e., for hash partitioning. However, functions have not
been supported on open (time) dimensions, instead requiring columns to
have a supported time type (e.g, integer or timestamp). This restricts
the tables that can be time partitioned. Tables with custom "time"
types, which can be transformed by a function expression into a
supported time type, are not supported.
This change generalizes partitioning so that both open and closed
dimensions can have an associated partitioning function that
calculates a dimensional value. Fortunately, since we already support
functions on closed dimensions, the changes necessary to support this
on any dimension are minimal. Thus, open dimensions now support an
(optional) partitioning function that transforms the input type to a
supported time type (e.g., integer or timestamp type). Any indexes on
such dimensional columns become expression indexes.
Tests have been added for chunk expansion and the hashagg and sort
transform optimizations on tables that are using a time partitioning
function.
Currently, not all of these optimizations are well supported, but this
could potentially be fixed in the future.
Remove the existing PLPGSQL function that implements drop_chunks, replacing it with a direct call to the C function, which also implements the old PLPGSQL checks in C. Refactor out much of the code shared between the C implementations of show_chunks and drop_chunks.
Previously, delete/insert_job were exposed, even though they were only used for testing. Moved both the C implementations and the SQL functions to the test module.
Timescale provides an efficient and easy to use api to drop individual
chunks from timescale database through drop_chunks. This PR builds on
that functionality and through a new show_chunks function gives the
opportunity to see the chunks that would be dropped if drop_chunks was run.
Additionally, it adds a newer_than option to drop_chunks (also supported
by show_chunks) that allows to see/drop chunks in an interval or newer
than a point in time.
This commit includes:
- Implementation of show_chunks in C
- Additional helper functions to work with chunks
- New version of drop_chunks in sql that uses show_chunks. This
also adds a newer_than option to drop_chunks
- More enhanced tests of drop_chunks and new tests for show_chunks
Among other reasons, show_chunks was implemented in C in order
to be able to have both older_than and newer_than arguments be null. This
was not possible in SQL because the arguments had to have polymorphic types
and whether they are used in function body or not, PL/pgSQL requires these
arguments to typecheck.
Because the server now sends a boolean indicating whether the local version is up-to-date, we simply validate the server's response and remove any version-checking code from the client-side. The client now only checks for well-formed versions before printing, if the local version is not up-to-date.
Previously, the scheduler only populated its jobs list once at start time. This commit enables the scheduler to receive notifications for updates (insert, update, delete) to the bgw_job table. Notifications are sent via the cache invalidation framework. Whenever the scheduler receives a notification, it re-reads the bgw_job table. For each job currently in the bgw_job table, it either instantiates new scheduler state for the job or copies over any existing scheduler state, for persisting jobs. For jobs that have disappeared from the bgw_job table, the scheduler deletes any local state it has.
Note that any updates to the bgw_job table must now go through the C, so that the cache invalidation framework in catalog.c can run. In particular, this commit includes a rudimentary API for interacting with the bgw_job table, for testing purposes. This API will be rewritten in the future.
Modify the restart action to start schedulers if they do not exist, this fixes a
potential race condition where a scheduler could be started for a given
database, but before it has shut down (because the extension does not exist) a
create extension command is run, the start action then would not change the
state of the worker but it would be waiting on the wrong vxid, so not see that
the extension exists. This also makes it so the start action can be truly
idempotent and not set the vxid on its startup, thereby respecting any restart
action that has taken place before and better defining how each interacts with
the system.
Additionally, we decided that the previous behavior in which launchers were not
started up on, say, alter extension update actions was not all that desirable as
it worked if the stop action had happened, but the database had not restarted,
if the database restarted, then the stop action would have no effect. We decided
that if we desire the ability to disable schedulers for a particular database,
we will implement it in the future as a standalone feature that takes effect
across server restarts rather than having somewhat ill-defined behavior with an
implicit feature of the stop action.
Add bool created to return value of create_hypertable and add_dimension.
When if_not_exists is true and creation is skipped because the object
already exists created will be false, otherwise it will be true. This
modifies the functions to return meta data even when no object was
created.
Change the return value of add_dimension to return a record consisting
of dimension_id, schema_name, table_name, column_name. This improves
user feedback about success of the operation but also gives the function
an API returning useful information for non-human consumption.
Change create_hypertable to return a record consisting of
(hypertable_id, schema_name, table_name). This improves user feedback
about success of the operation but also gives the function an API
returning useful information for non-human consumption.
We've decided to adopt the ts_ prefix on all exported C functions in
order to avoid having symbol conflicts with future postgres functions.
We've already started using this prefix on new functions and this commit
adds the prefix to to the old functions.
When timescaledb is installed in template1 and a user with only createdb
privileges creates a database, the user won't be able to dump the
database because of lacking permissions. This patch grants the missing
permissions to PUBLIC for pg_dump to succeed.
We need to grant SELECT to PUBLIC for all tables even those not
marked as being dumped because pg_dump will try to access all
tables initially to detect inheritance chains and then decide
which objects actually need to be dumped.
The integer variants of time_bucket don't handle negative time values
properly and return the upper boundary of the interval instead of the
lower boundary for negative time values.
The `dimension` metadata table had both a `(hypertable_id)` and a
`UNIQUE(hypertable_id, column_name)` index. Having only the latter
index should suffice.
This change removes the unnecessary index, which will save some space,
and make the schema more clear.
Version checks fail on pre-release versions (e.g., `1.0.0-rc1`) since
the version parsing didn't expect a pre-release tag at the end.
This change adds support for pre-release tags, using lexicographical
comparison in case of pre-release tags. Note that having a pre-release
tag vs not having one makes the version strictly "smaller", given
otherwise identical versions.
This adds the telemetry job to the job scheduler. Telemetry is
scheduled to run every 24 hours with a 1 hour exponential backoff
retry period. Additional fixes related to the telemetry job:
- Add separate ID space to the bgw_job table for default jobs. We do not dump this ID space inside pg_dump to prevent job insertion conflicts.
- Add check to update scripts for default jobs.
- Change shmem_callback so that it doesn't modify state since state transitions are not atomic with respect to interrupts and interrupt callbacks.
- Disable default telemetry job in regression and docker tests.
The installation metadata has been refactored:
- The installation metadata store now takes Datums of any
type as input and output
- Move metadata functions from uuid.c -> metadata.c
- Make metadata functions return native types rather than text,
including for tests
Telemetry tests for ssl and nossl have been combined.
Note that PG 9.6 does not have pg_backend_random() that gives us a
secure random numbers for UUIDs that we send in telemetry. Therefore,
we fall back to the generating the UUID from the timestamp if we are
on PG 9.6.
This change also fixes a number of test issues. For instance, in the
telemetry test the escape char `E` was passed on as part of the
response string when set as a variable with `\set`. This was not
detected before because the response parser didn't parse the start of
the response properly.
A number of fixes have been made to the formatting of log messages for
telemetry to conform to the PostgreSQL standard as well as being
consistent with other messages.
Numerous build issues on Windows have been resolved. There is also new
functionality to get OS version info on Windows (for telemetry),
including a SQL function get_os_info() to retrieve this information.
The net library will now allow connecting to a servicename, e.g., http
or https. A port is resolved from this service name via getaddrinfo().
An explicit port can still be given, and it that case it will not
resolve the service name.
Databases the are updated to the new version of the extension will
have an install_timestamp in their installation metadata that does not
reflect the actual original install date of the extension. To be able
to distinguish these updated installations from those that are freshly
installed, we add a bogus "epoch" install_timestamp in the update
script.
Parsing of the version string in the telemetry response has been
refactored to be more amenable to testing. Tests have been added.
Adding the telemetry BGW and all auxiliary functions, such as generating a UUID, creating the internal metadata
table for storing UUIDs, and parsing the server-side response with the latest version of TimescaleDB.
TimescaleDB will want to run multiple background jobs. This PR
adds a simple scheduler so that jobs inserted into a jobs table
could be run on a schedule. This first implementation has two limitations:
1) The list of jobs to be run is read from the database when the scheduler
is first started. We do not update this list if the jobs table changes.
2) There is no prioritization for when to run jobs.
There design of the scheduler is as follows:
The scheduler itself is a background job that continuously runs and waits
for a time when jobs need to be scheduled. It then launches jobs as new
background workers that it controls through the background worker handle.
Aggregate statistics about a job are kept in the job_stat catalog table.
These statistics include the start and finish times of the last run of the job
as well as whether or not the job succeeded. The next_start is used to
figure out when next to run a job after a scheduler is restarted.
The statistics table also tracks consecutive failures and crashes for the job
which is used for calculating the exponential backoff after a crash or failure
(which is used to set the next_start after the crash/failure). Note also that
there is a minimum time after the db scheduler starts up and a crashed job
is restarted. This is to allow the operator enough time to disable the job
if needed.
Note that the number of crashes is an overestimate of the actual number of crashes
for a job. This is so that we are conservative and never miss a crash and fail to
use the appropriate backoff logic. Note that there is some complexity
in ensuring that all crashes are counted since a crash in Postgres causes /all/
processes to SIGQUIT: we must commit changes to the stats
table /before/ a job starts so that we can then deduce after a job has crashed
and the scheduler comes back up that a job was started, and not finished before
the crash (meaning that it could have been the crashing process).
The launcher controls how Timescale DB schedulers for each database are stopped/started
both at server start time and if they are started or stopped while the server is running
which can happen when, say, an update of the extension is performed.
Includes tests for multiple types of behavior within the launcher, but only a mock for the
db schedulers which will be dealt with in future commits. This launcher code is mostly in the loader,
as such it must remain backwards compatible for the foreseeable future, so significant thought and design
has gone into making interactions with this code well defined and consistent so that maintaining
backwards compatibility is relatively easy.
Previously, the pg_dump test was broken because it is not possible to reference psql variables
from inside bash commands run through psql. This is fixed by hardcoding the username passed to
the bash commands inside the test.
Also, we changed the insert block trigger preventing inserts into hypertable to a non-internal
trigger, because internal triggers are not dumped by pg_dump. We need to dump the trigger so that
it is already in place after a pg_restore, to prevent users from accidentally inserting rows into
a hypertable while timescaledb_restoring=on.