mirror of
https://github.com/timescale/timescaledb.git
synced 2025-05-18 03:23:37 +08:00
Currently CREATE INDEX creates the indices for all chunks in a single transaction, which holds a lock on the root hypertable and all chunks. This means that during CREATE INDEX no new inserts can occur, even if we're not currently building an index on the table being inserted to. This commit adds the option to create indices using a separate transaction for each chunk. This option, used like CREATE INDEX ON <table> WITH (timescaledb.transaction_per_chunk); should cause less contention than a regular CREATE INDEX, in exchange for the possibility that the index will be created on only some, or none, of the chunks, if the command fails partway through. The command holds a lock on the root index used as a template throughout the command, and each of the only additionally locks the chunk being indexed. This means that that chunks which are not currently being indexed can be inserted to, and new chunks can be created while the CREATE INDEX command is in progress. To enable detection of failed transaction_per_chunk CREATE INDEXs, the hypertable's index is marked as invalid while the CREATE INDEX is in progress, if the command fails partway through, the index will remain invalid. If such an invalid index is discovered, it can be dropped an recreated to ensure that all chunks have a copy of the index, in the future, we may add a command to create indexes on only those chunks which are missing them. Note that even though the hypertable's index is marked as invalid, new chunks will have a copy of the index build as normal. As part of the refactoring to make this command work, normal index creation was slightly modified. Instead of getting the column names an index uses one-at-a-time we get them all at once at the beginning of index creation, this allows to close the hypertable's root table once we've determined all of while we create the index info for each chunk. Secondly, it changes our function to lookup a tablespace, ts_hypertable_get_tablespace_at_offset_from, to only take a hypertable id, instead of the hypertable's entire cache entry; this function only ever used the id, so this allows us to release the hypertable cache earlier