1
0
mirror of https://github.com/apple/foundationdb.git synced 2025-05-28 02:48:09 +08:00

525 Commits

Author SHA1 Message Date
Dan Adkins
5dcece90e1 Increase buggified lock bytes for backup workers to at least 256 MB.
We are still encountered simulation failures where the backup worker
is waiting on the lock and an assertion fails.
2023-01-19 17:36:06 -08:00
neethuhaneesha
ca4a964df1
Adding rocksDB control compaction on deletion knobs. () 2023-01-18 15:40:34 -08:00
Ata E Husain Bohra
3f2404cc25
[EaR]: Update KMS request/response to embedd version details ()
* [EaR]: Update KMS request/response to embeded version details

Description

 diff-1 : Address review comments

Patch embedd 'version_tag' detail to KMS JSON request/response
payload, this features enables future expansion as well as enables
the path to support multiple versions simulatanesouly if needed

Testing

RESTKmsConnectorUnit.toml updated as per new code
devRunCorrectness - 100K
2023-01-16 12:18:25 -08:00
A.J. Beamon
811593e093 Merge branch 'main' into add-tenant-lookup-interface 2023-01-12 09:56:17 -08:00
A.J. Beamon
281083822b Trigger a commit if none happens within some amount of time when a tenant lookup is performed 2023-01-12 09:11:30 -08:00
Zhe Wu
37e026366c
Merge pull request from halfprice/zhewu/add-txn-server-initialization-event-1
Add event for txn server initialization and a warning for TLog slow catching up
2023-01-11 22:00:53 -08:00
Zhe Wu
087d37d10b Add event for txn server initialization and a warning for TLog slow catching up 2023-01-11 10:02:06 -08:00
Ata E Husain Bohra
f673fce975
[EaR]: Update KMS APIs to split encryption keys endpoints ()
* [EaR]: Update KMS APIs to split encryption keys endpoints

Description
  diff-1: Address review comments

Major changes proposed:
1. Extend fdbserver to allow parsing two endpoints for encryption at-rest
support: getEncrypitonKeys, getLatestEncryptionKeys
2. Update RESTKmsConnector to do the following:
 2.1. Split the getLatest and getCipher requests.
 2.2. "domain_id" for point lookup marked as 'optional'

Testing

devRunCorrectness - 100K
2023-01-09 10:55:53 -08:00
A.J. Beamon
f999623bb1 Add a tenant lookup interface and use it when starting transactions 2023-01-06 15:51:12 -08:00
Hui Liu
e3bf79cf71 Add correctness test for blob restore 2023-01-04 11:10:34 -08:00
Meng Xu
a1d513b355 Fix:Exclusion stuck because DD cannot build new teams
Bug behavior:
When DD has zero healthy machine teams but more unhealthy machine teams
than the max machine teams DD plans to build, DD will stop building
new machine teams. Due to zero healthy machine team (and zero healthy
server team), DD cannot find a healthy destination team  to relocate data.
When data relocation stops, exclusion stops progressing and stuck.

Bug happens when we *shrink* a k-host cluster by
first adding k/2 new host;
then quickly excluding all old hosts.

Fix:
Let DD build temporary extra teams to relocate data.
The extra teams will be cleaned up later by DD's remove extra teams logic.

Simulation test:
There is no simulation test to cover cluster expansion scnenario.
To most closely simulate this behavior, we intentionally overbuild all possible
machine teams to trigger the condition that unhealthy teams is larger than
the maximum teams DD wants to build later.
2022-12-19 15:28:01 -08:00
Xiaoxi Wang
919c512cdc fix wiggler state setting 2022-12-15 12:14:40 -08:00
Xiaoxi Wang
ab4778bd19 Merge branch 'main' of https://github.com/apple/foundationdb into feature/main/ppwLoadBalance 2022-12-15 11:36:20 -08:00
He Liu
2024237e5d
Fetch checkpoint as key-value pairs ()
* Allow multiple keyranges in CheckpointRequest.
Include DataMove ID in CheckpointMetaData.

* Use UID dataMoveId instead of Optional<UID>.

* Implemented ShardedRocks::checkpoint().

* Implementing createCheckpoint().

* Attempted to change getCheckpointMetaData*() for a single keyrange.

* Added getCheckpointMetaDataForRange.

* Minor fixes for NativeAPI.actor.cpp.

* Replace UID CheckpointMetaData::ssId with std::vector<UID>
CheckpointMetaData::src;

* Implemented getCheckpointMetaData() and completed checkpoint creation
and fetch in test.

* Refactoring CheckpointRequest and CheckpointMetaData

rename `dataMoveId` as `actionId` and make it Optional.

* Fixed ctor of CheckpointMetaData.

* Implemented ShardedRocksDB::restore().

* Tested checkpoint restore, and added range check for restore, so that
the target ranges can be a subset of the checkpoint ranges.

* Added test to partially restore a checkpoint.

* Refactor: added checkpointRestore().

* Sort ranges for comparison.

* Cleanups.

* Check restore ranges are empty; Add ranges in main thread.

* Resolved comments.

* Fixed GetCheckpointMetaData range check issue.

* Refactor CheckpointReader for CF checkpoint.

* Added CheckpointAsKeyValues as a parameter for newCheckpointReader.

* PhysicalShard::restoreKvs().

* Added `ranges` in fetchCheckpoint.

* Added RocksDBCheckpointKeyValues::ranges.

* Added ICheckpointIterator and implemented for RocksDBCheckpointReader.

* Refactored OpenAction for CheckpointReader, handled failure cases.

* Use RocksDBCheckpointIterator::end() in readRange.

* Set CheckpointReader timout and other Rocks read options.

* Implementing fetchCheckpointRange().

* Added more CheckpointReader tests.

* Cleanup.

* More cleanup.

* Resolved comments.

Co-authored-by: He Liu <heliu@apple.com>
2022-12-14 17:44:47 -08:00
Andrew Noyes
dd0036f09c
Automatically clean old idempotency ids ()
* Add cleanIdempotencyIds

Delete zero or more idempotency ids older than minAgeSeconds

* Automatically clean idempotency ids from first proxy

* Add test for cleaner

* Fix formatting

* Address review comments
2022-12-14 14:24:24 -08:00
Xiaoxi Wang
16d11143fa add smallLoadThreshold logic and change knobs 2022-12-07 11:45:49 -05:00
Xiaoxi Wang
5d01d33531 Merge branch 'main' of https://github.com/apple/foundationdb into feature/main/ppwLoadBalance 2022-12-07 09:11:55 -05:00
Hui Liu
d76822bc12
Merge pull request from sfc-gh-huliu/applylog
blobrestore - apply mutation log
2022-12-02 13:03:13 -08:00
Jon Fu
c7b4f80ac6
Merge pull request from sfc-gh-jfu/build-cop-too-many-traces-2
Disable buggify for DD_QUEUE_MAX_KEY_SERVERS knob
2022-12-01 16:06:42 -08:00
Hui Liu
b38520ec4e blobrestore - apply mutation log 2022-12-01 14:16:18 -08:00
Jon Fu
dace2927a5 disable buggify for DD_QUEUE_MAX_KEY_SERVERS knob 2022-12-01 14:10:01 -08:00
Jingyu Zhou
c908f32d42 Increase buggified lock bytes for backup workers
To fix simulation failures where the knob value is too small.
2022-11-28 10:38:17 -08:00
Jingyu Zhou
2a74624cbd
Merge pull request from neethuhaneesha/suggest-compacts
Rocksdb suggest compact range checks
2022-11-17 20:01:19 -08:00
Yao Xiao
eda9d701dc
Set timeout to 5 sec. () 2022-11-17 19:56:26 -08:00
Ankita Kejriwal
4b52560dbc Merge branch 'main' of github.com:apple/foundationdb into monitorusage 2022-11-17 15:47:39 -08:00
Jingyu Zhou
186d30be95
Merge pull request from xumengpanda/mengxu/main-tmp
Increase memtable and writebuffer size for rocksdb simulation test
2022-11-17 14:45:14 -08:00
neethuhaneesha
b46c9eb67d Rocksdb suggest compact range checks 2022-11-17 14:20:18 -08:00
Steve Atherton
8e8c4b4489
Merge pull request from sfc-gh-sgwydir/ddsketch
Use DDSketch for sample data
2022-11-17 10:38:12 -08:00
Xiaoxi Wang
c89d74fa1b rewrite loadBytesBalanceRatio; rename knobs; update comments 2022-11-16 12:52:25 -08:00
Ankita Kejriwal
959bf9f4e7 Merge branch 'main' of github.com:apple/foundationdb into monitorusage 2022-11-15 18:32:21 -08:00
Ankita Kejriwal
7382975a6c Add a server knob to control the interval between the trace events 2022-11-15 15:29:28 -08:00
Xiaoxi Wang
907d7af966 solve merge conflict upstream/main 2022-11-15 14:59:31 -08:00
Sam Gwydir
99d4bacf5d Merge remote-tracking branch 'origin/main' into ddsketch 2022-11-15 13:19:42 -08:00
Hui Liu
00c270fc3f BlobManifest - add limits for getRange and transactions for resilency with large manifest 2022-11-14 20:09:44 -08:00
Meng Xu
68eb129c71 RocksDB:Use knob to control readValueTimeout value in simulation 2022-11-14 16:24:28 -08:00
Meng Xu
b699ba4c23 Increase memtable and writebuffer size for rocksdb simulation test
memtable and writebuffer size are too small in simualtion, which causes
thousands of sst files and at least 6 levels of ssts.
Both makes compaction slower in simulation and contribute to timeout errors.

After increasing the size, failure rate (timeout failures) when we only run rocksdb and
sharded rocksdb engines in simulation drops from 10 out of 332339 tests to 10 out of 497532 tests.

For apple dev who wants to look into the joshua details,
before the change, joshua ensemble id is 20221111-223720-mengxudebugrocks-505ede1c55664ddf
after the change, joshua ensemble id is 20221114-192042-mengxurocksdebugknobchange-1e4c047d112e9a38
2022-11-14 16:24:15 -08:00
Xiaoxi Wang
f997e73758 rename variable and solve some light comments 2022-11-14 13:11:27 -08:00
Steve Atherton
2b133e5bd1
Merge pull request from sfc-gh-satherton/pml-delay
Another ProrityMultiLock refactor and re-add StorageServer priority read locking without perf regression
2022-11-14 11:30:01 -08:00
Sam Gwydir
7ea42841a4 Merge remote-tracking branch 'origin/main' into ddsketch 2022-11-12 13:52:57 -08:00
Sam Gwydir
23706c957b Use DDSketch for Sample Data. 2022-11-12 13:45:46 -08:00
Steve Atherton
d61b88e6b3 Bug fix, Redwood's ioLock was not a Reference<>. Renamed several knobs, functions, and Redwood metrics for clarity. 2022-11-11 20:07:48 -08:00
Hao Fu
b7629ce56e
store and reset original Knob value in GetMappedRange test () 2022-11-11 15:20:26 -08:00
Hao Fu
7e78795284
add bytelimit for prefetch ()
* add bytelimit for prefetch

A fraction of byteLimit will be used as the limit to fetch index.
For the indexes fetched, fetch records for them in batch.

byteLimit always count the index size, it also count record if exist,
it at least return 1 index-record entry and always include the last entry
despite that adding the last entry despite it might exceed limit.

There is a Knob STRICTLY_ENFORCE_BYTE_LIMIT, when it is set, records
will be discarded once the byteLimit is hit, despite they are fetched.
Otherwise, return the whole batch.
2022-11-11 13:36:06 -08:00
Steve Atherton
e5e4457c6e Merge commit '8ad98dc9db2a1f9c3c1b44b22e0532bfa8c89ee5' into pml-delay
# Conflicts:
#	fdbserver/storageserver.actor.cpp
2022-11-11 11:49:31 -08:00
Xiaoxi Wang
b79268326a Merge branch 'main' of https://github.com/apple/foundationdb into feature/main/dataApi 2022-11-11 08:29:22 -08:00
Steve Atherton
d7b7af9e98 Change default read priority configuration to use a separate priority level per ReadType because the PriorityMultiLock now supports more priority ids with less overhead. 2022-11-11 00:02:47 -08:00
Ankita Kejriwal
3b4e0056a7 Add a knob to enable/disable storage quota enforcement 2022-11-10 19:35:21 -08:00
Xiaoge Su
c489dbcb26 Disable forwarding log of RocksDB to FDB by default 2022-11-10 18:58:19 -08:00
Xiaoxi Wang
ac923cfbcd add knobs; make ppw wait for byte load balance 2022-11-10 12:25:51 -08:00
Ankita Kejriwal
105648b888 Merge branch 'main' of github.com:apple/foundationdb into commitproxies 2022-11-09 17:38:30 -08:00