foundationdb

mirror of https://github.com/apple/foundationdb.git synced 2025-06-02 11:15:50 +08:00

Author	SHA1	Message	Date
Steve Atherton	f4e8854e8c	Merge pull request #8517 from sfc-gh-etschannen/feature-disk-queue-perf added a disk queue load generator	2022-10-28 14:15:55 -07:00
Jingyu Zhou	d672b1cbce	Merge pull request #8613 from sfc-gh-etschannen/fix-specific-unit-test Specific unit test should only run one tests instead of all tests	2022-10-28 11:02:12 -07:00
Evan Tschannen	dd970a5c99	Specific unit test should only run one tests instead of all tests	2022-10-28 10:55:20 -07:00
Evan Tschannen	51e2f8e74b	made the test clean up after itself	2022-10-28 09:26:48 -07:00
Andrew Noyes	0a15f081a1	Proactively clean up idempotency ids for successful commits (#8578 ) * Proactively clean up idempotency ids for successful commits This change also includes some minor changes from my branch working on an idempotency ids cleaner, that I'd like to get merged sooner rather than later. - Adding a timestamp to idempotency values - Making IdempotencyId an actor file - Adding commit_unknown_result_fatal - Checking idempotencyIdsExpiredVersion in determineCommitStatus - Some testing QOL changes * Factor out decodeIdempotencyKey logic * Fix formatting * Update flow/include/flow/error_definitions.h Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com> * Use KeyBackedObjectProperty for idempotencyIdsExpiredVersion * Add IDEMPOTENCY_ID_IN_MEMORY_LIFETIME knob * Rename ExpireIdempotencyKeyValuePairRequest Also add a code probe for the case where an ExpireIdempotencyIdRequest is received before the count is known, and add an assert * Fix formatting and add TODO for nwijetunga Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>	2022-10-28 09:07:54 -07:00
Jingyu Zhou	4d789b2fd9	Merge pull request #8602 from apple/revert-8498-mmmm Revert "Cancel watch when the key is not being waited"	2022-10-27 21:14:39 -07:00
Jingyu Zhou	49645a7755	Revert "Clean up unused comment in flow.h" This reverts commit 03b102d86aecbe700aa8402ae31d0431bfb0b2b9.	2022-10-27 19:46:05 -07:00
Jingyu Zhou	dc60f63f9b	Revert "Cancel watch when the key is not being waited" This reverts commit 639afbe62cc157a3428261bf8783088becc9ac13.	2022-10-27 19:46:05 -07:00
Jingyu Zhou	fbe9802be5	Revert "configurationMonitor does not need to check watch reference count" This reverts commit ab0f827058c21dfab66462c3ce8545c6eec6a6e5.	2022-10-27 19:46:05 -07:00
Jingyu Zhou	634bd529e7	Revert "Record the version of each watch" This reverts commit 4bd24e4d6460c5cf38117b89246561bb0d83e3ef.	2022-10-27 19:46:05 -07:00
Jingyu Zhou	19ae4e7eb7	Revert "Reformat source" This reverts commit ec47c261bf743e4ffefbea2e70641afdf8f16491.	2022-10-27 19:46:05 -07:00
Jingyu Zhou	e460933b52	Revert "Remove debugging output" This reverts commit 41d1d6404d933f0574d88d1fa2a68c642413bf4b.	2022-10-27 19:46:05 -07:00
Jingyu Zhou	e7fd3eda00	Revert "Update fdbclient/NativeAPI.actor.cpp" This reverts commit 812243bafab4b8cb9cad49c7c22f16063f39b37e.	2022-10-27 19:46:05 -07:00
Lukas Joswiak	9625efd5b9	Add comment about configuration database	2022-10-27 13:56:13 -07:00
Lukas Joswiak	8e76621653	Disable shared state updates on configuration database	2022-10-27 13:56:13 -07:00
Lukas Joswiak	91146a03f0	Write cluster ID to `ClientDBInfo` This enables clients to receive the cluster ID.	2022-10-27 13:56:13 -07:00
Lukas Joswiak	28540e5962	Format	2022-10-27 13:56:13 -07:00
Lukas Joswiak	a8f8757f77	Rename cluster ID key In FDB 7.1, this key was stored in the txnStateStore. In 7.2, it has been moved to the database. This was causing protocol compatibility issues during upgrades, so we need to rename the key.	2022-10-27 13:56:13 -07:00
Lukas Joswiak	02bc5edbf8	Avoid blocking in choose when	2022-10-27 13:56:13 -07:00
Lukas Joswiak	9d3c3b1efe	Remove cluster ID logic from individual roles The logic to determine the validity of a process joining a cluster now belongs on the worker and the cluster controller. It is no longer restricted to tlogs and storages, but instead applies to all processes (even stateless ones).	2022-10-27 13:56:13 -07:00
Lukas Joswiak	1fca3b7ddc	Modify how cluster ID tests are run in simulation	2022-10-27 13:56:13 -07:00
Lukas Joswiak	bba05b7c9b	Move cluster ID from txnStateStore to the database The cluster ID is now stored in the database instead of in the txnStateStore. The cluster controller will read it on boot and send it to all processes to persist.	2022-10-27 13:56:13 -07:00
Lukas Joswiak	5ca2b89bdf	Fix simulation issue where process switch was ignored The simulator tracks only active processes. Rebooted or killed processes are removed from the list of processes, and only get added back when the process is rebooted and starts up again. This causes a problem for the `RebootProcessAndSwitch` kill type, which wants to simultaneously reboot all machines in a cluster and change their cluster file. If a machine is currently being rebooted, it will miss the reboot process and switch command. The fix is to add a check when a process is being started in simulation. If the process has had its cluster file changed and the cluster is in a state where all processes should have had their cluster files reverted to the original value, the simulator will now send a `RebootProcessAndSwitch` signal right when the process is started. This will cause an extra reboot, but should correctly switch the process back to its original, correct cluster file, allowing the cluster to fully recover all clusters. Note that the above issue should only affect simulation, due to how the simulator tracks processes and handles kill signals. This commit also adds a field to each process struct to determine whether the process is being run in a DR cluster in the simulation run. This is needed because simulation does not differentiate between processes in different clusters (other than by the IP), and some processes needed to switch clusters and some simply needed to be rebooted.	2022-10-27 13:56:13 -07:00
Lukas Joswiak	f43011e4b7	Notify processes joining the wrong cluster And have these processes enter a "zombie" state where they cancel all their actors and then wait forever, refusing to do any additional work until they are manually handled by the operator.	2022-10-27 13:56:13 -07:00
Lukas Joswiak	72a97afcd6	Avoid recruiting workers with different cluster ID	2022-10-27 13:56:13 -07:00
Lukas Joswiak	a72066be33	Add simulation support for changing the cluster file	2022-10-27 13:56:13 -07:00
Jingyu Zhou	6e0835f8a8	Merge pull request #8599 from technmsg/main updated copyright year on web site	2022-10-27 13:36:56 -07:00
Xiaoge Su	812243bafa	Update fdbclient/NativeAPI.actor.cpp Co-authored-by: Jingyu Zhou <jingyuzhou@gmail.com>	2022-10-27 12:42:05 -07:00
Xiaoge Su	41d1d6404d	Remove debugging output	2022-10-27 12:42:05 -07:00
Xiaoge Su	ec47c261bf	Reformat source	2022-10-27 12:42:05 -07:00
Xiaoge Su	4bd24e4d64	Record the version of each watch In the case 1. A watch to key A is set, the watchValueMap ACTOR, noted as X, starts waiting. 2. All watches are cleared due to connection string change. 3. The watch to key A is restarted with watchValueMap ACTOR Y. 4. X receives the cancel exception, and tries to dereference the counter. This causes Y gets cancelled. the reference count will cause watch prematurely terminate. Recording the versions of each watch would help preventing this issue	2022-10-27 12:42:05 -07:00
Xiaoge Su	ab0f827058	configurationMonitor does not need to check watch reference count	2022-10-27 12:42:05 -07:00
Xiaoge Su	639afbe62c	Cancel watch when the key is not being waited Currently, there is a cyclic reference situation in DatabaseContext -> WatchMetadata -> watchStorageServerResp -> DatabaseContext If there is a watch created in the DatabaseContext, even the corresponding wait ACTOR is cancelled, the WatchMetadata will still hold a reference to watchStorageServerResp ACTOR, which holds a reference to DatabaseContext. In this situation, any DatabaseContext who held a watch will not be automatically destructed since its reference count will never reduce to 0 until the watch value is changed. Every time the cluster recoveries, several watches are created, and when the cluster restarts, the DatabaseContext which not being used, will not be able to destructed due to these watches. With this patch, each wait to the watch will be counted. Either the watch is triggered or cancelled, the corresponding count will be reduced. If a watch is not being waited, the watch will be cancelled, effectively reduce the reference count of DatabaseContext. This will hopefully fix the issue mentioned above. The code is tested by 1) Manually change the number of logs of a local cluster, see the cluster recovery and previous DatabaseContext being destructed; 2) 100K joshua run, with 1 failure, the same test will fail on the current git main branch.	2022-10-27 12:42:05 -07:00
Xiaoge Su	03b102d86a	Clean up unused comment in flow.h	2022-10-27 12:42:05 -07:00
Alex Moundalexis	67049518b9	updated copyright year on web site	2022-10-27 15:05:52 -04:00
Nim Wijetunga	bf01d9b879	Bulk Setup Workload Improvements (#8573 ) * bulk setup workload improvements * fix workload * modify	2022-10-27 11:10:14 -07:00
Jingyu Zhou	fe66c026b4	Merge pull request #8598 from jzhou77/fix Fix restarting restore test failure	2022-10-27 10:44:17 -07:00
Josh Slocum	4d3553481f	Blob connection provider test (#8478 ) * Refactoring test blob metadata creation * Implementing BlobConnectionProviderTest * createRandomTestBlobMetadata supports blobstore and works outside simulation	2022-10-27 10:44:06 -05:00
Jingyu Zhou	6c0f890f78	Fix restarting restore test failure Old fdbserver may not set the "enableSnapshotBackupEncryption" key, thus we should allow the key to be not present.	2022-10-27 08:43:55 -07:00
Vaidas Gasiunas	c6adb3a98c	Building fdb_c_shim to a shared library (#8586 )	2022-10-27 12:37:20 +02:00
Markus Pilman	2bf9c2f448	Merge pull request #8588 from sfc-gh-mpilman/bugfixes/fix-build-dependencies Fix AWS SDK build and removed check for old build system	2022-10-26 12:36:08 -06:00
Dennis Zhou	deeedfc3f8	Merge pull request #8537 from sfc-gh-dzhou/unblob blob: allow purge ranges to begin and end in unblobbified regions	2022-10-26 11:11:09 -07:00
Markus Pilman	989731f7f4	Fix AWS SDK build and removed check for old build system	2022-10-26 11:48:10 -06:00
Aaron Molitor	f620f391f5	make same change to Dockerfile.eks (from #8583 )	2022-10-26 12:24:37 -05:00
Josh Slocum	623e6ef761	adding delay in bw forced shutdown to prevent crash races (#8552 )	2022-10-26 12:22:41 -05:00
Nim Wijetunga	6f37f55917	Restore System Keys First in Backup/Restore Workloads (#8475 ) * system key restore ordering * restore system keys before regular data * atomic restore backup fix * change testing * fix compile error * fix compile issue * fix compile issues * Trigger Build * only split restore if encryption is enabled * revert knob changes * Update fdbserver/workloads/AtomicSwitchover.actor.cpp Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com> * Update fdbserver/workloads/AtomicSwitchover.actor.cpp Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com> * Update fdbserver/workloads/BackupCorrectness.actor.cpp Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com> * Update fdbserver/workloads/AtomicRestore.actor.cpp Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com> * add todo * strengthen check * seperate system restore for atomic restore * address pr comments * address pr comments Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>	2022-10-26 09:38:27 -07:00
Josh Slocum	ab6953be7d	Blob Granule read-driven compaction (#8572 )	2022-10-26 09:02:50 -07:00
Aaron Molitor	b8b7b46d8f	update kubectl and awscli	2022-10-26 10:52:05 -05:00
Marian Dvorsky	3c5d3f7a94	Fix SpanContext for GP:getLiveCommittedVersion (#8565 ) * Fix SpanContext for GP:getLiveCommittedVersion	2022-10-26 16:29:28 +02:00
Junhyun Shim	32099bfce5	Merge pull request #8564 from sfc-gh-jshim/enable-authz-benchmark-in-mako Enable authz/TLS-enabled benchmark in mako	2022-10-26 14:55:53 +02:00

1 2 3 4 5 ...

23470 Commits