125 Commits

Author SHA1 Message Date
Lukas Joswiak
7fb427f4ec Update fdbserver/GrvProxyServer.actor.cpp
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
2022-07-22 10:37:29 -07:00
Lukas Joswiak
a26344b877 Update fdbserver/GrvProxyServer.actor.cpp
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
2022-07-22 10:37:29 -07:00
Lukas Joswiak
8395c6cf3f Fix Tuple error 2022-07-22 10:37:29 -07:00
Lukas Joswiak
5fa8e3f7d0 Simplify refresh condition 2022-07-22 10:37:29 -07:00
Lukas Joswiak
703aa1d279 Mess with timeout values 2022-07-22 10:37:29 -07:00
Lukas Joswiak
40d403ed5f Reduce global configuration system key reads from proxy
Clients now poll the proxy for the latest global config for a specific
version. The proxy now periodically requests the latest global
configuration data and stores it in memory, enabling it to respond
immediately to clients with the appropriate version.
2022-07-22 10:37:29 -07:00
Lukas Joswiak
56dfdbda83 Add migration timeout 2022-07-22 10:37:29 -07:00
Lukas Joswiak
e024db85a0 Update fdbserver/GrvProxyServer.actor.cpp
Co-authored-by: A.J. Beamon <aj.beamon@snowflake.com>
2022-07-22 10:37:29 -07:00
Lukas Joswiak
7a48a53778 Perform migration automatically on proxy boot 2022-07-22 10:37:29 -07:00
Lukas Joswiak
2e99d5f6cc Batch global config refresh requests 2022-07-22 10:37:29 -07:00
Lukas Joswiak
c33e44b0f4 Proxy GlobalConfig reads through GRV proxies
Clients should avoid reading system keys unless authorized. Under global
config, each client reads from the system keyspace to check for new
global config keys. This commit moves these reads to a server role (the
GRV proxies) and sends the results back to GlobalConfig for an in-memory
update.
2022-07-22 10:37:29 -07:00
Markus Pilman
1de37afd52
Make TEST macros C++ only (#7558)
* proof of concept

* use code-probe instead of test

* code probe working on gcc

* code probe implemented

* renamed TestProbe to CodeProbe

* fixed refactoring typo

* support filtered output

* print probes at end of simulation

* fix missed probes print

* fix deduplication

* Fix refactoring issues

* revert bad refactor

* make sure file paths are relative

* fix more wrong refactor changes
2022-07-19 13:15:51 -07:00
Ray Jenkins
dc9e782ccc
OpenTelemetry Tracing Perf Fixes (#6990) 2022-05-02 14:56:51 -05:00
Ray Jenkins
1c5bf135d5
Revert "Migrate to OpenTelemetry tracing. (#6855)" (#6941)
This reverts commit 5df3bac110d9b5b88931b008b852433688bb7eb0.
2022-04-25 09:29:56 -05:00
Ray Jenkins
5df3bac110
Migrate to OpenTelemetry tracing. (#6855) 2022-04-20 09:26:37 -05:00
Sreenath Bodagala
f038f37513 - Do not invoke version vector related code on the sequencer and
GRVs when version vector feature is disabled.
2022-04-12 20:05:32 +00:00
Markus Pilman
16467262f0 Merge remote-tracking branch 'origin/main' into features/private-request-streams 2022-04-10 14:12:37 -06:00
Markus Pilman
bf956f5630 Merge remote-tracking branch 'origin/main' into features/private-request-streams 2022-04-07 13:29:27 -06:00
Jingyu Zhou
cfcf0f152c Merge branch 'main-4a085fc84' into vv
Fix Conflicts:
	fdbclient/NativeAPI.actor.cpp
	fdbserver/ClusterRecovery.actor.cpp
	fdbserver/MasterInterface.h
	fdbserver/masterserver.actor.cpp
	flow/error_definitions.h
2022-03-30 22:28:06 -07:00
Jingyu Zhou
00b57d4cce Merge branch 'main-67eba5ec7' into vv
Fix Conflicts:
	fdbclient/CommitProxyInterface.h
	fdbclient/NativeAPI.actor.cpp
	fdbclient/StorageServerInterface.h
	fdbserver/CommitProxyServer.actor.cpp
	fdbserver/storageserver.actor.cpp
2022-03-30 20:05:55 -07:00
Jingyu Zhou
e9659b5dd4 Merge branch 'master-PR-6500' into vv
Fix Conflicts:
	fdbclient/CommitProxyInterface.h
	fdbclient/NativeAPI.actor.cpp
	fdbserver/masterserver.actor.cpp
2022-03-30 14:53:49 -07:00
sfc-gh-tclinkenbeard
a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
Markus Pilman
118b53b7cf Merge remote-tracking branch 'origin/main' into features/private-request-streams 2022-03-17 12:06:44 +01:00
Jingyu Zhou
2acc222d9a Detect stale GRV proxy replies
Each GetReadVersionReply now has the GRV proxy ID that sent it so that the
client can detect old GRV proxy's replies and avoid apply stale version vector
changes.
2022-03-15 14:23:07 -07:00
Markus Pilman
bed799220a Addressed review comments, added test 2022-03-15 16:57:26 +01:00
sfc-gh-tclinkenbeard
8dcac2f76d Fix typos 2022-03-13 10:02:11 -03:00
Markus Pilman
8fac0081a8 Merge remote-tracking branch 'origin/main' into features/private-request-streams 2022-03-09 11:00:00 +01:00
A.J. Beamon
cdebda35ab
Merge pull request #5725 from sfc-gh-jfu/jfu-grv-cache
Add transaction option for clients to use cached read versions
2022-03-04 09:17:27 -08:00
A.J. Beamon
250a88e682 Enforce that trace event suppression calls happen first when using trace event call chaining. Fix various instances where we weren't following this requirement. 2022-02-24 12:25:52 -08:00
Jon Fu
ae071a7211 clean up debug trace lines 2022-02-22 17:16:11 -05:00
Markus Pilman
dc973fb67e Allow List and first test 2022-02-22 11:15:16 +01:00
Jon Fu
e0d3b0a488 format times in trace event 2022-02-16 16:55:14 -05:00
Jon Fu
6199faadc3 fix bug which constantly overwrote start time of last throttled queue 2022-02-16 16:52:38 -05:00
Jon Fu
5846dda410 temporary changes and extra traces for debugging 2022-02-16 15:05:25 -05:00
Jon Fu
6f1c3d50bb add debug traces for testing 2022-02-15 15:08:53 -05:00
Jon Fu
8129c4e21c simplify sidebandsingle workload and be stricter with batch throttling on rk 2022-02-14 13:58:56 -05:00
Jon Fu
a63d218e9d simplify test workload and adjust ratekeeper throttling strategy 2022-02-11 16:41:14 -05:00
Jon Fu
458e708272 addressed code review comments: renamed variables, small functional changes, style changes 2022-02-10 16:17:54 -05:00
Jon Fu
ec2bbf0343 clean up some more trace lines and leftover code snippets 2022-02-07 14:50:04 -05:00
Jon Fu
d8e7fea421 clean up some comments and debug changes 2022-02-02 14:03:32 -05:00
Jon Fu
915e2f6c1c Merge branch 'main' of github.com:apple/foundationdb into jfu-grv-cache 2022-01-20 16:17:20 -05:00
Dan Lambright
9544379cdf rebase 2022-01-20 11:12:33 -05:00
Ata E Husain Bohra
936bf5336a
Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine" (#6191)
* Revert "Revert "Refactor: ClusterController driving cluster-recovery state machine""

Major changes includes:
1. Re-revert Sequencer refactor commits listed below (in listed order):
1.a. This reverts commit bb17e194d9c9888e203421290959bd7f2c075d7f.
1.b. This reverts commit d174bb2e06bff01157d16c652073536c54d17f7f.
1.c. This reverts commit 30b05b469c87d9b526b427751c211fb5cf7ff9cd.

2. Update Status.actor to track ClusterController interface to track
   recovery status.
3. Introduce a ServerKnob to define "cluster recovery trace event"
   prefix; for now keeping it as "Master", however, it should allow
   smooth transition to "Cluster" prefix as it seems more appropriate.
2022-01-06 12:15:51 -08:00
Aaron Molitor
30b05b469c Revert "Refactor: ClusterController driving cluster-recovery state machine"
This reverts commit dfe9d184ff5dd66bdbbc5b984688ac3ebb15b901.
2021-12-24 11:25:51 -08:00
Aaron Molitor
d174bb2e06 Revert "Refactor: ClusterController driving cluster-recovery state machine"
This reverts commit abd2959702b0027ab23b8d42d8082b79c3b197f3.
2021-12-24 11:25:51 -08:00
Ata E Husain Bohra
abd2959702 Refactor: ClusterController driving cluster-recovery state machine
diff-1: Address Jingyu's review comments

At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
Ata E Husain Bohra
dfe9d184ff Refactor: ClusterController driving cluster-recovery state machine
At present, cluster recovery process consists of following steps:
1. ClusterController clusterWatchDatabase actor recruits
   master/sequencer process.
2. Sequencer process implements the cluster recovery state machine,
   responsible to recruit all other processes as well restore the
   cluster state.

Patch proposes a scheme where the cluster recovery state machine
is implemented and driven by the ClusterController process instead
of the Sequencer process.

Advantages of the scheme could be:
1. Simplified design where ClusterController recruits "sequencer"
   process like other worker processes compared to current scheme
   where "sequencer" process gets special treatment. In newer scheme
   sequencer is responsible for maintaining/providing
   "committed version" (as expected).
2. ClusterController is responsible for worker processes recruitment,
   the sequencer though orchestrating the recovery state machine, it
   need to reachout to the ClusterController for recruiting worker
   processes etc.

NOTE:
Patch has moved the recovery state machine code from
'sequencer' -> 'cluster-controller' process, however, necessary
updates were done for both functionality as well as performance
improvement reasons.

Next Steps:
Cluster recovery documentation will be updated in near future.
2021-12-22 14:06:27 -08:00
Jon Fu
2b0ade5250 Change throttling threshold to count loop iterations instead of time 2021-11-25 13:31:55 -05:00
Jon Fu
beae9ccfa1 tweak some knob default settings and trace formatting 2021-11-23 14:58:17 -05:00
Jon Fu
33ee5fa372 add tracing to proxy throttling check codepath 2021-11-22 13:00:53 -05:00