560 Commits

Author SHA1 Message Date
Suraj Gupta
d3fbad74a2 cleanup debugging 2021-12-10 14:00:34 -06:00
Suraj Gupta
cb568bbd55 Add watch on config key. 2021-12-10 14:00:34 -06:00
Suraj Gupta
fc3376fe8f Move client knob to database config for blob granules. 2021-12-10 14:00:34 -06:00
Evan Tschannen
130def7897 fix: make sure both conditions are true before better master exists is executed 2021-12-06 13:50:31 -08:00
Evan Tschannen
951bc4acd7 fix: do not call better master exists until the long lived stateless processes have settled into their desired locations 2021-12-06 13:12:27 -08:00
Evan Tschannen
a4fff19320 Revert "fix: long lived stateless processes need to be the last comparison criteria to avoid better master exists from changing behavior from the original recruitment"
This reverts commit b55b095ed050d494ca729cfbe1e0e92a2d64fa7f.
2021-12-05 20:26:13 -08:00
Evan Tschannen
b55b095ed0 fix: long lived stateless processes need to be the last comparison criteria to avoid better master exists from changing behavior from the original recruitment 2021-12-05 20:18:03 -08:00
Evan Tschannen
ff47013158 fix: check if the master has been killed while waiting for getNextBMEpoch 2021-12-04 17:08:23 -08:00
Evan Tschannen
557186ed17
Merge pull request #5909 from sfc-gh-jfu/jfu-cc-request-dbinfo
Change dbinfo broadcast to be explicitly requested by the worker registration message
2021-11-16 15:01:42 -08:00
Evan Tschannen
964d0209ca
Merge pull request #5637 from sfc-gh-ljoswiak/features/data-loss-prevention
Data loss protection when joining new cluster
2021-11-15 15:26:32 -08:00
Vaidas Gasiunas
51b8ccf7d3 Merge remote-tracking branch 'apple/master' into notify-client-lib-changes 2021-11-10 18:40:34 +01:00
Lukas Joswiak
74cf64fe0f Sync cluster ID through ServerDBInfo 2021-11-09 12:29:48 -08:00
Lukas Joswiak
3e2c65bb11 Allow tlog to join another cluster but retain its data 2021-11-09 12:29:48 -08:00
Lukas Joswiak
30867750b5 Add protection against storage and tlog data deletion when joining a new cluster 2021-11-09 12:29:47 -08:00
Jon Fu
00f4bd8536 Check ccInterface against serverDbInfo's cc and make broadcast unconditional for first registration 2021-11-08 12:43:02 -05:00
Jon Fu
4e8625ccc0 retain old behaviour along with explicit request 2021-11-03 17:23:07 -04:00
Jon Fu
59f0a2c3e5 Change dbinfo broadcast to be explicitly requested by the worker registration message 2021-11-03 15:51:21 -04:00
Evan Tschannen
ee00135a6b skip good recruitment errors when doing simulation only validation 2021-11-01 13:24:15 -07:00
Evan Tschannen
78e36e7590 fix: simulation only validation could throw errors which would impact the behavior of the cluster controller 2021-11-01 13:24:15 -07:00
Evan Tschannen
ddf235713e strengthen assert 2021-10-28 16:40:30 -07:00
Evan Tschannen
4d8ee2ed33 fix: simple recruitment could succeed with less than the required replication factor 2021-10-28 16:38:04 -07:00
Vaidas Gasiunas
875824b186 MVC2.0: Notify clients about relevant changes of client libraries 2021-10-27 23:43:40 +02:00
Josh Slocum
0ff8ddc2b6 Merge branch 'master' into blob_full_clean 2021-10-25 13:38:48 -05:00
A.J. Beamon
e882eb33fc Abstract the cluster file into a cluster connection record that can be backed by something other than the filesystem. 2021-10-22 11:05:18 -07:00
Josh Slocum
773886515e Merge branch 'feature-range-feed' into blob_full_clean 2021-10-22 11:07:51 -05:00
Josh Slocum
912ef76f1c cleanup before merge 2021-10-18 17:11:14 -05:00
Suraj Gupta
5466bdb569 Gate more entry points to BM recruitment. 2021-10-18 15:04:22 -04:00
A.J. Beamon
507a09893c
Add ClientCount to ClusterControllerMetrics (#5748) 2021-10-17 20:47:11 -07:00
Josh Slocum
5f0ec0612a Merge branch 'feature-range-feed' into blob_full 2021-10-13 15:44:35 -05:00
Suraj Gupta
2ec8781224 Merge knobs into one. 2021-10-13 14:00:37 -04:00
Suraj Gupta
5a6a052c55 Add a knob to gate blob-related work. 2021-10-13 09:48:02 -04:00
Zhe Wu
645cfc85a0 fix remote health variables declaration order 2021-10-07 21:54:25 -07:00
Zhe Wu
6540b6eec5 Some improvements for grey failure failover 2021-10-07 20:42:55 -07:00
Zhe Wu
c07a07dbbe Take uptime into account when making failover decision 2021-10-07 11:19:34 -07:00
Zhe Wu
62197faa46 Add more comments to the code 2021-10-07 11:19:34 -07:00
Zhe Wu
c0fbe5471f Implement the core logic of grey failure triggered failover 2021-10-07 11:19:34 -07:00
Suraj Gupta
282f9d35cd Cleanup comments and debugging code. 2021-10-04 11:07:08 -04:00
Suraj Gupta
4d54669ccd Recruit the blob workers via blob manager.
In this PR, the blob manager now recruits blob workers
(via communication with the cluster controller). Blob workers
are onboarded as blob worker processes enter the cluster.
2021-10-04 11:07:08 -04:00
Chang Liu
c523964ff7 Fix roll trace event issue
Description

Testing
2021-09-24 09:53:32 -07:00
Chang Liu
8427e40cbe Fix roll trace event issue
Description

Testing
2021-09-24 09:53:32 -07:00
Chang Liu
48990058a3 Fix roll trace event issue
Description

Testing
2021-09-24 09:53:32 -07:00
Zhe Wu
e28fef6264 Fix failover logic in checkRecoveryStalled: failover only when remote is enabled 2021-09-23 20:12:22 -07:00
Suraj Gupta
5fa6c687d6 Add blob manager as a singleton. 2021-09-23 10:45:37 -04:00
Suraj Gupta
95c004f80b Add missing namespace qualifier to vector. 2021-09-22 16:57:04 -05:00
Suraj Gupta
2b9dfc1371 Simplify count increments. 2021-09-22 16:56:59 -05:00
Suraj Gupta
4530e746d6 Address PR comments.
Adds comment for constant and changes method name for `setonDb`.
2021-09-22 16:56:49 -05:00
Suraj Gupta
4a71f3d0f8 Fix recruitment bug. 2021-09-22 16:56:44 -05:00
Suraj Gupta
72edcd8d73 Address PR comments.
Revert knob name change, fix comparison between new and old
recruitments, and get rid of empty `if` block.
2021-09-22 16:56:34 -05:00
Suraj Gupta
10807ddebc Rename function to be more clear. 2021-09-22 16:56:27 -05:00
Suraj Gupta
0b6fecddbc Refactor logic for recruiting singletons.
This commit refactors the logic for recruiting singletons,
which is done by the ClusterController. This allows for far
easier additions of new singletons in the future, and also
cleans up the code.

Also, the logic for recruiting DD was changed to mirror
the logic for recruiting RK. Although the logic for RK
allows there to be many RKs existing at once, the moveKeysLock
mechanism used by DD still prevents multiple DDs existing at once.
2021-09-22 16:56:18 -05:00