1
0
mirror of https://github.com/apple/foundationdb.git synced 2025-05-31 18:19:35 +08:00

76 Commits

Author SHA1 Message Date
Renxuan Wang
c69a07a858
Check in the new Hostname logic. ()
* Revert .

20220407-031010-renxuan-c101052c21da8346           compressed=True data_size=31004844 duration=4310801 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=1:04:15 sanity=False started=100047 stopped=20220407-041425 submitted=20220407-031010 timeout=5400 username=renxuan

* Revert .

20220407-051532-renxuan-470f0fe6aac1c217           compressed=True data_size=30982370 duration=3491067 ended=100002 fail_fast=10 max_runs=100000 pass=100002 priority=100 remaining=0 runtime=0:59:57 sanity=False started=100141 stopped=20220407-061529 submitted=20220407-051532 timeout=5400 username=renxuan

* Revert .

Remove resolving-related functionalities in connection string. Connection string will be used for storing purpose only, and non-mutable.

20220407-175119-renxuan-55d30ee1a4b42c2f           compressed=True data_size=30970443 duration=5437659 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=0:59:31 sanity=False started=100154 stopped=20220407-185050 submitted=20220407-175119 timeout=5400 username=renxuan

* Add hostname to coordinator interfaces.

* Turn on the new hostname logic.

* Add the corresponding change in config txns.

The most notable change is before calling basicLoadBalance(), we need to call tryInitializeRequestStream() to initialize request streams first.

Passed correctness tests.

* Return error when hostnames cannot be resolved in coordinators command.

* Minor fixes.
2022-04-27 21:54:13 -07:00
sfc-gh-tclinkenbeard
a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
Xiaoge Su
abf73047ca Enforce std:: specifier rather than using namespace 2021-09-16 19:40:28 -07:00
Neethu Haneesha Bingi
decfb610ff Minor review comments. 2021-06-25 15:04:49 -07:00
Neethu Haneesha Bingi
66f2518405 exclude to work with any locality data match. 2021-06-23 18:03:27 -07:00
Neethu Haneesha Bingi
cbe714acd0 Status json schema update, includelocalities back for consistency check, review comments. 2021-06-23 18:03:27 -07:00
Neethu Haneesha Bingi
4ad5926a25 Snake naming of keys and added comments to all new functions. 2021-06-23 18:03:27 -07:00
Neethu Haneesha Bingi
73752f441b exclude locality:clang-format, ranged loops, documentation, tracking addStoragesever for exclusion. 2021-06-23 18:03:27 -07:00
Neethu Haneesha Bingi
62355571d0 exclude servers based on locality match 2021-06-23 18:03:27 -07:00
Evan Tschannen
272e649a3c The checkSafeExclusions function only ensures the exclusion is safe from the storage server prospective, but does not confirm it is safe in terms of the tlog replication 2021-03-23 13:31:16 -07:00
FDB Formatster
df90cc89de apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-03-10 10:18:07 -08:00
David Youngworth
d64cf8b9e3 Merge branch 6.3 into master 2020-11-17 11:22:45 -08:00
Jon Fu
cc13ef08bd Sort the failed sets before modifying them in attempts to make changes consistent 2020-11-12 16:26:34 -05:00
sfc-gh-tclinkenbeard
7f0d14c8e4 Modernize/refactor workloads directory 2020-10-04 22:29:07 -07:00
sfc-gh-tclinkenbeard
6c726ba8dd Improve ConfigurationResult and CoordinatorsResult type safety 2020-09-27 15:29:15 -07:00
sfc-gh-tclinkenbeard
0814841827 Replace NULL with nullptr in fdbserver 2020-09-20 11:31:49 -07:00
Meng Xu
17056b2a24
Fix comment grammar typo
Co-authored-by: Xin Dong <jiangzian1987dx@gmail.com>
2020-07-30 10:02:11 -07:00
Meng Xu
a2089b354a RemoveServersSafely:Safety check toKill1 to avoid cluster getting stuck
toKill1 and toKill2 are a random subset of all processes. If simply kill all processes in toKill1 or toKill2,
we may kill too many processes to make the cluster unavailable and stuck.

Similar as what toKill2 were modified if it can cause cluster unavailable,
we should do the same thing for toKill1
2020-07-28 21:07:31 -07:00
Meng Xu
7f559bc712 Cleanup code and apply clang-format
Self code review
2020-03-16 15:08:32 -07:00
Meng Xu
1513df22f3 AutoQuorumChange:Exclude unreliable node from coordinator in simulation 2020-03-16 14:39:25 -07:00
Meng Xu
15c48b9e19 Add event for getDesired coordinators 2020-03-16 09:40:35 -07:00
Meng Xu
bd345f85db ConsistencyCheck:Fix failue due to address inconsistency between process and worker
With TLS, a worker (or process) can have a TLS address and non-TLS address.
When a process is created in simulation, the primary address is TLS by default.
The non-TLS one is the TLS address port plus one.

In a connection between two workers, if their primary addresses do not enable
or disable TLS together, one worker will swap its primary address and secondary address
so that the TLS config of the two endpoints can match.

The swap can make the primary address no longer the TLS one that was created
when the process is created. And the swap only happens for worker instead of
process struct in simulation.

This swap can cause worker->address != process->address.
In checkForExtraDataStores actor, we use worker->address to check if a process
is killable and use the process->address to kill the process. The inconsistency
can cause simulation to kill a protected process that is not killable and leads
to simulation failure.
2020-03-10 21:07:16 -07:00
Jon Fu
3de7ae5b0c Added size assertion in test workload 2019-11-08 09:39:25 -08:00
Jon Fu
f7b3686fc7 fixed bug in maintaining kill set size 2019-11-05 11:27:10 -08:00
Jon Fu
b1fd6b4443 addressed review comments 2019-10-18 09:43:25 -07:00
Jon Fu
6e1af6b2d9 changed check in movekeys for matching of srcSet and intendedTeam 2019-10-10 10:58:28 -07:00
Jon Fu
ac7369d27c Changed logic and reordered swap of coordinator exclusion in workload 2019-10-09 10:22:42 -07:00
Jon Fu
6fc3ef17fb included stricter checks when adding coordinator to the workload's kill set 2019-10-08 13:32:57 -07:00
Jon Fu
0f0a6c5431 reworked retry/timeout logic in workload to avoid forcefully putting db in broken state 2019-10-07 10:18:19 -07:00
Jon Fu
f721a444ae moved ordering of coordinator exclusion to fill all containers 2019-09-30 15:43:44 -07:00
Jon Fu
09c48cf3ab use management api to get coordinators instead of simulator 2019-09-27 12:14:36 -07:00
Jon Fu
4a69e43fe1 fixed mechanism to get coordinators from simulator processes 2019-09-26 16:10:37 -07:00
Jon Fu
061c98c13d explicitly exclude a coordinator if buggified 2019-09-26 15:13:08 -07:00
Jon Fu
450a09e117 Code Review Changes 2019-09-24 15:48:50 -07:00
Jon Fu
779ed5cc6c added timeout retry limit to safetycheck in workload 2019-09-09 09:38:34 -07:00
Jon Fu
d6e0c460f1 adjusted range in picking random subset of excluded servers 2019-08-27 14:39:44 -07:00
Jon Fu
202900bd79 adjusted priority of relocateShard requests if team contains failed server 2019-08-27 14:39:44 -07:00
Jon Fu
3666c0c776 added more trace lines and added timeout to safety check in test workload 2019-08-27 14:39:44 -07:00
Jon Fu
04d514c483 added a wait to check for master proxies changed and put in a few more trace events 2019-08-27 14:39:44 -07:00
Jon Fu
b9c73632e7 adjusted workload exclusions and addressed a few pre-existing bugs 2019-08-27 14:39:44 -07:00
Jon Fu
00c2025d4b fixed removeKeys impl, adjusted test workload, and introduced extra safety checks to NativeAPI and proxy 2019-08-27 14:39:44 -07:00
Jon Fu
807b02551e updated help message and changed existing workload to use mark as failed feature 2019-08-27 14:39:43 -07:00
Vishesh Yadav
d9a8657096 fdbcli: Add no_wait option in exclude command to avoid blocking
RESOLVES 
2019-07-18 13:07:31 -07:00
A.J. Beamon
5f55f3f613 Replace g_random and g_nondeterministic_random with functions deterministicRandom() and nondeterministicRandom() that return thread_local random number generators. Delete g_debug_random and trace_random. Allow only deterministicRandom() to be seeded, and require it to be seeded from each thread on which it is used. 2019-05-10 14:01:52 -07:00
mpilman
c008e16c81 Defer formatting in traces to make them cheaper
This is the first part of making `TraceEvent` cheaper. The main idea is
to defer calls to any code that formats string. These are the main
changes:

- TraceEvent::detail now takes a c-string instead of std::string for
  literals. This prevents unnecessary allocations if the trace is not
  going to be printed in the first place (for example for SevDebug).
  Before that `detail` expected a `std::string` as key, which mean that
  any string literal would be copied on each call.
- Templates Traceable and SpecialTraceMetricType. These templates can be
  specialized for any type that needs to be printed. The actual
  formatting will be deferred to after the `enabled` check. This
  provides two benefits: (1) if a TraceEvent is disabled, we don't pay
  for the formatting and (2) TraceEvent can trace types that it doesn't
  know about.
- TraceEvent::enabled will be set in the constructor if the Severity is
  passed. This will make sure that `TraceEvent::init` is not called.
- `TraceEvent::detail` will be inlined. So for disabled TraceEvent
  calls, a call to detail will only introduce a if-branch which is much
  cheaper than a function call.
2019-04-05 13:12:19 -07:00
Vishesh Yadav
57832e625d net: Support IPv6
- NetworkAddress now contains IPAddress object which can be either
IPv4 or IPv6 address. 128bits are used even for IPv4 addresses,
however only 32bits are used when using/serializing IPv4 address.

- ConnectPacket is updated to store IPv6 address. Backward compatible
with old format since the first 32bits of IP address field is used
for serialization of IPv4.

- Mainly updates rest of the code to use IPAddress structure instead
of plain uint32_t.

- IPv6 address/pair ports should be represented as `[ip]:port` as per
convention. This applies to both cluster files and command line
arguments.
2019-03-04 14:12:41 -08:00
mpilman
999ea09bfd Use correct fwd decls in TesterInterface
Also TesterInterface.h -> TesterInterface.actor.h
2019-02-19 15:16:59 -08:00
mpilman
699216f713 Use fwd decls in workloads
Also workloads.h -> workloads.actor.h
2019-02-19 15:16:59 -08:00
mpilman
3f0fd2a20c Use fwd decls in WorkerInterface
Also WorkerInterface.h -> WorkerInterface.actor.h
2019-02-19 15:16:59 -08:00
mpilman
0bb60e5a3b Use proper fwd decl in NativeAPI
Also NativeAPI.h -> NativeAPI.actor.h
2019-02-19 15:16:59 -08:00