7696 Commits

Author SHA1 Message Date
Evan Tschannen
4640edf5d6 do not recruit satellite tlogs when usable regions=1 2020-03-13 10:24:52 -07:00
Evan Tschannen
243c268d9d Limit the amount of requests the proxy can queue up in memory 2020-03-13 10:17:49 -07:00
Alex Miller
04498cbc0e Make policy failures be reported as per 1s and not over 5s. 2020-03-13 02:49:06 -07:00
Alex Miller
d86a601b84 Add cluster.processes.id.network.tls_policy.hz to status.
This allows monitoring of TLS policy failures, but one has to go scrape
for TLSPolicyFailure trace events to figure out why they're happening.
2020-03-13 02:46:10 -07:00
Alex Miller
75e2fffe5a Add a ProcessMetrics.TLSPolicyFailures metric
This reports the number of policy failures over the past 5s interval.
It also is step 1 towards getting this information into status json.
2020-03-13 02:24:37 -07:00
Alex Miller
0c558efcfe Add a tlsinfo command to fdbcli that prints the certificate chain.
This requires the certificate chain to load successfully, otherwise
fdbcli will error out at an earlier point due to Net2 not being able to
configure TLS.
2020-03-13 00:11:53 -07:00
A.J. Beamon
f7198c4ba3 Use the std::string constructor of StringRef, which will use the length of string correctly. 2020-03-12 12:35:08 -07:00
A.J. Beamon
6940d546f5 Fix bug where status is truncated when a null byte is included. This is implemented by escaping unprintable characters. 2020-03-12 12:27:53 -07:00
A.J. Beamon
555db50cd1 Avoid calling into SABTF so frequently. Use a cheaper call that only checks that shards exist. 2020-03-12 11:22:03 -07:00
A.J. Beamon
2466749648 Don't disallow allocation tracking when a trace event is open because we now have state trace events. Instead, only block allocation tracking while we are in the middle of allocation tracking already to prevent recursion. 2020-03-12 11:17:49 -07:00
A.J. Beamon
8cdf918316 Add logging when file identifiers don't match 2020-03-12 11:06:53 -07:00
tclinken
2017daf7d4 Ignore createDirectory error if directory already exists 2020-03-06 16:48:23 -08:00
Evan Tschannen
53d4798c75
Merge pull request #2789 from etschannen/post-release-cleanup-6.2.18
Post release cleanup 6.2.18
2020-03-06 13:58:56 -08:00
Evan Tschannen
f2cb743cfa update installer WIX GUID following release 2020-03-06 13:58:03 -08:00
Evan Tschannen
4020919185 update version to 6.2.19 2020-03-06 13:58:02 -08:00
Evan Tschannen
ca5782c2a4
Merge pull request #2788 from etschannen/release-6.2
updated documentation for 6.2.18
6.2.18
2020-03-06 11:20:49 -08:00
Evan Tschannen
15f1a75d4f updated documentation for 6.2.18 2020-03-06 11:16:10 -08:00
Evan Tschannen
dbfc0cbcc0
Merge pull request #2781 from alexmiller-apple/certificate-refresh
Refresh certificates used for handshaking when they change on disk
2020-03-06 11:12:04 -08:00
Alex Miller
f9969a853c Merge remote-tracking branch 'origin/certificate-refresh' into certificate-refresh 2020-03-06 11:10:05 -08:00
Alex Miller
188d9b8239 Don't swallow actor cancellation in certificate refreshing. 2020-03-06 11:09:17 -08:00
Alex Miller
9b760fae2d Rewrite all Errors into tls_errors if they happen as part of initializing TLS. 2020-03-06 11:06:19 -08:00
Alex Miller
1f56bf8933
Fix the build with success()
Co-Authored-By: A.J. Beamon <ajbeamon@users.noreply.github.com>
2020-03-06 10:15:04 -08:00
Evan Tschannen
98647a61fc
Merge pull request #2784 from ajbeamon/add-resolver-metrics
Add ResolverMetrics trace event
2020-03-06 09:38:30 -08:00
Evan Tschannen
a3662c68e8
Merge pull request #2786 from ajbeamon/add-new-transaction-metrics
Add more metrics to the TransactionMetrics event
2020-03-06 09:34:16 -08:00
A.J. Beamon
faf9101ad4
Update fdbserver/Resolver.actor.cpp
Co-Authored-By: Evan Tschannen <36455792+etschannen@users.noreply.github.com>
2020-03-06 09:20:38 -08:00
A.J. Beamon
d59e25b0dc
Merge pull request #2787 from etschannen/feature-log-router-logging
Added additional logging
2020-03-06 09:20:18 -08:00
Alex Miller
ac52b6b474 Rework a bit of error and exception handling.
I went back and dug through all of the "what functions can throw what
types", and made sane decisions about them.  boost errors are
aggressively translated into FDB ones, whcih might result in multiple
lines of logging about errors, but this is in infrequently run code, so
it should be fine.
2020-03-06 02:33:16 -08:00
Evan Tschannen
1076abdee5 fixed crash when interf was not created 2020-03-05 19:09:08 -08:00
Evan Tschannen
39050308ff lower accept batch size just to be conservative with the change 2020-03-05 18:17:49 -08:00
Evan Tschannen
1128666840 added additional logging on the log router 2020-03-05 18:17:06 -08:00
Alex Miller
ccef3f7d05 Attempt to fix TLS_DISABLED compiles. 2020-03-05 17:32:10 -08:00
Alex Miller
2d95a1e64d Implement certificate refreshing 2020-03-05 17:25:33 -08:00
A.J. Beamon
fd8d569b91 Fix a few typos. 2020-03-05 14:42:07 -08:00
A.J. Beamon
6479034645 Add more metrics to the TransactionMetrics event 2020-03-05 14:00:44 -08:00
A.J. Beamon
7fb8c3c080 Remove unused variable. 2020-03-05 11:38:30 -08:00
A.J. Beamon
effb6d2d49 Add ResolverMetrics trace event 2020-03-05 10:49:21 -08:00
Meng Xu
0cd4913990
Merge pull request #2783 from ajbeamon/release-note-update
Clarify fdbcli knob release note
2020-03-05 09:07:40 -08:00
A.J. Beamon
6d87800343 Clarify fdbcli knob release note to say that the knobs being set apply to the behavior of fdbcli. 2020-03-05 08:20:55 -08:00
Alex Miller
f657ca069e Fix bindings build breakage, because I hadn't built bindings. 2020-03-04 23:51:21 -08:00
Evan Tschannen
d005f3d3aa
Merge pull request #2780 from etschannen/release-6.2
Do not allow the cluster controller to mark any process as failed within 30 seconds of startup
2020-03-04 21:08:30 -08:00
Alex Miller
595dd77ed1 Merge remote-tracking branch 'upstream/release-6.2' into certificate-refresh 2020-03-04 20:25:42 -08:00
Alex Miller
9b5ef3416e Refactor TLSParams into TLSConfig + LoadedTLSConfig
The idea being that we keep around a TLSConfig that the configuration
that the user has provided, and then when we want to intialize an SSL
context, we ask the TLSConfig to load all certificates and return us a
LoadedTLSConfig that is a concrete set of certificate bytes in memory.

initTLS now just takes the in-memory bytes and applies them to the ssl
context.

This is a large refactor to lead up into certificate refeshing, where we
will periodically check for changes to the certificates, and then
re-load them and apply them to a new SSL context.
2020-03-04 20:14:47 -08:00
Evan Tschannen
f3ac2c9180 renamed a variable 2020-03-04 18:49:21 -08:00
Evan Tschannen
45fb098ce0 updated release notes 2020-03-04 18:47:16 -08:00
Evan Tschannen
b3ea9d5896 Do not allow the cluster controller to mark any process as failed within 30 seconds of startup 2020-03-04 18:45:26 -08:00
Evan Tschannen
93becf1986
Merge pull request #2776 from etschannen/feature-dd-region-queue
When configured with multiple regions, the DD queue could start too many relocations
2020-03-04 18:42:39 -08:00
Evan Tschannen
b353ea1fd1 updated documentation 2020-03-04 17:40:59 -08:00
Evan Tschannen
e219c1671f Merge branch 'release-6.2' into feature-dd-region-queue
# Conflicts:
#	fdbserver/Knobs.h
2020-03-04 16:25:38 -08:00
Evan Tschannen
6d6f184e2f added a knob which reverts the new queue behavior 2020-03-04 16:23:49 -08:00
Evan Tschannen
b7834b2995
Merge pull request #2774 from etschannen/feature-dd-repopulate-priority
Make the DD priority of populating a region lower than machine failures
2020-03-04 16:15:18 -08:00