111 Commits

Author SHA1 Message Date
Ata E Husain Bohra
33ae398268
REST KmsConnector implementation (#6994)
* REST KmsConnector implementation

Description
  diff-1: Address review comments.
          Add utility interface to Platform namespace to
          create and operate on tmpfile
 diff-2: Address review comments
         Link Boost::filesystem to CMake build process

Major changes includes:
1. Implement REST based KmsConnector implementation.
2. Salient features of the connector:
 2.1. Two required configuration are:
   a. Discovery KMS URLs - enable KMS discovery on bootstrap
   b. Endpoint path configuration to construct URI to fetch/refresh
      encryption keys
   c. Configuration to provide "validationTokens" to connect with
      external KMS. Patch implements file-based token validation scheme.
 2.2. On startup, RESTKmsConnector discovers KMS Urls and caches
      them in-memory. Extracts "validationTokens" based on input config.
 2.3. Expose endpoints to allow fetch/refresh of encryption keys.
 2.4. Defines JSON format to interact with external KMS - request &
      response payload format.
3. Extend Platform namespace with an interface to create and operate on
   tmp files.
4. Update Platform 'readFileBytes' and 'writeFileBytes' to leverage
   fstream supported implementation.

NOTE: KMS URLs fetched after initial discovery will be persisted using
      DynamicKnobs. It is TODO at the moment and shall be completed
      once DynamicKnobs is feature complete

Testing

Unit test to validation following:
1. Parsing on "validation tokens" logic.
2. Construction and parsing of REST JSON request and response strings.
2022-05-07 13:18:35 -07:00
Andrew Noyes
297d831192
Put guard pages next to fast alloc memory (#6885)
* Put guard pages next to fast alloc memory

I verified that we can now detect #6753 without creating tons of
threads.

* Use pageSize instead of 4096

* Don't include mmapInternal for windows
2022-04-19 11:22:35 -07:00
Steve Atherton
38190ad7e7
Merge pull request #6737 from sfc-gh-satherton/fix-storage-timestamps
Change storage metadata and perpetual wiggle timestamps to double epoch seconds
2022-04-02 09:47:23 -07:00
Steve Atherton
6eb1c2ae48
Merge pull request #6574 from sfc-gh-satherton/redwood-rare-bugs
Rare correctness bug fixes in Redwood
2022-04-01 16:40:22 -07:00
Chaoguang Lin
7d365bd1bb
Remote ikvs debugging (#6465)
* initial structure for remote IKVS server

* moved struct to .h file, added new files to CMakeList

* happy path implementation, connection error when testing

* saved minor local change

* changed tracing to debug

* fixed onClosed and getError being called before init is finished

* fix spawn process bug, now use absolute path

* added server knob to set ikvs process port number

* added server knob for remote/local kv store

* implement simulator remote process spawning

* fixed bug for simulator timeout

* commit all changes

* removed print lines in trace

* added FlowProcess implementation by Markus

* initial debug of FlowProcess, stuck at parent sending OpenKVStoreRequest to child

* temporary fix for process factory throwing segfault on create

* specify public address in command

* change remote kv store knob to false for jenkins build

* made port 0 open random unused port

* change remote store knob to true for benchmark

* set listening port to randomly opened port

* added print lines for jenkins run open kv store timeout debug

* removed most tracing and print lines

* removed tutorial changes

* update handleIOErrors error handling to handle remote-ikvs cases

* Push all debugging changes

* A version where worker bug exists

* A version where restarting tests fail

* Use both the name and the port to determine the child process

* Remove unnecessary update on local address

* Disable remote-kvs for DiskFailureCycle test

* A version where restarting stuck

* A version where most restarting tests green

* Reset connection with child process explicitly

* Remove change on unnecessary files

* Unify flags from _ to -

* fix merging unexpected changes

* fix trac.error to .errorUnsuppressed

* Add license header

* Remove unnecessary header in FlowProcess.actor.cpp

* Fix Windows build

* Fix Windows build, add missing ;

* Fix a stupid bug caused by code dropped by code merging

* Disable remote kvs by default

* Pass the conn_file path to the flow process, though not needed, but the buildNetwork is difficult to tune

* serialization change on readrange

* Update traces

* Refactor the RemoteIKVS interface

* Format files

* Update sim2 interface to not clog connections between parent and child processes in simulation

* Update comments; remove debugging symbols; Add error handling for remote_kvs_cancelled

* Add comments, format files

* Change method name from isBuggifyDisabled to isStableConnection; Decrease(0.1x) latency for stable connections

* Commit the IConnection interface change, forgot in previous commit

* Fix the issue that onClosed request is cancelled by ActorCollection

* Enable the remote kv store knob

* Remove FlowProcess.actor.cpp and move functions to RemoteIKeyValueStore.actor.cpp; Add remote kv store delay to avoid race; Bind the child process to die with parent process

* Fix the bug where one process starts storage server more than once

* Add a please_reboot_remote_kv_store error to restart the storage server worker if remote kvs died abnormally

* Remove unreachable code path and add comments

* Clang format the code

* Fix a simple wait error

* Clang format after merging the main branch

* Testing mixed mode in simulation if remote_kvs knob is enabled, setting the default to false

* Disable remote kvs for PhysicalShardMove which is for RocksDB

* Cleanup #include orders, remove debugging traces

* Revert the reorder in fdbserver.actor.cpp, which fails the gcc build

Co-authored-by: “Lincoln <“lincoln.xiao@snowflake.com”>
2022-03-31 17:08:59 -07:00
Steve Atherton
6744e9e4f9 Change timestamps used in storage server metadata and perpetual wiggle metrics to epoch seconds, stored as doubles, and stringified as either floating point epoch seconds or timestamp strings of the form "2013-04-28 20:57:01.000 +0000". 2022-03-30 18:57:06 -07:00
Steve Atherton
2a52c76b7a Added INetwork::timer_int() for convenience. Clarified what timer_int() actually returns in header comments. 2022-03-30 14:47:24 -07:00
sfc-gh-tclinkenbeard
a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
vikasgupta8
595f50ce26 indentation corrected 2022-02-15 13:03:12 +00:00
root
9bf11ac2f3 Removed condition for arch64 2022-02-11 13:45:17 +00:00
root
459dd83583 resolved format error 2022-02-11 12:11:50 +00:00
root
15159a5deb resolved format error 2022-02-11 09:57:24 +00:00
vikasgupta8
edfff755bf added support for ppc64le 2022-02-11 06:17:15 +00:00
Xiaoxi Wang
6dc5921575
createdTime based storage wiggler (#6219)
* add storagemetadata

* add StorageWiggler;

* fix serverMetadataKey bug

* add metadata tracker in storage tracker

* finish StorageWiggler

* update next storage ID

* change pid to server id

* write metadata when seed SS

* add status json fields

* remove pid based ppw iteration

* fix time expression

* fix tss metadata nonexistence; fix transaction retry when retrieving metadata

* fix checkMetadata bug when store type is wrong

* fix remove storage status json

* format code

* refactor updateNextWigglingStoragePID

* seperate storage metadata tracker and store type tracker

* rename pid

* wiggler stats

* fix completion between waitServerListChange and storageRecruiter

* solve review comments

* rename system key

* fix database lock timeout by adding lock_aware

* format code

* status json

* resolve code format/naming comments

* delete expireNow; change PerpetualStorageWiggleID's value to KeyBackedObjectMap<UID, StorageWiggleValue>

* fix omit start rount

* format code

* status json reset

* solve status json format

* improve status json latency; replace binarywriter/reader to objectwriter/reader; refactor storagewigglerstats transactions

* status timestamp
2022-02-04 15:04:30 -08:00
Renxuan Wang
4a8e2a80e6 Improve/fix disk metrics.
1. Introduce processDiskReadSeconds and processDiskWriteSeconds, which stands for disk read/write times `since the last logging`. They can only be obtained on Linux and macOS, and will be 0 on Windows and FreeBSD;
2. Rename `busyTicks` to `IOMilliSecs`;
3. On FreeBSD, the metrics should be collected among all devices.
2022-01-27 14:40:32 -08:00
He Liu
2fb5c59440 Removed deprioritizeThread(). 2022-01-18 11:10:56 -08:00
He Liu
2c0c51dd6d Enabled setting thread poll priorities. 2022-01-10 17:50:56 -08:00
A.J. Beamon
264c75b9a6 Add some extra client logging details:
1. Add a trace event when a database is created and move the cluster file / connection string from ClientStart to the new trace event
2. Add a detail for the path to the image being loaded
3. Add a detail for whether a client library is primary or not
4. Set a thread name for each external client thread that includes the release version
2021-11-29 09:57:10 -08:00
Markus Pilman
0e6dd46f5a Correct code formatting 2021-11-11 08:38:25 -07:00
Markus Pilman
5af465aa29 FDB compiles on Apple Sillicon 2021-11-10 20:05:38 -07:00
Tao Lin
fdb3b72e35 Introduce GetRangeAndFlatMap to push computations down to FDB
Re-introduce #5609
2021-11-09 13:52:28 -08:00
Tao Lin
586cc3b102
Revert "Introduce GetRangeAndFlatMap to push computations down to FDB" 2021-11-04 08:46:56 -07:00
Tao Lin
0853661d13 Introduce getRangeAndHop to push computations down to FDB 2021-11-03 13:21:16 -07:00
Xiaoge Su
cb9ee75d9b Fix the self-assign warning in Atomic.h
When compiling FDB using clang++, self-assign warning appears due to the
code

	pos = littleEndian32(pos);

in Atomic.h, which expands to

	pos = pos;

as littleEndian32 is defined as

        #define littleEndian32(value) value

This warning is not interesting, but annoying, by adding a
no-side-effect casting, the warning is suppressed.
2021-08-31 23:54:56 -07:00
FDB Formatster
2c788c233d apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-08-27 17:07:47 -07:00
Lukas Joswiak
59d535149e Merge branch 'master' into fixes/alp6 2021-07-27 10:07:18 -07:00
sfc-gh-tclinkenbeard
826e70916c Merge remote-tracking branch 'origin/master' into config-db 2021-06-17 09:47:41 -07:00
sfc-gh-tclinkenbeard
399c2c96f0 Remove unnecessary std::string copies from flow 2021-06-09 11:40:01 -07:00
Lukas Joswiak
153de33f57 Revert "Merge pull request #4802 from sfc-gh-ljoswiak/revert/actor-lineage"
This reverts commit 6499fa178e8f65a22105c2cd062a67209b562973, reversing
changes made to 15126319577f915f28aa6308bbf066dc7ec992a2.
2021-06-04 13:31:55 -07:00
Lukas Joswiak
4ea760b2a9 Revert "Merge pull request #4136 from sfc-gh-mpilman/features/actor-lineage"
This reverts commit da41534618a2a1edbf6b0b760635175372a66294, reversing
changes made to e6300905d6f294c52ebd166f4714541b084f37b4.
2021-05-10 20:26:12 -07:00
sfc-gh-tclinkenbeard
e3be3bd90c Fix versionedMutationKey ordering bug and strengthen default ConfigurationDatabaseWorkload 2021-04-22 23:00:42 -07:00
Markus Pilman
09ddcb3bae remove old sample thread 2021-04-19 11:55:35 -06:00
Lukas Joswiak
2dfd420882 Add sampling profiler thread 2021-03-24 14:52:42 -07:00
FDB Formatster
7867cd454e apply clang-format to flow/Platform.h 2021-03-12 15:16:33 -08:00
A.J. Beamon
74f427d317 Change the macro that forbids exit() calls to be a static assertion 2021-03-12 14:47:19 -08:00
Vishesh Yadav
af36e52fdf Merge branch 'release-6.3-pre-format' into master-format
This merges release-6.3 branch right before it was fully formatted.
There were quite a few conflicts that are resolved here. CoroFlow had
a check for OOM errors introduced in 6.3, but didn't seem applicable in
the new implmentation which seems to use boost.
2021-03-10 10:40:53 -08:00
Daniel Smith
4adce4eb83 Limit named thread support to Linux and add a comment documenting that. 2021-03-01 20:59:25 +00:00
Daniel Smith
179dea5a1b Name the RocksDB background threads 2021-03-01 20:35:55 +00:00
sfc-gh-tclinkenbeard
5bfa6cea98 Merge remote-tracking branch 'origin/master' into misc-changes 2020-12-26 20:47:00 -04:00
sfc-gh-tclinkenbeard
d15441e85c Replace non-standard sealed with final 2020-12-08 09:09:30 -08:00
Markus Pilman
dae8ea24ad Move compiler definitions into config file 2020-11-25 15:06:59 -07:00
sfc-gh-tclinkenbeard
0ac08f6a9b Replace NULL with nullptr in flow 2020-09-20 11:31:49 -07:00
Evan Tschannen
a49cb41de7 Merge branch 'release-6.3'
# Conflicts:
#	CMakeLists.txt
#	cmake/ConfigureCompiler.cmake
#	fdbserver/Knobs.cpp
#	fdbserver/StorageCache.actor.cpp
#	fdbserver/storageserver.actor.cpp
#	flow/ThreadHelper.actor.h
#	flow/serialize.h
#	tests/CMakeLists.txt
2020-07-29 00:31:55 -07:00
Evan Tschannen
e0db748fb3
Merge pull request #3403 from satherton/tls-background-handshake
TLS handshaking in background threads
2020-07-27 10:55:00 -07:00
Daniel Smith
a88bbd6405 s/fake/declval/ 2020-07-15 23:33:01 +00:00
Steve Atherton
c3ce0034bf Background threads for TLS handshakes use a stack size based on a knob. Platform::startThread() now accepts a stack size. The generic thread pool implementation takes an optional stack size override which is used for each added thread. 2020-06-23 02:08:01 -07:00
sfc-gh-tclinkenbeard
99bf993815 Replace BOOST_NOEXCEPT with noexcept 2020-06-09 22:39:19 -07:00
Kao Makino
c2e80fe47b Linux aarch64 port 2020-05-09 22:14:03 +00:00
Dave Cottlehuber
f6c5e207da flow: provide rdtsc if missing 2020-04-30 18:11:23 +00:00
Dave Cottlehuber
95bc24de11 flow: update headers and includes 2020-04-30 18:11:23 +00:00