208 Commits

Author SHA1 Message Date
Chaoguang Lin
7d365bd1bb
Remote ikvs debugging (#6465)
* initial structure for remote IKVS server

* moved struct to .h file, added new files to CMakeList

* happy path implementation, connection error when testing

* saved minor local change

* changed tracing to debug

* fixed onClosed and getError being called before init is finished

* fix spawn process bug, now use absolute path

* added server knob to set ikvs process port number

* added server knob for remote/local kv store

* implement simulator remote process spawning

* fixed bug for simulator timeout

* commit all changes

* removed print lines in trace

* added FlowProcess implementation by Markus

* initial debug of FlowProcess, stuck at parent sending OpenKVStoreRequest to child

* temporary fix for process factory throwing segfault on create

* specify public address in command

* change remote kv store knob to false for jenkins build

* made port 0 open random unused port

* change remote store knob to true for benchmark

* set listening port to randomly opened port

* added print lines for jenkins run open kv store timeout debug

* removed most tracing and print lines

* removed tutorial changes

* update handleIOErrors error handling to handle remote-ikvs cases

* Push all debugging changes

* A version where worker bug exists

* A version where restarting tests fail

* Use both the name and the port to determine the child process

* Remove unnecessary update on local address

* Disable remote-kvs for DiskFailureCycle test

* A version where restarting stuck

* A version where most restarting tests green

* Reset connection with child process explicitly

* Remove change on unnecessary files

* Unify flags from _ to -

* fix merging unexpected changes

* fix trac.error to .errorUnsuppressed

* Add license header

* Remove unnecessary header in FlowProcess.actor.cpp

* Fix Windows build

* Fix Windows build, add missing ;

* Fix a stupid bug caused by code dropped by code merging

* Disable remote kvs by default

* Pass the conn_file path to the flow process, though not needed, but the buildNetwork is difficult to tune

* serialization change on readrange

* Update traces

* Refactor the RemoteIKVS interface

* Format files

* Update sim2 interface to not clog connections between parent and child processes in simulation

* Update comments; remove debugging symbols; Add error handling for remote_kvs_cancelled

* Add comments, format files

* Change method name from isBuggifyDisabled to isStableConnection; Decrease(0.1x) latency for stable connections

* Commit the IConnection interface change, forgot in previous commit

* Fix the issue that onClosed request is cancelled by ActorCollection

* Enable the remote kv store knob

* Remove FlowProcess.actor.cpp and move functions to RemoteIKeyValueStore.actor.cpp; Add remote kv store delay to avoid race; Bind the child process to die with parent process

* Fix the bug where one process starts storage server more than once

* Add a please_reboot_remote_kv_store error to restart the storage server worker if remote kvs died abnormally

* Remove unreachable code path and add comments

* Clang format the code

* Fix a simple wait error

* Clang format after merging the main branch

* Testing mixed mode in simulation if remote_kvs knob is enabled, setting the default to false

* Disable remote kvs for PhysicalShardMove which is for RocksDB

* Cleanup #include orders, remove debugging traces

* Revert the reorder in fdbserver.actor.cpp, which fails the gcc build

Co-authored-by: “Lincoln <“lincoln.xiao@snowflake.com”>
2022-03-31 17:08:59 -07:00
A.J. Beamon
48447c2788 Add the tenant management module to CMakeLists. Don't test tenants before API version 710. 2022-03-25 14:35:16 -07:00
A.J. Beamon
b4cfcc10d3 Move Python tenant management to its own module 2022-03-25 11:36:35 -07:00
A.J. Beamon
77ce0f4fc7 Add a unit test in Python to exercise some of the tenant code. Add some comments to the allocate and delete tenant implementations. 2022-03-23 15:50:06 -07:00
A.J. Beamon
8b92d3fccd Use special keys to create/delete tenants 2022-03-23 14:46:56 -07:00
Jon Fu
8e848f16df Support tuples in python tenants 2022-03-23 14:46:55 -07:00
A.J. Beamon
ce03f5783d Add tenant support to Python 2022-03-23 14:46:55 -07:00
A.J. Beamon
3f7365c433 Remove test debugging lines 2022-03-21 10:23:22 -07:00
A.J. Beamon
a23add6bc4 Add fdbcli test for tenants. Add documentation for new fdbcli tenant commands. Various output cleanup. Fix limit parsing bug in listtenants command. Update gettenant output format. 2022-03-17 12:10:39 -07:00
Ray Jenkins
dd45805312
Merge branch 'apple:main' into threadname-issue-6064 2022-02-01 17:40:07 -06:00
Andrew Noyes
96cbfe668c
Fix flaky ctest tests (#6310)
* Use localhost cluster for trace_partial_file_suffix_test

This way we get a predictable 127.0.0.1 in the trace file name

* Skip suspend test of pidof is not available

* Avoid writing to closed trace log

calling fdb_network_stop sends a "close" message to the trace thread,
but the network thread might can still be running and sending "flush"
messages to the network thread. This change basically ignores any
flushes that come after a close.

* Ensure unique ports for multi-process tests
2022-01-28 13:16:44 -08:00
Ray Jenkins
783cbb0aea Merge branch 'main' into threadname-issue-6064 2022-01-27 09:57:11 -06:00
Chaoguang Lin
3dad130e72 Disable setclass test for now 2022-01-26 16:42:13 -08:00
Ray Jenkins
9e1fd3cee5 Add comment about python thread API not necessarily setting underlying OS thread name. 2022-01-26 12:36:37 -06:00
Ray Jenkins
95d4497e2b Add python binding network thread name. 2022-01-25 13:20:31 -06:00
Ray Jenkins
d3055cc59a Use single transaction for setProcessClass and add fdbcli unit test. 2022-01-24 13:32:44 -08:00
Lukas Joswiak
8a6bb8611a Update Python libfdb_c paths 2022-01-11 09:34:20 -08:00
Andrew Noyes
32ebdc6da2 Log status json if cluster is unavailable in fdbcli tests 2021-12-22 15:23:05 -08:00
Andrew Noyes
38a97a2e8f Increase default timeout to 5 minutes for add_fdbclient_test 2021-12-22 15:23:05 -08:00
Andrew Noyes
1ce9c0faed Add sleep 1 after killing/suspending a process
So that it's more likely to actually deliver the message
2021-12-08 16:44:03 -08:00
Aaron Molitor
08b635d405 rename prerelease_string, replace PRERELEASE with SNAPSHOT 2021-11-29 15:11:20 -08:00
Chaoguang Lin
e2fa511036 Add option --api-version for fdbcli 2021-10-05 13:00:28 -07:00
Chaoguang Lin
0b9f32a7d2 Remove the unnecessary check in the end of setclass 2021-08-25 14:50:52 -07:00
Chaoguang Lin
a1c8217260 Move setclass test from single-process_test to multi-process_test 2021-08-25 13:04:01 -07:00
Chaoguang Lin
b6dc20875e Add test coverage for triggerddteaminfolog command 2021-08-25 10:39:00 -07:00
Chaoguang Lin
b00cefc243 Add a safe wait in the fdbcli setclass test 2021-08-25 10:38:01 -07:00
Chaoguang Lin
6b01363f45 Remove commented test; fix issues 2021-08-25 10:29:48 -07:00
Chaoguang Lin
ec1fcfba57 Add test coverage for profile command 2021-08-25 10:04:22 -07:00
Chaoguang Lin
68b41392a0 Change to use ArgumentParser, set env to use external client library in Popen, enable logging in all tests 2021-08-19 12:13:26 -07:00
Chaoguang Lin
775ac3e27c Format fdbcli_tests.py file 2021-08-17 10:15:35 -07:00
Chaoguang Lin
cc18cc742c Add fdbcli external client tests 2021-08-17 10:14:39 -07:00
Chaoguang Lin
4cc2042783 Update debugging logs 2021-08-13 15:00:52 -07:00
Chaoguang Lin
9553427619 try to fix exlcude fdbcli test 2021-08-13 11:36:49 -07:00
Chaoguang Lin
3b9cb1a85a Re-enable exclude command ctest 2021-08-09 21:18:27 +00:00
Chaoguang Lin
10484c426c Disable advanceversion ctest 2021-08-05 19:31:33 +00:00
Chaoguang Lin
a32cff08eb Add comments for the change 2021-08-02 22:34:08 +00:00
Chaoguang Lin
20f0a5a1f2 Disable multiprocess fdbcli tests while debugging flakiness 2021-08-02 21:55:07 +00:00
Chaoguang Lin
e5933dee7e Add test coverage for throttle 2021-07-27 17:28:59 +00:00
Chaoguang Lin
6bf5df6cc5 Update comments in fdbcli_tests.py 2021-07-21 18:38:13 +00:00
Chaoguang Lin
0b6d43fa6f Fix exclude test and re-enable it in ctest 2021-07-21 18:36:05 +00:00
Chaoguang Lin
f48a2b52f1 Disable test for exclude for now which can time out sometime 2021-07-20 20:44:46 +00:00
Chaoguang Lin
3552080266 Update some comments of the change 2021-07-15 16:38:04 +00:00
Chaoguang Lin
4659c028f5 Add test coverage for coordinators command 2021-07-15 08:18:37 +00:00
Chaoguang Lin
07882d809d Add test coverage for exclude command 2021-07-15 07:19:25 +00:00
Chaoguang Lin
932058e64b Add tests for fdbcli commands running against multi-process cluster 2021-07-14 22:37:07 +00:00
Chaoguang Lin
3d438dfe6d Update suspend test to avoid flaky results 2021-06-25 01:09:44 +00:00
Chaoguang Lin
cd594be0f8 Update setclass' test to have random class type and use the specific network address 2021-06-23 23:40:34 +00:00
Chaoguang Lin
c4c78410ed update comments 2021-06-17 18:36:33 +00:00
Chaoguang Lin
9a4bfd48aa Add test coverage for consistencycheck, cache_range, datadistribution, lock, unlock, setclass, suspend and all transaction related fdbcli commands 2021-06-17 00:28:07 +00:00
Chaoguang Lin
a5e69c269a remove unused header, fix the CMake rule 2021-06-04 20:44:49 +00:00