27838 Commits

Author SHA1 Message Date
Dan Lambright
9e08874463
Disable enable_version_vector_reply_recovery in version vector tests. (#12032)
Co-authored-by: Dan Lambright <hlambright@apple.com>
2025-03-18 19:52:29 -04:00
Jingyu Zhou
deda04b845
Fix a restore bug due to a race (#12037)
Found by simulation:
seed:  -f tests/slow/ApiCorrectnessAtomicRestore.toml -s 177856328 -b on
Commit: 51ad8428e0fbe1d82bc76cf42b1579f51ecf2773
Compiler: clang++
Env: Rhel9 okteto

applyMutations() has processed version 801400000-803141392, and before calling sendCommitTransactionRequest(),
which was going to update apply begin version to 803141392. But DID NOT wait for the transaction commit.

Then there is an update on the apply end version to 845345760, which picks up the PREVIOUS apply begin version 801400000.
Thus started another applyMutation() with version range 801400000-845345760. Note because previous
applyMutation() has finished and didn't wait for the transaction commit, thus the starting version
is wrong. As a result, this applyMutation() re-processed version range 801400000-803141392.

The test failed during re-processing, because mutations are missing for the overlapped range.

The fix is to wait for the transaction to commit in sendCommitTransactionRequest().

This bug probably affects DR as well.

See rdar://146877552

20250317-162835-jzhou-ff4c4d6d7c51bfed
2025-03-17 16:12:33 -07:00
Vishesh Yadav
43938d91aa Add more gRPC/TLS tests 2025-03-17 12:16:06 -07:00
Vishesh Yadav
38b7d6ff66 Implement TLS support for Flow/gRPC
This patch adds TLS support for GrpcServer and AsyncGrpcClient by
implementing `GrpcCredentialsProvider` and using that to get channel
credentials. It adds `FlowGrpc` which is a flow global instance, and
initializes TLS credentials that are consistent with the ones provided
to FlowTransport.

- Added `FlowGrpc` to manage gRPC server initialization and TLS
  configuration globally.
- `GrpcCredentialsProvider` abstracts secure/insecure communications
  configurations for server/clients.
- Introduced `GrpcTlsCredentialProvider` for dynamic TLS certificate
  reloading from filesystem and `GrpcTlsCredentialStaticProvider` for
  static in-memory credentials.
- Updated `GrpcServer` to accept a `GrpcCredentialProvider`, enabling
  dynamic TLS credential management.
- Modified `fdbserver` to use `FlowGrpc::init()` for gRPC server
  initialization instead of `GrpcServer::initInstance()`, aligning it
  with FlowTransport behavior.
- Modified `GrpcServer::run()` to use the provided
  `GrpcCredentialProvider` instead of hardcoded insecure credentials.

Testing:
- Implemented a basic mTLS test case (`/fdbrpc/grpc/basic_tls`) to
  verify secure gRPC connections using
  `GrpcTlsCredentialStaticProvider`.

Todo:
- Generate certificates during testruns instead statically.
- Add test for `GrpcTlsCredentialProvider` which reads keys/certs from
  filesystem and monitors changes.
- Verify peers rules/criterias like FDB --verify-peer feature.
2025-03-17 12:16:06 -07:00
Zhe Wang
0e736c68e7
Allow One BulkloadTask Do Multiple Manifests (#12036) 2025-03-17 11:45:15 -07:00
Zhe Wang
d5946157f0
avoid shard merge when bulkload (#12035) 2025-03-15 13:20:51 -07:00
Michael Stack
911fdb4eaf
Add a bulkload user guide (#12033)
* Add a bulkload user guide

* Forgot to add a file

* Address review comments

---------

Co-authored-by: stack <stack@duboce.com>
2025-03-15 13:04:59 -07:00
Zhe Wang
eb0d9f2028
Add Verbose Level for BulkLoad Trace Events (#12034)
* add level for DDBulkLoad except for datadistribution

* nits
2025-03-14 19:15:41 -07:00
Zhe Wang
6ae46b4917
BulkLoadJob Should Not Schedule Completed BulkLoadTask (#12030)
* make bulkload job manager logic clear

* bypass task if the task has been completed

* improve scheduleBulkLoadJob
2025-03-14 14:52:33 -07:00
Vishesh Yadav
bc4dec8e5d Fix use-after-move issue in AsyncTaskExecutor
`getFuture()` should be called before post as `send`/`sendError`
operation in `ThreadReturnPromise` moves the underlying Promise to
`tagAndForward()`.

Ideally, `ThreadReturnPromise` behavior should stay consistent with the
`Promise`. However, the problem is that it relies on the invariant that
there will always be one owner of its internal `Promise` which is either
itself or `tagOrForward`  -- which is necessary to ensure that only one
thread can operate on the Promise's internal state (ref count, flags
etc) and avoid race conditions.

This patch (1) makes sure that in case of `post()` function we get
future before, (2) adds an ASSERT as this should never happen, (3)
documentation for future users and (4) a test case for potentially
fixing this in future.
2025-03-14 14:11:22 -07:00
Zhe Wang
9f5fdd0bea
Add BulkLoad Task Count to BulkLoad FDBCLI Command (#12029)
* change a event name

* add bulkload task count to fdbcli

* nit
2025-03-13 21:07:47 -07:00
Syed Paymaan Raza
bc8eca15e1
Initialize lastShardMove for recovery txn and in CommitBatchContext (#12027) 2025-03-13 15:25:46 -07:00
Dan Lambright
96be535a1f
ENABLE_VERSION_VECTOR_REPLY_RECOVERY can be T only if ENABLE_VERSION_VECTOR_TLOG_UNICAST is T (#12021)
* ENABLE_VERSION_VECTOR_REPLY_RECOVERY can be T only if ENABLE_VERSION_VECTOR_TLOG_UNICAST is T

* Respond to review comments

---------

Co-authored-by: Dan Lambright <hlambright@apple.com>
2025-03-13 18:15:13 -04:00
Michael Stack
74f447cbd9
More cleanup of bulk* cli (#12015)
Tighten up options for bulk*. Compound 'local' and 'blobstore' as 'dump'/'load'. Ditto for 'history'.

Make it so 'bulkload mode' works like 'bulkdump mode': i.e. dumps current mode.

If mode is not on for bulk*, ERROR in same manner as for writemode.

Make it so we can return bulk* subcommand specific help rather than dump all help when an issue.

Make the commands match in the ctest
2025-03-13 13:49:53 -07:00
Zhe Wang
10fecd0a4e
Add Error Message To BulkLoadJob Metadata (#12024)
* add error message to bulkload metadata

* remove TODOs and add error message for bulkload job manifest map creation failures

* nits
2025-03-13 10:02:39 -07:00
Zhe Wang
529db211b2
persist bulkload task count in bulkload job (#12022) 2025-03-12 15:35:26 -07:00
neethuhaneesha
15e35ed3a1
Adding rocksdb obsolete files size property in metrics. (#12017) 2025-03-12 15:11:34 -07:00
Sreenath Bodagala
56402dbbf1
Extend the unicast based recovery algorithm to do the replication policy check (#11996)
* - Extend the unicast based recovery algorithm to do the replication policy check

* - Review comments related changes

* - Review and compilation related changes
2025-03-12 18:01:38 -04:00
Syed Paymaan Raza
f40a4cfdad
Add replica comparison wrong_shard_server trace event (#12020)
* Add replica comparison wrong_shard_server trace event

* Suppress trace for 1 sec
2025-03-12 13:49:45 -07:00
neethuhaneesha
1d9f16bf07
Added compaction knobs. (#12018) 2025-03-12 12:38:23 -07:00
Syed Paymaan Raza
a9deb3ef6b
Request reboot for TSS data move conflicts in simulation (#12008)
* Request reboot for TSS data move conflicts in simulation

* Add comment

* Update storageserver.actor.cpp

Co-authored-by: Jingyu Zhou <jingyuzhou@gmail.com>

---------

Co-authored-by: Jingyu Zhou <jingyuzhou@gmail.com>
2025-03-12 11:56:54 -07:00
Michael Stack
32f2ef9104
Add checksumming across multipart upload and download (#11988)
* Hash file before uploading. Add it as tag after successful
multipart upload. On download, after the file is on disk,
get its hash and compare to that of the tag we get from s3.

* fdbclient/CMakeLists.txt
 Be explicit what s3client needs.

* fdbclient/S3BlobStore.actor.cpp
* fdbclient/include/fdbclient/S3BlobStore.h
 Add putObjectTags and getObjectTags

* fdbclient/S3Client.actor.cpp
 Add calculating checksum, adding it as
 tags on upload, fetching on download,
 and verifying match if present.
 Clean up includes.
 Less logging.

* fdbclient/tests/s3client_test.sh
 Less logging.

* Make failed checksum check an error (and mark non-retryable)

---------

Co-authored-by: michael stack <stack@duboce.com>
2025-03-11 21:34:59 -07:00
Zhe Wang
51ad8428e0
A Couple for Fixes for BulkDump and RangeLock (#12013)
* fix lockrange test and improve bulk dump

* fix bulkdump stuck error

* remove unnecessary yield when read/write bulk files

* remove unnecessary string creation in read/write bulk files
2025-03-11 15:58:01 -07:00
Dan Lambright
8c6f8c1403
Track shard moves for version vector (#11977)
* Track shard moves for version vector

* Don't broadcast to all TL when a different CP had a metadata mutation, unless on shard moves

* update lastShardMove on resolver

* Respond to review comments

---------

Co-authored-by: Dan Lambright <hlambright@apple.com>
2025-03-11 13:19:57 -04:00
Michael Stack
6ee6e0bd7f
Edit of bulkload/bulkdump cli. (#12012)
* fdbcli/BulkDumpCommand.actor.cpp
* fdbcli/BulkLoadCommand.actor.cpp
 Print out the bulkdump description rather than usage so user
 has a chance of figuring out what it is they entered incorrectly.
 Make bulkdump and bulkload align by using 'cancel' instead of
 'clear' in both and ordering the sub-commands the same for
 bulkload and bulkdump.  Add more help to the description.
 Bulkload was missing mention of the jobid needed
 specifying a bulkload.
* documentation/sphinx/source/bulkdump.rst
 s/clearBulkDumpJob/cancelBulkDumpJob/

Co-authored-by: stack <stack@duboce.com>
2025-03-11 08:52:13 -07:00
Syed Paymaan Raza
610ab21936 Increase TLOG_MAX_CREATE_DURATION in simulation 2025-03-10 19:47:23 -07:00
Zhe Wang
79a38c1dc0
Fix RangeLock in BulkDump Test and Avoid Memory Copy For Async Read/Write Bulk Files (#12007) 2025-03-10 15:13:29 -07:00
Jingyu Zhou
91acbbc0a5
Merge pull request #12010 from jzhou77/fix
Set max_read_transaction_life_versions for KillRegionCycle.toml
2025-03-10 13:59:08 -07:00
Jingyu Zhou
082cded30a Set max_read_transaction_life_versions for KillRegionCycle.toml
Simulation found an assertion failure in SS:
	ASSERT(rollbackVersion >= data->storageVersion());

The reason is that storage version is updated to a version larger than the
forced recovery version, due to only 1'000'000 for max_read_transaction_life_versions.
Also added debugging for cumulative checksum mutations.

See rdar://144550725

20250309-185039-jzhou-5145c65b0e8071b7
2025-03-09 11:37:46 -07:00
Zhe Wang
6156975979
Improve BulkLoad Test Coverage And Fix Bugs (#12009) 2025-03-08 20:26:51 -08:00
Jingyu Zhou
0a9940aa23
Merge pull request #12005 from vishesh/grpc-sim
Handle exceptions in `AsyncTaskExecutor` and don't use pointers for ThreadPromise in `AsyncTaskExecutor`
2025-03-08 11:05:02 -08:00
Syed Paymaan Raza
beba524f48
Never absorb wrong_shard_server in LoadBalance replicaComparison (#12006)
* Never absorb wrong_shard_server in LoadBalance replicaComparison

* Add comment

* Throw wrong_shard_server() instead of Error(error_code_wrong_shard_server)
2025-03-07 14:41:37 -08:00
Zhe Wang
21b87ef6c8
Improve Range Lock and Add Documentation (#11986)
* rangelock doc

* nits

* fix ci

* fix ci

* nits

* address comments

* nits

* nit

* make read lock exclusive

* fix

* fix CI

* improve doc

* fix bug

* address simulation failues

* fix bugs

* nits
2025-03-07 14:11:23 -08:00
Vishesh Yadav
4b9cb2f25e ThreadReturnPromise* in AsyncTaskExecutor don't need to be pointers 2025-03-06 19:00:23 -08:00
Vishesh Yadav
4836a2e9ff Handle Exceptions in AsyncTaskExecutor
Forwards FDB's `Error` type thrown by tasks in `AsyncTaskExecutor`. Any other kind of exception is
forwarded as `unknown_error()`.
2025-03-06 17:31:20 -08:00
Jingyu Zhou
cbb605d282
Merge pull request #12003 from vishesh:flow-bug
Hold `ThreadReturnPromiseStream` reference when sending value/error
2025-03-06 13:47:34 -08:00
Jingyu Zhou
9542f37aec
Merge pull request #12004 from vishesh:promise-move
Delete copy constructor and add move constructors for `ThreadReturnPromise*` and delete copy constructors
2025-03-06 13:38:58 -08:00
Michael Stack
e1138c30ee
Make bulkload file reads and writes async and memory parsimonious (#11997)
* * fdbclient/S3Client.actor.cpp
 Change field names so capitialized (convention)
 Add duration as field to traces.

* fdbserver/BulkLoadUtil.actor.cpp
 When the job-manifest is big, processing blocks
 so much getBulkLoadJobFileManifestEntryFromJobManifestFile
 fails.

* Make bulkload file reads and writes async and memory parsimonious.
In tests at scale, processing a large job-manifest.txt was blocking
and causing the bulk job to fail. This is part 1 of two patches.
The second is to address data copy added in the below when we
made methods ACTORs (ACTOR doesn't allow passing by reference).

* fdbserver/BulkDumpUtil.actor.cpp
 Removed writeStringToFile and buldDumpFileCopy in favor of new methods
 in BulkLoadUtil. Made hosting functions ACTORs so could wait on
 async calls.

* fdbserver/BulkLoadUtil.actor.cpp
 Added async read and write functions.

* fdbserver/DataDistribution.actor.cpp
 Making uploadBulkDumpJobManifestFile async made it so big bulkloads
 work.

* fix memory corruption in writeBulkFileBytes and fix read options in getBulkLoadJobFileManifestEntryFromJobManifestFile

* If read or write < 1MB, do it in a single read else do multiple read/writes

* * packaging/docker/fdb-aws-s3-credentials-fetcher/fdb-aws-s3-credentials-fetcher.go
 Just be blunt and write out the credentials. Trying to figure when the
 blob credentials have expired is error prone.

Co-authored-by: michael stack <stack@duboce.com>
Co-authored-by: Zhe Wang <zhe.wang@wustl.edu>
2025-03-06 10:43:04 -08:00
dependabot[bot]
7f7de83cb0
Bump jinja2 from 3.1.5 to 3.1.6 in /flow/protocolversion (#12002)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.5 to 3.1.6.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-06 08:38:01 -08:00
Vishesh Yadav
6740eb2a2f Add move constructors for ThreadReturnPromise* and delete copy constructors
Copy-constructor can be added back if necessary. Meanwhile, its simpler to enforce only copy of
ThreadReturnPromise* family, and avoid scattering it all over places.
2025-03-06 07:54:30 -08:00
Vishesh Yadav
55edaf5ed1 Hold ThreadReturnPromiseStream reference when sending value/error
When a value/error is sent via `ThreadReturnPromiseStream` we assume that the underlying
`PromiseStream` will be alive when the client waits. However, if the last
`ThreadReturnPromiseStream` gets destroyed after sending values/end_of_stream(), the underlying
`PromiseStream` will as well resulting in `broken_promise`. This happens because the actual work of
sending the value/error is deferred on the main thread.

This is likely to happen because the sender did its work and it isn't supposed to check if client
got the value. Hence, little reason to keep the promise. Meanwhile, client is free to read values
from its future whenever it needs to.

This patch just holds the reference to underlying `NotifiedQueue` by copying `PromiseStream` until
the value/error is sent. The test added would fail without this patch.
2025-03-06 02:00:34 -08:00
Jingyu Zhou
2ef72cb649
Merge pull request #11984 from vishesh/grpc-sim
grpc: AsyncTaskExecutor and basic server lifecycle
2025-03-05 22:18:22 -08:00
Syed Paymaan Raza
61c9a81257
Reduce some parameter values for StoreFrontTest (#11998) 2025-03-05 17:41:31 -08:00
Jingyu Zhou
8b0c36924d
Update 7.3.63 as the stable latest release (#11999) 2025-03-05 17:20:20 -08:00
Vishesh Yadav
6329672513 gRPC server life-cycle management and AsyncTaskExecution
This patch has two set of changes:

- Whenever a service is registered and removed from server, we need to restart gRPC server.
  GrpcServer provides some methods that can be used by worker actors so that the life of
  services registered by them can tied to the life of the worker role itself.

- Replace asio::thread_pool with AsyncTaskExecutor both in client and server.
2025-03-05 15:17:30 -08:00
Vishesh Yadav
60a0276e08 AsyncTaskExecutor: lightweight wrapper for IThreadPool
This patch implements `AsyncTaskExecutor` for asynchronous execution of tasks in a separate thread
pool. We already have `IThreadPool` however its API is more well suited for bigger tasks. This just
provides an easier to use API.

There is `AsyncTaskThread` which is similar in nature, but this is not re-wrapping IThreadPool hence
has ability to have multiple worker threads. We can potentially replace that with this component by
setting `num_threads = 1`.

TODO: Move this to `flow/include` instead of here.
2025-03-05 15:17:30 -08:00
Zhe Wang
2e47e97613
Fix DCC tester (#11995) 2025-03-05 13:41:08 -08:00
Zhe Wang
8142ebd029
Add BulkLoad History (#11992)
* add bulkload history

* address comments

* address comments
2025-03-04 18:50:08 -08:00
Dan Lambright
8bba38b180
Version vector: compute locations only once during commits. (#11924)
* During commits with version vector enabled, compute location list only once, as recalcuating could
generate a different random number, hence a different set of locations.

* Respond to review comments.

* Select replicas from locations returned from resolver.

* Respond to review comments

---------

Co-authored-by: Dan Lambright <hlambright@apple.com>
2025-03-04 17:09:06 -05:00
neethuhaneesha
5872ef711b
Temporarily disabling backup dry run request until the issue is fixed (#11991) 2025-03-03 15:52:30 -08:00