This PR includes a few stability fixes for Backup Worker
* Fixed memory bookkeeping issue in Backup Worker. Previously
it didn't release flow lock correctly when erasing messages.
* Added TLogServer fix to return 0 from poppedVersion() for
unrecognized log router tags.
Description
Given Configurable encryption has been checked in and being tested via
simulation for more than a month and also to avoid penalty of accessing
KNOBS in inline commit path, patch retires the KNOB and make
ConfigurationEncryption default EaR mode for FDB.
BlobCipher still supports the old format header and encryption semantics,
will remove the dead code as a followup PR.
Testing
devRunCorrectness - 100K
* EaR: Update ApiWorkload to validate encryption at-rest guarantees
Description
FDB encryption data at-rest guarantees if cluster is configured with feature
enabled, all data written to persistent disks shall be "encrypted". Given FDB
maintains multiple persistent storages during lifecycle of the data, the patch
proposes a scheme to validate the invariant via "simulation testing"
Patch proposes updating ApiCorrectness workload to do the following:
1. Client supplied params and/randomly enable the validation feature.
2. Validation when enabled, allows injecting a known "marker string"
to workload generated Key and Value data patterns.
3. On shutdown, if the validation is enabled, all test files are
scanned for the known "marker" pattern.
Simulation tests are already capable of doing the following:
1. Randomly select TenantMode (disabled/optional/required)
2. Randomly select EncryptionAtRestMode (cluster_aware/domain_aware)
Hence, the updates test all possible combinations are validated. Also,
'defaultTenant' is present to cover 'domain_aware' encryption use cases.
Testing
devRunCorrectness
devRetryCorrectness - ApiCorrectness & EncryptedBackupCorrectness
* EaR: Update encryption methods to make 'cipherHeaderKey' optional
Description
diff-1: Address review comments
Major changes includes:
1. Update BlobCipher Encrypt/Decrypt classes to make 'headerCipher' optional
2. Update GetEncryptionCipherKeys actor methods to make 'headerCipherKey' optional
3. Update the usage across all encryption participant methods
Testing
BlobCipherUnitTest
EnryptedBackupCorrecctness
BlobGranuleCorrectness*
devRunCorrectness - 100K
Commit proxy needs to fetch additional cipher keys post-resolution, since tenant ids for raw access requests and cross-tenant clear ranges are calculated after resolution.
Previously with EaR we always enable authentication (e.g. we encrypt Redwood pages). The authentication is a form of checksum, so dedicated page checksum was not needed. This PR adds back xxhash page checksum when authentication is disabled. Also change the knob to default disable authentication.
* [EAR]: Remove usage of EncryptDomainName for Encryption at-rest operations
Description
diff-1: Address review comments
EncryptDomainName is an auxillary information, given EAR encryption domain
matches with Tenants, EncryptDomainName maps to TenantName in the current
code. However, this mapping adds EAR depedency has multiple drawbacks:
1. In some scenarios obtaning consistent mapping of TenantId <-> TenantName
is difficult to maintain. For instance: StorageServer (SS) TLog mutation
pop loop, it is possible that same commit batch contains: TenantMap update
mutation as well as a Tenant user mutation. SS would parse TenantMap update
mutation (FDB System Keyspace encryption domain), process the mutation, but,
doesn't apply it to the process local TenantMap. SS then attempts to process,
Tenant user mutation and fails to decrypt the mutation given TenantMetadaMap
isn't updated yet.
2. FDB codebase uses EncryptDomainId matching TenantId, TenantName is used as
an auxillary information source and feels better to be handled by an
external KMS.
Major changes include:
1. EAR to remove TenantName dependency across all participating processes
such as: CommitProxy, Redwood, BlobGranule and Backup agent.
2. Update EKP and KmsConnector APIs to avoid relying on "domainName"
information being passed around to external KMS EAR endpoints.
Testing
devRunCorrectness - 100K
EncryptKeyProxyTest - 100K
EncryptionOps Test - 100K