Skip to content

Add extstore s3 driver#2907

Open
cconstable wants to merge 3 commits into
extstore-foundationfrom
extstore-s3-driver
Open

Add extstore s3 driver#2907
cconstable wants to merge 3 commits into
extstore-foundationfrom
extstore-s3-driver

Conversation

@cconstable

@cconstable cconstable commented Jun 10, 2026

Copy link
Copy Markdown

What changed?

  • Added the experimental S3 external storage driver and AWS SDK v2 adapter at contrib/temporal-payload-storage-s3/src/main/java/io/temporal/payload/storage/s3/
  • Added shared PayloadHasher for drivers to consistently hash payload content.

Why?

  • Enables Temporal payloads to be offloaded to S3 through the external storage driver API.

Checklist

  • Added focused S3 driver tests covering storage, retrieval, key format, error handling, validation, and README examples.

@cconstable cconstable changed the base branch from master to extstore-foundation June 10, 2026 18:59
@cconstable cconstable changed the title extstore s3 driver Add extstore s3 driver Jun 10, 2026
- Missing values (including a missing `run-id`) are encoded as `null`.
- `hex-digest` is lower-case SHA-256 hex (64 characters).

### Percent-encoding rules

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are encoding rules that all temporal s3 drivers should conform to. If we have a better place to put information like this I will move it, otherwise I'll put something similar in the READMEs of the other SDKs.


/** Computes payload hashes shared by external storage drivers. */
@Experimental
public final class PayloadHasher {

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any driver we add that creates content-addressable keys will need to hash the content somehow. I opted to put this in a shared place. Happy to rename or move if needed.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move this to the s3 package for now. As it is, it's publicly available to anyone using the SDK, and I think this should be an implementation detail.

@cconstable cconstable marked this pull request as ready for review June 10, 2026 20:21
@cconstable cconstable requested a review from a team as a code owner June 10, 2026 20:21
@cconstable

Copy link
Copy Markdown
Author

The actual changes here are only a few hundred lines. The vast majority of the diff is tests, comments, and java ceremony. storage/s3/S3StorageDriver.java is the big change.


/** Interface for S3 {@link S3StorageDriver} operations: upload, existence check, and download. */
@Experimental
public interface S3Client {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S3StorageDriverClient

@@ -0,0 +1,18 @@
package io.temporal.payload.storage.s3;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

package io.temporal.payload.storage.s3driver to avoid collisions with the AWS S3 SDK package naming?

* be configured with credentials and a region by the caller.
*/
@Experimental
public final class S3AsyncClientAdapter implements S3Client {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the io.temporal.payload.storage.s3 package have a hard dependency on the AWS S3 SDK libraries, which is different than how we did it for Go and Python, which use a separate module for the client implementation.


/** Computes payload hashes shared by external storage drivers. */
@Experimental
public final class PayloadHasher {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move this to the s3 package for now. As it is, it's publicly available to anyone using the SDK, and I think this should be an implementation detail.

*/
final class S3StorageKey {
private static final String KEY_VERSION = "v0";
private static final String PATH_SEGMENT_UNRESERVED = "-_.~$&+:=@";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you validate that these are acceptable characters in S3 object keys?

* larger payload fails the {@code store} call.
*/
public Builder setMaxPayloadSize(int maxPayloadSize) {
this.maxPayloadSize = maxPayloadSize;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

range check here instead of in builder? not sure what's idiomatic in Java

* setting both before {@link #build()} is an error.
*/
public Builder setBucket(@Nonnull String bucket) {
this.staticBucket = Objects.requireNonNull(bucket, "bucket");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the staticBucket field and provide an annonymous implementation for the bucket resolver? Eliminates the field and the XOR check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants