Skip to content

[Feature] Support cross-K8s-cluster replication for RedisReplication deployment type - add External-Master configuration #1772

@sergmour

Description

@sergmour

Great work on Redis-K8s deployments! Thank you. We are evaluating this operator for enterprise-grade deployments.

Is your feature request related to a problem? Please describe.

We are looking into having Redis deployments in multi-region mode. This is required for enterprise-grade deployments to support DR (Disaster Recovery) events.
We cannot span a K8s cluster cross-region. Each region is supposed to have separate K8s clusters. Essentially, this feature implements a cross-K8s-cluster deployment. This allows extending a Redis-deployment cross-region and cross-cloud-provider.

We are evaluating RedisReplication deployment type where we would like to have multi-K8s-cluster capability. Later, the same capability can be extended to RedisCluster deployment type.

Describe the solution you'd like

For the time being, we considered a solution only for RedisReplication deployment type.

Below is the diagram depicting how a Redis deployment can span multiple K8s clusters:

Image

The regular (with master) RedisReplication is deployed in the primary-K8s-cluster (1 master, 2 slaves, 3 sentinels). It works out-of-box as-is. There needs to be an LB/NodePort created exposing the master endpoint outside of this primary-K8s-cluster.
Secondary-RedisReplication deployments can be created in other K8s-clusters using the same operator and helm charts. These secondary deployments will consist of slaves-only connected to the primary-RedisReplication master through the above mentioned LB/NodePort. No Sentinels for the secondary deployments. Redis-operator will skip all leader-election/failover logic for the secondary-RedisReplication as it will contain only read replicas.

Coding the solution: see PR: Support for cross-K8s-cluster replication for RedisReplication deployment type: added ExternalMaster configuration

Details:

RedisReplication CRD will contain a new configuration field:
externalMaster
host: "redis-master.primary.example.com"
port: 6379

If configured, this field becomes a flag for treating RedisReplication deployment as secondary (slave-only). Operator will skip creating a master Statefulset and will configure all pods as slaves of the externalMaster host:port endpoint. When this configuration is absent, existing RedisReplication-primary (with master) behavior is preserved.

Authentication to the external Redis-master will reuse the existing redisSecret field for password.

DR (Disaster Recovery) event is supposed to be handled by converting one of the secondary-RedisReplication deployments into primary (assigning one of the slaves as a master, changing deployment type to primary), and possibly reconfiguring externalMaster.host field in other secondary deployments to point to the new master unless GSLB can do it automatically.

Keeping metadata for these cross-K8s-cluster deployments and automating the DR solution can and should be done but it is out-of-scope for this feature.

This solution can be extended to RedisCluster deployment type, but it does not seem to be a straightforward task.

Describe alternatives you've considered

We considered connecting K8s clusters on the infrastructure level (service mesh, VPC-peering, etc.). This requires a lot of coordination and changes on the infrastructure level (non-overlapping CIDR blocks, etc), which is not easy to implement.

What version of redis-operator are you using?
redis-operator version: v0.24.0
redis-version: v8.2.1

Additional context

Here is the initial POC-PR: Support for cross-K8s-cluster replication for RedisReplication deployment type: added ExternalMaster configuration (link-goes-here) for the implementation of the above mentioned solution for RedisReplication deployment type. Looks like it works. We haven't conducted comprehensive tests with DR approaches yet.

It would make sense to extend RedisReplication solution to RedisCluster. Looks like it needs a considerably larger effort as the code for RedisCluster differs a lot.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions