diff --git a/docs/reference/system-views.md b/docs/reference/system-views.md index 912225bb8ad..14ff51a43f9 100644 --- a/docs/reference/system-views.md +++ b/docs/reference/system-views.md @@ -1,14 +1,113 @@ # System views -linkdb adds three cluster-aware system views to the standard -PostgreSQL catalog. All three are present in `--enable-cluster` -builds; in `--disable-cluster` builds they return zero rows. +linkdb adds cluster-aware system views to the standard PostgreSQL +catalog. These views are present in `--enable-cluster` builds; in +`--disable-cluster` builds the backing functions are unavailable or +return zero rows, depending on whether the function is read-only or +operator-facing. | View | Purpose | |---|---| | `pg_cluster_nodes` | Cluster topology (the parsed `pgrac.conf`) | | `pg_stat_cluster_wait_events` | Cluster-specific wait events on the local node | | `pg_stat_gcluster_wait_events` | Cluster wait events globally (cross-node placeholder) | +| `pg_stat_cluster_backup` | Current cluster backup state on the local node | +| `pg_cluster_backup_history` | Latest cluster backup manifest summary | +| `pg_cluster_restore_points` | Cluster restore points visible to PITR status | +| `pg_cluster_pitr_status` | Cluster PITR target reachability status | + +## Cluster Backup / PITR Views + +The cluster backup surface exposes the manifest and target-resolution +state used by `pg_cluster_backup_start`, `pg_cluster_backup_stop`, and +`pg_cluster_create_restore_point`. + +Current 6.5 scope is conservative: + +- The views, catalog entries, manifest validators, PITR target resolver, + shared-memory state, and IC wire format are present as substrate. +- Mutating physical backup and restore-point entry points fail closed + with `feature_not_supported` until the cluster physical capture, + durable WAL pin, restore-point commit-drain barrier, restore, and PITR + replay paths are implemented. +- No manifest is published unless WAL, undo, transaction-table, SCN, and + control-file inclusion are proven. The current substrate therefore + refuses to create a manifest instead of reporting a partial or unsound + backup as complete. + +### `pg_stat_cluster_backup` + +One row describing the current or most recent cluster backup on this +node. + +| Column | Type | Description | +|---|---|---| +| `in_progress` | `bool` | True while this session has an active cluster backup. | +| `backup_id` | `text` | Backup label/id, or NULL before the first backup. | +| `coordinator_node_id` | `int4` | Local node id that started the backup. | +| `start_redo_lsn` | `pg_lsn` | Checkpoint redo LSN used as the backup start contract. | +| `checkpoint_lsn` | `pg_lsn` | Checkpoint record LSN captured at backup start. | +| `stop_cut_lsn` | `pg_lsn` | WAL cut LSN captured at backup stop. | +| `consistent_scn` | `int8` | Cluster SCN selected for the backup cut. | +| `manifest_crc` | `int8` | CRC32C of the latest manifest image. | +| `started_at` | `timestamptz` | Local timestamp when the backup started. | +| `stopped_at` | `timestamptz` | Local timestamp when the backup stopped. | +| `backup_parallel_channels` | `int4` | Configured copy-channel capacity for the backup substrate. | +| `backup_wal_retention` | `int4` | Configured WAL retention hint, in MB. | +| `restore_points_enabled` | `bool` | Whether automatic PITR restore-point scheduling is enabled. | +| `restore_point_interval_ms` | `int4` | Automatic restore-point scheduling interval, in milliseconds. | + +### `pg_cluster_backup_history` + +Returns the latest cluster backup manifest summary retained in shared +memory. + +| Column | Type | Description | +|---|---|---| +| `backup_id` | `text` | Backup label/id. | +| `consistent_scn` | `int8` | SCN that defines the backup cut. | +| `scn_durable_peak` | `int8` | Highest durable SCN covered by the cut. | +| `timeline` | `int4` | WAL timeline recorded at backup stop. | +| `catversion` | `int8` | Catalog version used to reject incompatible restores. | +| `storage_id` | `int4` | Cluster shared-storage backend id. | +| `node_count` | `int4` | Number of nodes proven in the manifest. | +| `thread_count` | `int4` | Number of WAL threads proven in the manifest. | +| `manifest_crc` | `int8` | CRC32C of the manifest image. | + +### `pg_cluster_restore_points` + +Shows restore points created by the cluster-aware restore-point entry +point. + +| Column | Type | Description | +|---|---|---| +| `restore_point_name` | `text` | Restore point name. | +| `cut_scn` | `int8` | SCN selected for the restore point cut. | +| `thread_count` | `int4` | WAL threads covered by the cut. | +| `incarnation` | `int4` | Cluster incarnation recorded with the cut. | +| `created_at` | `timestamptz` | Local timestamp when the point was recorded. | + +### `pg_cluster_pitr_status` + +Resolves the configured cluster PITR target against known restore +points and the latest manifest. + +| Column | Type | Description | +|---|---|---| +| `target_type` | `text` | `latest` when no target is configured, `scn`, `name`, or `cluster_time`. | +| `target_action` | `text` | Configured PITR action: `pause`, `promote`, or `shutdown`. | +| `reachable` | `bool` | True if the configured target is reachable. | +| `reason` | `text` | `ok` or the fail-closed reason. | +| `resolved_scn` | `int8` | Restore-point SCN selected for the target, when reachable. | +| `restore_point_name` | `text` | Restore point used for the target, when reachable. | + +Mutating function execution is revoked from PUBLIC: + +```sql +SELECT * FROM pg_cluster_backup_start('b1', true); +SELECT * FROM pg_cluster_backup_stop(true); +SELECT * FROM pg_cluster_create_restore_point('rp1'); +``` ## pg_cluster_nodes diff --git a/docs/user-guide/configuration.md b/docs/user-guide/configuration.md index b01ce1fac5b..af41d7e6140 100644 --- a/docs/user-guide/configuration.md +++ b/docs/user-guide/configuration.md @@ -3,7 +3,7 @@ linkdb uses two configuration mechanisms layered on top of standard PostgreSQL configuration: -1. **`postgresql.conf`** — standard PG config plus the `cluster.*` +1. **`postgresql.conf`** — standard PG config plus `cluster.*` GUCs added by linkdb's cluster subsystem. 2. **`pgrac.conf`** — INI-style file describing the cluster topology (the list of nodes that participate in the cluster). @@ -62,6 +62,31 @@ an absolute path), `$PGDATA/pg_wal` must resolve to cluster.wal_threads_dir = '/shared/walroot' ``` +### Cluster backup / PITR GUCs + +These settings support the cluster-aware physical backup / restore / +PITR surface. + +| GUC | Type | Default | Context | Notes | +|---|---|---|---|---| +| `cluster.recovery_target_scn` | string | `''` | postmaster | Target SCN used by `pg_cluster_pitr_status`. Empty means latest unless a name or cluster-time target is set. | +| `cluster.recovery_target_cluster_time` | string | `''` | postmaster | Timestamp target reported by `pg_cluster_pitr_status`; current 6.5 recovery action remains fail-closed for this target type. | +| `cluster.recovery_target_name` | string | `''` | postmaster | Named restore-point target resolved by `pg_cluster_pitr_status`. | +| `cluster.recovery_target_action` | enum | `pause` | postmaster | Accepted values: `pause`, `promote`, `shutdown`; exposed in PITR status. | +| `cluster.enable_pitr_restore_points` | bool | `off` | sighup | Enables future automatic restore-point scheduling. Manual `pg_cluster_create_restore_point()` is independent. | +| `cluster.pitr_restore_point_interval_ms` | integer | `0` | sighup | Zero disables automatic scheduling. | +| `cluster.backup_wal_retention` | integer | `0` MB | sighup | Retention hint for the future backup-set writer. | +| `cluster.backup_parallel_channels` | integer | `1` | sighup | Reserved copy-channel capacity for the future backup-set writer. | +| `cluster.backup_manifest_checksums` | enum | `crc32c` | sighup | Manifest checksums are mandatory; unchecked manifests are not supported. | + +The current implementation is intentionally conservative. These GUCs +expose the 6.5 catalog and state surface, but mutating cluster physical +backup and restore-point entry points fail closed with +`feature_not_supported` until the physical capture, durable WAL pin, +commit-drain restore-point barrier, restore, and PITR replay paths are +implemented. The server refuses to publish a manifest or restore point +when those proofs are absent. + ### `cluster.interconnect_tier` | | | diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 70514361048..8dd6219cec2 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -1685,6 +1685,71 @@ GRANT SELECT ON pg_cluster_node_removal_state TO PUBLIC; -- REVOKE EXECUTE FROM PUBLIC for defense-in-depth (L7). REVOKE ALL ON FUNCTION pg_cluster_remove_node(int) FROM PUBLIC; +-- PGRAC: cluster-aware backup / restore / PITR surface (spec-6.5). +-- The state/history/restore-point/PITR views are read-only observability. +-- Mutating entry points are superuser-gated in C and revoked from PUBLIC. +CREATE VIEW pg_stat_cluster_backup AS + SELECT in_progress, + backup_id, + coordinator_node_id, + start_redo_lsn, + checkpoint_lsn, + stop_cut_lsn, + consistent_scn, + manifest_crc, + started_at, + stopped_at, + backup_parallel_channels, + backup_wal_retention, + restore_points_enabled, + restore_point_interval_ms + FROM cluster_get_backup_state(); + +REVOKE ALL ON pg_stat_cluster_backup FROM PUBLIC; +GRANT SELECT ON pg_stat_cluster_backup TO PUBLIC; + +CREATE VIEW pg_cluster_backup_history AS + SELECT backup_id, + consistent_scn, + scn_durable_peak, + timeline, + catversion, + storage_id, + node_count, + thread_count, + manifest_crc + FROM cluster_get_backup_history(); + +REVOKE ALL ON pg_cluster_backup_history FROM PUBLIC; +GRANT SELECT ON pg_cluster_backup_history TO PUBLIC; + +CREATE VIEW pg_cluster_restore_points AS + SELECT restore_point_name, + cut_scn, + thread_count, + incarnation, + created_at + FROM cluster_get_restore_points(); + +REVOKE ALL ON pg_cluster_restore_points FROM PUBLIC; +GRANT SELECT ON pg_cluster_restore_points TO PUBLIC; + +CREATE VIEW pg_cluster_pitr_status AS + SELECT target_type, + target_action, + reachable, + reason, + resolved_scn, + restore_point_name + FROM cluster_get_pitr_status(); + +REVOKE ALL ON pg_cluster_pitr_status FROM PUBLIC; +GRANT SELECT ON pg_cluster_pitr_status TO PUBLIC; + +REVOKE ALL ON FUNCTION pg_cluster_backup_start(text, bool) FROM PUBLIC; +REVOKE ALL ON FUNCTION pg_cluster_backup_stop(bool) FROM PUBLIC; +REVOKE ALL ON FUNCTION pg_cluster_create_restore_point(text) FROM PUBLIC; + -- PGRAC: pg_cluster_ic_msg_types (spec-2.3 D8; 2026-05-08). -- Lists every IC message type registered in the process-local -- dispatch_table[] under cluster_ic_router.c. Diagnostic / diff --git a/src/backend/cluster/Makefile b/src/backend/cluster/Makefile index b3d41033867..83e44d0aa6e 100644 --- a/src/backend/cluster/Makefile +++ b/src/backend/cluster/Makefile @@ -41,6 +41,8 @@ OBJS = \ cluster.o \ cluster_advisory.o \ cluster_cancel_token.o \ + cluster_backup.o \ + cluster_backup_manifest.o \ cluster_cf_authority.o \ cluster_cf_enqueue.o \ cluster_cf_phase2.o \ @@ -208,7 +210,8 @@ else OBJS = cluster_conf.o cluster_debug.o cluster_ic.o cluster_inject.o cluster_undo_srf.o \ cluster_cr_srf.o cluster_block_apply_srf.o cluster_block_recovery_srf.o cluster_thread_recovery_apply_srf.o cluster_thread_recovery_replay_srf.o cluster_thread_recovery_driver_srf.o cluster_thread_recovery_orchestrator_srf.o cluster_pgstat.o cluster_scn.o cluster_views.o cluster_ges_mode_backend.o \ cluster_ir_srf.o cluster_ts_srf.o cluster_ko_srf.o \ - cluster_hang_resolve.o cluster_clean_leave_views.o cluster_node_remove_views.o + cluster_hang_resolve.o cluster_clean_leave_views.o cluster_node_remove_views.o \ + cluster_backup.o cluster_backup_manifest.o # spec-5.12: cluster_hang_resolve.o provides the pg_cluster_hang_victims / # pg_cluster_hang_resolve SQL symbols (real bodies #ifdef USE_PGRAC_CLUSTER, # --disable-cluster stubs raise ERRCODE_FEATURE_NOT_SUPPORTED); the symbols @@ -217,6 +220,8 @@ OBJS = cluster_conf.o cluster_debug.o cluster_ic.o cluster_inject.o cluster_undo # SRF + pg_cluster_clean_leave_request UDF symbols, same unconditional-link reason. # spec-5.18: cluster_node_remove_views.o provides cluster_get_node_removal_state + # pg_cluster_remove_node, same unconditional-link reason. +# spec-6.5: cluster_backup.o provides the cluster-aware backup/restore/PITR +# SQL symbols; --disable-cluster bodies raise ERRCODE_FEATURE_NOT_SUPPORTED. endif include $(top_srcdir)/src/backend/common.mk diff --git a/src/backend/cluster/cluster_backup.c b/src/backend/cluster/cluster_backup.c new file mode 100644 index 00000000000..7688b7922c6 --- /dev/null +++ b/src/backend/cluster/cluster_backup.c @@ -0,0 +1,1032 @@ +/*------------------------------------------------------------------------- + * + * cluster_backup.c + * Cluster-aware backup / restore / PITR SQL surface and shmem state. + * + * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * Portions Copyright (c) 2026, pgrac contributors + * + * Author: SqlRush + * + * IDENTIFICATION + * src/backend/cluster/cluster_backup.c + * + * NOTES + * This is a pgrac-original file. Linked in both build modes because + * pg_proc.dat references the SQL symbols unconditionally. + * Spec: spec-6.5-cluster-aware-backup-restore-pitr.md + * + *------------------------------------------------------------------------- + */ +#include "postgres.h" + +#include "access/htup_details.h" +#include "access/xlog.h" +#include "access/xlogbackup.h" +#include "catalog/catversion.h" +#include "catalog/pg_type.h" +#include "fmgr.h" +#include "funcapi.h" +#include "miscadmin.h" +#include "storage/lwlock.h" +#include "storage/shmem.h" +#include "utils/builtins.h" +#include "utils/elog.h" +#include "utils/errcodes.h" +#include "utils/memutils.h" +#include "utils/pg_lsn.h" +#include "utils/timestamp.h" +#include "utils/tuplestore.h" + +#include "cluster/cluster_backup.h" +#include "cluster/cluster_conf.h" +#include "cluster/cluster_guc.h" +#include "cluster/cluster_shmem.h" + +PG_FUNCTION_INFO_V1(pg_cluster_backup_start); +PG_FUNCTION_INFO_V1(pg_cluster_backup_stop); +PG_FUNCTION_INFO_V1(pg_cluster_create_restore_point); +PG_FUNCTION_INFO_V1(cluster_get_backup_state); +PG_FUNCTION_INFO_V1(cluster_get_backup_history); +PG_FUNCTION_INFO_V1(cluster_get_restore_points); +PG_FUNCTION_INFO_V1(cluster_get_pitr_status); + +#ifdef USE_PGRAC_CLUSTER + +#include "cluster/cluster_ic_envelope.h" +#include "cluster/cluster_ic_router.h" +#include "cluster/cluster_lmon.h" +#include "cluster/cluster_wal_thread.h" + +typedef struct ClusterBackupSharedState { + LWLockPadded lock; + ClusterBackupStatus status; + bool have_manifest; + ClusterBackupManifest last_manifest; + int restore_point_count; + int restore_point_next; + ClusterRestorePoint restore_points[CLUSTER_BACKUP_RESTORE_POINT_MAX]; + + uint64 next_request_id; + bool coordinator_send_pending; + ClusterBackupWireRequest coordinator_request; + uint8 coordinator_expected[CLUSTER_BACKUP_NODE_BITMAP_BYTES]; + uint8 coordinator_backup_peers[CLUSTER_BACKUP_NODE_BITMAP_BYTES]; + uint8 coordinator_acked[CLUSTER_BACKUP_NODE_BITMAP_BYTES]; + uint8 coordinator_nacked[CLUSTER_BACKUP_NODE_BITMAP_BYTES]; + ClusterBackupWireAck coordinator_acks[CLUSTER_MAX_NODES]; + ClusterBackupManifestThread coordinator_peer_threads[CLUSTER_MAX_NODES]; + SCN coordinator_peer_cut_scn[CLUSTER_MAX_NODES]; + + bool peer_command_pending; + ClusterBackupWireRequest peer_command; + bool peer_reply_pending; + int32 peer_reply_dest; + ClusterBackupWireAck peer_reply; +} ClusterBackupSharedState; + +static ClusterBackupSharedState *cluster_backup_state = NULL; +static BackupState *cluster_backup_session_state = NULL; +static StringInfo cluster_backup_tablespace_map = NULL; +static MemoryContext cluster_backup_context = NULL; +static BackupState *cluster_backup_lmon_state = NULL; +static StringInfo cluster_backup_lmon_tablespace_map = NULL; +static MemoryContext cluster_backup_lmon_context = NULL; + +static inline void +cluster_backup_bitmap_set(uint8 *bitmap, int node_id) +{ + if (bitmap == NULL || node_id < 0 || node_id >= CLUSTER_MAX_NODES) + return; + bitmap[node_id / 8] |= (uint8)(1u << (node_id % 8)); +} + +static inline bool +cluster_backup_bitmap_test(const uint8 *bitmap, int node_id) +{ + if (bitmap == NULL || node_id < 0 || node_id >= CLUSTER_MAX_NODES) + return false; + return (bitmap[node_id / 8] & (uint8)(1u << (node_id % 8))) != 0; +} + +static uint16 +cluster_backup_local_thread_id(void) +{ + uint16 thread_id = cluster_wal_thread_id(); + + if (thread_id == XLP_THREAD_ID_LEGACY) + thread_id = 1; + return thread_id; +} + +static const char * +cluster_pitr_action_name(int action) +{ + switch (action) { + case CLUSTER_RECOVERY_TARGET_ACTION_PAUSE: + return "pause"; + case CLUSTER_RECOVERY_TARGET_ACTION_PROMOTE: + return "promote"; + case CLUSTER_RECOVERY_TARGET_ACTION_SHUTDOWN: + return "shutdown"; + } + return "unknown"; +} + +Size +cluster_backup_shmem_size(void) +{ + return sizeof(ClusterBackupSharedState); +} + +void +cluster_backup_shmem_init(void) +{ + bool found; + + cluster_backup_state + = ShmemInitStruct("pgrac cluster backup", cluster_backup_shmem_size(), &found); + if (!found) { + MemSet(cluster_backup_state, 0, sizeof(*cluster_backup_state)); + LWLockInitialize(&cluster_backup_state->lock.lock, LWTRANCHE_CLUSTER_BACKUP); + } +} + +static const ClusterShmemRegion cluster_backup_region = { + .name = "pgrac cluster backup", + .size_fn = cluster_backup_shmem_size, + .init_fn = cluster_backup_shmem_init, + .lwlock_count = 1, + .owner_subsys = "cluster_backup", + .reserved_flags = 0, +}; + +void +cluster_backup_shmem_register(void) +{ + cluster_shmem_register_region(&cluster_backup_region); +} + +static void cluster_backup_cleanup_session_context(void); +static void cluster_backup_mark_native_stopped(const BackupState *state); + +static void +cluster_backup_error_if_unavailable(const char *op) +{ + if (!cluster_enabled) + ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("%s requires cluster.enabled", op))); + if (cluster_node_id < 0 || cluster_node_id >= CLUSTER_MAX_NODES) + ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("%s requires a valid cluster.node_id", op))); + if (cluster_backup_state == NULL) + ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("cluster backup shared state is not initialized"))); +} + +static void +cluster_backup_fail_closed_unimplemented(const char *op, const char *missing) +{ + ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("%s is not available in the current cluster backup substrate", op), + errdetail("%s is required before this operation can return a sound " + "cluster backup/PITR result.", + missing), + errhint("Refusing to create an unsound cluster restore point or " + "backup manifest."))); +} + +static void +cluster_backup_abort_local_session_if_running(void) +{ + if (get_backup_status() == SESSION_BACKUP_RUNNING) + do_pg_abort_backup(0, DatumGetBool(false)); + cluster_backup_mark_native_stopped(NULL); + cluster_backup_cleanup_session_context(); +} + +typedef struct ClusterBackupCoordWaitResult { + bool ok; + bool timed_out; + int32 node_id; + ClusterBackupWireResult result; +} ClusterBackupCoordWaitResult; + +static void +cluster_backup_cleanup_session_context(void) +{ + cluster_backup_session_state = NULL; + cluster_backup_tablespace_map = NULL; + if (cluster_backup_context != NULL) { + MemoryContextDelete(cluster_backup_context); + cluster_backup_context = NULL; + } +} + +static void +cluster_backup_mark_native_stopped(const BackupState *state) +{ + LWLockAcquire(&cluster_backup_state->lock.lock, LW_EXCLUSIVE); + cluster_backup_state->status.in_progress = false; + if (state != NULL) + cluster_backup_state->status.stop_cut_lsn = state->stoppoint; + cluster_backup_state->status.stopped_at = GetCurrentTimestamp(); + LWLockRelease(&cluster_backup_state->lock.lock); +} + +void +cluster_backup_get_status(ClusterBackupStatus *out) +{ + if (out == NULL) + return; + MemSet(out, 0, sizeof(*out)); + if (cluster_backup_state == NULL) + return; + + LWLockAcquire(&cluster_backup_state->lock.lock, LW_SHARED); + *out = cluster_backup_state->status; + LWLockRelease(&cluster_backup_state->lock.lock); +} + +bool +cluster_backup_get_last_manifest(ClusterBackupManifest *out) +{ + bool have_manifest; + + if (out == NULL) + return false; + MemSet(out, 0, sizeof(*out)); + if (cluster_backup_state == NULL) + return false; + + LWLockAcquire(&cluster_backup_state->lock.lock, LW_SHARED); + have_manifest = cluster_backup_state->have_manifest; + if (have_manifest) + *out = cluster_backup_state->last_manifest; + LWLockRelease(&cluster_backup_state->lock.lock); + return have_manifest; +} + +int +cluster_backup_get_restore_points(ClusterRestorePoint *out, int max_points) +{ + int count; + int start; + int i; + + if (out == NULL || max_points <= 0 || cluster_backup_state == NULL) + return 0; + + LWLockAcquire(&cluster_backup_state->lock.lock, LW_SHARED); + count = Min(cluster_backup_state->restore_point_count, max_points); + start = cluster_backup_state->restore_point_next - cluster_backup_state->restore_point_count; + if (start < 0) + start += CLUSTER_BACKUP_RESTORE_POINT_MAX; + for (i = 0; i < count; i++) { + int slot = (start + i) % CLUSTER_BACKUP_RESTORE_POINT_MAX; + + out[i] = cluster_backup_state->restore_points[slot]; + } + LWLockRelease(&cluster_backup_state->lock.lock); + return count; +} + +static void +cluster_backup_init_wire_ack(ClusterBackupWireAck *ack, const ClusterBackupWireRequest *request, + ClusterBackupWireResult result) +{ + MemSet(ack, 0, sizeof(*ack)); + ack->magic = CLUSTER_BACKUP_IC_MAGIC; + ack->version = CLUSTER_BACKUP_IC_VERSION; + ack->op = request != NULL ? request->op : CLUSTER_BACKUP_WIRE_OP_NONE; + ack->result = (uint16)result; + ack->sender_node_id = cluster_node_id; + ack->thread_id = cluster_backup_local_thread_id(); + ack->request_id = request != NULL ? request->request_id : 0; +} + +static void +cluster_backup_lmon_reset_context(void) +{ + cluster_backup_lmon_state = NULL; + cluster_backup_lmon_tablespace_map = NULL; + if (cluster_backup_lmon_context != NULL) { + MemoryContextDelete(cluster_backup_lmon_context); + cluster_backup_lmon_context = NULL; + } +} + +static ClusterBackupWireAck +cluster_backup_lmon_execute_request(const ClusterBackupWireRequest *request) +{ + ClusterBackupWireAck ack; + ClusterBackupWireResult result = CLUSTER_BACKUP_WIRE_RESULT_EXECUTOR_ERROR; + + cluster_backup_init_wire_ack(&ack, request, result); + + if (request == NULL || !cluster_backup_wire_request_valid(request) + || request->coordinator_node_id < 0 || request->coordinator_node_id >= CLUSTER_MAX_NODES) { + ack.result = CLUSTER_BACKUP_WIRE_RESULT_BAD_REQUEST; + cluster_backup_wire_ack_compute_crc(&ack); + return ack; + } + + PG_TRY(); + { + switch ((ClusterBackupWireOp)request->op) { + case CLUSTER_BACKUP_WIRE_OP_START: + if (request->backup_id[0] == '\0') + result = CLUSTER_BACKUP_WIRE_RESULT_BAD_REQUEST; + else if (cluster_backup_lmon_state != NULL + || get_backup_status() == SESSION_BACKUP_RUNNING) + result = CLUSTER_BACKUP_WIRE_RESULT_BUSY; + else + result = CLUSTER_BACKUP_WIRE_RESULT_EXECUTOR_ERROR; + break; + + case CLUSTER_BACKUP_WIRE_OP_STOP: + if (cluster_backup_lmon_state == NULL || get_backup_status() != SESSION_BACKUP_RUNNING) + result = CLUSTER_BACKUP_WIRE_RESULT_NOT_IN_BACKUP; + else { + do_pg_abort_backup(0, DatumGetBool(false)); + cluster_backup_lmon_reset_context(); + result = CLUSTER_BACKUP_WIRE_RESULT_EXECUTOR_ERROR; + } + break; + + case CLUSTER_BACKUP_WIRE_OP_ABORT: + if (cluster_backup_lmon_state != NULL + || get_backup_status() == SESSION_BACKUP_RUNNING) { + do_pg_abort_backup(0, DatumGetBool(false)); + cluster_backup_lmon_reset_context(); + } + result = CLUSTER_BACKUP_WIRE_RESULT_OK; + break; + + case CLUSTER_BACKUP_WIRE_OP_RESTORE_POINT: + if (request->restore_point_name[0] == '\0') + result = CLUSTER_BACKUP_WIRE_RESULT_BAD_REQUEST; + else + result = CLUSTER_BACKUP_WIRE_RESULT_EXECUTOR_ERROR; + break; + + case CLUSTER_BACKUP_WIRE_OP_NONE: + result = CLUSTER_BACKUP_WIRE_RESULT_BAD_REQUEST; + break; + } + } + PG_CATCH(); + { + FlushErrorState(); + if ((ClusterBackupWireOp)request->op == CLUSTER_BACKUP_WIRE_OP_START) + cluster_backup_lmon_reset_context(); + else if ((ClusterBackupWireOp)request->op == CLUSTER_BACKUP_WIRE_OP_STOP + && get_backup_status() != SESSION_BACKUP_RUNNING) + cluster_backup_lmon_reset_context(); + result = CLUSTER_BACKUP_WIRE_RESULT_EXECUTOR_ERROR; + } + PG_END_TRY(); + + ack.result = (uint16)result; + cluster_backup_wire_ack_compute_crc(&ack); + return ack; +} + +static void +cluster_backup_request_handler(const ClusterICEnvelope *env, const void *payload) +{ + const ClusterBackupWireRequest *request = (const ClusterBackupWireRequest *)payload; + + if (cluster_backup_state == NULL || env == NULL || payload == NULL) + return; + if (env->payload_length != sizeof(ClusterBackupWireRequest)) + return; + if (!cluster_backup_wire_request_valid(request)) + return; + if (request->coordinator_node_id != (int32)env->source_node_id) + return; + if (request->coordinator_node_id == cluster_node_id) + return; + + LWLockAcquire(&cluster_backup_state->lock.lock, LW_EXCLUSIVE); + if (!cluster_backup_state->peer_command_pending && !cluster_backup_state->peer_reply_pending) { + cluster_backup_state->peer_command = *request; + cluster_backup_state->peer_command_pending = true; + } + LWLockRelease(&cluster_backup_state->lock.lock); + cluster_lmon_wakeup(); +} + +static void +cluster_backup_ack_handler(const ClusterICEnvelope *env, const void *payload) +{ + const ClusterBackupWireAck *ack = (const ClusterBackupWireAck *)payload; + int32 node_id; + + if (cluster_backup_state == NULL || env == NULL || payload == NULL) + return; + if (env->payload_length != sizeof(ClusterBackupWireAck)) + return; + if (!cluster_backup_wire_ack_valid(ack)) + return; + if (ack->sender_node_id != (int32)env->source_node_id) + return; + node_id = ack->sender_node_id; + + LWLockAcquire(&cluster_backup_state->lock.lock, LW_EXCLUSIVE); + if (cluster_backup_state->coordinator_request.request_id == ack->request_id + && cluster_backup_state->coordinator_request.op == ack->op + && cluster_backup_bitmap_test(cluster_backup_state->coordinator_expected, node_id)) { + cluster_backup_state->coordinator_acks[node_id] = *ack; + if (ack->result == CLUSTER_BACKUP_WIRE_RESULT_OK) { + ClusterBackupManifestThread *thread + = &cluster_backup_state->coordinator_peer_threads[node_id]; + + cluster_backup_bitmap_set(cluster_backup_state->coordinator_acked, node_id); + if (ack->op == CLUSTER_BACKUP_WIRE_OP_START) { + MemSet(thread, 0, sizeof(*thread)); + thread->present = true; + thread->wal_included = false; + thread->undo_included = false; + thread->tt_included = false; + thread->thread_id = ack->thread_id; + thread->node_id = node_id; + thread->start_redo_lsn = ack->start_redo_lsn; + thread->checkpoint_lsn = ack->checkpoint_lsn; + thread->start_tli = ack->timeline; + } else if (ack->op == CLUSTER_BACKUP_WIRE_OP_STOP) { + if (!thread->present && ack->start_redo_lsn != InvalidXLogRecPtr + && ack->checkpoint_lsn != InvalidXLogRecPtr) { + thread->present = true; + thread->wal_included = false; + thread->undo_included = false; + thread->tt_included = false; + thread->thread_id = ack->thread_id; + thread->node_id = node_id; + thread->start_redo_lsn = ack->start_redo_lsn; + thread->checkpoint_lsn = ack->checkpoint_lsn; + thread->start_tli = ack->timeline; + } + thread->stop_cut_lsn = ack->stop_cut_lsn; + cluster_backup_state->coordinator_peer_cut_scn[node_id] = ack->cut_scn; + } + } else + cluster_backup_bitmap_set(cluster_backup_state->coordinator_nacked, node_id); + } + LWLockRelease(&cluster_backup_state->lock.lock); +} + +void +cluster_backup_register_ic_msg_types(void) +{ + const ClusterICMsgTypeInfo request_info = { + .msg_type = PGRAC_IC_MSG_BACKUP_REQUEST, + .name = "backup_request", + .allowed_producer_mask = CLUSTER_IC_PRODUCER_LMON, + .broadcast_ok = true, + .handler = cluster_backup_request_handler, + }; + const ClusterICMsgTypeInfo ack_info = { + .msg_type = PGRAC_IC_MSG_BACKUP_ACK, + .name = "backup_ack", + .allowed_producer_mask = CLUSTER_IC_PRODUCER_LMON, + .broadcast_ok = false, + .handler = cluster_backup_ack_handler, + }; + + cluster_ic_register_msg_type(&request_info); + cluster_ic_register_msg_type(&ack_info); +} + +static void +cluster_backup_record_send_nak(int32 node_id, const ClusterBackupWireRequest *request, + ClusterBackupWireResult result) +{ + ClusterBackupWireAck ack; + + if (node_id < 0 || node_id >= CLUSTER_MAX_NODES || request == NULL) + return; + + cluster_backup_init_wire_ack(&ack, request, result); + ack.sender_node_id = node_id; + cluster_backup_wire_ack_compute_crc(&ack); + + LWLockAcquire(&cluster_backup_state->lock.lock, LW_EXCLUSIVE); + if (cluster_backup_state->coordinator_request.request_id == request->request_id + && cluster_backup_state->coordinator_request.op == request->op + && cluster_backup_bitmap_test(cluster_backup_state->coordinator_expected, node_id)) { + cluster_backup_state->coordinator_acks[node_id] = ack; + cluster_backup_bitmap_set(cluster_backup_state->coordinator_nacked, node_id); + } + LWLockRelease(&cluster_backup_state->lock.lock); +} + +static void +cluster_backup_lmon_send_coord_request(void) +{ + ClusterBackupWireRequest request; + uint8 expected[CLUSTER_BACKUP_NODE_BITMAP_BYTES]; + ClusterICFanoutResult per_peer[CLUSTER_MAX_NODES]; + int node_id; + + LWLockAcquire(&cluster_backup_state->lock.lock, LW_EXCLUSIVE); + if (!cluster_backup_state->coordinator_send_pending) { + LWLockRelease(&cluster_backup_state->lock.lock); + return; + } + request = cluster_backup_state->coordinator_request; + memcpy(expected, cluster_backup_state->coordinator_expected, sizeof(expected)); + cluster_backup_state->coordinator_send_pending = false; + LWLockRelease(&cluster_backup_state->lock.lock); + + cluster_ic_send_envelope_fanout(PGRAC_IC_MSG_BACKUP_REQUEST, &request, (uint32)sizeof(request), + per_peer); + for (node_id = 0; node_id < CLUSTER_MAX_NODES; node_id++) { + if (!cluster_backup_bitmap_test(expected, node_id)) + continue; + if (per_peer[node_id] == CLUSTER_IC_FANOUT_HARD_ERROR + || per_peer[node_id] == CLUSTER_IC_FANOUT_PEER_DOWN) + cluster_backup_record_send_nak(node_id, &request, + CLUSTER_BACKUP_WIRE_RESULT_EXECUTOR_ERROR); + } +} + +static void +cluster_backup_lmon_send_peer_reply(void) +{ + ClusterBackupWireAck reply; + int32 dest; + bool have_reply; + + LWLockAcquire(&cluster_backup_state->lock.lock, LW_EXCLUSIVE); + have_reply = cluster_backup_state->peer_reply_pending; + reply = cluster_backup_state->peer_reply; + dest = cluster_backup_state->peer_reply_dest; + cluster_backup_state->peer_reply_pending = false; + LWLockRelease(&cluster_backup_state->lock.lock); + + if (!have_reply || dest < 0) + return; + (void)cluster_ic_send_envelope(PGRAC_IC_MSG_BACKUP_ACK, dest, &reply, (uint32)sizeof(reply)); +} + +static void +cluster_backup_lmon_process_peer_command(void) +{ + ClusterBackupWireRequest request; + ClusterBackupWireAck reply; + bool have_command; + + LWLockAcquire(&cluster_backup_state->lock.lock, LW_EXCLUSIVE); + have_command = cluster_backup_state->peer_command_pending; + request = cluster_backup_state->peer_command; + cluster_backup_state->peer_command_pending = false; + LWLockRelease(&cluster_backup_state->lock.lock); + + if (!have_command) + return; + + reply = cluster_backup_lmon_execute_request(&request); + + LWLockAcquire(&cluster_backup_state->lock.lock, LW_EXCLUSIVE); + cluster_backup_state->peer_reply = reply; + cluster_backup_state->peer_reply_dest = request.coordinator_node_id; + cluster_backup_state->peer_reply_pending = true; + LWLockRelease(&cluster_backup_state->lock.lock); +} + +void +cluster_backup_lmon_tick(void) +{ + if (cluster_backup_state == NULL || !cluster_enabled) + return; + + cluster_backup_lmon_send_peer_reply(); + cluster_backup_lmon_send_coord_request(); + cluster_backup_lmon_process_peer_command(); + cluster_backup_lmon_send_peer_reply(); +} + +Datum +pg_cluster_backup_start(PG_FUNCTION_ARGS) +{ + text *backupid = PG_GETARG_TEXT_PP(0); + char *backupidstr; + + if (!superuser()) + ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("must be superuser to start a cluster backup"))); + cluster_backup_error_if_unavailable("pg_cluster_backup_start"); + + backupidstr = text_to_cstring(backupid); + if (strlen(backupidstr) >= CLUSTER_BACKUP_ID_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("cluster backup id is too long"), + errdetail("Maximum length is %d bytes.", CLUSTER_BACKUP_ID_MAX - 1))); + cluster_backup_fail_closed_unimplemented( + "pg_cluster_backup_start", + "cluster physical backup capture, durable WAL pinning, and restore integration"); + PG_RETURN_NULL(); +} + +Datum +pg_cluster_backup_stop(PG_FUNCTION_ARGS) +{ + if (!superuser()) + ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("must be superuser to stop a cluster backup"))); + cluster_backup_error_if_unavailable("pg_cluster_backup_stop"); + + if (get_backup_status() == SESSION_BACKUP_RUNNING) + cluster_backup_abort_local_session_if_running(); + cluster_backup_fail_closed_unimplemented( + "pg_cluster_backup_stop", + "cluster-wide restore-point commit-drain barrier and durable per-thread " + "WAL/undo/transaction-table capture"); + PG_RETURN_NULL(); +} + +Datum +pg_cluster_create_restore_point(PG_FUNCTION_ARGS) +{ + text *restore_name = PG_GETARG_TEXT_PP(0); + char *restore_name_str; + + if (!superuser()) + ereport(ERROR, (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("must be superuser to create a cluster restore point"))); + cluster_backup_error_if_unavailable("pg_cluster_create_restore_point"); + if (RecoveryInProgress()) + ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("recovery is in progress"), + errhint("WAL control functions cannot be executed during recovery."))); + if (!XLogIsNeeded()) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("WAL level not sufficient for creating a restore point"), + errhint("wal_level must be set to \"replica\" or \"logical\" at server start."))); + + restore_name_str = text_to_cstring(restore_name); + if (strlen(restore_name_str) >= CLUSTER_RESTORE_POINT_NAME_MAX) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("cluster restore point name is too long"), + errdetail("Maximum length is %d bytes.", CLUSTER_RESTORE_POINT_NAME_MAX - 1))); + cluster_backup_fail_closed_unimplemented("pg_cluster_create_restore_point", + "cluster-wide restore-point commit-drain barrier"); + PG_RETURN_NULL(); +} + +Datum +cluster_get_backup_state(PG_FUNCTION_ARGS) +{ + ReturnSetInfo *rsinfo; + ClusterBackupStatus status; + Datum values[14]; + bool nulls[14]; + + InitMaterializedSRF(fcinfo, 0); + rsinfo = (ReturnSetInfo *)fcinfo->resultinfo; + if (!cluster_enabled) + return (Datum)0; + + cluster_backup_get_status(&status); + MemSet(nulls, false, sizeof(nulls)); + values[0] = BoolGetDatum(status.in_progress); + if (status.backup_id[0] == '\0') + nulls[1] = true; + else + values[1] = CStringGetTextDatum(status.backup_id); + values[2] = Int32GetDatum(status.coordinator_node_id); + values[3] = LSNGetDatum(status.start_redo_lsn); + values[4] = LSNGetDatum(status.checkpoint_lsn); + values[5] = LSNGetDatum(status.stop_cut_lsn); + values[6] = Int64GetDatum((int64)status.consistent_scn); + values[7] = Int64GetDatum((int64)status.manifest_crc); + if (status.started_at == 0) + nulls[8] = true; + else + values[8] = TimestampTzGetDatum(status.started_at); + if (status.stopped_at == 0) + nulls[9] = true; + else + values[9] = TimestampTzGetDatum(status.stopped_at); + values[10] = Int32GetDatum(cluster_backup_parallel_channels); + values[11] = Int32GetDatum(cluster_backup_wal_retention); + values[12] = BoolGetDatum(cluster_enable_pitr_restore_points); + values[13] = Int32GetDatum(cluster_pitr_restore_point_interval_ms); + + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + return (Datum)0; +} + +Datum +cluster_get_backup_history(PG_FUNCTION_ARGS) +{ + ReturnSetInfo *rsinfo; + ClusterBackupManifest manifest; + Datum values[9]; + bool nulls[9]; + + InitMaterializedSRF(fcinfo, 0); + rsinfo = (ReturnSetInfo *)fcinfo->resultinfo; + if (!cluster_enabled || !cluster_backup_get_last_manifest(&manifest)) + return (Datum)0; + + MemSet(nulls, false, sizeof(nulls)); + values[0] = CStringGetTextDatum(manifest.backup_id); + values[1] = Int64GetDatum((int64)manifest.consistent_scn); + values[2] = Int64GetDatum((int64)manifest.scn_durable_peak); + values[3] = Int32GetDatum((int32)manifest.timeline); + values[4] = Int64GetDatum((int64)manifest.catversion); + values[5] = Int32GetDatum((int32)manifest.backend_storage_id); + values[6] = Int32GetDatum((int32)manifest.node_count); + values[7] = Int32GetDatum((int32)manifest.thread_count); + values[8] = Int64GetDatum((int64)manifest.manifest_crc); + + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + return (Datum)0; +} + +Datum +cluster_get_restore_points(PG_FUNCTION_ARGS) +{ + ReturnSetInfo *rsinfo; + ClusterRestorePoint points[CLUSTER_BACKUP_RESTORE_POINT_MAX]; + int count; + int i; + + InitMaterializedSRF(fcinfo, 0); + rsinfo = (ReturnSetInfo *)fcinfo->resultinfo; + if (!cluster_enabled) + return (Datum)0; + + count = cluster_backup_get_restore_points(points, CLUSTER_BACKUP_RESTORE_POINT_MAX); + for (i = 0; i < count; i++) { + Datum values[5]; + bool nulls[5] = { false }; + + values[0] = CStringGetTextDatum(points[i].name); + values[1] = Int64GetDatum((int64)points[i].cut_scn); + values[2] = Int32GetDatum((int32)points[i].thread_count); + values[3] = Int32GetDatum((int32)points[i].incarnation); + if (points[i].created_at == 0) + nulls[4] = true; + else + values[4] = TimestampTzGetDatum(points[i].created_at); + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + } + return (Datum)0; +} + +Datum +cluster_get_pitr_status(PG_FUNCTION_ARGS) +{ + ReturnSetInfo *rsinfo; + ClusterBackupManifest manifest; + ClusterRestorePoint points[CLUSTER_BACKUP_RESTORE_POINT_MAX]; + ClusterRestorePoint chosen; + ClusterPitrTargetReason reason; + Datum values[6]; + bool nulls[6] = { false }; + int count; + int i; + const char *target_action = cluster_pitr_action_name(cluster_recovery_target_action); + bool have_requested_scn = false; + bool invalid_requested_scn = false; + SCN requested_scn = InvalidScn; + + InitMaterializedSRF(fcinfo, 0); + rsinfo = (ReturnSetInfo *)fcinfo->resultinfo; + if (!cluster_enabled) + return (Datum)0; + + if ((cluster_recovery_target_scn == NULL || cluster_recovery_target_scn[0] == '\0') + && (cluster_recovery_target_name == NULL || cluster_recovery_target_name[0] == '\0') + && cluster_recovery_target_cluster_time != NULL + && cluster_recovery_target_cluster_time[0] != '\0') { + values[0] = CStringGetTextDatum("cluster_time"); + values[1] = CStringGetTextDatum(target_action); + values[2] = BoolGetDatum(false); + values[3] = CStringGetTextDatum("unsupported_target_type"); + nulls[4] = true; + nulls[5] = true; + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + return (Datum)0; + } + + if (cluster_recovery_target_scn != NULL && cluster_recovery_target_scn[0] != '\0') { + int64 parsed = pg_strtoint64(cluster_recovery_target_scn); + + have_requested_scn = true; + if (parsed > 0) + requested_scn = (SCN)parsed; + else + invalid_requested_scn = true; + } + + if (invalid_requested_scn) { + values[0] = CStringGetTextDatum("scn"); + values[1] = CStringGetTextDatum(target_action); + values[2] = BoolGetDatum(false); + values[3] = CStringGetTextDatum("invalid_target"); + nulls[4] = true; + nulls[5] = true; + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + return (Datum)0; + } + + if (!have_requested_scn && cluster_recovery_target_name != NULL + && cluster_recovery_target_name[0] != '\0') { + if (!cluster_backup_get_last_manifest(&manifest)) { + values[0] = CStringGetTextDatum("name"); + values[1] = CStringGetTextDatum(target_action); + values[2] = BoolGetDatum(false); + values[3] = CStringGetTextDatum("manifest"); + nulls[4] = true; + nulls[5] = true; + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + return (Datum)0; + } + + count = cluster_backup_get_restore_points(points, CLUSTER_BACKUP_RESTORE_POINT_MAX); + for (i = 0; i < count; i++) { + if (!points[i].present) + continue; + if (strcmp(points[i].name, cluster_recovery_target_name) == 0) { + if (scn_time_cmp(points[i].cut_scn, manifest.consistent_scn) < 0) { + values[0] = CStringGetTextDatum("name"); + values[1] = CStringGetTextDatum(target_action); + values[2] = BoolGetDatum(false); + values[3] = CStringGetTextDatum("before_backup"); + nulls[4] = true; + nulls[5] = true; + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + return (Datum)0; + } + if (points[i].thread_count == 0 || points[i].thread_count > CLUSTER_MAX_NODES) { + values[0] = CStringGetTextDatum("name"); + values[1] = CStringGetTextDatum(target_action); + values[2] = BoolGetDatum(false); + values[3] = CStringGetTextDatum("missing_thread"); + nulls[4] = true; + nulls[5] = true; + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + return (Datum)0; + } + + values[0] = CStringGetTextDatum("name"); + values[1] = CStringGetTextDatum(target_action); + values[2] = BoolGetDatum(true); + values[3] = CStringGetTextDatum("ok"); + values[4] = Int64GetDatum((int64)points[i].cut_scn); + values[5] = CStringGetTextDatum(points[i].name); + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + return (Datum)0; + } + } + values[0] = CStringGetTextDatum("name"); + values[1] = CStringGetTextDatum(target_action); + values[2] = BoolGetDatum(false); + values[3] = CStringGetTextDatum("no_restore_point"); + nulls[4] = true; + nulls[5] = true; + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + return (Datum)0; + } + + if (!SCN_VALID(requested_scn)) { + values[0] = CStringGetTextDatum("latest"); + values[1] = CStringGetTextDatum(target_action); + values[2] = BoolGetDatum(true); + values[3] = CStringGetTextDatum("ok"); + nulls[4] = true; + nulls[5] = true; + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + return (Datum)0; + } + + if (!cluster_backup_get_last_manifest(&manifest)) { + values[0] = CStringGetTextDatum("scn"); + values[1] = CStringGetTextDatum(target_action); + values[2] = BoolGetDatum(false); + values[3] = CStringGetTextDatum("manifest"); + nulls[4] = true; + nulls[5] = true; + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + return (Datum)0; + } + + count = cluster_backup_get_restore_points(points, CLUSTER_BACKUP_RESTORE_POINT_MAX); + reason + = cluster_pitr_resolve_scn(points, count, requested_scn, manifest.consistent_scn, &chosen); + values[0] = CStringGetTextDatum("scn"); + values[1] = CStringGetTextDatum(target_action); + values[2] = BoolGetDatum(reason == CLUSTER_PITR_TARGET_OK); + values[3] = CStringGetTextDatum(cluster_pitr_target_reason_name(reason)); + if (reason == CLUSTER_PITR_TARGET_OK) { + values[4] = Int64GetDatum((int64)chosen.cut_scn); + values[5] = CStringGetTextDatum(chosen.name); + } else { + nulls[4] = true; + nulls[5] = true; + } + tuplestore_putvalues(rsinfo->setResult, rsinfo->setDesc, values, nulls); + return (Datum)0; +} + +#else /* !USE_PGRAC_CLUSTER */ + +Size +cluster_backup_shmem_size(void) +{ + return 0; +} + +void +cluster_backup_shmem_init(void) +{} + +void +cluster_backup_shmem_register(void) +{} + +void +cluster_backup_get_status(ClusterBackupStatus *out) +{ + if (out != NULL) + MemSet(out, 0, sizeof(*out)); +} + +bool +cluster_backup_get_last_manifest(ClusterBackupManifest *out) +{ + if (out != NULL) + MemSet(out, 0, sizeof(*out)); + return false; +} + +int +cluster_backup_get_restore_points(ClusterRestorePoint *out, int max_points) +{ + return 0; +} + +Datum +pg_cluster_backup_start(PG_FUNCTION_ARGS) +{ + ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("pg_cluster_backup_start requires a --enable-cluster build"))); + PG_RETURN_NULL(); +} + +Datum +pg_cluster_backup_stop(PG_FUNCTION_ARGS) +{ + ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("pg_cluster_backup_stop requires a --enable-cluster build"))); + PG_RETURN_NULL(); +} + +Datum +pg_cluster_create_restore_point(PG_FUNCTION_ARGS) +{ + ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("pg_cluster_create_restore_point requires a --enable-cluster build"))); + PG_RETURN_NULL(); +} + +Datum +cluster_get_backup_state(PG_FUNCTION_ARGS) +{ + ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cluster_get_backup_state requires a --enable-cluster build"))); + PG_RETURN_NULL(); +} + +Datum +cluster_get_backup_history(PG_FUNCTION_ARGS) +{ + ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cluster_get_backup_history requires a --enable-cluster build"))); + PG_RETURN_NULL(); +} + +Datum +cluster_get_restore_points(PG_FUNCTION_ARGS) +{ + ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cluster_get_restore_points requires a --enable-cluster build"))); + PG_RETURN_NULL(); +} + +Datum +cluster_get_pitr_status(PG_FUNCTION_ARGS) +{ + ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("cluster_get_pitr_status requires a --enable-cluster build"))); + PG_RETURN_NULL(); +} + +#endif /* USE_PGRAC_CLUSTER */ diff --git a/src/backend/cluster/cluster_backup_manifest.c b/src/backend/cluster/cluster_backup_manifest.c new file mode 100644 index 00000000000..31ed2090936 --- /dev/null +++ b/src/backend/cluster/cluster_backup_manifest.c @@ -0,0 +1,439 @@ +/*------------------------------------------------------------------------- + * + * cluster_backup_manifest.c + * Dependency-light helpers for cluster backup manifests and PITR cuts. + * + * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * Portions Copyright (c) 2026, pgrac contributors + * + * Author: SqlRush + * + * IDENTIFICATION + * src/backend/cluster/cluster_backup_manifest.c + * + * NOTES + * This is a pgrac-original file (no derivation from PostgreSQL). + * Spec: spec-6.5-cluster-aware-backup-restore-pitr.md + * + *------------------------------------------------------------------------- + */ +#include "postgres.h" + +#include "cluster/cluster_backup.h" +#include "port/pg_crc32c.h" + +static int +cluster_backup_scn_cmp(SCN a, SCN b) +{ +#ifdef USE_PGRAC_CLUSTER + return scn_time_cmp(a, b); +#else + uint64 alocal = scn_local(a); + uint64 blocal = scn_local(b); + + if (alocal < blocal) + return -1; + if (alocal > blocal) + return 1; + return 0; +#endif +} + +void +cluster_backup_manifest_init(ClusterBackupManifest *manifest, const char *backup_id) +{ + if (manifest == NULL) + return; + + memset(manifest, 0, sizeof(*manifest)); + manifest->magic = CLUSTER_BACKUP_MANIFEST_MAGIC; + manifest->version = CLUSTER_BACKUP_MANIFEST_VERSION; + if (backup_id != NULL) + strlcpy(manifest->backup_id, backup_id, sizeof(manifest->backup_id)); +} + +bool +cluster_backup_manifest_set_thread(ClusterBackupManifest *manifest, int thread_index, + const ClusterBackupManifestThread *thread) +{ + if (manifest == NULL || thread == NULL) + return false; + if (thread_index < 0 || thread_index >= CLUSTER_MAX_NODES) + return false; + if (thread->thread_id == 0 || thread->thread_id > CLUSTER_MAX_NODES) + return false; + + if (!manifest->threads[thread_index].present) + manifest->thread_count++; + manifest->threads[thread_index] = *thread; + manifest->threads[thread_index].present = true; + return true; +} + +uint32 +cluster_backup_manifest_compute_crc(const ClusterBackupManifest *manifest) +{ + ClusterBackupManifest copy; + pg_crc32c crc; + + if (manifest == NULL) + return 0; + + copy = *manifest; + copy.manifest_crc = 0; + INIT_CRC32C(crc); + COMP_CRC32C(crc, ©, sizeof(copy)); + FIN_CRC32C(crc); + return crc; +} + +void +cluster_backup_manifest_seal(ClusterBackupManifest *manifest) +{ + if (manifest == NULL) + return; + manifest->manifest_crc = cluster_backup_manifest_compute_crc(manifest); +} + +ClusterBackupManifestReason +cluster_backup_manifest_validate(const ClusterBackupManifest *manifest) +{ + int i; + int present_count = 0; + + if (manifest == NULL) + return CLUSTER_BACKUP_MANIFEST_NULL; + if (manifest->magic != CLUSTER_BACKUP_MANIFEST_MAGIC) + return CLUSTER_BACKUP_MANIFEST_BAD_MAGIC; + if (manifest->version != CLUSTER_BACKUP_MANIFEST_VERSION) + return CLUSTER_BACKUP_MANIFEST_BAD_VERSION; + if (manifest->node_count == 0 || manifest->node_count > CLUSTER_MAX_NODES + || manifest->thread_count == 0 || manifest->thread_count > CLUSTER_MAX_NODES) + return CLUSTER_BACKUP_MANIFEST_BAD_COUNTS; + if (!manifest->control_included) + return CLUSTER_BACKUP_MANIFEST_MISSING_CONTROL; + if (!SCN_VALID(manifest->consistent_scn) || !SCN_VALID(manifest->scn_durable_peak) + || cluster_backup_scn_cmp(manifest->scn_durable_peak, manifest->consistent_scn) < 0) + return CLUSTER_BACKUP_MANIFEST_BAD_SCN_PEAK; + + for (i = 0; i < CLUSTER_MAX_NODES; i++) { + const ClusterBackupManifestThread *thread = &manifest->threads[i]; + + if (!thread->present) + continue; + + present_count++; + if (thread->thread_id == 0 || thread->thread_id > CLUSTER_MAX_NODES) + return CLUSTER_BACKUP_MANIFEST_MISSING_THREAD; + if (thread->start_redo_lsn == InvalidXLogRecPtr + || thread->checkpoint_lsn == InvalidXLogRecPtr + || thread->stop_cut_lsn == InvalidXLogRecPtr + || thread->stop_cut_lsn < thread->start_redo_lsn) + return CLUSTER_BACKUP_MANIFEST_BAD_LSN_RANGE; + if (!thread->wal_included) + return CLUSTER_BACKUP_MANIFEST_MISSING_WAL; + if (!thread->undo_included) + return CLUSTER_BACKUP_MANIFEST_MISSING_UNDO; + if (!thread->tt_included) + return CLUSTER_BACKUP_MANIFEST_MISSING_TT; + } + + if (present_count != (int)manifest->thread_count) + return CLUSTER_BACKUP_MANIFEST_MISSING_THREAD; + if (manifest->manifest_crc != cluster_backup_manifest_compute_crc(manifest)) + return CLUSTER_BACKUP_MANIFEST_BAD_CRC; + + return CLUSTER_BACKUP_MANIFEST_OK; +} + +const char * +cluster_backup_manifest_reason_name(ClusterBackupManifestReason reason) +{ + switch (reason) { + case CLUSTER_BACKUP_MANIFEST_OK: + return "ok"; + case CLUSTER_BACKUP_MANIFEST_NULL: + return "null"; + case CLUSTER_BACKUP_MANIFEST_BAD_MAGIC: + return "bad_magic"; + case CLUSTER_BACKUP_MANIFEST_BAD_VERSION: + return "bad_version"; + case CLUSTER_BACKUP_MANIFEST_BAD_COUNTS: + return "bad_counts"; + case CLUSTER_BACKUP_MANIFEST_MISSING_THREAD: + return "missing_thread"; + case CLUSTER_BACKUP_MANIFEST_BAD_LSN_RANGE: + return "bad_lsn_range"; + case CLUSTER_BACKUP_MANIFEST_MISSING_WAL: + return "missing_wal"; + case CLUSTER_BACKUP_MANIFEST_MISSING_UNDO: + return "missing_undo"; + case CLUSTER_BACKUP_MANIFEST_MISSING_TT: + return "missing_tt"; + case CLUSTER_BACKUP_MANIFEST_MISSING_CONTROL: + return "missing_control"; + case CLUSTER_BACKUP_MANIFEST_BAD_SCN_PEAK: + return "bad_scn_peak"; + case CLUSTER_BACKUP_MANIFEST_BAD_CRC: + return "bad_crc"; + } + return "unknown"; +} + +ClusterRestorePointCutReason +cluster_restore_point_build(ClusterRestorePoint *out, const char *name, const SCN *thread_scn, + const XLogRecPtr *thread_lsn, int max_threads, + bool pending_commits_empty, bool commit_fence_held, uint32 incarnation) +{ + SCN max_scn = InvalidScn; + int i; + int nthreads = 0; + + if (!pending_commits_empty) + return CLUSTER_RESTORE_POINT_CUT_PENDING_COMMITS; + if (!commit_fence_held) + return CLUSTER_RESTORE_POINT_CUT_NO_FENCE; + if (out == NULL || thread_scn == NULL || thread_lsn == NULL || max_threads <= 0 + || max_threads > CLUSTER_MAX_NODES) + return CLUSTER_RESTORE_POINT_CUT_NO_THREADS; + + memset(out, 0, sizeof(*out)); + out->present = true; + out->incarnation = incarnation; + if (name != NULL) + strlcpy(out->name, name, sizeof(out->name)); + + for (i = 0; i < max_threads; i++) { + if (!SCN_VALID(thread_scn[i]) && thread_lsn[i] == InvalidXLogRecPtr) + continue; + if (!SCN_VALID(thread_scn[i]) || thread_lsn[i] == InvalidXLogRecPtr) + return CLUSTER_RESTORE_POINT_CUT_BAD_THREAD; + + out->cut_lsn[i] = thread_lsn[i]; + if (!SCN_VALID(max_scn) || cluster_backup_scn_cmp(thread_scn[i], max_scn) > 0) + max_scn = thread_scn[i]; + nthreads++; + } + + if (nthreads == 0) + return CLUSTER_RESTORE_POINT_CUT_NO_THREADS; + + out->cut_scn = max_scn; + out->thread_count = (uint32)nthreads; + return CLUSTER_RESTORE_POINT_CUT_OK; +} + +const char * +cluster_restore_point_cut_reason_name(ClusterRestorePointCutReason reason) +{ + switch (reason) { + case CLUSTER_RESTORE_POINT_CUT_OK: + return "ok"; + case CLUSTER_RESTORE_POINT_CUT_PENDING_COMMITS: + return "pending_commits"; + case CLUSTER_RESTORE_POINT_CUT_NO_FENCE: + return "no_fence"; + case CLUSTER_RESTORE_POINT_CUT_NO_THREADS: + return "no_threads"; + case CLUSTER_RESTORE_POINT_CUT_BAD_THREAD: + return "bad_thread"; + } + return "unknown"; +} + +ClusterPitrTargetReason +cluster_pitr_resolve_scn(const ClusterRestorePoint *points, int npoints, SCN requested_scn, + SCN backup_consistent_scn, ClusterRestorePoint *out) +{ + const ClusterRestorePoint *best = NULL; + int i; + + if (!SCN_VALID(requested_scn) || !SCN_VALID(backup_consistent_scn) + || cluster_backup_scn_cmp(requested_scn, backup_consistent_scn) < 0) + return CLUSTER_PITR_TARGET_BEFORE_BACKUP; + if (points == NULL || npoints <= 0) + return CLUSTER_PITR_TARGET_NO_RESTORE_POINT; + + for (i = 0; i < npoints; i++) { + const ClusterRestorePoint *point = &points[i]; + + if (!point->present || !SCN_VALID(point->cut_scn)) + continue; + if (cluster_backup_scn_cmp(point->cut_scn, backup_consistent_scn) < 0) + continue; + if (cluster_backup_scn_cmp(point->cut_scn, requested_scn) > 0) + continue; + if (point->thread_count == 0 || point->thread_count > CLUSTER_MAX_NODES) + return CLUSTER_PITR_TARGET_MISSING_THREAD; + if (best == NULL || cluster_backup_scn_cmp(point->cut_scn, best->cut_scn) > 0) + best = point; + } + + if (best == NULL) + return CLUSTER_PITR_TARGET_NO_RESTORE_POINT; + + if (out != NULL) + *out = *best; + return CLUSTER_PITR_TARGET_OK; +} + +const char * +cluster_pitr_target_reason_name(ClusterPitrTargetReason reason) +{ + switch (reason) { + case CLUSTER_PITR_TARGET_OK: + return "ok"; + case CLUSTER_PITR_TARGET_NO_RESTORE_POINT: + return "no_restore_point"; + case CLUSTER_PITR_TARGET_BEFORE_BACKUP: + return "before_backup"; + case CLUSTER_PITR_TARGET_MISSING_THREAD: + return "missing_thread"; + case CLUSTER_PITR_TARGET_UNARCHIVED_WAL: + return "unarchived_wal"; + } + return "unknown"; +} + +ClusterRestoreCompatibilityReason +cluster_backup_manifest_compatible(const ClusterBackupManifest *manifest, uint32 current_catversion, + uint32 current_storage_id, uint32 expected_node_count) +{ + if (cluster_backup_manifest_validate(manifest) != CLUSTER_BACKUP_MANIFEST_OK) + return CLUSTER_RESTORE_COMPAT_MANIFEST; + if (manifest->catversion != current_catversion) + return CLUSTER_RESTORE_COMPAT_CATVERSION; + if (manifest->backend_storage_id != current_storage_id) + return CLUSTER_RESTORE_COMPAT_STORAGE; + if (manifest->node_count != expected_node_count) + return CLUSTER_RESTORE_COMPAT_TOPOLOGY; + return CLUSTER_RESTORE_COMPAT_OK; +} + +const char * +cluster_restore_compat_reason_name(ClusterRestoreCompatibilityReason reason) +{ + switch (reason) { + case CLUSTER_RESTORE_COMPAT_OK: + return "ok"; + case CLUSTER_RESTORE_COMPAT_CATVERSION: + return "catversion"; + case CLUSTER_RESTORE_COMPAT_STORAGE: + return "storage"; + case CLUSTER_RESTORE_COMPAT_TOPOLOGY: + return "topology"; + case CLUSTER_RESTORE_COMPAT_MANIFEST: + return "manifest"; + } + return "unknown"; +} + +void +cluster_backup_wire_request_compute_crc(ClusterBackupWireRequest *request) +{ + pg_crc32c crc; + + if (request == NULL) + return; + + request->crc = 0; + INIT_CRC32C(crc); + COMP_CRC32C(crc, request, offsetof(ClusterBackupWireRequest, crc)); + FIN_CRC32C(crc); + request->crc = crc; +} + +bool +cluster_backup_wire_request_valid(const ClusterBackupWireRequest *request) +{ + ClusterBackupWireRequest copy; + + if (request == NULL) + return false; + if (request->magic != CLUSTER_BACKUP_IC_MAGIC || request->version != CLUSTER_BACKUP_IC_VERSION) + return false; + if (request->op == CLUSTER_BACKUP_WIRE_OP_NONE + || request->op > CLUSTER_BACKUP_WIRE_OP_RESTORE_POINT) + return false; + if (request->request_id == 0) + return false; + if (request->coordinator_node_id < 0 || request->coordinator_node_id >= CLUSTER_MAX_NODES) + return false; + if (request->backup_id[CLUSTER_BACKUP_ID_MAX - 1] != '\0') + return false; + if (request->restore_point_name[CLUSTER_RESTORE_POINT_NAME_MAX - 1] != '\0') + return false; + + copy = *request; + cluster_backup_wire_request_compute_crc(©); + return copy.crc == request->crc; +} + +void +cluster_backup_wire_ack_compute_crc(ClusterBackupWireAck *ack) +{ + pg_crc32c crc; + + if (ack == NULL) + return; + + ack->crc = 0; + INIT_CRC32C(crc); + COMP_CRC32C(crc, ack, offsetof(ClusterBackupWireAck, crc)); + FIN_CRC32C(crc); + ack->crc = crc; +} + +bool +cluster_backup_wire_ack_valid(const ClusterBackupWireAck *ack) +{ + ClusterBackupWireAck copy; + + if (ack == NULL) + return false; + if (ack->magic != CLUSTER_BACKUP_IC_MAGIC || ack->version != CLUSTER_BACKUP_IC_VERSION) + return false; + if (ack->op == CLUSTER_BACKUP_WIRE_OP_NONE || ack->op > CLUSTER_BACKUP_WIRE_OP_RESTORE_POINT) + return false; + if (ack->result > CLUSTER_BACKUP_WIRE_RESULT_EXECUTOR_ERROR) + return false; + if (ack->sender_node_id < 0 || ack->sender_node_id >= CLUSTER_MAX_NODES) + return false; + if (ack->request_id == 0) + return false; + if (ack->result == CLUSTER_BACKUP_WIRE_RESULT_OK) { + if (ack->thread_id == 0 || ack->thread_id > CLUSTER_MAX_NODES) + return false; + if (ack->op == CLUSTER_BACKUP_WIRE_OP_START + && (ack->start_redo_lsn == InvalidXLogRecPtr + || ack->checkpoint_lsn == InvalidXLogRecPtr)) + return false; + if ((ack->op == CLUSTER_BACKUP_WIRE_OP_STOP + || ack->op == CLUSTER_BACKUP_WIRE_OP_RESTORE_POINT) + && (ack->stop_cut_lsn == InvalidXLogRecPtr || !SCN_VALID(ack->cut_scn))) + return false; + } + + copy = *ack; + cluster_backup_wire_ack_compute_crc(©); + return copy.crc == ack->crc; +} + +const char * +cluster_backup_wire_result_name(ClusterBackupWireResult result) +{ + switch (result) { + case CLUSTER_BACKUP_WIRE_RESULT_OK: + return "ok"; + case CLUSTER_BACKUP_WIRE_RESULT_BUSY: + return "busy"; + case CLUSTER_BACKUP_WIRE_RESULT_BAD_REQUEST: + return "bad_request"; + case CLUSTER_BACKUP_WIRE_RESULT_NOT_IN_BACKUP: + return "not_in_backup"; + case CLUSTER_BACKUP_WIRE_RESULT_EXECUTOR_ERROR: + return "executor_error"; + } + return "unknown"; +} diff --git a/src/backend/cluster/cluster_guc.c b/src/backend/cluster/cluster_guc.c index fca70c883e7..0d46a1f12ef 100644 --- a/src/backend/cluster/cluster_guc.c +++ b/src/backend/cluster/cluster_guc.c @@ -81,6 +81,16 @@ int cluster_recovery_workers_max = 4; /* spec-4.5 D9: merged k-way recovery (default OFF, Q8) + wait timeout. */ bool cluster_merged_recovery = false; int cluster_recovery_merge_wait_timeout = 10000; +/* spec-6.5: cluster-aware backup / restore / PITR target knobs. */ +char *cluster_recovery_target_scn = NULL; +char *cluster_recovery_target_cluster_time = NULL; +char *cluster_recovery_target_name = NULL; +int cluster_recovery_target_action = CLUSTER_RECOVERY_TARGET_ACTION_PAUSE; +bool cluster_enable_pitr_restore_points = false; +int cluster_pitr_restore_point_interval_ms = 0; +int cluster_backup_wal_retention = 0; +int cluster_backup_parallel_channels = 1; +int cluster_backup_manifest_checksums = CLUSTER_BACKUP_MANIFEST_CHECKSUM_CRC32C; int cluster_shared_storage_backend = CLUSTER_SHARED_FS_BACKEND_STUB; /* spec-4.5a D2: shared data root for the cluster_fs (shared_fs) backend. */ char *cluster_shared_data_dir = NULL; @@ -817,6 +827,15 @@ static const struct config_enum_entry cluster_shared_storage_backend_options[] { "multi_attach", CLUSTER_SHARED_FS_BACKEND_MULTI_ATTACH, false }, { NULL, 0, false } }; +static const struct config_enum_entry cluster_recovery_target_action_options[] + = { { "pause", CLUSTER_RECOVERY_TARGET_ACTION_PAUSE, false }, + { "promote", CLUSTER_RECOVERY_TARGET_ACTION_PROMOTE, false }, + { "shutdown", CLUSTER_RECOVERY_TARGET_ACTION_SHUTDOWN, false }, + { NULL, 0, false } }; + +static const struct config_enum_entry cluster_backup_manifest_checksum_options[] + = { { "crc32c", CLUSTER_BACKUP_MANIFEST_CHECKSUM_CRC32C, false }, { NULL, 0, false } }; + /* * check_cluster_shared_data_dir -- GUC check_hook for @@ -1145,6 +1164,71 @@ cluster_init_guc(void) &cluster_recovery_merge_wait_timeout, 10000, 0, 600000, PGC_POSTMASTER, GUC_UNIT_MS, NULL, NULL, NULL); + DefineCustomStringVariable( + "cluster.recovery_target_scn", gettext_noop("Cluster PITR target SCN."), + gettext_noop("When set, cluster PITR status resolves the requested SCN against " + "cluster restore points and refuses unreachable targets."), + &cluster_recovery_target_scn, "", PGC_POSTMASTER, 0, NULL, NULL, NULL); + + DefineCustomStringVariable( + "cluster.recovery_target_cluster_time", gettext_noop("Cluster PITR target timestamp."), + gettext_noop("Reserved target timestamp for cluster-aware recovery planning. " + "Spec-6.5 exposes the configuration and status surface; the " + "startup recovery action remains fail-closed until the " + "coordinator can prove all WAL threads are present."), + &cluster_recovery_target_cluster_time, "", PGC_POSTMASTER, 0, NULL, NULL, NULL); + + DefineCustomStringVariable( + "cluster.recovery_target_name", gettext_noop("Cluster PITR named restore point target."), + gettext_noop("Reserved named cluster restore-point target. The status view " + "reports restore points produced by pg_cluster_create_restore_point."), + &cluster_recovery_target_name, "", PGC_POSTMASTER, 0, NULL, NULL, NULL); + + DefineCustomEnumVariable( + "cluster.recovery_target_action", + gettext_noop("Action to take when a cluster PITR target is reached."), + gettext_noop("Accepted values are pause, promote, and shutdown. The setting is " + "advertised with the 6.5 target surface; startup recovery remains " + "fail-closed until every required WAL thread is proven present."), + &cluster_recovery_target_action, CLUSTER_RECOVERY_TARGET_ACTION_PAUSE, + cluster_recovery_target_action_options, PGC_POSTMASTER, 0, NULL, NULL, NULL); + + DefineCustomBoolVariable( + "cluster.enable_pitr_restore_points", + gettext_noop("Enable automatic cluster restore point creation."), + gettext_noop("Manual pg_cluster_create_restore_point is available regardless of " + "this setting. Automatic background creation is reserved until a " + "cluster-wide cut coordinator is present."), + &cluster_enable_pitr_restore_points, false, PGC_SIGHUP, 0, NULL, NULL, NULL); + + DefineCustomIntVariable("cluster.pitr_restore_point_interval_ms", + gettext_noop("Interval for automatic cluster PITR restore points."), + gettext_noop("Zero disables automatic restore point scheduling."), + &cluster_pitr_restore_point_interval_ms, 0, 0, 86400000, PGC_SIGHUP, + GUC_UNIT_MS, NULL, NULL, NULL); + + DefineCustomIntVariable( + "cluster.backup_wal_retention", + gettext_noop("Cluster backup WAL retention hint in megabytes."), + gettext_noop("The 6.5 manifest/status surface records the setting; actual " + "multi-thread retention enforcement is deferred to the backup-set " + "writer."), + &cluster_backup_wal_retention, 0, 0, INT_MAX, PGC_SIGHUP, GUC_UNIT_MB, NULL, NULL, NULL); + + DefineCustomIntVariable( + "cluster.backup_parallel_channels", gettext_noop("Maximum cluster backup copy channels."), + gettext_noop("Reserved capacity knob for the cluster backup-set writer."), + &cluster_backup_parallel_channels, 1, 1, CLUSTER_MAX_NODES, PGC_SIGHUP, 0, NULL, NULL, + NULL); + + DefineCustomEnumVariable( + "cluster.backup_manifest_checksums", + gettext_noop("Checksum mode for cluster backup manifests."), + gettext_noop("crc32c protects the in-memory and SQL-visible manifest substrate; " + "6.5 does not provide an unchecked manifest mode."), + &cluster_backup_manifest_checksums, CLUSTER_BACKUP_MANIFEST_CHECKSUM_CRC32C, + cluster_backup_manifest_checksum_options, PGC_SIGHUP, 0, NULL, NULL, NULL); + /* * cluster.injection_points -- comma-separated list of injection point * names to auto-arm at startup with fault_type=WARNING (counter-only + diff --git a/src/backend/cluster/cluster_ic_tier1.c b/src/backend/cluster/cluster_ic_tier1.c index a07a12837c6..42889369668 100644 --- a/src/backend/cluster/cluster_ic_tier1.c +++ b/src/backend/cluster/cluster_ic_tier1.c @@ -1477,6 +1477,7 @@ cluster_ic_tier1_continue_hello_recv(int anon_slot, int peer_fd, int32 *out_lear const char *self_name = (ClusterConfShmem != NULL) ? ClusterConfShmem->cluster_name : "(no-conf)"; + memset(&peer_sa, 0, sizeof(peer_sa)); if (getpeername(peer_fd, (struct sockaddr *)&peer_sa, &peer_sa_len) == 0) { if (inet_ntop(AF_INET, &peer_sa.sin_addr, peer_ip, sizeof(peer_ip)) == NULL) strcpy(peer_ip, "?"); diff --git a/src/backend/cluster/cluster_lmon.c b/src/backend/cluster/cluster_lmon.c index 1c4dae9ba38..cb6f4b1b697 100644 --- a/src/backend/cluster/cluster_lmon.c +++ b/src/backend/cluster/cluster_lmon.c @@ -55,6 +55,7 @@ #include "utils/ps_status.h" #include "utils/timestamp.h" +#include "cluster/cluster_backup.h" /* cluster_backup_register_ic_msg_types + lmon_tick (spec-6.5) */ #include "cluster/cluster_clean_leave.h" /* cluster_clean_leave_register_ic_msg_types (spec-5.13 D8) */ #include "cluster/cluster_node_remove.h" /* cluster_node_remove_lmon_tick + register (spec-5.18 D9/D10) */ #include "cluster/cluster_conf.h" @@ -425,6 +426,16 @@ cluster_lmon_shmem_init(void) node_remove_registered = true; } } + /* spec-6.5 D1/D4: register cluster backup coordinator/peer request + ACK + * messages. Backends enqueue requests in shmem; LMON owns IC fanout. */ + { + static bool backup_registered = false; + + if (!backup_registered) { + cluster_backup_register_ic_msg_types(); + backup_registered = true; + } + } } @@ -1024,6 +1035,7 @@ LmonMain(void) /* spec-3.2 D6: LMON drain cross-node TT status hint outbound. * Fire-and-forget; L172 family — only LMON owns tier1 fds. */ cluster_tt_status_hint_drain_outbound(); + cluster_backup_lmon_tick(); /* * spec-2.34 D6 (HC93 leg a): TTL sweep of the GCS block @@ -1581,6 +1593,7 @@ LmonMain(void) /* spec-3.2 D6: LMON drain cross-node TT status hint outbound. * Fire-and-forget; L172 family — only LMON owns tier1 fds. */ cluster_tt_status_hint_drain_outbound(); + cluster_backup_lmon_tick(); /* spec-2.34 D6 (HC93 leg a): TTL sweep GCS block dedup HTAB. */ cluster_gcs_block_dedup_sweep_expired(GetCurrentTimestamp()); diff --git a/src/backend/cluster/cluster_reconfig.c b/src/backend/cluster/cluster_reconfig.c index af747c1154c..12974307ab3 100644 --- a/src/backend/cluster/cluster_reconfig.c +++ b/src/backend/cluster/cluster_reconfig.c @@ -1156,21 +1156,22 @@ cluster_reconfig_join_publish_proven(uint64 admitted_epoch) static void cluster_reconfig_drive_joins(int coordinator) { + ClusterReconfigState *state = ReconfigShmem; uint8 join_bitmap[CLUSTER_RECONFIG_DEAD_BITMAP_BYTES]; uint8 pending_snapshot[CLUSTER_RECONFIG_DEAD_BITMAP_BYTES]; uint64 joiner_incarnations[CLUSTER_MAX_NODES]; int n_join; int i; - if (ReconfigShmem == NULL) + if (state == NULL) return; /* Phase-1 detection + a snapshot of the current pending set, under the lock * (compute_join_bitmap reads membership_state). */ - LWLockAcquire(&ReconfigShmem->lock, LW_SHARED); + LWLockAcquire(&state->lock, LW_SHARED); n_join = cluster_reconfig_compute_join_bitmap(join_bitmap); - memcpy(pending_snapshot, ReconfigShmem->pending_join_bitmap, sizeof(pending_snapshot)); - LWLockRelease(&ReconfigShmem->lock); + memcpy(pending_snapshot, state->pending_join_bitmap, sizeof(pending_snapshot)); + LWLockRelease(&state->lock); if (n_join > 0) { memset(joiner_incarnations, 0, sizeof(joiner_incarnations)); diff --git a/src/backend/cluster/cluster_shmem.c b/src/backend/cluster/cluster_shmem.c index 524d8d3ac23..42d2386b371 100644 --- a/src/backend/cluster/cluster_shmem.c +++ b/src/backend/cluster/cluster_shmem.c @@ -64,6 +64,7 @@ #include "cluster/cluster_diag.h" /* cluster_diag_shmem_register (1.13 Sprint A) */ #include "cluster/cluster_clean_leave.h" /* cluster_clean_leave_shmem_register (spec-5.13 D2) */ #include "cluster/cluster_node_remove.h" /* cluster_node_remove_shmem_register (spec-5.18 D2) */ +#include "cluster/cluster_backup.h" /* cluster_backup_shmem_register (spec-6.5) */ #include "cluster/cluster_inject.h" /* CLUSTER_INJECTION_POINT */ #include "cluster/cluster_lck.h" /* cluster_lck_shmem_register (1.12 Sprint A) */ #include "cluster/cluster_epoch.h" /* cluster_epoch_shmem_register (2.4) */ @@ -596,6 +597,10 @@ cluster_init_shmem_module(void) if (cluster_shmem_lookup_region("pgrac cluster node_remove") == NULL) cluster_node_remove_shmem_register(); + /* spec-6.5: register cluster backup / restore / PITR state. */ + if (cluster_shmem_lookup_region("pgrac cluster backup") == NULL) + cluster_backup_shmem_register(); + /* spec-1.14 Sprint A D7: register cluster_stats shmem region. */ if (cluster_shmem_lookup_region("pgrac cluster stats") == NULL) cluster_stats_shmem_register(); diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c index c2113ba69f5..051def95f82 100644 --- a/src/backend/storage/lmgr/lwlock.c +++ b/src/backend/storage/lmgr/lwlock.c @@ -204,6 +204,8 @@ static const char *const BuiltinTrancheNames[] = { "ClusterCleanLeave", /* PGRAC LWTRANCHE_CLUSTER_NODE_REMOVE (spec-5.18): */ "ClusterNodeRemove", + /* PGRAC LWTRANCHE_CLUSTER_BACKUP (spec-6.5): */ + "ClusterBackup", /* PGRAC LWTRANCHE_CLUSTER_STATS: */ "ClusterStats", /* PGRAC LWTRANCHE_CLUSTER_SCN: */ diff --git a/src/backend/utils/errcodes.txt b/src/backend/utils/errcodes.txt index 07b3d6027f0..35aaea85810 100644 --- a/src/backend/utils/errcodes.txt +++ b/src/backend/utils/errcodes.txt @@ -827,6 +827,16 @@ Section: Class 53 - Insufficient Resources (pgrac extension) # relfilenode (stale writeback, 8.A). 53RAA E ERRCODE_CLUSTER_OBJECT_FLUSH_UNAVAILABLE cluster_object_flush_unavailable +# spec-6.5: cluster-aware backup / restore / PITR correctness band. +# These states are raised before any path can silently produce a partial +# cluster backup, choose an unreachable PITR target, or start from an +# incomplete / incompatible restore substrate. +53RAB E ERRCODE_CLUSTER_BACKUP_IN_PROGRESS cluster_backup_in_progress +53RAC E ERRCODE_CLUSTER_PITR_TARGET_UNREACHABLE cluster_pitr_target_unreachable +53RAD E ERRCODE_CLUSTER_BACKUP_INCOMPLETE cluster_backup_incomplete +53RAE E ERRCODE_CLUSTER_RESTORE_INCOMPATIBLE cluster_restore_incompatible +53RAF E ERRCODE_CLUSTER_RESTORE_POINT_DRAIN_TIMEOUT cluster_restore_point_drain_timeout + Section: Class 54 - Program Limit Exceeded # this is for wired-in limits, not resource exhaustion problems (class borrowed from DB2) diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h index 2101479e991..4ff9ce47929 100644 --- a/src/include/catalog/catversion.h +++ b/src/include/catalog/catversion.h @@ -713,7 +713,11 @@ * backward replay and reconstructs commit_scn=InvalidScn for v3. No catalog * surface change; the bump fences an old binary from replaying v3-format WAL * (unknown format_version -> redo PANIC). Bump 202606330 -> 202606340. */ -#define CATALOG_VERSION_NO 202606340 +/* spec-6.5: cluster-aware backup / restore / PITR catalog surface — + * pg_cluster_backup_start/stop, pg_cluster_create_restore_point, 4 state SRFs, + * 4 system views, LWTRANCHE_CLUSTER_BACKUP, and 53RAB..53RAF SQLSTATEs. + * Bump 202606340 -> 202606350. */ +#define CATALOG_VERSION_NO 202606350 /* spec-5.13 (2026-06-27): clean-leave catalog surface — cluster_get_clean_leave_state * SRF (oid 8960) + pg_cluster_clean_leave_state view + pg_cluster_clean_leave_request diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index 5653c113676..185d352c207 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -12540,6 +12540,67 @@ prorettype => 'text', proargtypes => 'int4', prosrc => 'pg_cluster_remove_node' }, +# spec-6.5 -- cluster-aware backup / restore / PITR SQL surface. +{ oid => '8965', descr => 'start a cluster-aware physical backup', + proname => 'pg_cluster_backup_start', provolatile => 'v', + prorettype => 'record', proargtypes => 'text bool', + proallargtypes => '{text,bool,text,pg_lsn,pg_lsn,int4}', + proargmodes => '{i,i,o,o,o,o}', + proargnames => '{label,fast,backup_id,start_redo_lsn,checkpoint_lsn,start_tli}', + prosrc => 'pg_cluster_backup_start' }, + +{ oid => '8966', descr => 'stop a cluster-aware physical backup', + proname => 'pg_cluster_backup_stop', provolatile => 'v', + prorettype => 'record', proargtypes => 'bool', + proallargtypes => '{bool,int8,pg_lsn,int8,text}', + proargmodes => '{i,o,o,o,o}', + proargnames => '{waitforarchive,consistent_scn,stop_cut_lsn,manifest_crc,backup_label}', + prosrc => 'pg_cluster_backup_stop' }, + +{ oid => '8967', descr => 'create a cluster-aware restore point', + proname => 'pg_cluster_create_restore_point', provolatile => 'v', + prorettype => 'record', proargtypes => 'text', + proallargtypes => '{text,text,int8,pg_lsn}', + proargmodes => '{i,o,o,o}', + proargnames => '{name,restore_point_name,cut_scn,cut_lsn}', + prosrc => 'pg_cluster_create_restore_point' }, + +{ oid => '8968', descr => 'show current cluster backup state', + proname => 'cluster_get_backup_state', prorows => '1', + proretset => 't', provolatile => 'v', proparallel => 'r', + prorettype => 'record', proargtypes => '', + proallargtypes => '{bool,text,int4,pg_lsn,pg_lsn,pg_lsn,int8,int8,timestamptz,timestamptz,int4,int4,bool,int4}', + proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o,o}', + proargnames => '{in_progress,backup_id,coordinator_node_id,start_redo_lsn,checkpoint_lsn,stop_cut_lsn,consistent_scn,manifest_crc,started_at,stopped_at,backup_parallel_channels,backup_wal_retention,restore_points_enabled,restore_point_interval_ms}', + prosrc => 'cluster_get_backup_state' }, + +{ oid => '8969', descr => 'show latest cluster backup manifest summary', + proname => 'cluster_get_backup_history', prorows => '16', + proretset => 't', provolatile => 'v', proparallel => 'r', + prorettype => 'record', proargtypes => '', + proallargtypes => '{text,int8,int8,int4,int8,int4,int4,int4,int8}', + proargmodes => '{o,o,o,o,o,o,o,o,o}', + proargnames => '{backup_id,consistent_scn,scn_durable_peak,timeline,catversion,storage_id,node_count,thread_count,manifest_crc}', + prosrc => 'cluster_get_backup_history' }, + +{ oid => '8970', descr => 'show cluster restore points', + proname => 'cluster_get_restore_points', prorows => '16', + proretset => 't', provolatile => 'v', proparallel => 'r', + prorettype => 'record', proargtypes => '', + proallargtypes => '{text,int8,int4,int4,timestamptz}', + proargmodes => '{o,o,o,o,o}', + proargnames => '{restore_point_name,cut_scn,thread_count,incarnation,created_at}', + prosrc => 'cluster_get_restore_points' }, + +{ oid => '8971', descr => 'show cluster PITR target resolution status', + proname => 'cluster_get_pitr_status', prorows => '1', + proretset => 't', provolatile => 'v', proparallel => 'r', + prorettype => 'record', proargtypes => '', + proallargtypes => '{text,text,bool,text,int8,text}', + proargmodes => '{o,o,o,o,o,o}', + proargnames => '{target_type,target_action,reachable,reason,resolved_scn,restore_point_name}', + prosrc => 'cluster_get_pitr_status' }, + # spec-3.2 D5b (2026-05-22) -- test-only visibility fork injection # functions. These are SQL-visible so TAP can drive a real # HeapTupleSatisfiesMVCC cluster-path miss and assert 53R97. Production diff --git a/src/include/cluster/cluster_backup.h b/src/include/cluster/cluster_backup.h new file mode 100644 index 00000000000..4bb663adf19 --- /dev/null +++ b/src/include/cluster/cluster_backup.h @@ -0,0 +1,225 @@ +/*------------------------------------------------------------------------- + * + * cluster_backup.h + * Cluster-aware backup / restore / PITR substrate. + * + * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * Portions Copyright (c) 2026, pgrac contributors + * + * Author: SqlRush + * + * IDENTIFICATION + * src/include/cluster/cluster_backup.h + * + * NOTES + * This is a pgrac-original file (no derivation from PostgreSQL). + * Spec: spec-6.5-cluster-aware-backup-restore-pitr.md + * + *------------------------------------------------------------------------- + */ +#ifndef CLUSTER_BACKUP_H +#define CLUSTER_BACKUP_H + +#include "access/xlogdefs.h" +#include "c.h" +#include "cluster/cluster_conf.h" /* CLUSTER_MAX_NODES */ +#include "cluster/cluster_scn.h" +#include "datatype/timestamp.h" + +#define CLUSTER_BACKUP_ID_MAX 64 +#define CLUSTER_RESTORE_POINT_NAME_MAX 64 +#define CLUSTER_BACKUP_MANIFEST_MAGIC 0x5047424BU /* "PGBK" */ +#define CLUSTER_BACKUP_MANIFEST_VERSION 1 +#define CLUSTER_BACKUP_RESTORE_POINT_MAX 16 +#define CLUSTER_BACKUP_NODE_BITMAP_BYTES (CLUSTER_MAX_NODES / 8) +#define CLUSTER_BACKUP_IC_MAGIC 0x50424249U /* "PBBI" */ +#define CLUSTER_BACKUP_IC_VERSION 1 + +typedef enum ClusterBackupManifestReason { + CLUSTER_BACKUP_MANIFEST_OK = 0, + CLUSTER_BACKUP_MANIFEST_NULL, + CLUSTER_BACKUP_MANIFEST_BAD_MAGIC, + CLUSTER_BACKUP_MANIFEST_BAD_VERSION, + CLUSTER_BACKUP_MANIFEST_BAD_COUNTS, + CLUSTER_BACKUP_MANIFEST_MISSING_THREAD, + CLUSTER_BACKUP_MANIFEST_BAD_LSN_RANGE, + CLUSTER_BACKUP_MANIFEST_MISSING_WAL, + CLUSTER_BACKUP_MANIFEST_MISSING_UNDO, + CLUSTER_BACKUP_MANIFEST_MISSING_TT, + CLUSTER_BACKUP_MANIFEST_MISSING_CONTROL, + CLUSTER_BACKUP_MANIFEST_BAD_SCN_PEAK, + CLUSTER_BACKUP_MANIFEST_BAD_CRC +} ClusterBackupManifestReason; + +typedef enum ClusterPitrTargetReason { + CLUSTER_PITR_TARGET_OK = 0, + CLUSTER_PITR_TARGET_NO_RESTORE_POINT, + CLUSTER_PITR_TARGET_BEFORE_BACKUP, + CLUSTER_PITR_TARGET_MISSING_THREAD, + CLUSTER_PITR_TARGET_UNARCHIVED_WAL +} ClusterPitrTargetReason; + +typedef enum ClusterRestoreCompatibilityReason { + CLUSTER_RESTORE_COMPAT_OK = 0, + CLUSTER_RESTORE_COMPAT_CATVERSION, + CLUSTER_RESTORE_COMPAT_STORAGE, + CLUSTER_RESTORE_COMPAT_TOPOLOGY, + CLUSTER_RESTORE_COMPAT_MANIFEST +} ClusterRestoreCompatibilityReason; + +typedef enum ClusterRestorePointCutReason { + CLUSTER_RESTORE_POINT_CUT_OK = 0, + CLUSTER_RESTORE_POINT_CUT_PENDING_COMMITS, + CLUSTER_RESTORE_POINT_CUT_NO_FENCE, + CLUSTER_RESTORE_POINT_CUT_NO_THREADS, + CLUSTER_RESTORE_POINT_CUT_BAD_THREAD +} ClusterRestorePointCutReason; + +typedef enum ClusterBackupWireOp { + CLUSTER_BACKUP_WIRE_OP_NONE = 0, + CLUSTER_BACKUP_WIRE_OP_START, + CLUSTER_BACKUP_WIRE_OP_STOP, + CLUSTER_BACKUP_WIRE_OP_ABORT, + CLUSTER_BACKUP_WIRE_OP_RESTORE_POINT +} ClusterBackupWireOp; + +typedef enum ClusterBackupWireResult { + CLUSTER_BACKUP_WIRE_RESULT_OK = 0, + CLUSTER_BACKUP_WIRE_RESULT_BUSY, + CLUSTER_BACKUP_WIRE_RESULT_BAD_REQUEST, + CLUSTER_BACKUP_WIRE_RESULT_NOT_IN_BACKUP, + CLUSTER_BACKUP_WIRE_RESULT_EXECUTOR_ERROR +} ClusterBackupWireResult; + +typedef struct ClusterBackupManifestThread { + bool present; + bool wal_included; + bool undo_included; + bool tt_included; + uint32 thread_id; + int32 node_id; + XLogRecPtr start_redo_lsn; + XLogRecPtr checkpoint_lsn; + TimeLineID start_tli; + XLogRecPtr stop_cut_lsn; +} ClusterBackupManifestThread; + +typedef struct ClusterBackupManifest { + uint32 magic; + uint32 version; + char backup_id[CLUSTER_BACKUP_ID_MAX]; + SCN consistent_scn; + SCN scn_durable_peak; + TimeLineID timeline; + uint32 catversion; + uint32 incarnation; + uint32 backend_storage_id; + uint32 node_count; + uint32 thread_count; + bool control_included; + bool voting_included; + ClusterBackupManifestThread threads[CLUSTER_MAX_NODES]; + uint32 manifest_crc; +} ClusterBackupManifest; + +typedef struct ClusterRestorePoint { + bool present; + char name[CLUSTER_RESTORE_POINT_NAME_MAX]; + SCN cut_scn; + XLogRecPtr cut_lsn[CLUSTER_MAX_NODES]; + uint32 thread_count; + uint32 incarnation; + TimestampTz created_at; +} ClusterRestorePoint; + +typedef struct ClusterBackupStatus { + bool in_progress; + char backup_id[CLUSTER_BACKUP_ID_MAX]; + int32 coordinator_node_id; + XLogRecPtr start_redo_lsn; + XLogRecPtr checkpoint_lsn; + TimeLineID start_tli; + XLogRecPtr stop_cut_lsn; + SCN consistent_scn; + uint32 manifest_crc; + TimestampTz started_at; + TimestampTz stopped_at; +} ClusterBackupStatus; + +typedef struct ClusterBackupWireRequest { + uint32 magic; + uint16 version; + uint16 op; + uint64 request_id; + int32 coordinator_node_id; + bool fast; + bool waitforarchive; + uint16 _pad0; + SCN requested_scn; + char backup_id[CLUSTER_BACKUP_ID_MAX]; + char restore_point_name[CLUSTER_RESTORE_POINT_NAME_MAX]; + uint32 crc; +} ClusterBackupWireRequest; + +typedef struct ClusterBackupWireAck { + uint32 magic; + uint16 version; + uint16 op; + uint16 result; + int32 sender_node_id; + uint16 thread_id; + uint16 _pad0; + uint64 request_id; + XLogRecPtr start_redo_lsn; + XLogRecPtr checkpoint_lsn; + XLogRecPtr stop_cut_lsn; + SCN cut_scn; + TimeLineID timeline; + uint32 crc; +} ClusterBackupWireAck; + +extern void cluster_backup_manifest_init(ClusterBackupManifest *manifest, const char *backup_id); +extern bool cluster_backup_manifest_set_thread(ClusterBackupManifest *manifest, int thread_index, + const ClusterBackupManifestThread *thread); +extern uint32 cluster_backup_manifest_compute_crc(const ClusterBackupManifest *manifest); +extern void cluster_backup_manifest_seal(ClusterBackupManifest *manifest); +extern ClusterBackupManifestReason +cluster_backup_manifest_validate(const ClusterBackupManifest *manifest); +extern const char *cluster_backup_manifest_reason_name(ClusterBackupManifestReason reason); + +extern ClusterRestorePointCutReason +cluster_restore_point_build(ClusterRestorePoint *out, const char *name, const SCN *thread_scn, + const XLogRecPtr *thread_lsn, int max_threads, + bool pending_commits_empty, bool commit_fence_held, uint32 incarnation); +extern const char *cluster_restore_point_cut_reason_name(ClusterRestorePointCutReason reason); + +extern ClusterPitrTargetReason cluster_pitr_resolve_scn(const ClusterRestorePoint *points, + int npoints, SCN requested_scn, + SCN backup_consistent_scn, + ClusterRestorePoint *out); +extern const char *cluster_pitr_target_reason_name(ClusterPitrTargetReason reason); + +extern ClusterRestoreCompatibilityReason +cluster_backup_manifest_compatible(const ClusterBackupManifest *manifest, uint32 current_catversion, + uint32 current_storage_id, uint32 expected_node_count); +extern const char *cluster_restore_compat_reason_name(ClusterRestoreCompatibilityReason reason); + +extern void cluster_backup_wire_request_compute_crc(ClusterBackupWireRequest *request); +extern bool cluster_backup_wire_request_valid(const ClusterBackupWireRequest *request); +extern void cluster_backup_wire_ack_compute_crc(ClusterBackupWireAck *ack); +extern bool cluster_backup_wire_ack_valid(const ClusterBackupWireAck *ack); +extern const char *cluster_backup_wire_result_name(ClusterBackupWireResult result); + +#ifndef FRONTEND +extern Size cluster_backup_shmem_size(void); +extern void cluster_backup_shmem_init(void); +extern void cluster_backup_shmem_register(void); +extern void cluster_backup_register_ic_msg_types(void); +extern void cluster_backup_lmon_tick(void); +extern void cluster_backup_get_status(ClusterBackupStatus *out); +extern bool cluster_backup_get_last_manifest(ClusterBackupManifest *out); +extern int cluster_backup_get_restore_points(ClusterRestorePoint *out, int max_points); +#endif + +#endif /* CLUSTER_BACKUP_H */ diff --git a/src/include/cluster/cluster_guc.h b/src/include/cluster/cluster_guc.h index 03ea74a49ea..a648ce1d906 100644 --- a/src/include/cluster/cluster_guc.h +++ b/src/include/cluster/cluster_guc.h @@ -497,6 +497,24 @@ extern int cluster_boc_sweep_interval_ms; */ extern bool cluster_enabled; +/* spec-6.5: cluster-aware backup / restore / PITR target configuration. */ +#define CLUSTER_RECOVERY_TARGET_ACTION_PAUSE 0 +#define CLUSTER_RECOVERY_TARGET_ACTION_PROMOTE 1 +#define CLUSTER_RECOVERY_TARGET_ACTION_SHUTDOWN 2 + +#define CLUSTER_BACKUP_MANIFEST_CHECKSUM_OFF 0 +#define CLUSTER_BACKUP_MANIFEST_CHECKSUM_CRC32C 1 + +extern char *cluster_recovery_target_scn; +extern char *cluster_recovery_target_cluster_time; +extern char *cluster_recovery_target_name; +extern int cluster_recovery_target_action; +extern bool cluster_enable_pitr_restore_points; +extern int cluster_pitr_restore_point_interval_ms; +extern int cluster_backup_wal_retention; +extern int cluster_backup_parallel_channels; +extern int cluster_backup_manifest_checksums; + /* spec-3.12 D5: own-instance undo/TT-slot retention horizon gate (default on). */ extern bool cluster_undo_retention_horizon_enabled; diff --git a/src/include/cluster/cluster_ic_envelope.h b/src/include/cluster/cluster_ic_envelope.h index 1f13f04a38a..3ce935da887 100644 --- a/src/include/cluster/cluster_ic_envelope.h +++ b/src/include/cluster/cluster_ic_envelope.h @@ -230,11 +230,18 @@ typedef enum ClusterICMsgType { * survivors (ClusterNodeRemoveAnnouncePayload: coordinator + target + remove_epoch + * removal_event_id). Survivors drop their refs to the removed node + reply * REMOVE_CLEANUP_ACK. */ - PGRAC_IC_MSG_REMOVE_CLEANUP_ACK = 32 /* PGRAC: spec-5.18 D10 — survivor -> removal + PGRAC_IC_MSG_REMOVE_CLEANUP_ACK = 32, /* PGRAC: spec-5.18 D10 — survivor -> removal * coordinator (ClusterNodeRemoveCleanupAckPayload): "I dropped all refs to the removed * node + accepted the permanent remaster"; sets the survivor's bit in the coordinator's * cleanup ACK barrier. */ - /* values 33..255 available for future sub-spec; never reuse 0..32 */ + PGRAC_IC_MSG_BACKUP_REQUEST = 33, /* PGRAC: spec-6.5 D1/D4 — backup coordinator -> + * peers (ClusterBackupWireRequest): START / STOP / ABORT / RESTORE_POINT request. + * LMON-mediated; peer LMON executes the local native backup/restore-point leg and + * replies with BACKUP_ACK. */ + PGRAC_IC_MSG_BACKUP_ACK = 34 /* PGRAC: spec-6.5 D1/D4 — peer -> backup + * coordinator (ClusterBackupWireAck): local thread REDO/checkpoint/stop-cut + * metadata or fail-closed NAK reason. */ + /* values 35..255 available for future sub-spec; never reuse 0..34 */ } ClusterICMsgType; diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h index d10d6c58047..54f9d65c14d 100644 --- a/src/include/storage/lwlock.h +++ b/src/include/storage/lwlock.h @@ -249,6 +249,8 @@ typedef enum BuiltinTrancheIds { LWTRANCHE_CLUSTER_CLEAN_LEAVE, /* spec-5.18: guards the permanent-removal ClusterNodeRemoveState shmem block. */ LWTRANCHE_CLUSTER_NODE_REMOVE, + /* spec-6.5: guards the cluster backup / restore / PITR shmem block. */ + LWTRANCHE_CLUSTER_BACKUP, /* * PGRAC (stage 1.14 Sprint A): dedicated tranche for * ClusterStatsSharedState lwlock — same pattern as LMON / LCK / DIAG. diff --git a/src/test/cluster_tap/t/006_errcodes.pl b/src/test/cluster_tap/t/006_errcodes.pl index 3a6999c2e97..a83e9ef874e 100644 --- a/src/test/cluster_tap/t/006_errcodes.pl +++ b/src/test/cluster_tap/t/006_errcodes.pl @@ -1,8 +1,7 @@ #------------------------------------------------------------------------- # # 006_errcodes.pl -# End-to-end regression for the 45 cluster SQLSTATE error codes -# registered in stage 0.12. +# End-to-end regression for cluster SQLSTATE error codes. # # Cluster errcodes are registered in src/backend/utils/errcodes.txt # and become available to plpgsql via PG's auto-generated @@ -103,6 +102,8 @@ sub raise_unknown "cluster_lms_queue_full -> 53R01"); is(raise_and_get_sqlstate('cluster_reconfig_in_progress'), '53R60', "cluster_reconfig_in_progress -> 53R60"); +is(raise_and_get_sqlstate('cluster_backup_incomplete'), '53RAD', + "cluster_backup_incomplete -> 53RAD"); is(raise_and_get_sqlstate('cluster_shared_storage_failed'), '58R01', "cluster_shared_storage_failed -> 58R01"); diff --git a/src/test/cluster_tap/t/007_guc.pl b/src/test/cluster_tap/t/007_guc.pl index 7724eb21843..787953a368c 100644 --- a/src/test/cluster_tap/t/007_guc.pl +++ b/src/test/cluster_tap/t/007_guc.pl @@ -179,5 +179,28 @@ qr/999 is outside the valid range for parameter "cluster.node_id"/, 'startup log contains GUC out-of-range WARNING for cluster.node_id'); +# ---------- +# spec-6.5 cluster backup / PITR GUCs. +# ---------- +is($node->safe_psql('postgres', + q{SELECT setting || '|' || vartype || '|' || context + FROM pg_settings WHERE name = 'cluster.recovery_target_scn'}), + '|string|postmaster', + 'cluster.recovery_target_scn default and context'); +is($node->safe_psql('postgres', + q{SELECT setting || '|' || vartype || '|' || context + FROM pg_settings WHERE name = 'cluster.recovery_target_action'}), + 'pause|enum|postmaster', + 'cluster.recovery_target_action default and context'); +is($node->safe_psql('postgres', + q{SELECT setting || '|' || vartype || '|' || context + FROM pg_settings WHERE name = 'cluster.backup_manifest_checksums'}), + 'crc32c|enum|sighup', + 'cluster.backup_manifest_checksums is mandatory crc32c'); +is($node->safe_psql('postgres', + q{SELECT setting || '|' || vartype || '|' || context + FROM pg_settings WHERE name = 'cluster.backup_parallel_channels'}), + '1|integer|sighup', + 'cluster.backup_parallel_channels default and context'); done_testing(); diff --git a/src/test/cluster_tap/t/020_shmem_registry.pl b/src/test/cluster_tap/t/020_shmem_registry.pl index d40aa6578f3..4e6c119e29f 100644 --- a/src/test/cluster_tap/t/020_shmem_registry.pl +++ b/src/test/cluster_tap/t/020_shmem_registry.pl @@ -93,9 +93,11 @@ # "cr counters"). # spec-5.18 D2: +1 "pgrac cluster node_remove" (permanent-removal driver state; # always registered; sorts between "multixact overlay" and "pcm grd"). -my $expected_region_count = $has_visibility_inject ? '68' : '67'; +# spec-6.5: +1 "pgrac cluster backup" (backup / restore / PITR state; sorts +# between "advisory" and "cf stats"). +my $expected_region_count = $has_visibility_inject ? '69' : '68'; my $expected_regions = - 'pgrac block recovery,pgrac cluster advisory,pgrac cluster cf stats,pgrac cluster clean_leave,pgrac cluster conf,pgrac cluster control,pgrac cluster cr admit stats,pgrac cluster cr coordinator,pgrac cluster cr counters,pgrac cluster cr pool,pgrac cluster cr relgen,pgrac cluster cr tuple stats,pgrac cluster cssd,pgrac cluster diag,pgrac cluster dl,pgrac cluster durable tt counters,pgrac cluster epoch,pgrac cluster fence,pgrac cluster gcs,pgrac cluster gcs block,pgrac cluster gcs block dedup,pgrac cluster ges,pgrac cluster ges dedup,pgrac cluster ges reply wait,pgrac cluster grd,pgrac cluster grd outbound,pgrac cluster grd pending,pgrac cluster grd work queue,pgrac cluster hw,pgrac cluster ir,pgrac cluster ko,pgrac cluster lck,pgrac cluster lmd,pgrac cluster lmd graph,pgrac cluster lmd probe,pgrac cluster lmon,pgrac cluster lms,pgrac cluster lock-path counters,pgrac cluster multixact overlay,pgrac cluster node_remove,pgrac cluster pcm grd,pgrac cluster qvotec,pgrac cluster reconfig,pgrac cluster resolver cache,pgrac cluster scn,pgrac cluster sequence,pgrac cluster sinval ack outbound,pgrac cluster sinval ack wait,pgrac cluster sinval inbound,pgrac cluster sinval outbound,pgrac cluster smgr,pgrac cluster startup phase,pgrac cluster stats,pgrac cluster subtrans state,pgrac cluster ts,pgrac cluster tt local seq,pgrac cluster tt slot allocator,pgrac cluster tt status hint outbound,pgrac cluster tt status overlay,pgrac cluster tx enqueue,pgrac cluster undo cleaner,pgrac cluster undo record cursor'; + 'pgrac block recovery,pgrac cluster advisory,pgrac cluster backup,pgrac cluster cf stats,pgrac cluster clean_leave,pgrac cluster conf,pgrac cluster control,pgrac cluster cr admit stats,pgrac cluster cr coordinator,pgrac cluster cr counters,pgrac cluster cr pool,pgrac cluster cr relgen,pgrac cluster cr tuple stats,pgrac cluster cssd,pgrac cluster diag,pgrac cluster dl,pgrac cluster durable tt counters,pgrac cluster epoch,pgrac cluster fence,pgrac cluster gcs,pgrac cluster gcs block,pgrac cluster gcs block dedup,pgrac cluster ges,pgrac cluster ges dedup,pgrac cluster ges reply wait,pgrac cluster grd,pgrac cluster grd outbound,pgrac cluster grd pending,pgrac cluster grd work queue,pgrac cluster hw,pgrac cluster ir,pgrac cluster ko,pgrac cluster lck,pgrac cluster lmd,pgrac cluster lmd graph,pgrac cluster lmd probe,pgrac cluster lmon,pgrac cluster lms,pgrac cluster lock-path counters,pgrac cluster multixact overlay,pgrac cluster node_remove,pgrac cluster pcm grd,pgrac cluster qvotec,pgrac cluster reconfig,pgrac cluster resolver cache,pgrac cluster scn,pgrac cluster sequence,pgrac cluster sinval ack outbound,pgrac cluster sinval ack wait,pgrac cluster sinval inbound,pgrac cluster sinval outbound,pgrac cluster smgr,pgrac cluster startup phase,pgrac cluster stats,pgrac cluster subtrans state,pgrac cluster ts,pgrac cluster tt local seq,pgrac cluster tt slot allocator,pgrac cluster tt status hint outbound,pgrac cluster tt status overlay,pgrac cluster tx enqueue,pgrac cluster undo cleaner,pgrac cluster undo record cursor'; $expected_regions .= ',pgrac cluster visibility inject' if $has_visibility_inject; # spec-4.12 D7: cooperative write-fence region; always registered. Sorts after diff --git a/src/test/cluster_tap/t/332_cluster_backup_pitr.pl b/src/test/cluster_tap/t/332_cluster_backup_pitr.pl new file mode 100644 index 00000000000..51b0c2cd577 --- /dev/null +++ b/src/test/cluster_tap/t/332_cluster_backup_pitr.pl @@ -0,0 +1,167 @@ +#!/usr/bin/env perl +#------------------------------------------------------------------------- +# +# 332_cluster_backup_pitr.pl +# spec-6.5 -- cluster-aware backup / restore / PITR SQL surface. +# +# IDENTIFICATION +# src/test/cluster_tap/t/332_cluster_backup_pitr.pl +# +# Author: SqlRush +# +# Portions Copyright (c) 2026, pgrac contributors +# +#------------------------------------------------------------------------- + +use strict; +use warnings; + +use FindBin; +use IO::Socket::INET; +use lib "$FindBin::RealBin/../lib"; + +use PgracClusterNode; +use PostgreSQL::Test::Utils; +use Test::More; + +my $next_high_port = $ENV{PGRAC_BACKUP_TAP_PORT_BASE} // 60432; + +sub next_free_high_port +{ + for (1 .. 256) + { + my $port = $next_high_port++; + my $sock = IO::Socket::INET->new( + Listen => 5, + LocalAddr => '127.0.0.1', + LocalPort => $port, + Proto => 'tcp', + ReuseAddr => 1); + if ($sock) + { + close $sock; + return $port; + } + } + die "could not find a free high TCP port for cluster backup TAP"; +} + +my $node = PgracClusterNode->new('cluster_backup_single', + port => next_free_high_port()); +$node->init(allows_streaming => 1); +$node->append_conf('postgresql.conf', + "cluster.enabled = on\n" + . "cluster.node_id = 0\n" + . "cluster.allow_single_node = on\n" + . "wal_level = replica\n"); +$node->start; + +is($node->safe_psql('postgres', + q{SELECT count(*) FROM pg_stat_cluster_backup}), + '1', + 'L1 backup state view is present'); +is($node->safe_psql('postgres', + q{SELECT backup_parallel_channels || ',' || backup_wal_retention || ',' || + CASE WHEN restore_points_enabled THEN 't' ELSE 'f' END || ',' || + restore_point_interval_ms + FROM pg_stat_cluster_backup}), + '1,0,f,0', + 'L1 backup state view exposes backup/PITR GUC readers'); +is($node->safe_psql('postgres', + q{SELECT target_type || ',' || target_action || ',' || + CASE WHEN reachable THEN 't' ELSE 'f' END || ',' || reason + FROM pg_cluster_pitr_status}), + 'latest,pause,t,ok', + 'L2 default PITR target status is latest/pause/ok'); + +my ($backup_ret, $backup_out, $backup_err) = $node->psql('postgres', + "\\set VERBOSITY verbose\nSELECT * FROM pg_cluster_backup_start('b332', true)"); +isnt($backup_ret, 0, + 'L3 cluster backup start fails closed until physical capture lands'); +like($backup_err, qr/0A000|feature_not_supported/, + 'L3 cluster backup start reports feature_not_supported'); +like($backup_err, qr/physical backup capture|durable WAL pinning|restore integration/, + 'L3 cluster backup start names the missing substrate'); + +is($node->safe_psql('postgres', + q{SELECT CASE WHEN in_progress THEN 't' ELSE 'f' END + FROM pg_stat_cluster_backup}), + 'f', + 'L4 rejected cluster backup does not leave in-progress state'); +is($node->safe_psql('postgres', + q{SELECT count(*) FROM pg_cluster_backup_history}), + '0', + 'L4 rejected cluster backup does not publish a manifest'); + +my ($rp_ret, $rp_out, $rp_err) = $node->psql('postgres', + "\\set VERBOSITY verbose\nSELECT * FROM pg_cluster_create_restore_point('rp332')"); +isnt($rp_ret, 0, + 'L5 cluster restore point fails closed until commit-drain lands'); +like($rp_err, qr/0A000|feature_not_supported/, + 'L5 cluster restore point reports feature_not_supported'); +like($rp_err, qr/restore-point commit-drain barrier/, + 'L5 cluster restore point names the missing barrier'); + +is($node->safe_psql('postgres', + q{SELECT count(*) FROM pg_cluster_restore_points}), + '0', + 'L6 rejected restore point is not retained'); + +$node->stop; + +my $peer_node = PgracClusterNode->new('cluster_backup_peers', + port => next_free_high_port()); +my $peer_ic0 = next_free_high_port(); +my $peer_ic1 = next_free_high_port(); +$peer_node->init(allows_streaming => 1); +$peer_node->append_conf('postgresql.conf', + "cluster.enabled = on\n" + . "cluster.node_id = 0\n" + . "cluster.allow_single_node = on\n" + . "wal_level = replica\n"); +PostgreSQL::Test::Utils::append_to_file($peer_node->data_dir . '/pgrac.conf', <start; + +my ($ret, $out, $err) = $peer_node->psql('postgres', + "\\set VERBOSITY verbose\nSELECT * FROM pg_cluster_backup_start('partial', true)"); +isnt($ret, 0, 'L8 declared-peer backup remains fail-closed without capture substrate'); +like($err, qr/0A000|feature_not_supported/, + 'L8 declared-peer backup reports feature_not_supported'); +is($peer_node->safe_psql('postgres', + q{SELECT CASE WHEN in_progress THEN 't' ELSE 'f' END + FROM pg_stat_cluster_backup}), + 'f', + 'L9 failed peer backup did not leave in-progress state'); + +$peer_node->stop; + +my $bad_target_node = PgracClusterNode->new('cluster_backup_bad_target', + port => next_free_high_port()); +$bad_target_node->init(allows_streaming => 1); +$bad_target_node->append_conf('postgresql.conf', + "cluster.enabled = on\n" + . "cluster.node_id = 0\n" + . "cluster.allow_single_node = on\n" + . "wal_level = replica\n" + . "cluster.recovery_target_scn = '0'\n"); +$bad_target_node->start; + +is($bad_target_node->safe_psql('postgres', + q{SELECT target_type || ',' || target_action || ',' || + CASE WHEN reachable THEN 't' ELSE 'f' END || ',' || reason + FROM pg_cluster_pitr_status}), + 'scn,pause,f,invalid_target', + 'L10 invalid SCN PITR target fails closed'); + +$bad_target_node->stop; + +done_testing(); diff --git a/src/test/cluster_unit/Makefile b/src/test/cluster_unit/Makefile index 28d409d5699..6ee1bcb0a98 100644 --- a/src/test/cluster_unit/Makefile +++ b/src/test/cluster_unit/Makefile @@ -47,6 +47,7 @@ TESTS = test_cluster_basic test_cluster_version test_cluster_backend_types \ test_cluster_retention test_cluster_undo_cleaner test_cluster_visibility_variants test_cluster_tt_2pc \ test_cluster_stage3_acceptance test_cluster_undo_buf test_cluster_undo_extent \ test_cluster_wal_thread test_cluster_wal_state test_cluster_recovery_plan \ + test_cluster_backup \ test_cluster_recovery_worker test_cluster_recovery_merge test_cluster_remote_xact \ test_cluster_block_apply test_cluster_thread_apply test_cluster_thread_replay \ test_cluster_thread_driver test_cluster_thread_orchestrator \ @@ -144,13 +145,23 @@ test_cluster_membership: test_cluster_membership.c unit_test.h $(CLUSTER_VERSION $(CLUSTER_VERSION_O) $(CLUSTER_MEMBERSHIP_O) \ $(top_builddir)/src/port/libpgport_srv.a -o $@ +# spec-6.5 D0/D1: test_cluster_backup links only the dependency-light +# manifest/PITR helper object. Runtime SQL/shmem code stays in +# cluster_backup.o and is intentionally not pulled into this pure test. +CLUSTER_BACKUP_MANIFEST_O = $(top_builddir)/src/backend/cluster/cluster_backup_manifest.o +test_cluster_backup: test_cluster_backup.c unit_test.h $(CLUSTER_VERSION_O) \ + $(CLUSTER_BACKUP_MANIFEST_O) + $(CC) $(CFLAGS) $(CPPFLAGS) $< \ + $(CLUSTER_VERSION_O) $(CLUSTER_BACKUP_MANIFEST_O) \ + $(top_builddir)/src/port/libpgport_srv.a -o $@ + # Most tests link only cluster_version.o. test_cluster_guc / # test_cluster_shmem / test_cluster_signal / test_cluster_views / # test_cluster_gviews / test_cluster_ic / test_cluster_conf have # separate rules because they also link additional cluster_*.o # objects (the test files stub the PG backend symbols those # objects reference). -SIMPLE_TESTS = $(filter-out test_cluster_guc test_cluster_shmem test_cluster_signal test_cluster_views test_cluster_gviews test_cluster_ic test_cluster_conf test_cluster_ic_mock test_cluster_inject test_cluster_pgstat test_cluster_debug test_cluster_shared_fs test_cluster_shared_fs_sharedfs test_cluster_smgr test_cluster_startup_phase test_cluster_lmon test_cluster_lck test_cluster_diag test_cluster_stats test_cluster_cssd test_cluster_qvotec test_cluster_voting_disk_io test_cluster_quorum_decision test_cluster_scn test_cluster_epoch test_cluster_fence test_cluster_reconfig test_cluster_ges test_cluster_grd test_cluster_grd_starvation test_cluster_lmd test_cluster_lmd_graph test_cluster_lmd_wait_state test_cluster_cancel_token test_cluster_lmd_probe_collector test_cluster_lock_acquire test_cluster_advisory test_cluster_retention test_cluster_visibility_variants test_cluster_tt_2pc test_cluster_stage3_acceptance test_cluster_undo_buf test_cluster_block_apply test_cluster_thread_apply test_cluster_thread_replay test_cluster_thread_driver test_cluster_thread_orchestrator test_cluster_write_fence test_cluster_stage4_acceptance test_cluster_stage5_5_cr_acceptance test_cluster_stage5_integrated_acceptance test_cluster_ges_mode test_cluster_sequence test_cluster_hw test_cluster_dl test_cluster_extend_gate test_cluster_ir test_cluster_ts test_cluster_ko test_cluster_hw_snapshot test_cluster_cf_authority test_cluster_cf_storage test_cluster_cf_enqueue test_cluster_cf_phase2 test_cluster_cf_stats test_cluster_hang test_cluster_hang_resolve test_cluster_touched_peers test_cluster_clean_leave test_cluster_membership test_cluster_node_remove test_cluster_resolver_cache,$(TESTS)) +SIMPLE_TESTS = $(filter-out test_cluster_guc test_cluster_shmem test_cluster_signal test_cluster_views test_cluster_gviews test_cluster_ic test_cluster_conf test_cluster_ic_mock test_cluster_inject test_cluster_pgstat test_cluster_debug test_cluster_shared_fs test_cluster_shared_fs_sharedfs test_cluster_smgr test_cluster_startup_phase test_cluster_lmon test_cluster_lck test_cluster_diag test_cluster_stats test_cluster_cssd test_cluster_qvotec test_cluster_voting_disk_io test_cluster_quorum_decision test_cluster_scn test_cluster_epoch test_cluster_fence test_cluster_reconfig test_cluster_ges test_cluster_grd test_cluster_grd_starvation test_cluster_lmd test_cluster_lmd_graph test_cluster_lmd_wait_state test_cluster_cancel_token test_cluster_lmd_probe_collector test_cluster_lock_acquire test_cluster_advisory test_cluster_retention test_cluster_visibility_variants test_cluster_tt_2pc test_cluster_stage3_acceptance test_cluster_undo_buf test_cluster_block_apply test_cluster_thread_apply test_cluster_thread_replay test_cluster_thread_driver test_cluster_thread_orchestrator test_cluster_write_fence test_cluster_stage4_acceptance test_cluster_stage5_5_cr_acceptance test_cluster_stage5_integrated_acceptance test_cluster_ges_mode test_cluster_sequence test_cluster_hw test_cluster_dl test_cluster_extend_gate test_cluster_ir test_cluster_ts test_cluster_ko test_cluster_hw_snapshot test_cluster_cf_authority test_cluster_cf_storage test_cluster_cf_enqueue test_cluster_cf_phase2 test_cluster_cf_stats test_cluster_hang test_cluster_hang_resolve test_cluster_touched_peers test_cluster_clean_leave test_cluster_membership test_cluster_node_remove test_cluster_resolver_cache test_cluster_backup,$(TESTS)) # spec-2.4 D16: test_cluster_epoch links cluster_epoch.o standalone. # cluster_epoch.c references ShmemInitStruct + cluster_shmem_register_region diff --git a/src/test/cluster_unit/test_cluster_backup.c b/src/test/cluster_unit/test_cluster_backup.c new file mode 100644 index 00000000000..23776220d90 --- /dev/null +++ b/src/test/cluster_unit/test_cluster_backup.c @@ -0,0 +1,351 @@ +/*------------------------------------------------------------------------- + * + * test_cluster_backup.c + * spec-6.5 unit tests for cluster backup / restore / PITR helpers. + * + * Portions Copyright (c) 1996-2024, PostgreSQL Global Development Group + * Portions Copyright (c) 1994, Regents of the University of California + * Portions Copyright (c) 2026, pgrac contributors + * + * Author: SqlRush + * + * Spec: spec-6.5-cluster-aware-backup-restore-pitr.md + * + * IDENTIFICATION + * src/test/cluster_unit/test_cluster_backup.c + * + * NOTES + * pgrac-original file. + * + *------------------------------------------------------------------------- + */ +#include "postgres.h" + +#include + +#include "catalog/catversion.h" +#include "cluster/cluster_backup.h" + +#undef printf +#undef fprintf +#undef snprintf + +#include "unit_test.h" + +UT_DEFINE_GLOBALS(); + +void +ExceptionalCondition(const char *conditionName, const char *fileName, int lineNumber) +{ + printf("# Assert failed: %s at %s:%d\n", conditionName, fileName, lineNumber); + abort(); +} + +int +scn_time_cmp(SCN a, SCN b) +{ + uint64 alocal = scn_local(a); + uint64 blocal = scn_local(b); + + if (alocal < blocal) + return -1; + if (alocal > blocal) + return 1; + return 0; +} + +static SCN +test_scn(uint64 local) +{ + return scn_encode(0, local); +} + +static void +fill_valid_manifest(ClusterBackupManifest *m) +{ + ClusterBackupManifestThread thread; + + cluster_backup_manifest_init(m, "b1"); + m->consistent_scn = test_scn(10); + m->scn_durable_peak = test_scn(12); + m->timeline = 1; + m->catversion = CATALOG_VERSION_NO; + m->backend_storage_id = 3; + m->node_count = 1; + m->control_included = true; + + memset(&thread, 0, sizeof(thread)); + thread.thread_id = 1; + thread.node_id = 0; + thread.start_redo_lsn = 10; + thread.checkpoint_lsn = 20; + thread.start_tli = 1; + thread.stop_cut_lsn = 40; + thread.wal_included = true; + thread.undo_included = true; + thread.tt_included = true; + UT_ASSERT(cluster_backup_manifest_set_thread(m, 0, &thread)); + cluster_backup_manifest_seal(m); +} + +UT_TEST(test_manifest_validates_complete_single_thread) +{ + ClusterBackupManifest m; + + fill_valid_manifest(&m); + UT_ASSERT_EQ(cluster_backup_manifest_validate(&m), CLUSTER_BACKUP_MANIFEST_OK); + UT_ASSERT_STR_EQ(cluster_backup_manifest_reason_name(CLUSTER_BACKUP_MANIFEST_OK), "ok"); + UT_ASSERT_NE(cluster_backup_manifest_compute_crc(&m), 0); +} + +UT_TEST(test_manifest_rejects_missing_control_wal_undo_tt) +{ + ClusterBackupManifest m; + + fill_valid_manifest(&m); + m.control_included = false; + cluster_backup_manifest_seal(&m); + UT_ASSERT_EQ(cluster_backup_manifest_validate(&m), CLUSTER_BACKUP_MANIFEST_MISSING_CONTROL); + + fill_valid_manifest(&m); + m.threads[0].wal_included = false; + cluster_backup_manifest_seal(&m); + UT_ASSERT_EQ(cluster_backup_manifest_validate(&m), CLUSTER_BACKUP_MANIFEST_MISSING_WAL); + + fill_valid_manifest(&m); + m.threads[0].undo_included = false; + cluster_backup_manifest_seal(&m); + UT_ASSERT_EQ(cluster_backup_manifest_validate(&m), CLUSTER_BACKUP_MANIFEST_MISSING_UNDO); + + fill_valid_manifest(&m); + m.threads[0].tt_included = false; + cluster_backup_manifest_seal(&m); + UT_ASSERT_EQ(cluster_backup_manifest_validate(&m), CLUSTER_BACKUP_MANIFEST_MISSING_TT); +} + +UT_TEST(test_manifest_rejects_bad_scn_lsn_count_and_crc) +{ + ClusterBackupManifest m; + + fill_valid_manifest(&m); + m.scn_durable_peak = test_scn(9); + cluster_backup_manifest_seal(&m); + UT_ASSERT_EQ(cluster_backup_manifest_validate(&m), CLUSTER_BACKUP_MANIFEST_BAD_SCN_PEAK); + + fill_valid_manifest(&m); + m.threads[0].stop_cut_lsn = 9; + cluster_backup_manifest_seal(&m); + UT_ASSERT_EQ(cluster_backup_manifest_validate(&m), CLUSTER_BACKUP_MANIFEST_BAD_LSN_RANGE); + + fill_valid_manifest(&m); + m.thread_count = 2; + cluster_backup_manifest_seal(&m); + UT_ASSERT_EQ(cluster_backup_manifest_validate(&m), CLUSTER_BACKUP_MANIFEST_MISSING_THREAD); + + fill_valid_manifest(&m); + m.manifest_crc++; + UT_ASSERT_EQ(cluster_backup_manifest_validate(&m), CLUSTER_BACKUP_MANIFEST_BAD_CRC); +} + +UT_TEST(test_manifest_set_thread_is_bounds_defensive) +{ + ClusterBackupManifest m; + ClusterBackupManifestThread thread; + + cluster_backup_manifest_init(&m, "bounds"); + memset(&thread, 0, sizeof(thread)); + thread.thread_id = 1; + UT_ASSERT(!cluster_backup_manifest_set_thread(NULL, 0, &thread)); + UT_ASSERT(!cluster_backup_manifest_set_thread(&m, -1, &thread)); + UT_ASSERT(!cluster_backup_manifest_set_thread(&m, CLUSTER_MAX_NODES, &thread)); + thread.thread_id = 0; + UT_ASSERT(!cluster_backup_manifest_set_thread(&m, 0, &thread)); + thread.thread_id = CLUSTER_MAX_NODES + 1; + UT_ASSERT(!cluster_backup_manifest_set_thread(&m, 0, &thread)); +} + +UT_TEST(test_restore_point_cut_requires_drain_and_fence) +{ + SCN scns[CLUSTER_MAX_NODES]; + XLogRecPtr lsns[CLUSTER_MAX_NODES]; + ClusterRestorePoint point; + + memset(scns, 0, sizeof(scns)); + memset(lsns, 0, sizeof(lsns)); + scns[0] = test_scn(20); + lsns[0] = 500; + + UT_ASSERT_EQ( + cluster_restore_point_build(&point, "rp", scns, lsns, CLUSTER_MAX_NODES, false, true, 0), + CLUSTER_RESTORE_POINT_CUT_PENDING_COMMITS); + UT_ASSERT_EQ( + cluster_restore_point_build(&point, "rp", scns, lsns, CLUSTER_MAX_NODES, true, false, 0), + CLUSTER_RESTORE_POINT_CUT_NO_FENCE); +} + +UT_TEST(test_restore_point_cut_records_all_threads) +{ + SCN scns[CLUSTER_MAX_NODES]; + XLogRecPtr lsns[CLUSTER_MAX_NODES]; + ClusterRestorePoint point; + + memset(scns, 0, sizeof(scns)); + memset(lsns, 0, sizeof(lsns)); + scns[0] = test_scn(20); + lsns[0] = 500; + scns[2] = test_scn(30); + lsns[2] = 700; + + UT_ASSERT_EQ( + cluster_restore_point_build(&point, "rp", scns, lsns, CLUSTER_MAX_NODES, true, true, 9), + CLUSTER_RESTORE_POINT_CUT_OK); + UT_ASSERT_EQ(point.thread_count, 2); + UT_ASSERT_EQ(point.cut_scn, test_scn(30)); + UT_ASSERT_EQ(point.incarnation, 9); + UT_ASSERT_EQ(point.cut_lsn[2], 700); +} + +UT_TEST(test_restore_point_cut_rejects_partial_thread) +{ + SCN scns[CLUSTER_MAX_NODES]; + XLogRecPtr lsns[CLUSTER_MAX_NODES]; + ClusterRestorePoint point; + + memset(scns, 0, sizeof(scns)); + memset(lsns, 0, sizeof(lsns)); + scns[0] = test_scn(20); + UT_ASSERT_EQ( + cluster_restore_point_build(&point, "rp", scns, lsns, CLUSTER_MAX_NODES, true, true, 0), + CLUSTER_RESTORE_POINT_CUT_BAD_THREAD); +} + +UT_TEST(test_pitr_resolves_latest_reachable_restore_point) +{ + ClusterRestorePoint points[3]; + ClusterRestorePoint chosen; + + memset(points, 0, sizeof(points)); + points[0].present = true; + points[0].cut_scn = test_scn(20); + points[0].thread_count = 1; + strlcpy(points[0].name, "a", sizeof(points[0].name)); + points[1].present = true; + points[1].cut_scn = test_scn(30); + points[1].thread_count = 1; + strlcpy(points[1].name, "b", sizeof(points[1].name)); + points[2].present = true; + points[2].cut_scn = test_scn(50); + points[2].thread_count = 1; + strlcpy(points[2].name, "c", sizeof(points[2].name)); + + UT_ASSERT_EQ(cluster_pitr_resolve_scn(points, 3, test_scn(35), test_scn(10), &chosen), + CLUSTER_PITR_TARGET_OK); + UT_ASSERT_EQ(chosen.cut_scn, test_scn(30)); + UT_ASSERT_STR_EQ(chosen.name, "b"); +} + +UT_TEST(test_pitr_fail_closed_reasons) +{ + ClusterRestorePoint point; + + memset(&point, 0, sizeof(point)); + point.present = true; + point.cut_scn = test_scn(20); + point.thread_count = 0; + UT_ASSERT_EQ(cluster_pitr_resolve_scn(&point, 1, test_scn(20), test_scn(10), NULL), + CLUSTER_PITR_TARGET_MISSING_THREAD); + UT_ASSERT_EQ(cluster_pitr_resolve_scn(NULL, 0, test_scn(20), test_scn(10), NULL), + CLUSTER_PITR_TARGET_NO_RESTORE_POINT); + UT_ASSERT_EQ(cluster_pitr_resolve_scn(&point, 1, test_scn(5), test_scn(10), NULL), + CLUSTER_PITR_TARGET_BEFORE_BACKUP); +} + +UT_TEST(test_restore_compatibility_rejects_mismatches) +{ + ClusterBackupManifest m; + + fill_valid_manifest(&m); + UT_ASSERT_EQ(cluster_backup_manifest_compatible(&m, CATALOG_VERSION_NO, 3, 1), + CLUSTER_RESTORE_COMPAT_OK); + UT_ASSERT_EQ(cluster_backup_manifest_compatible(&m, CATALOG_VERSION_NO + 1, 3, 1), + CLUSTER_RESTORE_COMPAT_CATVERSION); + UT_ASSERT_EQ(cluster_backup_manifest_compatible(&m, CATALOG_VERSION_NO, 4, 1), + CLUSTER_RESTORE_COMPAT_STORAGE); + UT_ASSERT_EQ(cluster_backup_manifest_compatible(&m, CATALOG_VERSION_NO, 3, 2), + CLUSTER_RESTORE_COMPAT_TOPOLOGY); + m.manifest_crc++; + UT_ASSERT_EQ(cluster_backup_manifest_compatible(&m, CATALOG_VERSION_NO, 3, 1), + CLUSTER_RESTORE_COMPAT_MANIFEST); +} + +UT_TEST(test_backup_wire_request_crc_and_bounds) +{ + ClusterBackupWireRequest req; + + memset(&req, 0, sizeof(req)); + req.magic = CLUSTER_BACKUP_IC_MAGIC; + req.version = CLUSTER_BACKUP_IC_VERSION; + req.op = CLUSTER_BACKUP_WIRE_OP_START; + req.request_id = 42; + req.coordinator_node_id = 0; + strlcpy(req.backup_id, "b-wire", sizeof(req.backup_id)); + cluster_backup_wire_request_compute_crc(&req); + UT_ASSERT(cluster_backup_wire_request_valid(&req)); + + req.request_id = 43; + UT_ASSERT(!cluster_backup_wire_request_valid(&req)); + req.request_id = 42; + cluster_backup_wire_request_compute_crc(&req); + req.backup_id[CLUSTER_BACKUP_ID_MAX - 1] = 'x'; + UT_ASSERT(!cluster_backup_wire_request_valid(&req)); +} + +UT_TEST(test_backup_wire_ack_fail_closed_validation) +{ + ClusterBackupWireAck ack; + + memset(&ack, 0, sizeof(ack)); + ack.magic = CLUSTER_BACKUP_IC_MAGIC; + ack.version = CLUSTER_BACKUP_IC_VERSION; + ack.op = CLUSTER_BACKUP_WIRE_OP_STOP; + ack.result = CLUSTER_BACKUP_WIRE_RESULT_OK; + ack.sender_node_id = 1; + ack.thread_id = 2; + ack.request_id = 99; + ack.stop_cut_lsn = 500; + ack.cut_scn = test_scn(30); + cluster_backup_wire_ack_compute_crc(&ack); + UT_ASSERT(cluster_backup_wire_ack_valid(&ack)); + + ack.stop_cut_lsn = InvalidXLogRecPtr; + cluster_backup_wire_ack_compute_crc(&ack); + UT_ASSERT(!cluster_backup_wire_ack_valid(&ack)); + + ack.result = CLUSTER_BACKUP_WIRE_RESULT_EXECUTOR_ERROR; + ack.thread_id = 0; + ack.cut_scn = InvalidScn; + cluster_backup_wire_ack_compute_crc(&ack); + UT_ASSERT(cluster_backup_wire_ack_valid(&ack)); + UT_ASSERT_STR_EQ(cluster_backup_wire_result_name(CLUSTER_BACKUP_WIRE_RESULT_EXECUTOR_ERROR), + "executor_error"); +} + +int +main(void) +{ + UT_PLAN(12); + UT_RUN(test_manifest_validates_complete_single_thread); + UT_RUN(test_manifest_rejects_missing_control_wal_undo_tt); + UT_RUN(test_manifest_rejects_bad_scn_lsn_count_and_crc); + UT_RUN(test_manifest_set_thread_is_bounds_defensive); + UT_RUN(test_restore_point_cut_requires_drain_and_fence); + UT_RUN(test_restore_point_cut_records_all_threads); + UT_RUN(test_restore_point_cut_rejects_partial_thread); + UT_RUN(test_pitr_resolves_latest_reachable_restore_point); + UT_RUN(test_pitr_fail_closed_reasons); + UT_RUN(test_restore_compatibility_rejects_mismatches); + UT_RUN(test_backup_wire_request_crc_and_bounds); + UT_RUN(test_backup_wire_ack_fail_closed_validation); + UT_DONE(); + return ut_failed_count == 0 ? 0 : 1; +} diff --git a/src/test/cluster_unit/test_cluster_errcodes.c b/src/test/cluster_unit/test_cluster_errcodes.c index f160adb1635..021eb5c5fc6 100644 --- a/src/test/cluster_unit/test_cluster_errcodes.c +++ b/src/test/cluster_unit/test_cluster_errcodes.c @@ -1,7 +1,7 @@ /*------------------------------------------------------------------------- * * test_cluster_errcodes.c - * Compile-time invariants for the 45 cluster SQLSTATE error codes + * Compile-time invariants for the cluster SQLSTATE error codes * registered in stage 0.12. * * All ERRCODE_CLUSTER_* macros are generated automatically by PG's @@ -14,7 +14,7 @@ * - Each ERRCODE_CLUSTER_* macro encodes the exact SQLSTATE string * via MAKE_SQLSTATE() (proves the .txt -> .h pipeline produced * correct values). - * - All 45 codes use the 'R' subclass character (pgrac namespace + * - All checked codes use the 'R' subclass character (pgrac namespace * discipline; design doc §2.3). * - The Class 58 pgrac block is dense from 58R01..58R12 (the * largest pgrac sub-class, anchors the count proof). @@ -118,7 +118,8 @@ UT_TEST(test_class_40_first_last) UT_TEST(test_class_53_first_last) { UT_ASSERT_EQ(ERRCODE_CLUSTER_LMS_QUEUE_FULL, MAKE_SQLSTATE('5', '3', 'R', '0', '1')); - UT_ASSERT_EQ(ERRCODE_CLUSTER_RECONFIG_IN_PROGRESS, MAKE_SQLSTATE('5', '3', 'R', '6', '0')); + UT_ASSERT_EQ(ERRCODE_CLUSTER_RESTORE_POINT_DRAIN_TIMEOUT, + MAKE_SQLSTATE('5', '3', 'R', 'A', 'F')); } UT_TEST(test_class_55_first_last) @@ -175,6 +176,16 @@ UT_TEST(test_class_58_complete) UT_ASSERT_EQ(ERRCODE_CLUSTER_RECOVERY_FAILED, MAKE_SQLSTATE('5', '8', 'R', '1', '2')); } +UT_TEST(test_class_53_backup_band) +{ + UT_ASSERT_EQ(ERRCODE_CLUSTER_BACKUP_IN_PROGRESS, MAKE_SQLSTATE('5', '3', 'R', 'A', 'B')); + UT_ASSERT_EQ(ERRCODE_CLUSTER_PITR_TARGET_UNREACHABLE, MAKE_SQLSTATE('5', '3', 'R', 'A', 'C')); + UT_ASSERT_EQ(ERRCODE_CLUSTER_BACKUP_INCOMPLETE, MAKE_SQLSTATE('5', '3', 'R', 'A', 'D')); + UT_ASSERT_EQ(ERRCODE_CLUSTER_RESTORE_INCOMPATIBLE, MAKE_SQLSTATE('5', '3', 'R', 'A', 'E')); + UT_ASSERT_EQ(ERRCODE_CLUSTER_RESTORE_POINT_DRAIN_TIMEOUT, + MAKE_SQLSTATE('5', '3', 'R', 'A', 'F')); +} + /* ---------- * All 45 cluster errcodes use 'R' as their subclass character @@ -190,6 +201,7 @@ UT_TEST(test_all_use_r_subclass) UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_LMS_QUEUE_FULL, 3), 'R'); UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_PCM_STATE_INVALID, 3), 'R'); UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_RECONFIG_IN_PROGRESS, 3), 'R'); + UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_RESTORE_POINT_DRAIN_TIMEOUT, 3), 'R'); UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_SHARED_STORAGE_FAILED, 3), 'R'); UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_SNAPSHOT_TOO_OLD, 3), 'R'); UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_ASSERTION_FAILURE, 3), 'R'); @@ -229,9 +241,9 @@ UT_TEST(test_per_class_anchors) UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_PROTOCOL_VERSION_MISMATCH, 5), '5'); /* Class 40 has 4 entries: 40R01..40R04 */ UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_PI_INVALIDATED_RETRY, 5), '4'); - /* Class 53 spans base 53R01..53R07 plus quorum/fence/reconfig ranges up to 53R60. */ - UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_RECONFIG_IN_PROGRESS, 4), '6'); - UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_RECONFIG_IN_PROGRESS, 5), '0'); + /* Class 53 spans base 53R01..53R07 plus later pgrac bands up to 53RAF. */ + UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_RESTORE_POINT_DRAIN_TIMEOUT, 4), 'A'); + UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_RESTORE_POINT_DRAIN_TIMEOUT, 5), 'F'); /* Class 55 has 6 entries: 55R01..55R06 */ UT_ASSERT_EQ(sqlstate_char(ERRCODE_CLUSTER_BLOCK_MISSING_TEMPORARY, 5), '6'); /* Class 57 keeps operator-intervention cluster codes 57R02..57R06. */ @@ -246,7 +258,7 @@ UT_TEST(test_per_class_anchors) int main(void) { - UT_PLAN(12); + UT_PLAN(13); UT_RUN(test_class_08_first_last); UT_RUN(test_class_40_first_last); UT_RUN(test_class_53_first_last); @@ -256,6 +268,7 @@ main(void) UT_RUN(test_class_72_first_last); UT_RUN(test_class_xx_first_last); UT_RUN(test_class_58_complete); + UT_RUN(test_class_53_backup_band); UT_RUN(test_all_use_r_subclass); UT_RUN(test_no_overlap_with_pg_native); UT_RUN(test_per_class_anchors); diff --git a/src/test/cluster_unit/test_cluster_lmon.c b/src/test/cluster_unit/test_cluster_lmon.c index 3b36fae5b25..d33e7af507b 100644 --- a/src/test/cluster_unit/test_cluster_lmon.c +++ b/src/test/cluster_unit/test_cluster_lmon.c @@ -427,6 +427,18 @@ void cluster_node_remove_register_ic_msg_types(void) {} +/* spec-6.5 D1/D4 stubs: cluster_lmon registers and ticks the backup + * coordinator/peer ACK path, but this standalone unit binary intentionally + * does not link cluster_backup.o or backend backup symbols. */ +void cluster_backup_register_ic_msg_types(void); +void +cluster_backup_register_ic_msg_types(void) +{} +void cluster_backup_lmon_tick(void); +void +cluster_backup_lmon_tick(void) +{} + /* spec-2.2 D5 LMON drive references cluster_conf_lookup_node + cluster_node_id. */ const struct ClusterNodeInfo * cluster_conf_lookup_node(int32 node_id pg_attribute_unused()) diff --git a/src/test/cluster_unit/test_cluster_shmem.c b/src/test/cluster_unit/test_cluster_shmem.c index a927deaa75f..d375949d0a8 100644 --- a/src/test/cluster_unit/test_cluster_shmem.c +++ b/src/test/cluster_unit/test_cluster_shmem.c @@ -762,6 +762,12 @@ void cluster_node_remove_shmem_register(void) {} +/* spec-6.5 stub: cluster backup / restore / PITR shmem region. */ +void cluster_backup_shmem_register(void); +void +cluster_backup_shmem_register(void) +{} + /* spec-4.12 D7 stub: cluster write-fence token region. */ void cluster_write_fence_shmem_register(void); void diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index 00df5b92860..3db2e5ca7e9 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -1281,6 +1281,7 @@ drop table cchild; -- -- Check that ruleutils are working -- +-- PGRAC: expected ruleutils output includes cluster backup/PITR catalog views. -- temporarily disable fancy output, so view changes create less diff noise \a\t SELECT viewname, definition FROM pg_views @@ -1313,6 +1314,16 @@ pg_backend_memory_contexts| SELECT name, free_chunks, used_bytes FROM pg_get_backend_memory_contexts() pg_get_backend_memory_contexts(name, ident, parent, level, total_bytes, total_nblocks, free_bytes, free_chunks, used_bytes); +pg_cluster_backup_history| SELECT backup_id, + consistent_scn, + scn_durable_peak, + timeline, + catversion, + storage_id, + node_count, + thread_count, + manifest_crc + FROM cluster_get_backup_history() cluster_get_backup_history(backup_id, consistent_scn, scn_durable_peak, timeline, catversion, storage_id, node_count, thread_count, manifest_crc); pg_cluster_clean_leave_state| SELECT phase, leaving_node_id, leave_epoch, @@ -1431,6 +1442,13 @@ pg_cluster_nodes| SELECT node_id, region, is_self FROM cluster_get_nodes() cluster_get_nodes(node_id, hostname, interconnect_addr, public_addr, role, region, is_self); +pg_cluster_pitr_status| SELECT target_type, + target_action, + reachable, + reason, + resolved_scn, + restore_point_name + FROM cluster_get_pitr_status() cluster_get_pitr_status(target_type, target_action, reachable, reason, resolved_scn, restore_point_name); pg_cluster_quorum_state| SELECT in_quorum, quorum_size, disks_ok, @@ -1450,6 +1468,12 @@ pg_cluster_reconfig_state| SELECT event_id, cssd_dead_generation, reconfig_kind FROM cluster_get_reconfig_state() cluster_get_reconfig_state(event_id, coordinator_node_id, old_epoch, new_epoch, dead_bitmap, applied_at, observer_role, event_seq, cssd_dead_generation, reconfig_kind); +pg_cluster_restore_points| SELECT restore_point_name, + cut_scn, + thread_count, + incarnation, + created_at + FROM cluster_get_restore_points() cluster_get_restore_points(restore_point_name, cut_scn, thread_count, incarnation, created_at); pg_cluster_shmem| SELECT name, size_bytes, lwlock_count, @@ -1981,6 +2005,21 @@ pg_stat_bgwriter| SELECT pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync, pg_stat_get_buf_alloc() AS buffers_alloc, pg_stat_get_bgwriter_stat_reset_time() AS stats_reset; +pg_stat_cluster_backup| SELECT in_progress, + backup_id, + coordinator_node_id, + start_redo_lsn, + checkpoint_lsn, + stop_cut_lsn, + consistent_scn, + manifest_crc, + started_at, + stopped_at, + backup_parallel_channels, + backup_wal_retention, + restore_points_enabled, + restore_point_interval_ms + FROM cluster_get_backup_state() cluster_get_backup_state(in_progress, backup_id, coordinator_node_id, start_redo_lsn, checkpoint_lsn, stop_cut_lsn, consistent_scn, manifest_crc, started_at, stopped_at, backup_parallel_channels, backup_wal_retention, restore_points_enabled, restore_point_interval_ms); pg_stat_cluster_counters| SELECT name, value FROM cluster_get_pgstat_counters() cluster_get_pgstat_counters(name, value); diff --git a/src/test/regress/sql/rules.sql b/src/test/regress/sql/rules.sql index 8b7e255dcd2..f013d076283 100644 --- a/src/test/regress/sql/rules.sql +++ b/src/test/regress/sql/rules.sql @@ -772,6 +772,7 @@ drop table cchild; -- Check that ruleutils are working -- +-- PGRAC: expected ruleutils output includes cluster backup/PITR catalog views. -- temporarily disable fancy output, so view changes create less diff noise \a\t