How to Enable Replication for PostgreSQL on Ubuntu

data replication, postgres replication, postgresql replication

PostgreSQL replication is a method of copying and maintaining database objects, such as schemas and tables, in multiple database systems distributed across different locations. This article will guide you through the steps to set up and enable data replication for PostgreSQL on Ubuntu.

Types of PostgreSQL Data Replication

PostgreSQL data replication refers to the process of copying data from one PostgreSQL database to another, and keeping the data in these databases synchronized. This practice is often used in database management to enhance data security, increase availability, and distribute the workload across multiple systems.

At a basic level, data replication is like making a copy of a book, and then ensuring that any changes to the original book are also made to the copy. In the context of databases, replication involves copying the entire database (full replication) or parts of the database (partial replication) to another system. Any changes made to the data in the original database are then mirrored in the replicated database.

There are two main forms of PostgreSQL replication: physical replication and logical replication.

Streaming Replication

Physical replication, often referred to as streaming replication, duplicates the exact physical files of the database from one server to another. It’s a binary copy of the database cluster’s files, providing a hot standby system. Physical replication is straightforward and efficient, as it duplicates data exactly as is, but it lacks flexibility because you can’t replicate a subset of the database or transform data during the replication process.

See also  Postgres GROUP BY Month, Year, Day by Example

Logical Replication

Logical replication, on the other hand, is more flexible. It allows you to replicate specific tables and even transform data while replicating. This is achieved by sending the changes of the data (the write-ahead log or WAL) in a logical format that includes the data and the changes, rather than in a binary format.

PostgreSQL data replication is an essential aspect of maintaining and managing databases. It helps in disaster recovery, load balancing, and it can also be a strategic part of migrating data across different systems. Understanding how it works is crucial for anyone managing PostgreSQL databases, whether using physical or logical replication.

How To Set Up PostgreSQL Replication

Prerequisites

  • Two Ubuntu 20.04 servers: A primary server and a standby server.
  • PostgreSQL installed on both servers.
  • Sudo or root access to both servers.

Step 1: Configuring the Primary Server

1.1 Editing the PostgreSQL Configuration File

Open the PostgreSQL configuration file located at /etc/postgresql/{version}/main/postgresql.conf in a text editor:

sudo nano /etc/postgresql/{version}/main/postgresql.conf

Replace {version} with your PostgreSQL version.

Modify the following parameters:

  • wal_level – set this to replica.
  • max_wal_senders – set this to the number of standby servers that you have.
  • wal_keep_segments – set this to a higher value like 64. This represents the number of past log file segments kept in the pg_xlog directory.
wal_level = replica
max_wal_senders = 3
wal_keep_segments = 64

1.2 Editing the pg_hba.conf File

In the same directory, edit the pg_hba.conf file:

sudo nano /etc/postgresql/{version}/main/pg_hba.conf

Add the following line at the end of the file. Replace {standby-ip-address} with the IP address of the standby server:

host replication all {standby-ip-address}/32 md5

1.3 Restart PostgreSQL

After editing the configuration files, restart the PostgreSQL service:

sudo systemctl restart postgresql

Step 2: Configuring the Standby Server

2.1 Stop the PostgreSQL Service

Before we can configure the standby server, we must stop the PostgreSQL service:

sudo systemctl stop postgresql

2.2 Copying Data from the Primary Server

We will use the pg_basebackup tool to copy data from the primary server. This tool makes a binary copy of the database files while the system is running.

sudo -u postgres pg_basebackup -h {primary-ip-address} -D /var/lib/postgresql/{version}/main -U replication -v -P --wal-method=stream

Replace {primary-ip-address} with the IP address of the primary server and {version} with your PostgreSQL version.

See also  How to Enable PostgreSQL Performance Logging

2.3 Creating the Recovery Configuration File

The recovery configuration file tells the standby server to run in standby mode and fetch missing WAL files from the primary server.

Create a file named standby.signal in the PostgreSQL data directory:

sudo touch /var/lib/postgresql/{version}/main/standby.signal

Then, create a postgresql.auto.conf file:

sudo nano /var/lib/postgresql/{version}/main/postgresql.auto.conf

Add the following lines:

primary_conninfo = 'host={primary-ip-address} user=replication password={your-password} application_name={standby-hostname}'

Replace {primary-ip-address} with the IP address of the primary server, {your-password} with the password you want to use, and {standby-hostname} with the hostname of the standby server.

2.4 Start the PostgreSQL Service

Finally, start the PostgreSQL service:

sudo systemctl start postgresql

For subsequent configuration updates to configuration, you can reload without restarting PostgreSQL service.

Leave a Comment