Geo-Replication in Apache Pulsar, Part 1: Concepts and Features

Exadra37 · 7 July 2021 21:59

https://www.splunk.com/en_us/blog/it/geo-replication-in-apache-pulsar-part-1-concepts-and-features.html

Geo-replication is a typical mechanism used to provide disaster recovery. A lot of data systems claim to support geo-replication. However, these systems generally only replicate to 2 data centers and have severe limitations when replicating to more than two.

The geo-replication mechanisms used in different data systems can be put into two categories, synchronous geo-replication and asynchronous geo-replication. Apache Pulsar supports both geo-replication strategies.

In this example, assume there are 3 data centers: us-west, us-central and us-east, while a client is issuing a write request to us-central. In the synchronous geo-replication case, when the client issues a write request to us-central, the data written to us-central will be replicated to the other two data centers, both us-west and us-east.
The write request is typically only acknowledged to the client when the majority of the data centers have issued a confirmation that the write has been persisted. In this case, at least 2 data centers have to confirm that this write request has been persisted. This mechanism is called “sync geo-replication” because the data is synchronously replicated to multiple data centers and the client has to wait for an acknowledgement from the other data centers.

In contrast, with asynchronous geo-replication the client doesn’t have to wait for a response from the other data centers. The client receives a response immediately after us-central successfully persists the data. The data is then replicated from us-central to the other two data centers, us-west and us-east, in an asynchronous fashion.