A slowing in the data replication progress between the active repository and standby repository is called lag. If the slowing exceeds the user-specified lag threshold, TVI alert 4604002 is triggered.
The property sync.data.lagging.threshold specifies the threshold in the sync.properties file.
The Postgres transaction process is called write-ahead logging (WAL).
The sync.log reports the following when the synchronization service is slow:
- sending_lag: how many WALs have been generated, but not yet sent to the standby servers.
- receiving_lag: WALs in the network that have been sent but not yet written.
- write_lag: WALs that have been written but not moved to the permanent storage. If Postgres crashes, these changes are lost.
- replaying_lag: WALs that has been moved to the permanent storage but not yet replayed.
The following are possible reasons for system slowing based on the messages in the report:
- sending_lag:
- Active repository performance issue (such as a heavy load)
- Low throughput of the network between the active and the standby repositories
- The standby server being offline over a long period of time prior to its starting
- receving_lag:
- Low throughput of the network between the active and the standby repositories
- Standby repository performance issue (such as a heavy load)
- write_lag and replay_lag
- Standby performance issues such as over-utilized storage, stuck recovery process, or a heavy load on standby