VPG Troubleshooting Best Practices

Modified on Wed, 10 Dec, 2025 at 12:02 PM

Summary

This article provides best practices and recommended procedures for troubleshooting Virtual Protection Groups (VPGs) in Zerto. These guidelines help administrators quickly identify configuration issues, infrastructure problems, and replication bottlenecks that commonly impact VPG performance and stability.

Symptoms

Administrators may observe one or more of the following:

VPG in Warning, Error, or Needs Configuration state
Missing or delayed checkpoints
Frequent bitmap syncs
Journal overflow alerts
Network disconnections or VRA communication errors
Slow or stalled replication

Resolution

Follow the steps in the procedure below to diagnose and resolve VPG-related issues.

Procedure

1. Initial Assessment

1.1 Review VPG Status
Navigate to Zerto UI → VPGs and review the state of the affected VPG.
Look for:

Warnings: Often caused by configuration issues such as journal size or RPO thresholds.
Errors: Typically indicate replication failures, VRA disconnections, or storage issues.
Needs Configuration: Caused by missing or changed resources (e.g., datastore changes).

1.2 Review Alerts
Go to Monitoring → Alerts and filter for the VPG.
Pay attention to alerts such as:

Bitmap sync required
Journal overflow
Network disconnected
VRA communication failed
Checkpoint creation delayed

2. Validate Infrastructure Health

2.1 Verify VRA Health
Check both source and target VRAs. Validate:

Status is Connected
No CPU or memory saturation
Access to required datastores
Stable L2/L3 network connectivity

2.2 Confirm Host and Datastore Availability
Check for:

Hosts in maintenance mode
Datastores out of capacity
High storage latency

2.3 Ensure Network Stability
Verify:

No packet loss between VRAs
No recent VLAN or switching changes
Firewalls allow required ports:
- Open Firewall Ports

3. Review VPG Configuration

3.1 Storage Settings
Confirm:

Target datastore is online
Journal datastore has sufficient free space

4. Review Zerto Analytics (If Available)

Use Zerto Analytics → VPG Performance to identify:

Change rate trends
RPO violations
Latency spikes
Network throughput limitations

5. When to Engage Support

Contact Support if:

VPG errors persist after all troubleshooting steps
Block transmission failures repeat
Checkpoint generation stops
VRAs crash or restart consistently
A VPG will not enter/exit test or failover mode

Provide the following:

ZVM logs
Source and target VRA logs
Screenshots of VPG errors
Relevant vSphere host/datastore events

Additional Information

Proactive monitoring, correct journal sizing, stable infrastructure, and regular validation of VPG configuration significantly reduce replication issues and improve long-term VPG resilience.