VPG Troubleshooting Best Practices

Modified on Wed, 10 Dec at 12:02 PM

Summary

This article provides best practices and recommended procedures for troubleshooting Virtual Protection Groups (VPGs) in Zerto. These guidelines help administrators quickly identify configuration issues, infrastructure problems, and replication bottlenecks that commonly impact VPG performance and stability.


Symptoms

Administrators may observe one or more of the following:

  • VPG in Warning, Error, or Needs Configuration state

  • Missing or delayed checkpoints

  • Frequent bitmap syncs

  • Journal overflow alerts

  • Network disconnections or VRA communication errors

  • Slow or stalled replication


Resolution

Follow the steps in the procedure below to diagnose and resolve VPG-related issues.


Procedure

1. Initial Assessment

1.1 Review VPG Status
Navigate to Zerto UI → VPGs and review the state of the affected VPG.
Look for:

  • Warnings: Often caused by configuration issues such as journal size or RPO thresholds.

  • Errors: Typically indicate replication failures, VRA disconnections, or storage issues.

  • Needs Configuration: Caused by missing or changed resources (e.g., datastore changes).

1.2 Review Alerts
Go to Monitoring → Alerts and filter for the VPG.
Pay attention to alerts such as:

  • Bitmap sync required

  • Journal overflow

  • Network disconnected

  • VRA communication failed

  • Checkpoint creation delayed


2. Validate Infrastructure Health

2.1 Verify VRA Health
Check both source and target VRAs. Validate:

  • Status is Connected

  • No CPU or memory saturation

  • Access to required datastores

  • Stable L2/L3 network connectivity

2.2 Confirm Host and Datastore Availability
Check for:

  • Hosts in maintenance mode

  • Datastores out of capacity

  • High storage latency

2.3 Ensure Network Stability
Verify:

  • No packet loss between VRAs

  • No recent VLAN or switching changes

  • Firewalls allow required ports:


3. Review VPG Configuration

3.1 Storage Settings
Confirm:

  • Target datastore is online

  • Journal datastore has sufficient free space


4. Review Zerto Analytics (If Available)

Use Zerto Analytics → VPG Performance to identify:

  • Change rate trends

  • RPO violations

  • Latency spikes

  • Network throughput limitations


5. When to Engage Support

Contact Support if:

  • VPG errors persist after all troubleshooting steps

  • Block transmission failures repeat

  • Checkpoint generation stops

  • VRAs crash or restart consistently

  • A VPG will not enter/exit test or failover mode

Provide the following:

  • ZVM logs

  • Source and target VRA logs

  • Screenshots of VPG errors

  • Relevant vSphere host/datastore events


Additional Information

Proactive monitoring, correct journal sizing, stable infrastructure, and regular validation of VPG configuration significantly reduce replication issues and improve long-term VPG resilience.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article