When an LTR backup runs for a very long time after their scheduled time, as seen below:
Start backup dated 08-26
00000000,0f9dec48,71409034,pgId=743f4339-68d7-4f17-86cb-6f16360f0ce2; backupSetId=f32b8f47-de1e-45c2-8d0a-4443f3714c09,20-08-26 20:06:44.20,I,344,BackupManager,BackupProtectionGroupImpl,Backup initiated successfully 5 volumes will be backed up,
00000000,00000000,,pgId=743f4339-68d7-4f17-86cb-6f16360f0ce2; backupSetId=f32b8f47-de1e-45c2-8d0a-4443f3714c09; volumeImageId=2020-08-26-18-00-02_90ef7202-3137-466f-a720-97d17f053fe9,20-08-26 22:27:22.96,I,85,LongRunningOperation,LogUpdatedProgress,"No progress on volume. Volume Progress 76%. VolumeImageId=2020-08-26-18-00-02_90ef7202-3137-466f-a720-97d17f053fe9, Current Progress in Bytes: 10504896512/13780779008Zeros Skipped in Bytes: 73138176"
ZVM will trigger an alert indicating that the backup job didn't complete as seen below:
Turn on alert for backup 08-26
22:32:52.91,I,92,EventPostingController,PrepareAllSystemEventsLocal,"At 8/26/2020 10:32:52 PM alarm 'The Retention process for VPG ....... is missing. Task was scheduled to run on 8/26/2020
The alert will go off when the next successful backup occurs as seen below:
Start next backup: for backup 27.8
00000000,467736b4,29534603,pgId=743f4339-68d7-4f17-86cb-6f16360f0ce2; backupSetId=ff626a45-db1b-45d1-a8e6-9cde8bad45e7,20-08-27 14:01:05.32,I,58,BackupManager,BackupProtectionGroupImpl,Backup initiated successfully 5 volumes will be backed up,
End previous backup: for backup 27.8
00000000,00000000,,pgId=743f4339-68d7-4f17-86cb-6f16360f0ce2; backupSetId=ff626a45-db1b-45d1-a8e6-9cde8bad45e7,20-08-28 02:45:48.20,I,20,BackupManager,EndBackupCommandTask,"Backup Summary: backupSetId ff626a45-db1b-45d1-a8e6-9cde8bad45e7, command task completion code: Success, description: Retention process successful. Policy: Daily, Weekly. . 25.7 GB data was processed out of total VPG size: 1,590.0 GB. Duration: 12:44:42.",
Turn off alert for backup 26.8
02:45:55.05,I,195,EventPostingController,PrepareAllSystemEventsLocal,"At 8/28/2020 2:45:55 AM alarm 'The Retention process for VPG ..... is missing. Task was scheduled to run on 8/26/2020 (per xxx).' was turned off, and lasted for 1693.03594726 minutes.
Customer will receive the following alert about an incomplete retention set:
At 8/26/2020 10:32:52 PM alarm 'The Retention process for VPG ....... is missing. Task was scheduled to run on 8/26/2020 (per xxx).' was turned on.
A rare condition exists between updating the BackupProcessRunner state and importing it to the catalog where if the alert checks run between them, it thinks the backup already been finished handling, but it doesn't find it in the ZVM DB (catalog).
This issue will be fixed in ZVM 9.0. In most instances, a complete backup will occur after receiving the alert about missing retention set for a VPG. In a ZVM 8.5 environment, validation about a successful backup, can be made by looking at "Manage Retention Sets" in ZVM GUI for the latest Retention Sets for the VPG as seen below:
See ZVM 8.5 Administration Guide for more details about "Manage Retention Sets"
For backup validation in ZVM 8.0 or lower do a dry-run restore as follow:
1) In the Zerto User Interface select Restore > VM/VPG.
2) The Restore wizard is displayed.
3) Select the VPG for which the missing-retention-set alert was sent
4) Verify the latest "Point In Time" timestamp for the latest available Retention Sets is after the alert timestamp for that VPG, a successful backup took place after the alert message.
5) If the Retention Sets timestamp is before the alert message timestamp, call Zerto Support and open up a case for further investigation about the missing backup.