Hi. We have a 2012R2 Hyper-V cluster that is being backed up by DPM 2012 R2 UR6. We've originally setup the protection groups for host-level backups (using the hyper-v snapshots, vss null provider) and that worked fine.
Due to problems with the cluster nodes, we were forced to reinstall them. After reinstalling, we've reinstalled DPM agent, updated it, configured it to connect to the right DPM server (setdpmserver.exe) and reattached in DPM.
But since then the host-level protection is mostly failing. It occasionally works for some VMs, but for most it fails. We've tried removing all the protection groups and then removed the nodes (agents) from DPM, added them back and reconfigured the protection, but it's still failing for most VMs (I'd say 1/10 works).
The error is always the same:
Affected area:\Online\VM01Occurred since:10. 10. 2015 19:52:59
Description:The replica of Microsoft Hyper-V \Online\VM01 on VM01 Resources.hvclus.domain.local is inconsistent with the protected data source. All protection activities for data source will fail until the replica is synchronized with consistency check. You can recover data from existing recovery points, but new recovery points cannot be created until the replica is consistent.
For SharePoint farm, recovery points will continue getting created with the databases that are consistent. To backup inconsistent databases, run a consistency check on the farm. (ID 3106)
An unexpected error occurred while the job was running. (ID 104 Details: The RPC server is unavailable (0x800706BA))
More information
Recommended action:Retry the operation.
Synchronize with consistency check.
Run a synchronization job with consistency check...
Resolution:To dismiss the alert, click below
Inactivate
I've observed that when I run the consistency check (or configure new protection), the VHDX snapshots are created (including the _autorecovery one), DPM pretends to be running the backup for about 2 minutes, but then it fails with the above error and 0MB transferred. The .avhdx files are removed automatically.
Connectivity with the hosts (nodes) is fine. I've also tried moving the VMs to different cluster nodes, changing CSV ownership to different nodes, nothing helps.
There's little information in the node's eventlogs:
- EventID 10170, source Hyper-V-VMMS: Requester reported unsuccessful backup for the virtual machine 'VM01'. (Virtual machine ID xxx)
- followed by eventID 16010, source Hyper-V-VMMS: The operation failed.
- followed by the 2 events for disk merge (start and finish)
Can anyone help?