I've seen this error come up several times since upgrading DPM and setting up the VMM integration.
Affected area: \Backup Using Child Partition Snapshot\SERVER1 Occurred since: 2/10/2013 7:06:27 PM Description: Recovery point creation jobs for Microsoft Hyper-V \Backup Using Child Partition Snapshot\SERVER1 on HOST.DOMAIN.com have been failing. The number of failed recovery point creation jobs = 1. If the data source protected has some dependent data sources (like a SharePoint Farm), then click on the Error Details to view the list of dependent data sources for which recovery point creation failed. (ID 3114) DPM was unable to establish a connection with the Virtual Machine Manager (VMM) server. Server name: VMMSERVER.DOMAIN.com. Exception Message: Type: System.TimeoutException, Message: This request operation sent to net.tcp://DPMSERVER.DOMAIN.com:6070/VmmHelperService/TcpEndpoint did not receive a reply within the configured timeout (00:01:00). The time allotted to this operation may have been a portion of a longer timeout. This may be because the service is still processing the operation or because the service was unable to send a reply message. Please consider increasing the operation timeout (by casting the channel/proxy to IContextChannel and setting the OperationTimeout property) and ensure that the service is able to connect to the client. (ID 33400) More information Recommended action: 1) Verify that VMM Console features are installed on the DPM Server. 2) Verify that the VMM server is online. If you need to reconfigure the VMM server in DPM, use the Set-DPMGlobalProperty cmdlet as follows: Set-DPMGlobalProperty -DpmServerName <DPMServerName> -KnownVMMServers <VMMServerName>. 3) Verify that the VMM server has been configured to accept requests from this DPM server. Create a recovery point... Resolution: To dismiss the alert, click below Inactivate
When this occurs, it seems that all of my backups of VM's, across multiple hosts, fail with the same error. However, later jobs for the same VM's appear to be completing successfully. I just checked one of the VM's that was affected by this error over the weekend, and it currently shows the latest recovery point was earlier this morning.
The first time this happened was right after the upgrade, and I found the DPMVMMHelperService was not running. Since then, when I see this error I check the service and have always found it running. I don't see anything related in the event logs on the VMM server. On the DPM server, in the "VM Manager" log, I do have several entries regarding a "Refresh Performance Data" job that failed to completed, but that appears to be an error that is logged periodically throughout the day, not matching up with the time of the backup issue.
What would be causing the dpm server to intermittenly run into this issue?