A few days after upgrading from vCloud Director 9.1 to 9.5, I encountered a strange issue. If someone was trying to launch a vApp or VM or trying to reconfigure existing VMs or vApps, it wasn’t possible. In the logs I could see the error message “Total available resources could not be determined for reservation” in combination with a long Java stack trace.
After testing and debugging for some time I was able to find more error messages in the logs which pointed to a database problem.
Caused by: org.postgresql.util.PSQLException: ERROR: update or delete on table “cluster_compute_resource_inv” violates foreign key constraint “fk_cc_dr_v_h_r_i2clu_com_re_in” on table “ccr_drs_vm_host_rule_inv”
There were more of these SQL error messages with different foreign key constraints. But all ended with the suffix “_inv”. So I looked at these tables in the database and did some research on them. Unfortunately, there isn’t a lot information out there about the database structure of vCD. I’ve only found a few blog posts (for example, this one) but nothing specific to my error.
What is the problem?
Apparently, vCloud Director periodically collects information about the vSphere environment via the vCenter connection configured for vCD. This information about the virtual infrastructure is stored and referenced in these tables, among others.
Since I didn’t really get any further at this point, I consulted the VMware GSS. And they confirmed my observations. We suspected that something went wrong with the vCD 9.5 upgrade. Or another explanation could be that the cells weren’t stopped during a vCenter upgrade.
To be more concrete: vCloud Director “observes” the vSphere environment to be able to react to events at vCenter, host and VM level. Therefore, under rare circumstances, the inventory information in the vCloud Director database may be incorrect. This leads to these foreign key errors in the backend and to the error message “Total available resources could not be determined for reservation” in the frontend and logs.
How to solve this “Total available resources could not be determined for reservation” error?
And that’s the easy part. We just need to delete all inventory entries in the vCloud Director database so that vCD can gather fresh information directly from the vSphere environment.
! Warning !
Manipulating the vCloud Director database is nothing you should do in production. I highly recommend to consult the VMware GSS before starting and make backups of the database in any case!
At first: Stop all vcd services on all cells:
service vmware-vcd stop
After that, execute the following queries against the vcloud director database:
delete from task; update jobs set status = 3 where status = 1; update last_jobs set status = 3 where status = 1; delete from busy_object; delete from ccr_drs_host_group_host_inv; delete from ccr_drs_host_group_inv; delete from ccr_drs_rule_inv; delete from ccr_drs_vm_group_inv; delete from ccr_drs_vm_group_vm_inv; delete from ccr_drs_vm_host_rule_inv; delete from compute_resource_inv; delete from custom_field_manager_inv; delete from cluster_compute_resource_inv; delete from datacenter_inv; delete from datacenter_network_inv; delete from datastore_inv; delete from dv_portgroup_inv; delete from dv_switch_inv; delete from folder_inv; delete from managed_server_inv; delete from managed_server_datastore_inv; delete from managed_server_network_inv; delete from network_inv; delete from resource_pool_inv; delete from storage_pod_inv; delete from task_inv; delete from task_activity_queue; delete from activity; delete from activity_parameters; delete from failed_cells; delete from lock_handle; delete from vm_inv; delete from property_map;
Finally, start the vcd services again on each cell and wait until the vCenter collection is run.