The Overland Street Data Center suffered a power outage in the morning on Saturday, May 10. This impacted not only primary power, but all backup systems. The Overland Street Data Center houses our research high-performance computing clusters, all research data storage devices, network servers, as well as a number of virtual machines, websites, and applications.
Power was restored Saturday night and RCS and Network Engineering are continuing to work to bring the network and all servers back online. We anticipate system access to be restored on Monday, May 12.
- May 11 7:00PM: Full restoration of network connectivity is still pending. There are several network switches and ports that are not performing and unfortunately, these devices impact the full availability of the HPC and data storage systems.
- May 11 10:30PM: The majority of network issues have been resolved. RCSM remains offline while 2 remaining nodes near resolution. Argos HPC restorations are in progress. Further timelines for resolution of both services will be updated Monday morning.
- May 12 9:00AM: RCS and Data Center support services are onsite working to bring RCSM and the virtualization cluster back online.
- May 12 12:00PM: RCSM remains unavailable due to a networking issue with one node. The virtualization environment has been restored and reconfigurations for HPC are in progress.
Further updates will continue to be posted as they are available.