Overland Street Power Outage May 2025

Tue, 05/13/2025 - 18:30

The Overland Street Data Center suffered a power outage in the morning on Saturday, May 10. This impacted not only primary power, but all backup systems. The Overland Street Data Center houses our research high-performance computing clusters, all research data storage devices, network servers, as well as a number of virtual machines, websites, and applications. 

Power was restored Saturday night and RCS and Network Engineering are continuing to work to bring the network and all servers back online. We anticipate system access to be restored on Monday, May 12. 

  • May 11 7:00PM: Full restoration of network connectivity is still pending. There are several network switches and ports that are not performing and unfortunately, these devices impact the full availability of the HPC and data storage systems.
  • May 11 10:30PM: The majority of network issues have been resolved. RCSM remains offline while 2 remaining nodes near resolution. Argos HPC restorations are in progress. Further timelines for resolution of both services will be updated Monday morning. 
  • May 12 9:00AM: RCS and Data Center support services are onsite working to bring RCSM and the virtualization cluster back online. 
  • May 12 12:00PM: RCSM remains unavailable due to a networking issue with one node. The virtualization environment has been restored and reconfigurations for HPC are in progress.
  • May 12 5:45PM: 
    • Aether High Performance Computing has been restored and is now available. 
    • Argos High Performance Computing has been restored and is now available. 
  • May 13 11:15AM:
    • RCSM (Mediaflux): RCS and the vendor staff have brought RCSM back online and are working to resolve external connection issues preventing user access.
  • May 13 6:30PM:
    • RCSM: Access to RCSM has been restored. The platform is now available by all methods of connection. 

If you encounter any further issues please report them by submitting a ticket to RCS via the RCS Support Form.