(10:35) Around 9:15am this morning, a network card in a CIS switch failed that connects our primary firewall's external network interface. This necessitated someone going onsite to manually force our firewalls to failover. Service was restored around 10:30am. We are in the process of checking various services to make sure that everything has recovered properly. In the mean time, please email firstname.lastname@example.org if you notice any issues that have not been resolved.
UPDATE (11:01) The faulty hardware is scheduled to be replaced tomorrow morning @ 7am. This change should not affect the department as it will be associated with the non-active firewall.
UPDATE (11:12) We just cleared a backlog of email heading to our list server.
UPDATE (11:46) There was a miscommunication between us and CIS. Apparently the failed hardware affects more than just one network interface, it's entire blade in one of the chassis. Given this, CIS will be replacing the hardware within the next hour. This should not cause any issues, although it does affect the network interfaces on a number of our GPFS nodes so there might be a blip as GPFS recovers.
UPDATE (12:28) Bad news. The replacement hardware CIS thought that had on hand is also bad. They are going to open a case with Cisco to get some a replacement, which should happen some time tomorrow. In the mean time, this will mean we are missing five storage nodes from our GPFS file system, so performance will no doubt be affected somewhat. We will update this post when we have more details.
UPDATE (12:52) It seems that printing was somehow caught up in this outage as well. We have restarted the print services and they seem to be online again.
UPDATE (13:22) At this point, we've gone through all the service alerts resulting from this morning's failure. We are still running in a degraded mode, so file system performance will be deminished, but we believe everythign else is operational. If you notice any continuing issues, please email email@example.com.
UPDATE (14:52) It looks like the replacment hardware won't arrive until Tuesday, meaning it might be Wednesday before it can be installed. CIS has kindly offered to pull a working module from somewhere else and swap it out tomorrow @ 6am. Unless something else changes, this will be the plan. So expect degraded file system performance until early tomorrow morning.