Technical Staff Blog

Category archives: Service Outages

Tstaff announcement about complete service outages

RSS feed of Service Outages

Last update on .

Unscheduled Sunlab outage - resolved

This morning - September 28th - some electricians came to replace some of the lights in the Sunlab. In the process, for reasons that remain unclear to us, they threw some breakers that resulted in every computer in the entire room losing power.

We've brought all the computers back online. However, anyone logged in at the ...

Last update on .

We are aware of ongoing issues with the CS department website. The problem began shortly after midnight when both our main web servers got into an irregular state and had to be rebooted. Now parts of the website are back, but the home page itself (https://cs.brown.edu/) refuses to load properly. Possibly other ...

Last update on .

Two GPFS NFS servers failed simultaneously at around 1pm today, causing file services to be unavailable for about 40 minutes.  The servers, crows and runts, serve different cluster groups; crows serves the grid and runts serves the internal department network.  Runts' failure was a kernel lock-up which prevented the normal automatic failover behavior.  The technical ...

Last update on .

There was a broken water pipe on the 3rd floor near the network closet.  One of the network switch, cit-cs-as.net.brown, was affected and is offlined.  The following CIT rooms are affected this switch outage: 115, 121, 132, and 134.

We are investigating further and will update this post as we learn more.

Update ...

Last update on .

There was a short network outage at 5pm today, affecting some of our user managed machines in Room 310 and the department's Print Host.  While disabling a set of ports on the switch a port we were not suppose to have access to was also disabled, which severed that switch's communications with the ...

Last update on .

At around 11:30am today, our local DNS database became corrupted, causing domain name resolution to fail for local hosts.  This caused a series of cascading failures which rendered most local services unusable.  We restored the database and restarted our name service shortly after noon and all services should be back to normal.  We are ...

Last update on .

The CS Department mailing list server was down for much of today after an update caused a local misconfiguration.  An update to our Sympa software which was automatically installed overnight broke a local configuration script and caused the service to be unresponsive from around 12:30am this morning until about 7pm this evening.  No messages ...

Last update on .

The system database was down this morning between 12:25am and 1:40am.  The pgpool proxy server failed on host mallo, which is the backup server but was acting at the time as the primary.  Failing over to the the usual primary server, carvel, restored services.  During this period services which rely upon the database ...