Technical Staff Blog

Author archives: John Bazik

RSS feed of John Bazik

Last update on .

The filesystem and all services have been restored and are back to normal.  Our efforts to repair file corruption in the file system were not entirely successful.  Following instructions from IBM, we made four attempts to execute their file repair program, and it failed or hung each time.  However, each run did make progress, and ...

Last update on .

A virtual machine server crashed this morning at around 5am, affecting 49 CS Department servers, including list, web, and other important services.  The technical staff spent much of today locating and fixing problems that arose from this event.  All or most services have been restored, but problem mail was interrupted mid-day today, and remains broken ...

Last update on .

During the final weeks of the semester, the technical staff encountered issues with routine maintenance of our GPFS filesystem.  IBM suggested that there was corruption in the filesystem, and recommended that we take it offline to repair it.  We postponed that work until after commencement, to avoid disrupting the final weeks of classes, a conference ...

Last update on .

Shortly after 10am this morning, most CS services, including the filesystem, webserver, and list server were affected by the failure of several of the University's virtual machine hosts.  Services were down for about a half hour.  At this time, the cause of the failure is unknown, but we are investigating.  All services were restored ...

Last update on .

Many services were affected by a GPFS filesystem issue this morning.  An NFS server in our GPFS cluster failed, affecting filesystem access generally, and services that rely on filesystem access.  Systems were affected by at least 10am, and possibly much earlier.  The filesystem was restored to normal operation before 2pm.  A VMWare host server also ...

Last update on .

A number of services were intermittently unavailable today.  These include VPN, the list server and the website.  The cause of the problems is not yet known and we continue to investigate.  There remain lingering problems - intermittent authentication failures on the website in particular - and we have posted a notice on the system status page.

Last update on .

The Technical Staff has several new members.  Paul Vars joined tstaff back in June as Senior Hardware Technician.  Shaun Wallace became Senior Systems Programmer last month, and David Serpa takes on the role of Lead Systems Administrator today.  With David's arrival, the technical staff has a full roster for the first time since 2016 ...

Last update on .

On the morning of Wednesday, January 17th, CIS will change the VLAN numbers of seven of the department's virtual network segments, affecting up to 422 systems, mostly desktop machines.  A simple bookkeeping change, users who are up very early that morning should notice only a brief network outtage.   The change is necessary for a ...