Technical Staff Blog

Author archives: John Bazik

RSS feed of John Bazik

Last update on .

All CS Department shared filesystems have now been migrated to CIS's Dell/EMC Isilon file servers.  The final three shares to go today were /data, /research and the CS website.  The old shares remain available, read-only, under the root path /gpfs-old for those who need it.  Eventually the old file servers will be taken ...

Last update on .

The /course fileshare, which is home to essential data and infrastructure for many CS courses, was relocated today from the CS department's GPFS filesystem to CIS's Isilon filesystem.  The migration was implemented by changing the remote filesystem mounts on all end user systems, and by making the old fileshare read-only, for those who ...

Last update on .

The filesystem and all services have been restored and are back to normal.  Our efforts to repair file corruption in the file system were not entirely successful.  Following instructions from IBM, we made four attempts to execute their file repair program, and it failed or hung each time.  However, each run did make progress, and ...

Last update on .

A virtual machine server crashed this morning at around 5am, affecting 49 CS Department servers, including list, web, and other important services.  The technical staff spent much of today locating and fixing problems that arose from this event.  All or most services have been restored, but problem mail was interrupted mid-day today, and remains broken ...

Last update on .

During the final weeks of the semester, the technical staff encountered issues with routine maintenance of our GPFS filesystem.  IBM suggested that there was corruption in the filesystem, and recommended that we take it offline to repair it.  We postponed that work until after commencement, to avoid disrupting the final weeks of classes, a conference ...

Last update on .

Shortly after 10am this morning, most CS services, including the filesystem, webserver, and list server were affected by the failure of several of the University's virtual machine hosts.  Services were down for about a half hour.  At this time, the cause of the failure is unknown, but we are investigating.  All services were restored ...

Last update on .

Many services were affected by a GPFS filesystem issue this morning.  An NFS server in our GPFS cluster failed, affecting filesystem access generally, and services that rely on filesystem access.  Systems were affected by at least 10am, and possibly much earlier.  The filesystem was restored to normal operation before 2pm.  A VMWare host server also ...

Last update on .

A number of services were intermittently unavailable today.  These include VPN, the list server and the website.  The cause of the problems is not yet known and we continue to investigate.  There remain lingering problems - intermittent authentication failures on the website in particular - and we have posted a notice on the system status page.