Maintenance

isginf needs to perform maintenance on servers for various reasons, sometimes even urgently:

  • operating system updates
  • applying security fixes
  • changes to server room or virtualization infrastructure

We try to do maintenance only when necessary and keep downtime to a minimum.

The Standards for Responsibilities and System Maintenance requires us to perform updates on a regular basis and apply security fixes pretty much immediately.

Server Owner

Each server has an owner which is the primary contact for isginf for maintenance and other issues. If the owner is unavailable we fall back to the ITC of the group.

Policies

For each server we negotiate a policy for downtime and maintenance. The two general policies are outlined below.

Anytime

This is the preferred way. isginf will just perform the necessary maintenance and power-cycle the system when we need to.

An automated mail notification to the owner and additional addresses on start-up can be requested by the server owner.

Coordination with Owner Required

For systems where there is no other way we will coordinate the maintenance tasks with the server owner or ITC. Reasons why this may be the only option include:

  • The server provides an critical service and unannounced downtime is unacceptable.
  • Updates are likely to break something and require staging and testing.
  • The server runs long-running jobs that cannot be interrupted.

We try to avoid this option as it is work-intensive and takes too long for in case of security fixes.

Page URL: https://isg.inf.ethz.ch/bin/view/Main/ServicesServersMaintenance
2024-12-21
© 2024 Eidgenössische Technische Hochschule Zürich