Dear SWITCHengines user
tl;dr: We are upgrading SWITCHengines on 17.8. (Lausanne) and 24.8. (Zurich). Your running virtual machines will continue to run, but there can be short disconnects of their network connections. From your side, no action is required.
As we are moving towards making SWITCHengines into a complete SWITCH product, we are constantly working on the infrastructure and the software. As some of you know, SWITCHengines uses the Open Source “OpenStack” software as a basis. OpenStack is being continuously developed and updated. We are following the development closely and are now ready to upgrade to the “Juno” release of OpenStack (currently we are on “Icehouse”).
There are numerous improvements in this release - most of them are under the hood and you won’t see any difference. Those improvements will allow us to improve the service, the service quality, the performance and the overall experience you as a user are having.
SWITCHengines runs from two regions (datacenters) - one in Lausanne, one in Zurich. We will upgrade them at different times.
The Lausanne region will be upgraded on Monday, 17. August 2015. Zurich one week later on Monday, 24. August.
During the upgrade, the API and the user interface at https://engines.switch.ch will become unavailable for some time (that means that you will not be able to start/stop virtual machines). The virtual machines will continue to run without interruption! The two regions each will have short temporary network disconnects, as we have to reboot the server that handles the network connections. We expect the outage to be a few minutes each time, during the course of the day.
If you have services running that cannot be offline for a few minutes, we advise you to move them to the other region for the scheduled date.
We will be upgrading the so called “control plane” (all the services that allow the cloud to work and you to create new machines) first. The control plane lives on two virtual machines (called controller and network) that are running on a mirrored pair of physical servers. As a first step, we will switch the (also virtual) disks of theses virtual machines from LVM to Ceph. This allows us more flexibility in the future when working with those machines. To do that, we will have to suspend the virtual machines, dump their disks to the new storage and the reboot them from the new disks. This causes a first outage of service during that phase. For the controller, this means you will not be able to work with your machines, and when we move network, the network connectivity will be lost for a few minutes.
We will then snapshot the machines (as a form of backup). Then we are performing the actual software upgrade on both servers (installing all the new software, adapting configuration etc). Then the underlying databases have to be updated. The databases hold information about you, your projects, your virtual machines, virtual disks, virtual networks, firewalls and so on. After this step has completed, the servers are going to be rebooted. The reboot of the network node will again cause a short interruption in connectivity. When these steps are done, the whole control plane is on the new release. We will then move to upgrade the actual servers that run your virtual machines. Before upgrading a server, we move the running virtual machines to another server (this is a process that is transparent to you and causes no downtime for your virtual machine). We then perform the upgrade, reboot the server and continue on to the next. This process will take a couple of hours, but there will be no further interruptions of service.
If you have any questions, feel free to write us at firstname.lastname@example.org or visit our user forum on https://forum.cloud.switch.ch
The SWITCHengines team