I assume there are many installations which need to go to VIOS 3.1 by October 2020 due to end of support for VIOS 2.x. Because, I did not find any upgrade procedure for “dummies”, I decided to write one which I hope people find useful. I do not cover only the upgrade process itself, but also share some advises how to do it safely and recover in case of troubles. Upgrade Virtual I/O Server to 3.1 is completely different process than before. For instance, you are going to need an extra disk to perform the operation. Interested?
There is not much documentation on that topic. Definitely the best one is from Nigel Griffith but Nigel is Nigel…. He thinks the entire VIO community knows AIX and all terminology around it, while apparently for most IBM i administrators, the VIOS is a “black-box”. Therefore, this post is mainly for less advanced VIOS users. If you use Shared Storage Pool, the cluster within VIOS, configured another Volume Group or you do hundreds of AIX installations you are already advanced VIOS administrator and skip it.
Initially I didn’t have intentions to write about the process because the idea was to go to version 3.1 with new POWER9 systems, and leave VIOS 2.x dies with POWER8 machines. Second thing was, there is really no reason to go to VIOS 3.1 while the main workload runs on IBM i (none of the new features bring any value to IBM i). And even Nigel Griffith told at first VIOS 3.1 presentation (Rome 2018 – while he gets all releases sooner than there are generally available) do not do it. Apparently, IBM has changed their mind and shorten the support for VIOS 2.x which make many customers in uncomfortable situation.
Is upgrade to version 3.1 difficult ? No, If everything goes according to the process, and the automation process works. If something unpredictable happens, you might be in trouble.
Did I have problems ? Yes, I did. 40% of my upgrades failed, the process had crashed and I was forced to manually recover the configuration.
Few personal advises:
Before the upgrade – double/triple check if you have the current VIOS backup and you know how to use it. This is especially to IBM i community. Officially the VIOS can be restored in following ways:
- from the DVD-RAM (really, anyone is doing that),
- using the HMC command line (have you ever tried it?) Remember than there are very tight requirements between HMC release, and capability which VIOS version will be possible to restore.
- Create mksysb image, and use NIM (AIX) to restore. Super easy and fast but AIX LPAR is required.
- Install the VIO from scratch, apply fixes, and restore the configuration
- Flashcopy on the storage array? – If your VIOS is installed on the external array and flashcopy feature is available. You may do the flashcopy, but again be sure, that you know how to recover afterwards.
Don’t do it by yourself if:
- If you don’t know VIOS commands
- You have never ever touched VIOS command line
- You have never installed the VIO from the scratch
- You don’t know how to configure tcp/ip on the VIOS
- If all configuration was done only by the HMC enhanced interface. (look to ad2)
- If you have some internal VIOS modifications/scritps done by someone else, and you are not sure how these work.
- You don’t know how to connect to the VIOS thru the console
Again, you are going to need this knowledge if something bad happens. If everything goes well, this is super easy process.
Why VIOS 3.1 upgrade is so different?
This is not the same process as we know for last ~10 years (use updateios command). It is jump from AIX 6.1 to 7.2, and this is not the same operation what IBM i people knows by upgrading “their OS”. This is basically scratch installation of AIX 7.2 (VIOS 3.1) and restore the configuration. BUT, not everything will be restored. If you have some modifications, ssh private/public keys, scripts, automation, folders with data, entries in the crontab, extra file systems, virtual media repository. These all WILL NOT be restored. You need to do a backup prior the upgrade, and restore/configure afterwards by yourself.
Another side effect of the upgrade might be renaming of logical devices. For instance, an Ethernet adapter, or Shared Ethernet Adapter with logical name ent5 might become ent7, or a fiber adapter port fcs3 might become fcs5. It is almost impossible (definitely not recommended) to modify these names on AIX.
Do you have to afraid if the SEA will be improperly configured or a virtual fibre adapter will be mapped to another physical port after upgrade? No, don’t be afraid. The restore does not use logical names, it is linked to the physical locations.
VIOS 2.2.6.x upgrade to 3.1 can be compared to “side by side” POWER system upgrade – when you put a new server next to the current one, and the data being migrated one by one. Therefore, you are going to need an extra disk/LUN to perform it. Yes, an extra physical disk is required because the process installs the brand new VIOS instance and copy the configuration to the new one. After the upgrade, the old disk can be destroyed. If you don’t have an extra disk, either stop the mirroring on the internal storage or borrow a disk from someone :). If VIOS runs from the external storage, I hope you can find at least 30GB.
“I don’t know VIOS commands but I have a dual VIOS configuration with full redundancy, I will give a try” – I suggest you to think again. If you run IBMi LPARs, very likely the tape library is not connected thru both VIOSes, if something goes wrong the production LPARs will survive, but your backup/restore functionality might be affected. While you won’t be able to recover by yourself, you would have to ask for help. Do you accept the risk?
Still sure, that you have it under control? If you afraid, you can contact me and I can help you. Otherwise, let’s go to the next steps:
Before the upgrade to 3.1
- Definitely, I recommend to install one of the latest VIO releases, 220.127.116.11 or above (be aware that some latest releases contain a bug, see post). It provides a new command viosupgrade which automates the process (if you are lucky). So, definitely install this release shortly before. It adds 30min – 1hr extra work, but definitely it pays off. Personally I recommend 18.104.22.168 release.
- If for any reason, you decide to use release 22.214.171.124 (which should be a bug free) and still want to go to VIOS 126.96.36.199. You hit another problems. Thus, some filesets in version 2 will be newer than on Version 188.8.131.52. And the upgrade will not be possible. You can read how to solve it at the end of the post.
- If you just upgraded to 184.108.40.206 do not apply cfgrules after upgrade – I don’t have good experience with it. It modified some parameters which are incompatible on release 3.1.
- Download an installation media to VIOS 3.1 from My Entitled Systems Support. I recommend to use images 220.127.116.11 and install 18.104.22.168 fix afterwards. Release 22.214.171.124 contains a bug (read here)
- Provision an extra disk to the VIO which will be upgraded (minimum 30GB), but I recommend something around 70GB minimum.
- Copy all custom scripts to a remote location.
- Download the fix pack to 126.96.36.199 and all eFixes (see HIPER APAR page)
At this point I assume you run already release 188.8.131.52 minimum and very few moments you will do the upgrade to 3.1
I recommend following:
- Failover the Shared Ethernet Adapter (SEA) manually to another VIOS. See best practice section in the previous post.
- Make the VIOS configuration backup and download to your workstation (or any other remote location). Backup of the configuration can be made with command:
viosbr -backup -file /home/padmin/vios31Priorupgrade
- Take a copy of all crucial configuration to the notepad. Using commands like:
lsnports, lsdev -dev ent*,lsdev -dev ent* -vpd, lsmap -all -npiv , lsmap –all ,lsdev -dev entx –attr, ifconfig –a , lspath.
- Custom entries added to the crontab: crontab –l.
- Or you can use this cool script which my colleague found savevio……It will list all configuration to text files stored in /home/padmin/saveit folder.
- If you use the script – copy (ftp) it to the VIOS, add rights chmod +x /home/padmin/savevio.sh ,and run it ksh savevio.sh in the root shell (oem_setup_env)
- Copy (ftp) 3.1 iso images to the VIOS.
- Create VIOS 3.1 image from the iso images.
$ viosupgrade -I iso/VIO311_DVD1.iso:iso/VIO311_DVD2.iso -w vio31install.
Where iso/VIO311_DVD1.iso:iso/VIO311_DVD2.iso is location of the copied iso images and vio31intall is the folder where the VIO installation image will be created. This image is a mksysb image of “clean VIOS 3.1”. What does it mean mksysb image? – it is basically a binary image which can be used by AIX (NIM) to install another operating system. There is no equivalent in IBM i world. The command should create the file in format iosmksysb_xxxxx.mksysb
- copy the created mksysb image from /home/padmin/vio31install directory to the offsite location. You can re-use it on another VIOS the other day.
- Verify that the file systems have some free space. Execute command df .If some some filesystem are next to 100%, please increase it.
- Verify that an extra physical disks is available: lspv
- If you have any Volume Group beside rootvg, delete it.
- In the rool shell (oem_setup_env) Use command chdef , and check if io_dma attribute in new_default column is equal to 256. Otherwise you may hit the APAR
- If io_dma parameter must be changed do following:
# /usr/sbin/chdef -c adapter -s pciex -t df1000e21410f10 -a ‘io_dma’=’256’
# /usr/sbin/chdef -c adapter -s pciex -t df1060e21410100 -a ‘io_dma’=’256’
From this moment double check that no one will perform any VIO changes. No new LPARs, no changes in the virtual adapters, no SEA changes. Again, no changes! otherwise you will be screwed.
So, everything was saved and checked. If you are lucky this will be easy….:)
- Login to the VIOS thru the HMC terminal.
- Check again the empty disk with lspv. If the empty disk is hdisk1 run the command as stated below:
viosupgrade -l -i vio31install/iosmksysb_49402 -a hdisk1
At the very begging the procedure also create viosbr backup, which it will automatically used for restore, and you should see something like this:
The upgrade may take 20-40 min, the speed is depended mainly by the disk speed.
If you are lucky, the configuration restores, and VIOS IP address gets active, and the restore procedure initiates one more restart of the VIOS.
If the restore works, there will be lot of information, and the console stops at the login prompt. Remember, this is the scratch VIO installation, the first login to VIOS will be possible only from the console where you must set the password and accept the licenses.
You should see a summary, which device has been changed. For instance, ent8 become ent11. The detail restore information can be checked with command
viosupgrade -l –q
In terms of npiv mapping, I’ve noticed that this process may take up to 5-10 min after VIOS is started. So, do not panic at the beginning, give it more time.
If everything, was properly restored, all mapping should work. The SEA should operates (remember that I recommended to disable it before the upgrade, thus if you did, it will be still disabled). Now you can recover/restore all your custom configuration like ssh keys , virtual images, nmon setup. etc
You can also remove the old Volume Group from the configuration and remove the extra disks. In order to list the Volume Groups, do:
You should see vg old_rootvg assigned to the physical disks. You can remove VG:
# alt_disk_install -X old_rootvg
Now you can remove the disks from the configuration. If old_rootvg was assigned to hdisk0, do:
$ rmdev -dev hdisk0
If the disks was sucessfully removed from the configuration, you can remove it from the storage array.
If the restore process fails…..
Check first the log with command:
viosupgrade -l –q
You should see something like:
Welcome to viosupgrade tool.
Getting status of node(s):
Please see the vioupgrade status:
Wed Jul 8 12:10:42 2020|STARTED
Wed Jul 8 12:15:21 2020|TRIGGERED
Wed Jul 8 05:29:25 2020|RESTORE
Wed Jul 8 05:29:34 2020|FAILED
Please see the viosbr restore status:
Viosbr restore timestamp:
Wed Jul 8 05:29:24 CDT 2020
License acceptance is successful
Restoring the backup..
Restore summary on localhost:
Backedup Devices that are unable to restore/change
DEPLOYED or CHANGED devices:
Dev name during BACKUP Dev name after RESTORE
Some error messages may contain invalid information
for the Virtual I/O Server environment.
0590-105 The execution of a command failed
ERROR- 'ArtexRules' are failed to restore
Errors have been detected during restoration.
Details are in the log file : /home/ios/logs/restore_trace.out.
So, you can track what was wrong in the log file /home/ios/logs/restore_trace.out, but it might be difficult to find.
Anyway, very likely the tcpip is not running, and the very first thing for you will be to configure the tcpip. So, you must find the network adapter, and configure IP.
If tcp is active, you can transfer (ftp) the backup of the configuration which I advised to move it offsite. It was done by viosbr command and manually restore it.
viosbr -restore -file /home/padmin/vios31Priorupgrade -skipcluster
I had multiple reasons why the automatic process failed. But running the restore manualy always recover the configuration. You must read carefully the restore log. If there is some problem in the configuration it may happen that some configuration/mapping will not be restored.
Package on VIOS 2.x is newer than on VIOS 184.108.40.206. The upgrade process will not start.
This is very rare situation but it may happen. If you are already on VIOS 220.127.116.11 but for any reason you are not going to upgrade to the latest 18.104.22.168 release, but you go only to 22.214.171.124.
You may see an error when viosupgrade command executed that some package is newer and the installation cannot continue. In such case, you need to remove the newer package. In my scenario some lan package had to be removed:
# /usr/sbin/emgr -P
PACKAGE INSTALLER LABEL
======================================================== =========== =====
devices.vdevice.IBM.l-lan.rte installp IJ25397sFa
Remote the package
# /usr/sbin/emgr -r -L IJ25397sFa