HyperV – Ramblings of an IT Admin

Installing AWX on AlmaLinux 9

I ran into some issues installing AWX on AlmaLinux 9 on Proxmox (I had the same issues with Alma 8.7). This also applies to RockyLinux 9.

I was installing AWX via Rancher following https://github.com/ansible/awx-operator#basic-install. I made it all the way to the section where you create the awx-demo.yaml, add it to your kustomization.yaml and build via kustomize build . | kubectl apply -f -. From there I was receiving errors such as “unable to determine if virtual resource”,”gvk”:”apps/v1″ and the build would ultimately fail out.

In order to make it past that error I found a found a few posts which suggested changing the CPU type from “Default (kvm64)” to Host. This sets the VM to match the CPU of the host.

***If you are running HyperV, there is a similar option, see the final post in this Google Group conversation: https://groups.google.com/g/awx-project/c/4tmP0TlRODU.***

After resetting the CPU type, rebooting the vm and re-running the kustomize build, I was able to make it quite a bit further. The logs looked like there were no issues, then towards the end the script once again failed. This time I was seeing the following error: “awx unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1:”. The Pod itself was also down with a CrashLoopBackOff error. From there I found the following link which was able to get me past all of my installation issues: https://stackoverflow.com/questions/62442679/could-not-get-apiversions-from-kubernetes-unable-to-retrieve-the-complete-list

I ran: kubectl api-resources which listed the resources and metrics.k8s.io/v1beta1 was in fact down.

Next I ran: kubectl delete apiservice/v1beta1.metrics.k8s.io

From there I re-ran the kustomize build command and awx installation completed successfully after the installation. I did have to open the firewall ports in Alma to allow my browser to access AWX.

Steps to Install AWX:

#Install Rancher
curl -sfL https://get.k3s.io | sh -

#Install Kustomize
curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh"  | bash

#Move Kustomize binary
mv kustomize /usr/local/bin/

#Goto AWX Readme and follow along from there:
# https://github.com/ansible/awx-operator#basic-install

Feel free to contact me if you have any comments or questions

Veeam Backup Failing (VSS_WS_FAILED_AT_PREPARE_SNAPSHOT) (Resolved)

Veeam Backup Failing (‘VSS_WS_FAILED_AT_PREPARE_SNAPSHOT’)

I had a Veeam backup job that was failing with: Retrying snapshot creation attempt (Writer ‘Microsoft Hyper-V VSS Writer’ is failed at ‘VSS_WS_FAILED_AT_PREPARE_SNAPSHOT’. The writer experienced a non-transient error. If the backup process is retried, the error is likely to reoccur. –tr:Failed to verify writers state. –tr:Failed to perform pre-backup tasks.)

Researching this error online was telling me the issue was on the host, but I wasn’t believing that as all of my other vm’s were backing up without issue daily.

To play it safe I checked the host by running: vssadmin list writers

I received the following error on the host:

microsoft hyper-v vss writer non-retryable error

Looking further on the host’s event logs for the error I saw this:

At this point I was still convinced the host wasn’t at fault due to the fact all other vm’s still backed up fine, so I logged onto the vm in question and ran: vssadmin list writers

I received the following on the vm:

sqlserverwriter non-retryable error

Looking into the event viewer I saw:

Researching these errors online I found several solution saying to delete the old backup software. This server used to use another backup solution prior to Veeam called Altaro, which I was pretty sure I had removed a long time ago. I checked add/remove programs and verified Altaro wasn’t listed. I even checked the vss writers for any other backup software listed and found nothing. Running out of ideas, I checked Windows backup to make sure it wasn’t running and no backup jobs were listed. I then looked into Task Scheduler and found a few manual backup jobs listed. I disabled and deleted these jobs. I then restarted the SQL VSS writer service, restarted SQL VSS service, verified it showed no errors after re-running vssasdmin list writers. I then retried the Veeam backup again and it failed out once again.

Re-running vss list writer I received the same error. I was now convinced this was tied to the old task scheduled backups I had removed.

Next, I tried: vssadmin delete shadows /all

After running that command, I received:

Error: Snapshots were found, but they were outside of your allowed context. Try
removing them with the backup application which created them.

After much more research, I found an outside the box way of deleting the snapshots from another site.

How to Fix “outside of your allowed context” Errors

In order to get rid of these kinds of shadows we need to apply a “trick”. Basically the VSS diff area storage is where VSS keeps these shadows “alive”.

By seriously cutting this limit to the bare minimum we invoke a mechanism in VSS itself that causes it to dump all shadows.

So we proceed by telling VSS to cut the limit down to 401 MB. For some reason the user interface will claim the bottom is 300MB but on several versions of Windows it refuses and reports:

Error: Specified number is invalid

The command that works uses 401MB and is (adapt it to your drive letter as needed):

vssadmin resize shadowstorage /for=D: /on=D: /maxsize=401MB

*****I ran this against the C: and D: drive of my VM*****

Then once you get “success” you can increase the limit once again to the recommended “unbounded” setting, or an actual limit value if you are using shadow copies for other purposes:

vssadmin resize shadowstorage /for=d: /on=D: /maxsize=unbounded

*****I ran this against the C: and D: drive of my VM*****

Then, vssadmin happily reports:

Successfully resized the shadow copy storage association

and a quick check using

vssadmin list shadows

reveals all VSS shadow copies are now gone!

I then re-ran Veeam the Veeam backup job against the VM and it ran successfully!

Move KVM VM vm to HyperV

Found the following online on Novell’s site and it worked perfectly (link below)

Copy the disk to a pc with virtualbox installed the convert with:

c:\Program Files\Oracle\VirtualBox>VBoxManage convertfromraw c:\be\disk0-test.ra

w test.vhd –format VHD

Then copy the vhd to hyperv and do it

1. Boot the target machine into rescue mode, using appropriate boot media

2. Run the following command to determine which devices are /, /boot and swap:

fdisk -l

3. Using that information mount the appropriate devices:

mount /dev/%root device% /mnt

mount /dev/%boot device% /mnt/boot

mount –rbind /proc /mnt/proc

mount –rbind /sys /mnt/sys

mount –rbind /dev /mnt/dev

4. Change the system root to the newly selected location:

chroot /mnt

5. Modify the /etc/fstab file to make sure the correct devices are being used. For example the root, boot, and swap devices might be listed with these device names:

/dev/sda1

/dev/sda2

/dev/sda3

According to the fdisk -l output, however, the devices should be listed as follows:

/dev/cciss/c0d0p1

/dev/cciss/c0d0p2

/dev/cciss/c0d0p3

6. Modify the /boot/grub/menu.lst file and replace the boot partition information with the correct device id. For example, the boot partition may be listed as this device:

/dev/sda2

Where it should be this device:

/dev/cciss/c0d0p2

7. Make sure that the /var/tmp directory exists and then run the following command (note: the /var/tmp directory may need to be manually created first):

mkinitrd

8. Reboot the target machine

From <https://www.novell.com/support/kb/doc.php?id=7009643>