Ansible – List all powered on VM’s to CSV

Sometimes we need to audit our VMWare environment and it is nice to have ansible gather this information in seconds into a format that is easy to import. This can take an hours long job down to seconds. I initially had an ansible script that would list all vm’s including powered templates, powered down or paused vms. That was nice, but I only really needed the powered on vms.

This grew from running the ansible script and manually scraping the output for what I needed which still took some time. The second iteration had me run the ansible script and “tee” the output to a file which I would then run a series of 12 sed statements against the file to gather the information I needed. That was great and it took less time, but I wanted to get it down to a one liner.

My third iteration is where I am at today. This is a one liner that isn’t pretty. I was able to join several sed statements into one, but the last 4 sed statements I still had to run separately due to the fact if they were joined to the first, the output wasn’t what I was expecting.

This is what I am using now (I will break it down after):

ansible-playbook VMWARE_list_all_powered-on_vms.yml --ask-vault-pass | sed -e '/"msg"/,$!d; /"msg"/d; / ____________/,$d; s/        {//g; s/        },//g; s/            "guest_name": "//g; s/            "ip_address": "//g; s/"//g'  > list.csv && sed -i '1d' list.csv && sed -i '/        }/,$d' list.csv && sed -i 'N;s/\n//' list.csv && sed -i '/^[[:space:]]*$/d' list.csv && cat list.csv

I know the above code is not too appealing to the eye (Feel free to message me if you have any suggestions). The first statement is running the ansible playbook “VMWARE_list_all_powered-on_vms.yml” Since I don’t want to store passwords in plain text, in this example I’m using ansible vault (there are better options out there). I am piping the output of ansible into 5 sed statements. The first sed statement is where I take the ansible output (which contains Ansible cowsay… which makes Ansible output fun) and do the following:

  1. Strip the first several lines of Ansible output down to the “msg”: [ line
  2. Remove the msg line
  3. Remove the trailing Ansible output (PLAY RECAP)
  4. Remove all lines starting with “{“
  5. Remove all lines starting with “},”
  6. Remove everything before and including “guest_name”: “
  7. Remove everything before and including “ip_address”: “
  8. remove all quotes and output to list.csv (I’m not finished yet)

Now I start separate sed statements because if I included them into one statement, the format wasn’t what I expected:

  1. Remove the first line of output
  2. Remove everything after and including “}”
  3. Join the VM name and IP address lines together
  4. Remove all lines with blank spaces

Below is an example of my output (Some vm’s did not include their ip. This has to do with VMware tools not being installed or running on the vm. I will follow that with the actual Ansible playbook):

COUNT DOOKU, 192.168.77.4
COUNT CHOCULA, 192.168.192.
COUNT VON COUNT - 123 AHHH HA HA, 192.168.3.192
COUNT DRACULA, 192.168.1.81
GREEDO, 192.168.3.3
POE, 192.168.3.8
TARKIN, 192.168.4.4
GENERAL GRIEVOUS, 192.168.3.7
GROGU - ITS BABY YODA FOOL, 192.168.3.159
DEATH STAR, 192.168.144.14
DEATH STAR 2, 192.168.192.15
BB-8, 192.168.1.199
JABBA, 192.168.192.86
FINN, 192.168.144.3.
MANDO,
TRAWN, 192.168.3.8
KIRK, 192.168.3.3
SPOCK, 192.168.3.176
DATA, 192.168.1.86
WORF, 192.168.5.7
PICARD, 192.168.1.178
RIKER, 192.168.19.84
McCOY, 192.168.9.81
La FORGE,
SCOTTY, 192.168.3.55
ARCHER, 192.168.19.99
RON BURGANDY, 192.168.144.3
SULU,
PIKE, 192.168.1.78
T'POL, 192.168.192.24
T-PAIN -LOL, 192.168.1.192
ENTERPRISE, 192.168.1.194
INTREPID, 192.168.3.49
USS VIRGINIA CGN-38, 192.168.8.38
USS LaSALLE AGF-3, 192.168.8.3
MISS PIGGY,
KERMIT THE FROG, 192.168.1.174
GONZO, 192.168.192.5
FOZZIE, 192.168.7.21
ANIMAL, 192.168.3.92
BEAKER, 192.168.192.26
ROWLF, 192.168.3.80
SCOOTER, 192.168.3.36
SAM EAGLE, 192.168.192.1
DR BUNSEN HONEYDEW,
STALER, 192.168.1.155
WALDORF, 192.168.3.58
SWEDISH CHEF - BORK BORK BORK, 192.168.1.3
PIGS IN SPACE, 192.168.1.48
RIZZO THE RAT, 192.168.144.16
FRANK RIZZO - LOL,
OSCAR, 192.168.4.9
BIG BIRD, 192.168.3.30
BERT, 192.168.5.45
ERNIE, 192.168.1.118
GROVER, 192.168.3.55
SNAKE EYES,
COBRA COMANDER, 192.168.3.3.
STORM SHADOW, 192.168.3.36
LADY JAYE, 192.168.3.5
BARONESS, 192.168.3.8
DUKE, 192.168.1.177
DESTRO, 192.168.3.1
SCARLETT, 192.25.160.1
FLINT, 192.168.3.17
HAWK, 192.168.192.1
BIG CHUCK,
LITTLE JOHN, 192.168.144.13
COOL GHOUL, 192.168.3.6
ZARTAN, 192.168.192.7
MEGATRON, 192.168.14.3
STARSCREAM, 192.168.1.185
ICE CREAM, 192.168.33.3
ME GRIMLOCK, 192.168.5.101
JAZ, 192.168.3.54
OPTIMUS PRIME, 192.168.3.53
IRONHIDE, 192.168.1.116
SOUNDWAVE - THE BEST, 192.168.1.136
KUP, 192.168.192.8
SLUDGE, 192.168.1.184
LASERBEAK, 192.168.192.14
BUMBLEBEE, 192.168.144.3
GRAPPLE, 192.168.3.1
SMOKESCREEN, 192.168.3.45
RUMBLE, 192.168.14.7
RAVAGE, 192.168.5.99
MAGNUM PI, 192.168.3.30
A-TEAM, 192.168.144.17
MR T - I PITTY THE FOOL, 192.168.6.3.
TRAP - ITS A TRAP, 192.168.1.135
MACGYVER,
THE DUKES OF HAZZARD, 192.168.1.136
BOSS HOG,
TOUR OF DUTY, 192.168.3.56
VOLTRON, 192.168.1.19
TIMMY, 192.168.1.15
JIMMY, 192.168.3.4
MR-HANKEY, 192.168.3.5
CARTMAN, 192.168.5.44
KENNY, 192.168.3.3
STAN, 192.168.19.168
KYLE, 192.168.5.60
TOLKIEN,
CHEF, 192.168.3.99
LIAN-CARTMAN, 192.168.1.3
THE-SCARY-MONSTER, 192.168.1.100
BEBE, 192.168.1.156
SHARON-MARSH, 192.168.1.101
TOWELIE,
LINDA-STOTCH, 192.168.192.3
GARY, 192.168.1.3
MR GARRISON, 192.168.12.3
BONO, 192.168.1.16
WENDY TESTABURGER, 192.168.192.8
ANAKIN, 192.168.3.37
DARTH VADER, 192.168.1.115
LUKE, 192.168.3.38
OBI-WAN, 192.168.1.49
HAN SOLO, 192.168.35.22
SHEEV, 192.168.3.33
LEA, 192.168.1.117
YODA, 192.168.192.168
CHEWBACA, 192.168.5.41
BOBA FETT, 192.168.192.11
JENGO-FETT, 192.168.3.58
R2-D2, 192.168.144.11
C-3PO, 192.168.22.45
STORM TROOPER, 192.168.4.3
SNOW TROOPER, 192.168.69.101
CLONE TROOPER,
SUPER TROOPERS - LOL, 192.168.99.100
REY, 192.168.192.4
LANDO,
PADME, 192.168.88.88
KYLO REN, 192.168.1.194
MACE WINDU - LIVES, 192.168.192.5
QUI-GON JIN,
GIN AND JUICE, 192.168.1.189
ADMIRAL ACKBAR,
DARTH MAUL, 192.168.3.41
AHSOKA TANO, 192.168.77.78

Here is the actual Ansible Playbook “VMWARE_list_all_powered-on_vms.yml”


---
- hosts: localhost
  vars:
    vcenter_hostname: vcenter.domain.local
    vcenter_user: ansibleuser@DOMAIN.LOCAL
    vcenter_pass: !vault |
          $ANSIBLE_VAULT;1.1;AES256
    
    esxhost: 192.168.1.101
    name: "{{ vm_name }}"
    notes: Ansible Test
    dumpfacts: False

  tasks:
  - name: Gather all VMs information
    vmware_vm_info:
      hostname: '{{ vcenter_hostname }}'
      username: '{{ vcenter_user }}'
      password: '{{ vcenter_pass }}'
      validate_certs: no
    register: all_vm_info
    delegate_to: localhost


  - name: Gather a list of all powered on VMs
    set_fact:
      on_vm: "{{ all_vm_info.virtual_machines | json_query(query) }}"
    vars:
      query: "[?power_state=='poweredOn']"
    register: jsoncontent

  - name: Gather a list of all powered on VM names
    debug: msg="{{ on_vm | json_query(jmesquery) }}"
    vars:
      jmesquery: "[*].{guest_name: guest_name, ip_address: ip_address}"


Watchguard Dimension Server on ProxMox VE 6.4-8

One of my friends needed a WatchGuard Dimension server setup and they were using ProxMox as the host. I figured it *Should* be easy. I initially downloaded the Dimension ova, scp’d the .ova over to proxmox and worked on unpacking the ova and importing the .ovf. the import worked, but the VM would not boot. Next I found Marcus Eaton’s Blog article on Installing WatchGuard Dimension on Proxmox VE. I ran into problems converting the disks initially as my Dimension VM drives were stored in an LVM-Thin volume. This is how I got mine to work:

1.) On ProxMox, I created a directory under /root: mkdir /root/staging

2.) Scp’d WG Dimension’s VMware .ova file to /root/staging

3.) In /root/staging, I unzipp’d the .ova: tar xvf ./watchguard-dimension_2_2.ova

4.) Create a new VM in Proxmox, chose “Do not use any media” and left the default Guest OS type as: Linux/5x-2.6 Kernel. Under the System Tab, I left the defaults. On the “Hard Disk” tab, For Bus/Device, I selected: SATA and set the drive as 160 GB. Under the “CPU” Tab, I selected 2 sockets and 2 Cores. Under the Memory Tab, I selected 4096 (4 GB). Under Network, I changed the Model to E1000 and confirmed the settings. When the vm was finished creating, I edited the VM hardware adding a 2nd Hard drive (SATA) which was also 160 GB. So the VM now had (2) 160 GB hard drives. I left the vm powered down and returned to my ssh session into proxmox.

5.) From the /root/staging directory I ran the following (the commands can take some time to run):

A.) qemu-img convert -f vmdk -O raw watchguard-dimension_2_2_signed-disk1.vmdk /dev/mapper/pve-vm–100–disk–0

B.) qemu-img convert -f vmdk -O raw watchguard-dimension_2_2_signed-disk2.vmdk /dev/mapper/pve-vm–100–disk–1

Once the conversions completed, I was able to power on the Dimension VM and run through its configuration.

Migrating VM’s on XEN to VMware

2 years ago I was tasked with migrating some vm’s off XEN to VMware, these were my notes:

1.) Take SNAPSHOT!!!!

2.) Uninstall Citrix via add/remove programs (dont restart)

3.) Manually run C:\programfilesx86\citrix\xentools uninistaller.exe (dont restart)

4.) Device Manager (uninstall devices w/Citrix driver) (dont reboot) (May have to uninstall twice)

5.) Device mgr (show hidden devices) look for citrix drivers and uninstall if any are shown

6.) Restart machine – take another snapshot (just incase)

7.) open device MGR, double check for XEN drivers (shouldnt be any)

8.) Open the registry editor (regedit) and navigate to:

HKLM\SYSTEM\CurrentControlSet\Services\

Delete all Keys that begin with “XEN*” and repeat it for all “CurrentControlSet” Keys you may have for example

HKLM\SYSTEM\CurrentControlSet1\Services\
HKLM\SYSTEM\CurrentControlSet2\Services\

Now navigate to:

HKLM\SYSTEM\CurrentControlSet\Control\Class\

and delete the “UpperFilters” value found under the contents of the following two Keys:

{4D36E96A-E325-11CE-BFC1-08002BE10318}
{4D36E97D-E325-11CE-BFC1-08002BE10318}

Repeat it for all “CurrentControlSet” Keys you may have for example:

HKLM\SYSTEM\CurrentControlSet1\Control\Class{4D36E96A-E325-11CE-BFC1-08002BE10318}
HKLM\SYSTEM\CurrentControlSet1\Control\Class{4D36E97D-E325-11CE-BFC1-08002BE10318}
HKLM\SYSTEM\CurrentControlSet2\Control\Class{4D36E96A-E325-11CE-BFC1-08002BE10318}
HKLM\SYSTEM\CurrentControlSet2\Control\Class{4D36E97D-E325-11CE-BFC1-08002BE10318}

9.) goto c:\windows\system32 & delete all xen drivers

10.) reboot & make sure no BSOD

11.) run vmware converter

VCenter 6.5 Appliance (VCSA) password recovery procedure failing

The other week I was resetting the root password on a few VCenter appliances. 2 out of the 3 appliances went well. The last one I encountered was not so easy. At first I figured maybe I fat-fingered the password, but after a few retries with the same results I looked elsewhere.

A normal password recovery consist of:

  1. Restart your VCenter appliance and wait for the Photon OS Splash screen during boot
  2. Hit the letter “E” on the keyboard to edit grub menu
  3. Next add the following to the end of the line that starts with linux: rw init=/bin/bash
  4. Hit the F10 key on your keyboard to boot
  5. At the root  prompt, enter passwd (hit enter) set your normal password (twice)
  6.  run: unmount /
  7. run: reboot -f

This is  exactly what I did and the new password would still not work after reboot.  I would enter root as the username, when I entered the password  I would see “account locked after x retries” I then tried using:  pam_tally –reset –user root (directly after resetting my root password during recovery (In between  set 5 & 6, but I still had issues.

The final work-around was to try: pam_tally2 –reset –user root

To recap:

  1. Restart your VCenter appliance and wait for the Photon OS Splash screen during boot
  2. Hit the letter “E” on the keyboard to edit grub menu
  3. Next add the following to the end of the line that starts with linux: rw init=/bin/bash
  4. Hit the F10 key on your keyboard to boot
  5. At the root  prompt, enter passwd (hit enter) set your normal password (twice)
  6.  run: pam_tally2 –reset –user root
  7.  run: unmount /
  8. run: reboot -f

Veeam Backup Failing (VSS_WS_FAILED_AT_PREPARE_SNAPSHOT) (Resolved)

Veeam Backup Failing (‘VSS_WS_FAILED_AT_PREPARE_SNAPSHOT’)

I had a Veeam backup job that was failing with: Retrying snapshot creation attempt (Writer ‘Microsoft Hyper-V VSS Writer’ is failed at ‘VSS_WS_FAILED_AT_PREPARE_SNAPSHOT’. The writer experienced a non-transient error. If the backup process is retried, the error is likely to reoccur. –tr:Failed to verify writers state. –tr:Failed to perform pre-backup tasks.)

Researching this error online was telling me the issue was on the host, but I wasn’t believing that as all of my other vm’s were backing up without issue daily.

To play it safe I checked the host by running: vssadmin list writers

I received the following error on the host:

microsoft hyper-v vss writer non-retryable error

Looking further on the host’s event logs for the error I saw this:

At this point I was still convinced the host wasn’t at fault due to the fact all other vm’s still backed up fine, so I logged onto the vm in question and ran: vssadmin list writers

I received the following on the vm:

sqlserverwriter non-retryable error

Looking into the event viewer I saw:

Researching these errors online I found several solution saying to delete the old backup software. This server used to use another backup solution prior to Veeam called Altaro, which I was pretty sure I had removed a long time ago. I checked add/remove programs and verified Altaro wasn’t listed. I even checked the vss writers for any other backup software listed and found nothing. Running out of ideas, I checked Windows backup to make sure it wasn’t running and no backup jobs were listed. I then looked into Task Scheduler and found a few manual backup jobs listed. I disabled and deleted these jobs. I then restarted the SQL VSS writer service, restarted SQL VSS service, verified it showed no errors after re-running vssasdmin list writers. I then retried the Veeam backup again and it failed out once again.

Re-running vss list writer I received the same error. I was now convinced this was tied to the old task scheduled backups I had removed.

Next, I tried: vssadmin delete shadows /all

After running that command, I received:

Error: Snapshots were found, but they were outside of your allowed context.  Try
 removing them with the backup application which created them.

After much more research, I found an outside the box way of deleting the snapshots from another site.

How to Fix “outside of your allowed context” Errors

In order to get rid of these kinds of shadows we need to apply a “trick”. Basically the VSS diff area storage is where VSS keeps these shadows “alive”.

By seriously cutting this limit to the bare minimum we invoke a mechanism in VSS itself that causes it to dump all shadows.

So we proceed by telling VSS to cut the limit down to 401 MB. For some reason the user interface will claim the bottom is 300MB but on several versions of Windows it refuses and reports:

Error: Specified number is invalid

The command that works uses 401MB and is (adapt it to your drive letter as needed):

vssadmin resize shadowstorage /for=D: /on=D: /maxsize=401MB  

*****I ran this against the C: and D: drive of my VM*****

Then once you get “success” you can increase the limit once again to the recommended “unbounded” setting, or an actual limit value if you are using shadow copies for other purposes:

vssadmin resize shadowstorage /for=d: /on=D: /maxsize=unbounded

*****I ran this against the C: and D: drive of my VM*****

Then, vssadmin happily reports:

Successfully resized the shadow copy storage association

and a quick check using

vssadmin list shadows

reveals all VSS shadow copies are now gone!

I then re-ran Veeam the Veeam backup job against the VM and it ran successfully!

Dell ESXI tweaks for EqualLogic SAN

change login timeout to 60
turn off delay AK

run the following:
ethtool –pause vmnic0 autoneg off tx on rx on
ethtool –pause vmnic1 autoneg off tx on rx on
ethtool –pause vmnic2 autoneg off tx on rx on
ethtool –pause vmnic3 autoneg off tx on rx on

then past the above into: /etc/rc.local.d/local.sh

Dell esxi script for esxi hosts using equalogic:

esxcli storage nmp satp set –default-psp=VMW_PSP_RR –satp=VMW_SATP_EQL ; for i in `esxcli storage nmp device list | grep EQLOGIC|awk ‘{print $7}’|sed ‘s/(//g’|sed ‘s/)//g’` ; do esxcli storage nmp device set -d $i –psp=VMW_PSP_RR ; esxcli storage nmp psp roundrobin deviceconfig set -d $i -I 3 -t iops ; done

VMWare: How to add RDP rule to esxi firewall via cli

Sometimes I need to ssh jump to esxi host and tunnel RDP to an internal host across the ssh session

Backup:
cp /etc/vmware/firewall/service.xml /etc/vmware/firewall/service.xml.bak

chmod 644 /etc/vmware/firewall/service.xml

chmod +t /etc/vmware/firewall/service.xml

vi & enter towards the bottom of service.xml (below 0037):

<!– MY RDP –>
<service id=’0038′>
<id>myrdp</id>
<rule>
<direction>outbound</direction>
<protocol>tcp</protocol>
<porttype>dst</porttype>
<port>3389</port>
</rule>
<enabled>false</enabled>
<required>false</required>
</service>

chmod 444 /etc/vmware/firewall/service.xml

esxcli network firewall refresh

esxcli network firewall ruleset list

esxcli network firewall ruleset set -e true -r myrdp

Now feel free to RDP via ssh tunnel. I usually disable the rule after via:

esxcli network firewall ruleset set -e false -r myrdp

How to power off an unresponsive vm via cli

From: https://www.vladan.fr/esxi-5-unresponsive-vm-h/

Step 1 – connect via SSH by using puty for example and enter esxtop.

Enter “esxtop”, then press “c” for the CPU resource screen and shift + V to display VMs only.

ESXi 5 Unresponsive VM

Step 2 – changing the display and locating the LWID number

Press “f” to change the display fields and press “c” in order to show the LWID (Leader World Id) and press ENTER.

How to kill unresponsive VM in VMware ESXi 5

Step 3 – Invoking the k (kill) with the number does it…..

Now when you have the LWID column there, you can see the VM which interests you by the LWID number.

You can press “k” and enter the LWID number of the VM which you want to stop. Note that this is hard stop so, the next time that the VM will boot you’ll probbably see this screen (depending on your guest OS of course).

VMware ESXi 5 – How to kill an unresponsive VM through command line

If this method don’t work, you can’t vmotion the VM elsewhere or any other option don’t work either, there might be a hardware problem with the host which can lead into PSOD.