I have a stack of oldish Dell PowerEdge R805 servers that have been running ESX and then ESXi 5.0 on a standard hard drive configuration. Now that they'll not be in production use I was giving them a once over and saw they had an internal 1G SD card that was off in the BIOS.
Hmmm.
I decided to turn it on and install ESXi 5.1 on the SD card. I'm pretty sure that the version of ESXi preinstalled was the OEM embedded Dell 3.0 version of ESXi. The 5.1 installer throws quite the fit when doing the check of what's running in the card but it looks like it installs and runs just fine. Mostly python and libparted, see screenshot.
"The selected storage device contains ESX(i). The version could not be determined. Only installation is possible." Fine by me.
Wednesday, July 3, 2013
Friday, June 14, 2013
root cause analysis log hunting
I was asked to unravel the all-to-common filled datastore and the response from the local IT staff that ultimately made the company pull a critical server from backup (the actions of the local IT staff that is).
My problem? "Task: Delete file" from the datastore browser.
What file?
Looks like the verbose and trivia[l] vpxa.log would have this information ...but... the logs I have don't go quite far enough back. vpxa is CHATTY. Search the log for nfc.
Now, I was almost lucky in this case because the datastore was only used by two hosts. In a larger environment, which ESXi log would have this info? Hopefully whichever host the guest was associated with, but again.. unraveling what host a guest was on at a particular point could be time consuming.
Hopefully I have better luck next time and the client asks for this analysis closer to when it happened!
My problem? "Task: Delete file" from the datastore browser.
What file?
Looks like the verbose and trivia[l] vpxa.log would have this information ...but... the logs I have don't go quite far enough back. vpxa is CHATTY. Search the log for nfc.
Now, I was almost lucky in this case because the datastore was only used by two hosts. In a larger environment, which ESXi log would have this info? Hopefully whichever host the guest was associated with, but again.. unraveling what host a guest was on at a particular point could be time consuming.
Hopefully I have better luck next time and the client asks for this analysis closer to when it happened!
Saturday, April 27, 2013
5.1u1 rant. Too much fail in readme
5.1U1 might not work if you're attached to lots of AD groups, so use Windows passthrough
Oh, sorry. I'll do that now... Oh wait.
I can't login with Windows passthough because of SSO config issues. NetBIOS? REALLY?
Or maybe my login issue is that SSO seems to be only hanging onto to UUID and not the username so when my account was deleted I get locked out.
Or I use the vCSA and it doesn't allow Windows passthrough (<--see comment for KB fix).
Or I (some would say correctly) setup a SSO load balanced HA configuration, and it doesn't allow windows passthrough either. ARG.
I've been reading the upgrade readmes from VMware for years now and none of them seem to have as much fail as this one. No Windows Auth anymore for SQL Server? What?
That's ok I guess, since I couldn't install the schema in the first place.
So many of the issues, work arounds or otherwise identified issues have to do with special characters and non-English strings. Why is there no standard lib for those crazy double quotes in passwords? Really!?!
Who the hell is coding this? I've been to VMware HQ...which building basement houses those monkeys?
Enjoy. It would make for a good drinking game whenever SSO, special characters or database issues are mentioned. Please, not all three at once...you'll get alcohol poisoning.
Oh, sorry. I'll do that now... Oh wait.
I can't login with Windows passthough because of SSO config issues. NetBIOS? REALLY?
Or maybe my login issue is that SSO seems to be only hanging onto to UUID and not the username so when my account was deleted I get locked out.
Or I use the vCSA and it doesn't allow Windows passthrough (<--see comment for KB fix).
Or I (some would say correctly) setup a SSO load balanced HA configuration, and it doesn't allow windows passthrough either. ARG.
I've been reading the upgrade readmes from VMware for years now and none of them seem to have as much fail as this one. No Windows Auth anymore for SQL Server? What?
That's ok I guess, since I couldn't install the schema in the first place.
So many of the issues, work arounds or otherwise identified issues have to do with special characters and non-English strings. Why is there no standard lib for those crazy double quotes in passwords? Really!?!
Who the hell is coding this? I've been to VMware HQ...which building basement houses those monkeys?
Enjoy. It would make for a good drinking game whenever SSO, special characters or database issues are mentioned. Please, not all three at once...you'll get alcohol poisoning.
Monday, January 28, 2013
Can't even install vSphere DP? This is not a good sign
Downloaded current 5.1.1 (build 5.1.56.179) of vDP 0.5T ova.
From C# Windows Client, able to get through the ovf wizard, fails on creation disk 1 of 4:
Unable to access file:
<path>\vSphere_Data_Protection_-_0.5TB-disk1.vmdk.
Unable to connect to the remote server
On the web client, after RDP'ing to a Windows host that had access on port 9443 and installed Flash and installed the client integration plugin (arg), the wizard wouldn't let me pass the Select Storage. I said Next it went back to Select Storage. I hit Setup Network, it went back to Select Storage.
I think the issue here is that the 0.5T appliance is actually more like 900G when thick and the largest datastore I have in this environment is 700G (and yes, it's empty for this purpose).
I am choosing thin provision during the wizard but obviously this isn't good enough for vDP.
From the horror stories in the community forums I'm just going to say screw it.
This is a big issue when I move my production vCenter to 5.1. I'm only backing up 5 VMs with vDR in prod (why? because it's unreliable) but at least I can install (and re-install and re-install) it.
From C# Windows Client, able to get through the ovf wizard, fails on creation disk 1 of 4:
Unable to access file:
<path>\vSphere_Data_Protection_-_0.5TB-disk1.vmdk.
Unable to connect to the remote server
On the web client, after RDP'ing to a Windows host that had access on port 9443 and installed Flash and installed the client integration plugin (arg), the wizard wouldn't let me pass the Select Storage. I said Next it went back to Select Storage. I hit Setup Network, it went back to Select Storage.
I think the issue here is that the 0.5T appliance is actually more like 900G when thick and the largest datastore I have in this environment is 700G (and yes, it's empty for this purpose).
I am choosing thin provision during the wizard but obviously this isn't good enough for vDP.
From the horror stories in the community forums I'm just going to say screw it.
This is a big issue when I move my production vCenter to 5.1. I'm only backing up 5 VMs with vDR in prod (why? because it's unreliable) but at least I can install (and re-install and re-install) it.
Thursday, January 3, 2013
vCSA update 5.1.0a to 5.1.0b
Just a few hiccups when upgrading to 5.1.0b... Nothing like the koolaid I had to mix up (here and here) to get from 5.0U1a to 5.0U1b to 5.1a.
I read the release notes and rebooted the VM at the end of the update.
Then nothing worked.
WTF?
I logged into the console of the VM and saw that it had reverted to the temporary IP that I gave the appliance when I initially made the upgrade from 5.0 to 5.1. This may be due to the vApp options -> Properties still having the temp IP... I just changed it to the correct IP in /etc/sysconfig/network/ifcfg-eth0 (then ifdown eth0 / ifup eth0).
This worked for the main services, but not, it seems for SSO. I can't login to the web client, the error message clearly shows that it's trying to talk on the temp IP. I expect another reboot (or actually a shutdown/startup so that I can change the vApp settings) needs to happen for vmware-sso to start listening on the right address (restarting the vmware-sso did nothing).
/etc/vmware-sso/ls_url.txt shows the correct IP (lookup service)
/etc/vmware-vpx/vpxd.cfg did not. I restarted vmware-vpxd and still couldn't login.. a restart of the vmware-client service did the trick (and while it's called vmware-client, it did not quit active sessions with the C# client).
So I'm not sure if my editing of the vpxd file and the restart of vmware-client was needed or just the restart of vmware-client... at least this post will point someone in the right direction if they have the same issue.
The other fun thing was the ESXi hosts disconnected to vCenter and needed to be manually connected back. Not sure if this was because of the IP address change or something else.
I read the release notes and rebooted the VM at the end of the update.
Then nothing worked.
WTF?
I logged into the console of the VM and saw that it had reverted to the temporary IP that I gave the appliance when I initially made the upgrade from 5.0 to 5.1. This may be due to the vApp options -> Properties still having the temp IP... I just changed it to the correct IP in /etc/sysconfig/network/ifcfg-eth0 (then ifdown eth0 / ifup eth0).
This worked for the main services, but not, it seems for SSO. I can't login to the web client, the error message clearly shows that it's trying to talk on the temp IP. I expect another reboot (or actually a shutdown/startup so that I can change the vApp settings) needs to happen for vmware-sso to start listening on the right address (restarting the vmware-sso did nothing).
/etc/vmware-sso/ls_url.txt shows the correct IP (lookup service)
/etc/vmware-vpx/vpxd.cfg did not. I restarted vmware-vpxd and still couldn't login.. a restart of the vmware-client service did the trick (and while it's called vmware-client, it did not quit active sessions with the C# client).
So I'm not sure if my editing of the vpxd file and the restart of vmware-client was needed or just the restart of vmware-client... at least this post will point someone in the right direction if they have the same issue.
The other fun thing was the ESXi hosts disconnected to vCenter and needed to be manually connected back. Not sure if this was because of the IP address change or something else.
Thursday, December 13, 2012
ESXi 3.5 build 207095 to 5.0 current build steps
A client has the free version of ESXi 3.5 using local storage and
wanted to move up to a current version of ESXi. I wasn't able to find a
one-stop shop that said how to get from A -> E and for a while there I
thought I could at least skip 4.0...nope. If you're not using local
storage, please do yourself a favor and just do a fresh install.
It's possible that there's a slightly more efficient way to do this, but this was the way I was guaranteed not to loose the local datastore data. vmware.com was giving me some grief when I was trying to finish this post so I don't have links up to some of the packages. You should be able to search around and find them.
Things you'll need:
- vMA (current version is fine)
- vSphere Client 4.0 with Host Update Utility (here VMware-viclient-all-4.0.0-208111.exe)
- vSphere 4.0 update 1 upgrade package (here ESXi-4.0.0-1.9.208167-upgrade-release.zip)
- vSphere 4.0 to 4.1 upgrade package (here upgrade-from-esxi4.0-to-4.1-update03-800380.zip)
- vSphere 5.0 iso (standard one here VMware-VMvisor-Installer-5.0.0.update02-914586.x86_64.iso)
- vSphere 5.0 current build (here) if needed
Things you might want:
- vSphere 4.0 upgrade guide (here)
- vSphere 4.1 upgrade guide (here)
Ready?
- SCP the 4.1 patch to your vMA
- SCP the 5.0 patch to the ESXi datastore *
- Login to vMA
- Make a backup copy of your host config via vMA
-- Yeah. 4.0u1. Not current version of 4.0, that gave back an error that it wasn't compatible.
- Wait a bit. I'm not sure what files are created when, but there was a directory called /vmupgrade that was created and then deleted while the HUU stayed at 3%
- One or two reboots. My test ESXi system rebooted at 3% and 34%, prod ESXi only did the reboot at 34%.
- ESXi reboots to 4.0, HUU eventually will say that the upgrade failed when installing packages after the initial base reboot. Do not fret...keep moving.
- You can login via the vSphere Client & test that everything still looks good.
- Log into vMA and run vihostupdate with the 4.1 upgrade against the host.
- Test again after starting up into 4.1.
- Reboot system and boot to the 5.0 CD.
- The installer will see your current install and happily upgrade.
- And another reboot. I hope your server doesn't have too much memory that it's set to check upon each reboot..
- Back to vMA and run the esxcli command to patch
- Yes, you are done. Luckily, ESXi patches are cumulative so once you're at a major revision you only need to grab the most recent patch.
* If, like me, you've drawn a blank on how to get ssh enabled on ESXi 3.5:
From console:
- Alt-F1 unsupported
- enter root pw
- vi /etc/inetd.conf -> unhash #ssh
- ps | grep inetd
- kill -HUP <processID>
It's possible that there's a slightly more efficient way to do this, but this was the way I was guaranteed not to loose the local datastore data. vmware.com was giving me some grief when I was trying to finish this post so I don't have links up to some of the packages. You should be able to search around and find them.
Things you'll need:
- vMA (current version is fine)
- vSphere Client 4.0 with Host Update Utility (here VMware-viclient-all-4.0.0-208111.exe)
- vSphere 4.0 update 1 upgrade package (here ESXi-4.0.0-1.9.208167-upgrade-release.zip)
- vSphere 4.0 to 4.1 upgrade package (here upgrade-from-esxi4.0-to-4.1-update03-800380.zip)
- vSphere 5.0 iso (standard one here VMware-VMvisor-Installer-5.0.0.update02-914586.x86_64.iso)
- vSphere 5.0 current build (here) if needed
Things you might want:
- vSphere 4.0 upgrade guide (here)
- vSphere 4.1 upgrade guide (here)
Ready?
- SCP the 4.1 patch to your vMA
- SCP the 5.0 patch to the ESXi datastore *
- Login to vMA
- Make a backup copy of your host config via vMA
vicfg-cfgbackup --server <ESXi-host-ip> --portnumber <port_number> --protocol- Check that there's a scratch partition setup on your host via vMA
<protocol_type> --username username --password <password> -s <backup-filename>
vicfg-advcfg --server <ESXi-host-ip> --username root -g ScratchConfig.ConfiguredScratchLocation- Shutdown VMs and put into maintenance mode
if not found:
vifs --server <ESXi-host-ip> --username root --mkdir "[datastore1] .locker-esxtest"
vicfg-advcfg --server <> --username root -s /vmfs/volumes/datastore1/.locker-esxtest ScratchConfig.ConfiguredScratchLocation
vicfg-hostops --server <ESXi-host-ip> --operation enter- Start up the Host Update Utility, add your host and patch with the 4.0u1 upgrade package.
-- Yeah. 4.0u1. Not current version of 4.0, that gave back an error that it wasn't compatible.
- Wait a bit. I'm not sure what files are created when, but there was a directory called /vmupgrade that was created and then deleted while the HUU stayed at 3%
- One or two reboots. My test ESXi system rebooted at 3% and 34%, prod ESXi only did the reboot at 34%.
- ESXi reboots to 4.0, HUU eventually will say that the upgrade failed when installing packages after the initial base reboot. Do not fret...keep moving.
- You can login via the vSphere Client & test that everything still looks good.
- Log into vMA and run vihostupdate with the 4.1 upgrade against the host.
vihostupdate --server <ESXi-host-ip> -i -b <zip bundle>- Reboot again.
- Test again after starting up into 4.1.
- Reboot system and boot to the 5.0 CD.
- The installer will see your current install and happily upgrade.
- And another reboot. I hope your server doesn't have too much memory that it's set to check upon each reboot..
- Back to vMA and run the esxcli command to patch
esxcli software profile update -d <depot url or offline bundle zip file> -p <profile_name>- Reboot
- Yes, you are done. Luckily, ESXi patches are cumulative so once you're at a major revision you only need to grab the most recent patch.
* If, like me, you've drawn a blank on how to get ssh enabled on ESXi 3.5:
From console:
- Alt-F1 unsupported
- enter root pw
- vi /etc/inetd.conf -> unhash #ssh
- ps | grep inetd
- kill -HUP <processID>
Tuesday, November 20, 2012
vCSA SSO Secret Active Directory Handshake
---
The test connection button was working for me with "reuse session" but I was getting "LDAP: error code 8 - 00002028: LdapErr: DSID-0C0901FC, comment: The server requires binds to turn on integrity checking if SSL\TLS are not already active on the connection" and "Failed to serialize response" when I actually said to use that configuration.
Error messages were in /var/log/vmware/sso/ssoAdminServer.log
---
vCSA and working for me:
Primary server URL:
ldaps://FQDN:636
Base DN for users:
<Only the domain, no other DN/OUs> DC=ad,DC=company,DC=com
Domain name:
ad.company.com
Base DN for groups:
<same as users>
cert (*.cer) created for me from the FQDN AD/GC server listed in Primary server URL.
(VMware KB pointing to MS KB)
Auth type:
Password
Created a "service account" user to connect.
The client needs the full ad.company.com\user, not shorthand domain to get in.
The test connection button was working for me with "reuse session" but I was getting "LDAP: error code 8 - 00002028: LdapErr: DSID-0C0901FC, comment: The server requires binds to turn on integrity checking if SSL\TLS are not already active on the connection" and "Failed to serialize response" when I actually said to use that configuration.
Error messages were in /var/log/vmware/sso/ssoAdminServer.log
---
vCSA and working for me:
Primary server URL:
ldaps://FQDN:636
Base DN for users:
<Only the domain, no other DN/OUs> DC=ad,DC=company,DC=com
Domain name:
ad.company.com
Base DN for groups:
<same as users>
cert (*.cer) created for me from the FQDN AD/GC server listed in Primary server URL.
(VMware KB pointing to MS KB)
Auth type:
Password
Created a "service account" user to connect.
The client needs the full ad.company.com\user, not shorthand domain to get in.
Subscribe to:
Posts (Atom)