Showing posts with label vCSA. Show all posts
Showing posts with label vCSA. Show all posts

Wednesday, October 30, 2013

vCSA 5.1u1b Login Timeouts

Had an interesting thing happen in an environment. Not sure which knob I tweaked did the actual fix so I thought I would point out the problem and all three modifications.
  • vCSA pointing directly to a Windows 2008 R2 AD DC/GC.
  • SysAdmins add a Windows 2012 AD DC/GC into the domain.
  • Timeouts on vSphere logins suddenly start: "The command has timed out as the remote server is taking too long to respond."
Frequently with the C# client but also with the web client. There are so many chatty logs that damned if I wasn't going to be able to figure out what was going on by log checking. Trial and error was the quickest way to a solution.

Fix 1) We have a main *.domain.com and a *.ad.domain.com. Turns out that while my primary DNS servers and hostname for the VM at hostname.domain.com was all well and good, some part of ldap  (/var/log/ldapmessages) was trying to contact hostname.ad.domain.com which did not have an A record. A record created in subdomain.

Fix 2) Swapped URL from port 636 (LDAPS) to 3269 (Secure Global Catalog). Same 2008 AD DC.

Fix 3) Bumped client timeout from default 30 seconds to 60 seconds.

Now, if you look at the readme for 5.1u1b it states pretty close to the top of the page that they've fixed timeout issues. Har. It was probably the LDAPS to SGC port change that fixed the issue, so if you are reading this I would start there.

Also, while you're looking through the readme files, look at 5.1u1c. How many items are fixed? 1.
How many known issues are there? 99. How many of those known issues are listed as new? 2/99.  Make your own choice if you want to update to 5.1u1c or just wait for an upgrade that actually does something.

Wednesday, October 2, 2013

Update to latest 5.1 build on vCSA

Thanks to Virtual Aspects for a smooth second attempt at updating to the latest build of the 5.1 vCSA. My search term of vami-sfcb was surprisingly good. After watching vami-sfcb fumble in the logs repeatedly I was just trying to find out what the service did.. but there was the fix.

Sunday, August 25, 2013

vCSA 5.1u1b saves the day

Windows shop install of a "simple install" vCenter. Piece of cake and the local IT staff should be able to manage it in the long term. Except they don't put the new vCenter server into DNS and I don't realize it. >_< "Error 29155.Identity source discovery error" but install of SSO continues so I think it's something we can work with. Turns out, not so much but it was a pair of bad decisions by me that turned into a 5 hour headache and wanting to do bodily harm to the manager who decided that SSO should be required for all vCenter deployments. Hindsight #1: It's a VM. Use snapshots before mucking with things or installing. "Nah, it'll be fine, simple install...right?" Hindsight #2 and a bit: Do not uninstall SSO. Do not try to re-install everything from the "simple vCenter install" and expect it to work. Clean system or nothing. vCSA 5.1u1b (which the local IT staff won't know what to do with) was downloaded, installed, configured in about an hour. Setting the appliance for AD auth automatically set the AD domain in SSO (although with insecure settings) but it worked and I didn't have to muck with it. Saved the day. So thank you VMware for investing time into the vCSA and screw you for bad SSO error messages (or just SSO in general) and uninstall procedures that don't clean the system correctly. You cost me an entire Saturday afternoon and into the evening. Should I manually check DNS for the new vC server that I was handed? Yes. Should the SSO installer have continued upon error? NO.

Thursday, January 3, 2013

vCSA update 5.1.0a to 5.1.0b

Just a few hiccups when upgrading to 5.1.0b... Nothing like the koolaid I had to mix up (here and here) to get from 5.0U1a to 5.0U1b to 5.1a.
I read the release notes and rebooted the VM at the end of the update.

Then nothing worked.
WTF?

I logged into the console of the VM and saw that it had reverted to the temporary IP that I gave the appliance when I initially made the upgrade from 5.0 to 5.1. This may be due to the vApp options -> Properties still having the temp IP... I just changed it to the correct IP in /etc/sysconfig/network/ifcfg-eth0 (then ifdown eth0 / ifup eth0).

This worked for the main services, but not, it seems for SSO. I can't login to the web client, the error message clearly shows that it's trying to talk on the temp IP. I expect another reboot (or actually a shutdown/startup so that I can change the vApp settings) needs to happen for vmware-sso to start listening on the right address (restarting the vmware-sso did nothing).

/etc/vmware-sso/ls_url.txt shows the correct IP (lookup service)
/etc/vmware-vpx/vpxd.cfg did not. I restarted vmware-vpxd and still couldn't login.. a restart of the vmware-client service did the trick (and while it's called vmware-client, it did not quit active sessions with the C# client).

So I'm not sure if my editing of the vpxd file and the restart of vmware-client was needed or just the restart of vmware-client... at least this post will point someone in the right direction if they have the same issue.

The other fun thing was the ESXi hosts disconnected to vCenter and needed to be manually connected back. Not sure if this was because of the IP address change or something else.


Tuesday, November 20, 2012

vCSA SSO Secret Active Directory Handshake

---
The test connection button was working for me with "reuse session" but I was getting "LDAP: error code 8 - 00002028: LdapErr: DSID-0C0901FC, comment: The server requires binds to turn on integrity checking if SSL\TLS are not already active on the connection" and "Failed to serialize response" when I actually said to use that configuration.

Error messages were in /var/log/vmware/sso/ssoAdminServer.log 

---
vCSA and working for me:

Primary server URL:
ldaps://FQDN:636
Base DN for users:
<Only the domain, no other DN/OUs> DC=ad,DC=company,DC=com
Domain name:
ad.company.com
Base DN for groups:
<same as users>

cert (*.cer) created for me from the FQDN AD/GC server listed in Primary server URL.
(VMware KB pointing to MS KB)

Auth type:
Password
Created a "service account" user to connect.


The client needs the full ad.company.com\user, not shorthand domain to get in.

Saturday, November 10, 2012

vCSA upgrade from 5.0u1b to 5.1a Part II

First time running VMware's vCenter Server Appliance upgrade, it failed miserably because my vCSA was a rsyslog location (initial failure here).

So I waited a bit to try again, and hey look, there's an "a" update to vCenter 5.1 for upgrade bugs!

Here's my basic steps so that I didn't fail again in the upgrade:
  • Stop rsyslog from hosts
  • rm -rf on the contents of /opt/vmware/var/log/remote 
    • Obviously you could copy them to another system if you needed to keep them around, lucky for me this is a dev environment.
  • TAKE SNAPSHOT NOW
  • Export your permission set unless you just let anyone get into your vCSA..
  • Grab new appliance ova, install and set for another IP in the same subnet as the 5.0 appliance 
  • Start up and go through the key exchange under the "upgrade" choice on the 5.1 appliance
  • Because of my previous dumping of tasks/events/history the upgrade took just a few minutes
  • The source 5.0 appliance is powered down and the 5.1 appliance takes over the IP of the original (the root password was duplicated as well, that was nice). I think it rebooted as well. 
  • I needed to reauth to the AD domain (removed old computer account, add this one)
  • My AD permissions were not copied over in the transfer, blank slate with only root access to anything. 
  • I needed to poke Active Directory LDAP settings  to be able to get SSO working and then I could import the custom permission sets and AD groups back into the appliance. 
Whew. Better than last time? Maybe. The fact that the vCSA is *still* a linux VM without NTP running but connected to AD is a big miff for me. I'll save that for the other post I'm dreaming up..

root ssh on vCSA (whoops!)






I'm going to make a post with some hardening and vmware-linux-best-practices-that-vmware-has-not-done of VMware's vCenter Server Appliance (5.1), but only after I stop locking myself out...

If you turn off root ssh before creating another real user in the appliance, do not be alarmed. You can head into the admin web interface and change the setting on the Admin tab (Toggle SSH setting). The Administrator SSH logon enabled will change from no to yes (assuming you haven't locked yourself out of the web admin as well). :)

Oh, and the new skitch sucks.

Friday, November 9, 2012

vCSA 5.1 SSO LDAP issues

Working with the vSphere vCenter Server Appliance 5.1 (build 880472, the "a" edition) I could not get straight LDAP (not LDAPS) to work for Single Sign On. Was not happening. Not with Anonymous, not with username, not with reuse session. Anonymous was just broken, but it looks like this is by design with current AD and is not recommended (by anyone, anywhere, but if you search for it you can find out how to do it).
Error message is [LDAP: error code 8 - 00002028: LdapErr: DSID-0C0901FC, comment: The server requires binds to turn on integrity checking if SSL\TSL are not already active on the connection, data 0, v1db1].
VMware says "Hey crazy! Get your SSL certs in order!" (Drink the kb here).
I said "That's great VMware, what are my other options?" Turns out I answered my own question year(s) ago!  (Old Koolaid here)

That's right, just turn off the cert requirement:
Computer configuration - Policies - Windows Settings - Security Settings - Local Policies/Security Options - Domain Controller: LDAP server signing requirements (None/Require signing/Undefined[which is the same as None])  -- Change to None

Not as secure, but I could import my users/groups into the new appliance (luckily had a backup) without making other changes (create cert template, export, blah blah) that my local domain admins didn't want to do on a Friday afternoon. 

See updated secret sso cert handshake post for secure settings that worked for me

Pic to show straight (insecure) LDAP settings.
ldap://<fqdn of an active directory server>:389
CN=Users,DC=ad,DC=<part2>,DC=<part3>
full domain name
reuse session

The default "Users" OU is listed as a CN. If you point to another root OU in your domain, you would use OU=<blah> instead of CN. You can also screw the CN/OU and pull everything from the domain as well by putting in just the DC parts.




Tuesday, September 18, 2012

vCSA upgrade from 5.0u1b to 5.1

First try on upgrading to the new 5.1 appliance and a big fail with a filled destination 5.1 vCSA root partition. (( Upgrade that worked here ))

If you have configured remote syslog on your source appliance, this routes to a /new directory on the destination appliance during the upgrade (not /storage/log). Unlucky for us, VMware choose not to have this /new directory with it's own mount point.

My work around is to create a new 20G drive and mount to /new before starting.

Oh, and take a snapshot BEFORE importing/swapping keys between source and destination vCSAs..

I'll update if I come across any other gotchas.


I've pinged VMware's kb with this, hopefully they'll get something up quickly. As for me...to last night's backup I go. Good thing this isn't production. 

Thursday, September 6, 2012

vCSA DB2 shenanigans before upgrade

My initial post on upgrading the vCenter appliance to 5.0u1b build 804277 gave some good tips that I found around and figured out myself.
But I'll tell you, dear reader, that the upgrade process kept failing for me. Too much db2 export which I narrowed down to tasks & events.

I had set tasks and events to be kept for 180 days (this might be the default). I bumped this down to 30 days hoping that the rows would be deleted and then my upgrade would work again. This didn't seem to happen. So this post is about clearing your embedded db2 database of history stats, tasks and events before upgrade. Note that it doesn't look like anything in the db2 cli is case sensitive.

I should mention that I am *not* a DBA, that this could hose your installation, it will most definitely hose your stats, events and tasks so this process might not be a good idea in your production environment. The vCSA is a VM: Use snapshots to your advantage here.
  • SSH into the appliance as root. Turn off vcenter services:
# service vmware-vpxd stop
  • Change your login to db2inst1, get into the db2 cli, connect to the vCenter database:
# su - db2inst1
~> db2
db2 => connect to VCDB
  • I couldn't remember if I had bumped the transaction logsize on this appliance, so I did the command anyway: 
db2 => update db configuration FOR VCDB USING logprimary 16 logsecond 112 logfilsiz 8192
  • Delete contents of the history tables:
TRUNCATE TABLE vc.vpx_hist_stat1 IMMEDIATE
TRUNCATE TABLE vc.vpx_hist_stat2 IMMEDIATE
TRUNCATE TABLE vc.vpx_hist_stat3 IMMEDIATE
TRUNCATE TABLE vc.vpx_hist_stat4 IMMEDIATE
TRUNCATE TABLE vc.vpx_sample_time1 IMMEDIATE
TRUNCATE TABLE vc.vpx_sample_time2 IMMEDIATE
TRUNCATE TABLE vc.vpx_sample_time3 IMMEDIATE
TRUNCATE TABLE vc.vpx_sample_time4 IMMEDIATE
  •  Deleting the contents of the tasks & events is not as straightforward. There are foreign keys associated with these tables that need to be removed before you can delete the rows and then re-create the foreign keys. 
    • See how many rows of data: 
select count(*) from vc.vpx_event
select count(*) from vc.vpx_event_arg
select count(*) from vc.vpx_task
    • Remove foreign keys: 
alter table vc.VPX_EVENT_ARG drop constraint FK_VPX_EVENT_ARG_REF_EVENT
alter table vc.VPX_EVENT_ARG drop constraint FK_VPX_EVENT_ARG_REF_ENTITY
alter table vc.VPX_ENTITY_LAST_EVENT drop constraint FK_VPX_LAST_EVENT_EVENT
alter table vc.VPX_EVENT drop constraint FK_VPX_CHANGE_TAG
alter table vc.VPX_EVENT drop constraint FK_VPX_EVENT_REF_COMPUTERES
alter table vc.VPX_TASK drop constraint FK_PARENT_TASK_REF
alter table vc.VPX_TASK drop constraint FK_VPX_TASK_CHANGE_TAG
alter table vc.VPX_TASK drop constraint FK_VPX_TASK_REF_ENTITY
    • Delete data: 
truncate table vc.VPX_TASK immediate
truncate table vc.VPX_ENTITY_LAST_EVENT immediate
truncate table vc.VPX_EVENT immediate
truncate table vc.VPX_EVENT_ARG immediate
    • Re-add foreign keys: 
alter table vc.VPX_EVENT_ARG add constraint FK_VPX_EVENT_ARG_REF_EVENT foreign key(EVENT_ID) references vc.VPX_EVENT (EVENT_ID) on delete cascade
alter table vc.VPX_EVENT_ARG add constraint FK_VPX_EVENT_ARG_REF_ENTITY foreign key (OBJ_TYPE) references vc.VPX_OBJECT_TYPE (ID)
alter table vc.VPX_ENTITY_LAST_EVENT add constraint FK_VPX_LAST_EVENT_EVENT foreign key(LAST_EVENT_ID) references vc.VPX_EVENT (EVENT_ID) on delete cascade
alter table vc.VPX_EVENT add constraint FK_VPX_CHANGE_TAG foreign key(CHANGE_TAG_ID) references vc.VPX_CHANGE_TAG(CHANGE_TAG_ID) on delete set null
alter table vc.VPX_EVENT add constraint FK_VPX_EVENT_REF_COMPUTERES foreign key(COMPUTERESOURCE_TYPE) references vc.VPX_OBJECT_TYPE(ID)
alter table vc.VPX_TASK add constraint FK_PARENT_TASK_REF foreign key(PARENT_TASK_ID) references vc.VPX_TASK(TASK_ID)
alter table vc.VPX_TASK add constraint FK_VPX_TASK_CHANGE_TAG foreign key(CHANGE_TAG_ID) references vc.VPX_CHANGE_TAG(CHANGE_TAG_ID) on delete set null
alter table vc.VPX_TASK add constraint FK_VPX_TASK_REF_ENTITY foreign key(ENTITY_TYPE) references vc.VPX_OBJECT_TYPE(ID)
    •  Your tables are now empty:
select count(*) from vc.vpx_event
select count(*) from vc.vpx_event_arg
select count(*) from vc.vpx_task
  •  quit from db2 cli, exit from db2inst1 user, restart vcenter services to see if everything powers back up: 
=> quit
~> exit
# service vmware-vpxd start
  •  Log into the vSphere Client to check that everything looks good. 
  • Try your upgrade again. For me, just truncating the history stats still failed after exporting the data for ~18 hours. Yeah. 18 hours and then I had to revert to snap. Dumping the events/tasks was the only way this was ever going to work. 
Thanks to vfrankjuanma and IBM's db2 documentation.

Saturday, July 28, 2012

vCSA upgrade tips & recommendations

Upgrading the vCenter Server Appliance looks simple enough. If you run into problems, maybe these tips can help:

  • Create the 20G (yes 20G) disk in the kb article
  • Take a snapshot first (shut down services)
  • Really. TAKE A SNAPSHOT FIRST
  • If the upgrade does not go smoothly the first time, do not try again with a half-broken install. Revert to snap.
  • If your upgrade breaks during the export, check your free inodes (df -i)
  • Increase the number of inodes on the /storage/db/export partition (mkfs -t ext2 -i 8192 /dev/sdc1 )
  • Problems mounting /dev/sdc1 (doesn't like ext2)? tune2fs -O has_journal /dev/sdc1
  • tail -f /opt/vmware/var/log/vami/updatecli.log during the upgrade. This file is created a minute or two after you kick off the upgrade. 
  • There will be a long (potentially hours long) pause in the log while the database export is happening. Top will show db2bp kjourneld and db2sync in use. Do not kill any of these processes. 
  • You didn't take my advice and create a snapshot? You needed to use a restored backup? You might need to remove the /etc/udev/rules.d/70-persistent-net.rules mac addresses if the appliance says it can't bring up eth0. 

Thursday, March 29, 2012

DB2 to Postgre vCSA change

The vCenter Server Appliance in the not-released-yet 5.0.1 (5u1) version will be switching from embedded DB2 to Postgre. ARG. Drink up
Is this because the transaction log settings and otherwise working with DB2 to truncate existing tables has been an issue? I sure can't find any detailed information online with the inner workings of the DB2 db. Hopefully smoother sailings than I've had so far.


Tuesday, October 18, 2011

vSphere 5 vCenter Appliance first take

UTC. Do not change the timezone on the appliance or the embedded DB2 database will not start. 

Also watch for the DB2 logsize: http://kb.vmware.com/kb/1005259
And the Likewise AD auth needs to be a domain admin, but it won't put the hostname in correctly if it's longer than 15 chars. 

Bugs to work out? Nahhhhh..

Other than those...few..gaping holes, I think VMware's got something going for it.