My ramblings about all things technical


VMware vSphere: Manage for Performance Course Experience

Last week I was fortunate enough to be able to attend the VMware vSphere: Manage For Performance course. I did the lab related to this course at VMworld Europe last year and in my now increasing preparations for the VCAP-DCA exam I knew I needed to strengthen my troubleshooting skills and more importantly fine tuning my ESXTOP/RESXTOP skills. Quite a few people commented that they really liked my VMware vSphere: Manage and Design for Security Course Experience posting so I thought i would try do the same for this one for anyone interested or thinking about booking the course.

  1. Day one covered the first three and a half modules Course Introduction, Performance in a Virtualized Environment, Virtual Machine Monitor and part of CPU Performance. It introduced you to all the monitoring tools you can use with an introduction to the performance graphs in Virtual Centre and ESXTOP,how to utilise these tools to work out possible problems and what to look for that may be good indicators of problems. Probably the thing I enjoyed the most about this course was that there was loads of labs for you to learn how to do it all yourself rather than learning it off a PowerPoint sheet or your course guide. If you are a regular reader of this blog then you’ll know I’ve been trying to perfect my knowledge of ESXTOP/RESXTOP and wrote a blog about it too “Understanding and using ESXTOP/RESXTOP”. This learning of it was a great stepping stone for the skills they covered in the course and for a few parts the links and resources in my blog gave me an even deeper knowledge of ESXTOP. For the virtual machine monitor module it covered Software and hardware virtualisation techniques which i knew fairly well from my studying for my VCP exams and the labs for it were really great in covering how the usage of these varying techniques can really help with the performance of your virtual machines/environment. Next we got into a bit of the CPU Performance module which introduced the CPU scheduler, CPU Cache contention and the NUMA. As with the hardware/software virtualisation techniques i had a good bit of knowledge about the CPU scheduler and NUMA from my VCP studies but it was a great refresher on the NUMA particularly and allowed me to better understand how it works and how the misallocation of resources can impact your virtual machines due to NUMA. Frank Denneman has done two brilliant postings all about the sizing of VM’s and NUMA Nodes and ESX 4.1 NUMA Scheduling which covers pretty much everything you need to know about this feature and how to use it correctly in your environment.
  2. Day 2 we finished off CPU Performance by learning how to use ESXTOP and the performance metrics in vCentre to find and recognise possible cpu problems and how to fix them. Next we covered Memory Performance which was fairly straight forward in my opinion but did give great recommendations on how to utilise your memory effectively and how ballooning and memory swapping works and what the increase of these values means to the performance of your environment. Yet again Frank Denneman has covered these topics brilliantly in two blog postings Memory reclamation, when and how? and Disable ballooning? which I’ll personally be rereading through myself so as to better my understanding of how it can help/impact my virtual machines. Next we did the Network Performance module which covered all the varying network card options you can select,what each allows you to do,what additional features each one gives and how these features work. This was also a refresher for me due to my VCP studies but it did seem to alert a lot of the people on the course with me to the benefits of upgrading all your virtual machines to hardware version 7 and changing their network cards to VMXNET3. VMware have a great KB article on this Choosing a network adapter for your virtual machine. For the rest of the module it was yet again teaching you how to find and troubleshoot possible network problems using the performance charts and ESXTOP.
  3. Day 3 finished off the last three modules Storage Performance,Virtual Machine Performance and Application Performance. Storage performance was good and was very interesting to hear how many people don’t use thin provisioning due to their belief that it impacts performance in certain ways. I’m not going to get into it on here and I agree it does in certain instances but like I said to the people on the course with me I would recommend reading  the VMware white paper on it first and make your own decisions from there. There are also loads of top blog postings on the subject so I would also recommend reading a few of those (Duncan Epping’s and Eric Gray’s in particular). For the last two modules of Virtual Machine Performance and Application performance these were essentially just applying what you learn for cpu,memory and network to your virtual machines and what to consider for the virtualising of differing applications.

Funnily enough whilst on the course the latest release of the vSphere performance troubleshooting guide for 4.1 came out which is perfect post course reading material for me. Duncan Epping’s posting alerted me to the release so only right to point to his posting here.

Well that’s a high level review of what I learnt/was covered in the course. As with any course though what you get out of the course is very dependant on your knowledge of the product/s and even though I have a fairly good amount of knowledge on the product and features I did still learn a fair amount and it was a really great refresher on certain features in preparation for my VCAP: DCA exam sitting.



VM’s can’t ping while on Distributed Virtual Switches VLAN’s

This blog posting has been sitting in my drafts for a few weeks now as I’ve been battling and troubleshooting for ages so that I could give the solution to all our Distributed Virtual Switching problems we’ve been having. Thankfully I believe I finally can although I’m amazed that this may be the only blog out there with a solution. I put this down mainly due to Distributed Virtual Switches only being available in the Enterprise Plus edition of vSphere and therefore not many people either feeling the need to get this version or their companies not seeing the need to buy the edition. Thankfully I work for EMC and therefore I was able to procure myself a licence key for this edition and so set myself on the way to many eventual problems.

As I said in my communities open question the machines always seemed to fall off at differing times and showed no kind of patterns. Later on I noticed the machines were for some reason losing their ARP tables. The solution I found was one I am still unable to find a VMware article about.

It all came down to a difference in ESX versions and virtual hardware. Not 3.5 and vSphere(I’m not that thick…often) but the build versions. It seems that ESX servers installed with builds pre update 1 and ESX servers with update 1 installed don’t communicate/lose connectivity between themselves. So for instance when i had five servers on an ESX 4.0.0, 175625 build(pre update 1) and five on an ESX 4.0.0, 208167 (update 1a) build the ten total servers initially will all communicate fine with no problems, but then over time all the machines on the pre update 1 host will lose connectivity to both the machines they are on the same host as as well as the machines on the update 1a host and the outside world(aka the lose all connectivity). The five servers on the update 1a host though won’t lose connectivity to each other (although if the dns server they are using is on the pre update 1 then obviously dns will be lost) or to the outside world.

So the steps i followed to fix the problem were:

  • Firstly upgrade the hosts to the latest versions. This can be done by VMware Update Manager if you have it setup in your environment or by the way I did it with esxupdate. Now I know loads of you who have been in the virtualisation field for a while will know this tool well as it was the only tool you could use pre esx 3.5 to update your machines and I’m still puzzled why the vSphere Host Update Utility cannot patch or upgrade ESX 4.0 hosts. I was going to write up the steps I use but David Davis @davidmdavis of TrainSignal fame has written up a great step by step guide of how to do this if you’re not familiar.
  • Once this is done you will then need to upgrade the virtual machine hardware to version 7. Scott Lowe has done a brilliantly detailed posting of how to do this and the changes you need to make to allow you to use the latest networking capabilities. Now i know a bunch of you will think that you don’t need to update your esx hosts to the latest version to be able to upgrade virtual machine hardware but due partly I believe to the problems I was experiencing when I tried I got a very vague error of image .Only once I had migrated the machine to the latest host would it let me upgrade the virtual hardware. My colleague Simon Phillips noticed this virtual hardware upgrade was a difference between machine that worked and ones that didn’t so credit is due to him on spotting this and finding Scott’s posting on how to upgrade the virtual hardware.

After these changes the machines all communicated without any problems and almost a week in haven’t shown any of the problems we were experiencing.

Funnily enough while building up this blog posting i came across a load of really interesting articles from fellow virtualisation professionals and i was going to do a wrap up of it all with the thoughts of putting your machine on standard or distributed switches and should you make it a virtual machine or not. But as of this morning Richard Brambley @rbrambley did a great one himself on the virtual centre side ,so definitely have a read. As well as these articles all surrounding the same topics and the problems and opinions some of the top people thought/have come across.

Sadly after finding out these solutions we’re now having to migrate all our machines back to standard switches due to our virtual centre server having database problems and needing a rebuild. I still think I would like to try use Distributed Virtual Switches again in the future but unless you have an enormous environment where you need the DVS’ I feel standard switches are more than adequate and at the moment less the pain.

Also a big thanks to Simon Phillips for all his help in this, Gabrie van Zanten for chatting through loads of it with me on gchat, all the guys on twitter who replied to me with ideas,the people who replied to my VMware communities question and the VMware helpdesk guy I caught unawares with all my questions when he called me about my virtual centre problems.

I’m always open for a chat/troubleshoot if you’re having the same problems so either leave a comment below or add me on twitter at @greggrobertson5.

Gregg Robertson