TheSaffaGeek

My ramblings about all things technical


Leave a comment

VCDX Spotlight: Sachin Bhowan

Name: Sachin Bhowan

Twitter Handle: @sbhowan

Current Employer: VMXperts

VCDX #: 38

How did you get into using VMware?

My first use of VMware was back in 2004 when we were investigating bare metal recoveries for Tivoli Storage Manager (TSM) and were experimenting with WinPE and GSX. I was looking for a way to standardize the recovery hardware platform for recovery as at the time there was no bare metal recovery option for TSM. It was after that then I started exploring and testing the ESX platform for server workloads and with the onset of Version 3, as the cliché goes….rest became history!

What made you decide to do the VCDX?

An interesting question in that there are two reasons for this; the main reason was for the challenge to prove to myself that the solutions I was actively “preaching” and delivering on were on par and on the right track. The second reason was simply being at the right place at the right time as we were having a meeting with our VMware Partner Manager and our CEO was complaining about the lack of elitism within the VMware certification portfolio and then they announced the VCDX program. I was then “volunteered” to make this happen!

How long did it take you to complete the whole VCDX journey?

I started this journey in September 2008 and tried to track and get as much information as possible, however being in the geo that I was, the most helpful information I got was from the active blogs run by Duncan Epping, Rick Scherer, VCDX001, and twitter tips, as exam information and details were only given out after successful registrations. The process was also impacted by the fact that I had to take all my exams in Europe or the US as there was no authorized testing centre in South Africa at that time. Therefore I had to wait for exams to open as well as align them with the major VMware events to reduce costs. This also forced me to be prepared at very short notice when I was given the availability of the exams! That said I completed my defence in February 2010 giving a total time of about 18 months.

What advice would you give to people thinking of pursuing the VCDX accreditation?

My advice is to be passionate about what you are doing and if you feel that doing the research and working out the solution interdependencies and limitations is mundane and tedious then this might not be the right track for you. Working towards the VCDX in my experience involved a lot of patience, dedication, passion and not to mention discipline as this is a method for working studiously and diligently on an everyday basis because not all solution requirements are the same. It also means investing a lot of your time over and above of your daily responsibilities and duties (work and family included); so commit for the long haul.

If you could do the whole VCDX journey again what would you do differently?

I do not think I would change much aside from the fact that with exams now available locally it would have meant a shorter timespan to get the accreditation.

Life after the VCDX?  How did your company respond?  Was it worth it?

Life for me has pretty much been the same since I achieved the VCDX accreditation and simply so because I chose for it to be that way as there were some tempting offers and I opted out for personal reasons. However I am now responsible for heading up a new business called VMXperts, that is a subsidiary of my former company Aptronics. (You can guess what platform this company’s focus will be). As for the response from my company everyone was simply elated and it also went well that I was the very first person in Africa to achieve this. This made their investment in the VCDX program a worthy one!


Leave a comment

VCDX Spotlight: Wade Holmes

Name: Wade Holmes

Twitter Handle: @wholmes

Blog URL: www.vwade.com

Current Employer: VMware

VCDX #: 15

How did you get into using VMware?

The year was 2004. I was an IT Specialist working in the IBM’s Business Continuity and Resiliency Services, and became aware of customers utilizing VMware for backup and recovery of their datacetners. I went to my manager at the time and told him about this trend, and that I was interested in becoming a VMware SME for IBM BCRS. I started working with ESX 2.0 and VirtualCenter 1.0, attended VMware training, and in 2005 became a VMware Certified Professional. During this time I spearheaded the creation and rollout of IBM BCRS’s first VMware based warm-site disaster recovery offering across the US, reducing the RTO of numerous fortune 500 clients. And so began the journey towards VMware excellence!

What made you decide to do the VCDX?

It was early 2008, and the VCDX certification was announced. The rigor of the requirements immediately attracted me to pursuing the certification. I knew this was a certification that could help further my career as an IT professional.

How long did it take you to complete the whole VCDX journey?

I completed the VCDX the summer of 2009, after taking the beta exam and defending during the first publically available defense. Below was the path I took before defending.

VCP on VI3
Enterprise Exam (beta)
Design Exam (beta)

I had no idea what to expect when coming to defend, and was extremely nervous. I spent countless hours preparing, reviewing my design, making sure I knew the in’s and out, and could justify every granular detail I documented. Luckily, that was exactly the approach necessary for me to be successful. I can’t describe how happy I was when I got a phone call that I passed and was a VCDX! (yes, back then I was actually contacted by phone to be informed I passed). In becoming VCDX #15, I was the first non-VMware employee worldwide to achieve the certification (as I worked for a partner at the time).

What advice would you give to people thinking of pursuing the VCDX accreditation? Dive in hear first to master your craft. Understand not just the what, but more importantly the why of architecture and design. Use the plethora of resources available to you online to become familiar with the format. Sign up for a VCDX Bootcamp to help prepare.

If you could do the whole VCDX journey again what would you do differently? Nothing except more sleep the night before the defense (if you can sleep).

Life after the VCDX?  How did your company respond?  Was it worth it

I believe I have a unique perspective on the VCDX program, having completed the VCDX program in 2009 as the first non-VMware VCDX worldwide, and then joining VMware and participating in the ongoing development of the program as a panelist. As an outsider looking in, the VCDX program was a goal that drove me to work on my craft, and become a better architect.

Since joining VMware, my participation in the VCDX program has only helped to hone my skills as a virtualization and cloud architect. It has forced me to sharpen my understanding of enterprise architecture principals, principals that aid me greatly in my day-to-day role dealing with virtualization and cloud solutions. I will be forever grateful to the VCDX program in providing a vehicle that forced me to push myself, and aiding me to take my career to another level.


Leave a comment

VCDX Spotlight: Randy Stanley

Name: Randy Stanley

Twitter Handle: @randystanley

Blog URL: http://www.randystanley.com

Current Employer: IT Partners

VCDX #: 94

How did you get into using VMware?

In 2003 I was working for a small software development company managing their business applications and supporting their software development team. Initially we began utilizing VMware GSX Server for those simple use cases trying to consolidate and save on our hardware spend where ever we could. In support of the software development team we also deployed ESX in a lab environment for testing and development purposes only. A fairly common introduction and use case early on in the adoption of VMware solutions. Plus, vMotion was the coolest freakin’ thing I had ever seen.

It wasn’t until I re-entered the consulting field in 2007 that I really started to dive deep into the VMware products and they have been an integral part of every solution we sell and deploy. It was this exposure to the VMware technology that really allowed me to develop my abilities and deepen my experience. I also should say that a large draw for me was the large, friendly and helpful community that supported and shared knowledge around the VMware products; easily the best community with which to be associated.

What made you decide to do the VCDX?

For me the decision was twofold, first because I’ve had the great fortune of working with one of the best consultants I know in Doug Baer, VCDX #19 and second for the shear challenge of obtaining the certification. A natural, underlying part of the equation has always been my love of the technology and interest in understanding how it works at its core. In my current line of work, utilizing the skills and knowledge measured by the VCDX certification is highly relevant and in many ways a validation of those abilities.

How long did it take you to complete the whole VCDX journey?

It’s hard to say exactly when the journey started, as I had wanted to go after it for the last couple of years, but it seemed so far off and I never really got going. In May 2011, I started and then stopped my journey with a failed attempt on the required VCAP-DCA exam which in combination with a heavy load of customer commitments limited my ability to focus on it. Since I wasn’t accustomed to failing an exam, the DCA failure caught me off guard and I needed to regroup. It was then about 6 months later over the 2011 Thanksgiving (US) holiday that I had a little heart-to-heart with myself and decided regardless of the time, effort or success, I was going to go after the VCDX4 before it was updated to version 5. I was leaving too many good designs on the table which I had worked on with vSphere 4 to not try to at least defend one of them. That’s when my real, 6-month journey toward VCDX began. This involved the DCD4 exam in December, the DCA4 exam in January, the VCP5 upgrade and the DCD5 beta in February, the VCDX4 Design and application in March and then the VCDX4 Defense in May. Approximately 6-months start to finish, but ultimately the journey never ends or at least I hope it doesn’t.

What advice would you give to people thinking of pursuing the VCDX accreditation?

My advice to those interested in the VCDX would be to dedicate themselves to the investment of time and resources necessary in the effort. This may mean the setup of a home lab, the time to read product guides, the repetition of product implementation and design, and/or the review of countless blogs and knowledge base articles. But beyond having a sound technical and architectural knowledge it will also require comfort in the spotlight, an ability to present from a white board, a quickness to think on your feet, an ability to envision the big picture design, and an openness to feedback, critique and improvement. With all that said, bottom line for anyone seriously considering it, I would say go for it. You’ll never know what could have been if you don’t try. I believe many will be surprised by what they can accomplish when they focus on a goal like the VCDX.

If you could do the whole VCDX journey again what would you do differently?

I probably would have started it earlier. Overall I felt the execution was successful once I got going, but for me it was just the issue of starting and sticking with it. Beyond that I don’t think I would have changed much.

Life after the VCDX?  How did your company respond?  Was it worth it

In my consulting position, the certifications are very much a part of the role and needed by the company to market, sell and deliver the solutions that we focus on. The certification definitely brought some recognition and accolades. It also provided some instant credibility amongst those in our community. For the most part, I do believe it was worth it mainly because of the challenge it provided to me and the opportunity to do what I love most which is work with the technology, understand the architecture of the products, solve the business problems of my customers, and participate in a community that is passionate about all these same things.


Leave a comment

VCDX Spotlight: Rick Scherer

Name: Rick Scherer

Twitter Handle: @rick_vmwaretips

Blog URL: http://www.vmwaretips.com

Current Employer: EMC

VCDX #: 21

How did you get into using VMware?

I first heard of and started using VMware Workstation in late 1999. I was a UNIX Administrator that was forced to live in a corporate Windows world. Workstation allowed me to have the best of both worlds. After thorough use and testing of the GSX and ESX products, by 2003 I was able to convince my (then employers) management that virtualization was a must for our datacenter. The rest is history.

What made you decide to do the VCDX?

Since day one I’ve been convinced that virtualization was a huge benefit for organizations large and small. Obtaining my first VCP (VCP2 #7315) in 2006 I’ve been following the work of the education team closely. When I was invited to be a beta participant in the then newly created VCDX program I saw it as an opportunity to validate my dedication, knowledge and experience. By making a candidate jump through as many hurdles that the VCDX program has, it really shows that VMware users are dedicated to the cause. This is a great way to give VMware additional validation to the industry.

How long did it take you to complete the whole VCDX journey?

From start to finish, an extremely long year, it was extremely rewarding being part of the beta program process though. Being able to assist in the shaping of the program, how the Design and Administration exams were written was really fun.

What advice would you give to people thinking of pursuing the VCDX accreditation?

Know your stuff, inside and out. The VCDX is really a mixture of everything, knowing not only how to fully design and architect a virtualized infrastructure (not only from a VMware perspective but the associated compute, network and storage), but also how to implement that design, manage that design, upgrade that design and operate that design.

Get as much hands on as possible. Learn as much as possible about how compute, network and storage relate to virtualization. Learn how applications relate to virtual machines. Know your design. That’s probably the most important thing, for your defense do not design your dream architecture, keep it simple and keep it to something you’ve done before. Know it inside and out, know what failed and how you fixed it. Don’t say you designed a specific thing to meet Best Practices, know why it’s the best practice.

Also, if you’re married… get your spouse’s buy-in on the journey as well. You’re going to spend a lot of time away from them while you’re on the journey and you’ll need more support than you’ve ever needed before.

If you could do the whole VCDX journey again what would you do differently?

No regrets, I loved every single part of the process. I wish I didn’t rush through it as fast as I possibly did, but I was so excited to be part of something new, something fresh and something fun! How awesome is it to be VCDX #21! J

Life after the VCDX?  How did your company respond?  Was it worth it?

I think obtaining the VCDX certification opened a lot of new doors and opportunities. Since obtaining my VCDX I’ve joined EMC as part of their vSpecialist organization, here I’m able to put my knowledge and experience directly to use as I evangelize companies about all of the amazing benefits of virtualization, application modernization, end-user computing and now cloud computing.


Leave a comment

VMware vSphere Data Protection

vSphere Data Protection (VDP) is a robust, simple-to-deploy, disk-based backup and recovery solution. VDP is fully integrated with VMware vCenter Server and enables centralized and efficient management of backup jobs while storing backups in de-duplicated destination storage.

Benefits:

•VDP leverages VMware vSphere API for Data Protection (VADP) which includes Changed Block Tracking (CBT) along with the EMC Avamar variable-length segment de-duplication engine to optimize backup and recovery times. Initial backups take a fair amount of time, but subsequent backups can be as little as a few minutes depending on the number of changes that have occurred since the last backup.

•Backup agents are not needed as VDP leverages VADP. VMs are backed up to disk-based storage (.vmdk files attached to the VDP virtual appliance).

•De-duplication occurs not only within each VM, but across all backups jobs and all VMs being backed up by the VDP appliance.

•A VM that utilizes an agent for backup and recovery require the VM to be in a powered on state. With VDP, that is not the case – backups and recoveries can be performed regardless of the VM’s power state.

•The is no need to install backup management software on an administrator’s workstation. Configuration and management of VDP is web browser based. Currently supported browsers: IE 7, 8 on Windows. Firefox 3.6 and higher on Windows or Linux. Adobe Flash is required.

•Restores can be entire VM or individual files and folders/directories. The file-level restore user interface (UI) is web based, simple, and intuitive meaning end-users can perform self-service file-level restores (administrator permissions required).

•Deployment, configuration and management of VDP is done via a web browser based graphical user interface (GUI). The majority of configuration tasks are completed using intuitive wizard-driven workflows.

vSphere Data Protection Key Components

VDP VM Appliance

•VDP is a virtual machine appliance deployed from a .ova (open virtual appliance or application) file.

vSphere Infrastructure

vSphere API for Data Protection (VADP) is utilized by VDP. This includes the Changed Block Tracking (CBT) mechanism. CBT tracks the changes made to a VM at the block level and provides this information to VDP so that only changed blocks are backed up. This significantly reduces storage consumption and speeds up backup and recovery times with VDP.

•VMware Tools on Windows contains Volume Shadow Copy Service (VSS) components to assist with guest OS and application quiescing when backing up Windows VMs. More details on VSS can be found here: http://technet.microsoft.com/en-us/library/ee923636(v=WS.10).aspx

VDP Architecture

•The appliance is deployed by default with 4 vCPUs and 4 GB RAM.

•Available in three sizes: 5 TB, 1 TB, and 2 TB – these are usable destination datastore sizes. The actual amount of disk space (thick provisioned) consumed by the appliance is 850 GB (3 .vmdk files), 1600 GB (7 .vmdk files), and 3100 GB (13 .vmdk files) respectively. Thin provisioning can be used, but the administrator should closely monitor disk consumption. It is important to note that once the VDP appliance is deployed, the size cannot be changed.

•The VDP appliance guest OS is SuSE Linux 11.

•vCenter Server 5.1 is required to use VDP. VDP can backup VMs on hosts running vSphere 4.0 and higher.

•VDP management is done via the vSphere Web Client. There is no plug-in for the vCenter Server “thick” client.

 


4 Comments

vSphere 5.1 Announced with Enhanced vSphere Replication

vSphere Replication

vSphere Replication (VR) is the industry’s first and only genuinely hypervisor-level replication engine.

It is a feature first introduced with Site Recovery Manager 5.0 to allow for the vSphere platform to protect virtual machines natively by copying their disk files to another location where they are ready to be recovered.

VR is a software based replication engine that works at the host level rather than the array level.

Identical hardware is not required between sites, and in fact customers can run their VMs on any type of storage they choose at their site – even local storage on the vSphere hosts, and VR will still work.

It provides simple and cost-efficient replication of applications to a failover site

VR is a component delivered with vSphere editions of Essentials Plus and above, and also comes bundled with Site Recovery Manager. This offers protection and simple recoverability to the vast majority of VMware customers without extra cost.

•With VR, a virtual machine is replicated by components of the hypervisor, removing any dependency on the underlying storage, and without the need for storage-level replication.

•VMs can be replicated between *any* type of storage platform: Replicate between VMFS and NFS, from iSCSI to local disk. Because VR works above the storage layer it can replicated independently of the file systems. (It will not, however, work with physical RDMs.)

•Replication is controlled as a property of the VM itself and its VMDKs, eliminating the need to configure storage any differently or to impose constraints on storage layout or management. If the VM is changed or migrated then the policy for replication will follow the VM.

•VR creates a “shadow VM” at the recovery side, then populates the VM’s data through replication of changed data.

•While VR can be deployed through the “thick client” all management and interaction with VR is done strictly through the vCenter 5.1 web interface.

•Only vSphere 5.0 and 5.1 will work for vSphere Replication as the VR Agent is a component of the vSphere 5.x hypervisor.

•vSphere Replication can not co-exist with the vSphere Replication pieces originally shipped with SRM 5.0. If an existing SRM 5.0 vSphere Replication environment is in place it will need to be uninstalled and replaced with the standalone vSphere Replication from vSphere 5.1.

•While both Storage DRS and sVmotion are supported, they will cause certain scenarios to be aware of

•While Storage vMotion of a VR protected VM can be done by an administrator, on vSphere 5.0 this may create a “full sync” scenario in which a VM must be completely resynchronized between source and destination, possibly violating the configured recovery point objective for that VM.

•Storage DRS compounds this problem by automating storage vMotion, and thereby may potentially cause the protected virtual machines to create continual full sync scenarios, driving up I/O on the storage, thereby creating cyclical storage DRS events. Because of this it is unsupported with 5.0.

•Storage vMotion and SDRS are only able to be run on the *protected* VM and can not execute against the *replica* of the VM.

•When using vSphere Replication with Site Recovery Manager, storage vMotion and storage DRS are *not supported*

•Neither of these scenarios is true with vSphere 5.1 as the persistent state file that contains current replication data is migrated along with the rest of the VM, which did not occur in vSphere 5.0.

vSphere Replication is not “new” as it has more than a year-long track record of success with Site Recovery Manager.

VR is a non-disruptive technology: It does not use vSphere file-system snapshots nor impact the execution of the VM in any abnormal way.

Since VR tracks changes at a sub-VM level, but above the file system, it is completely transparent to the VM unless Microsoft Volume Snapshot Service is being used to make the VM quiescent. Even then VR uses fully standard VSS calls to the Microsoft operating system.

Virtual machines can be replicated irrespective of underlying storage type • Can use local disk, SAN, NFS, and VSA
• Enables replication between heterogeneous datastores
• Replication is managed as a property of a virtual machine

• Efficient replication minimizes impact on VM workloads

vSphere Replication Use Cases

Protecting VMs within a site, between sites, or to and from remote and branch offices.

Can use dissimilar storage, low cost NAS Appliances, even independent vSphere hosts with only local disk.

VR Deployment

VR is deployed via a standard virtual appliance OVF format.

The OVF contains all the necessary components for VR.

•What used to be both the “VRMS and VRS” in the SRM 5.0 implementation of VR are included in the “VR Appliance” now

•This allows a single appliance to act in both a VR management capacity and as the recipient of changed blocks

•Scaling sites is an easy task, simply deploy another VR Appliance at the target site and it will contain the necessary pieces to either pair and mange replication for a site or simply receive changed blocks as per the VRS

vSphere Replication Limitations

vSphere Replication is targeted at replicating the virtual disks of powered on virtual machines only. It is based on a disk filter to track changes that pass through it, therefore static images can not be tracked.

Powered-off or suspended VMs will not be replicated. The assumption is that if the VM is important enough for protection, it is powered on.

That also means non-disks attached to a VM (ISOs, floppy images, etc) are not replicated. Also any disks, ISOs, or configuration files not associated with a VM will not be replicated.

Files that moreover are not required for the VM to restart (e.g., vswp files or log files) are not replicated by VR.

Since VR works above the disk itself at the virtual device layer, it can be completely independent of specifics about the VMDK it is replicating. VR can replicate to a different format than its primary disk – i.e. you can replicate a thick provisioned disk to be a thin provisioned replica.

VM snapshots in and of themselves are not replicated but instead are collapsed during replication. A VM with snapshots may be configured for protection by VR (and you can take and revert snapshots), but the remote state for such VMs will be “flat” without any snapshots. Snapshots are aggregated into a single VMDK at the recovery location.

Note: Reverting from a snapshot may cause a full sync!

VMs can be replicated with a recovery point objective (RPO) of at most 15 minutes and at least 24 hours. This means that a recovery of replicated VMs will lose at least 15 minutes worth of recent data.

How it works

Fundamentally VR is a handful of virtual appliances that allow the vSphere kernel to identify and replicate changed blocks between sites. The configuration and deployment is a handful of simple steps.

Once the administrator has deployed the components it is a matter of pairing a source and destination.

Lastly, configuration of an individual VM for protection tells VR to start replicating its changes, and where to put them at the recovery location.

Only replicates changed blocks

On an ongoing basis, after the first sync, VR will only ship changed blocks.

Within the RPO defined by the administrator, VR tracks which blocks are being dirtied and will create a “lightweight delta” (LWD) bundle of data to be transferred to the remote site.

Pointers to changed blocks are kept in both a memory bitmap as well as a “persistent state file” (psf) located in the directory of a VM. Memory contents are always current, the PSF file represents the current shipping LWD. After an LWD is shipped and completely acknowledgd, the memory bitmap is copied to the PSF file and the memory bitmap is restarted for the next LWD.

VR will use the defined RPO to determine how often to create a LWD. Time must be allowed to create the block bundle, transfer it, and successfully complete writing the entire bundle to ensure that the RPO is not violated. In order to do this, VR will track the length of the previous 15 transfers to create an estimate of how long it will take to complete the transaction of the subsequent LWD.

For example, if a transfer takes 1 minute to create, 8 minutes to transfer, and 1 minute to write, by the time the data is successfully written the original VM is now 10 minutes old. With, for example, a 1 hour RPO set for a VM, the next transfer would need to take place at least within the next 40 minutes. This presumes 10 minute old data plus the next 10 minute transfer = 20 minutes gone out of the 1 hour RPO to ensure the data at the recovery site is never older than the RPO defined.

If a transfer of a LWD takes more than half the time of the RPO it is very likely that the RPO will be violated based on the incremental “catch up” to the RPO period and it will be flagged as a potential RPO violation.

VR will create a per-host replication schedule by taking into account *all* the VMs being replicated from that particular host. This allows it to do host-wide scheduling for each replicated VMDK and allows transfers to take place according to variables such as length of transfer, size of LWD, etc. and gives the scheduler flexibility to send data when appropriate.

The scheduler will execute each time an event occurs that alters replication patterns, such as a power task on a replicated VM, changes to RPOs or a full sync, or an HA event such as a host crash.

Only the most-recent transfer information is persisted. If hostd crashes, or the VM is migrated, or reconfigured, the historic transfer state is lost, and must be re-accumulated for the scheduler to be most effective.

It is important to note that VR is *not* using vSphere based snapshots to create redo logs of the primary VMDK. The VMDK is not interrupted in any fashion at all, and there is no snapshot created.

It also does NOT use “CBT” or “Changed Block Tracking”, another feature of the vSphere Platform. The vSCSI filter of VR is completely independent of CBT by design. This allows CBT to remain untouched for other tools such as VADP and backup software. If CBT were to be used it would reset the changed block tracking epoch, breaking backups and other uses of CBT.

VR is 100% isolated from snapshots and CBT.

Recovering a VM with a few clicks

A VM can be recovered only if it is not powered on somewhere else or is not reachable by the recovery vCenter Server. This is to avoid having duplicate VMs running at the same time.

For further safety, the VM is booted with no networks connected to help avoid duplicate VMs colliding.

Once the recovery is processed, you can not reconnect and re-enable replication of that VM. You must re-start protection all over again. You may, however, use the old VMDK that might remain at either site as a seed to begin replication again.

Four steps for full recovery

As long as the replication has completed at least once a VM can be recovered quickly and easily directly from the vCenter Web Client.

From the Replication location in the Web Client, choose a VM that has been replicated, right-click and choose to recover.

Choosing a target folder and resource (Cluster, host, or resource pool) will then instantiate the replicated vm, create and register the vmx, attach the VMDK and power-on the VM if chosen.

This can not be automated, and can only be done a single VM at a time.


Leave a comment

vSphere 5.1 Announced with Site Recovery Manager 5.1

With the announcement of vSphere 5.1 is also the announcement of Site Recovery Manager 5.1. Below are some of the new features and enhancements coming with SRM 5.1

Application Quiescence for vSphere Replication

The new VR has improved VSS integration and doesn’t merely request OS quiescence, but flushes app/db writers if present.

This is due to better handling of VSS through the VMware Tools present in vSphere 5.1 and requires no work to configure – merely select the quiescing method and VR will handle it.

If VR is asked to use VSS, it will synchronize its creation of the lightweight delta with the request to flush writers and quiesce the application and operating system. This ensures full app consistency for backups.

vSphere Replication is presented the quiescent and consistent volume produced by the OSS flushing the VSS writers, and that consistent volume is used to create the LWD for replication.

If for some reason the VSS can not quiesce correctly or flush the writers, VR will continue irrespective of the failure and create an OS consistent LWD bundle at the VM level, and generate a warning that VSS consistency was not able to be created.

All Paths Down Improvements

The way vSphere 5 handles hosts with devices in an “All Paths Down” state has been improved to ensure that the host does not get stuck in a loop attempting I/O on unavailable devices.

APD states often occur during disaster scenarios, and as such it becomes important for SRM that the platform not cause delay for recovery.

SRM now checks for a datastore’s accessibility flag before deciding whether or not to attempt to use that datastore. A datastore may become inaccessible because of various reasons, one of which is APD.

The changes in how vSphere handles these devices enables SRM to differentiate APD from other types of inaccessible states such as and Permanent Device Loss (PDL).

If SRM sees a datastore in an APD condition, it will stop immediately and try again later, since APD conditions are supposed to be transient, rather than time out trying to access a missing device.

SRM also has been improved to use a new unmount command to gracefully remove datastores from the primary protected site during the execution of a recovery plan. Since SRM needs to break replication and unmount the datastore from the protected environment the new method allows for a graceful dismount and generation of an APD situation rather than an abrupt removal of the datastore.

During a disaster recovery, however, in some cases hosts are inaccessible via network to gracefully unmount datastores, and in the past the isolated hosts could panic if their storage was removed abruptly by SRM.

With vSphere 5.1 there are new improvements to the hosts and storage stacks that allow them to remain operative even through an unplanned APD state.

Forced Failover

Forced failover was introduced in SRM 5.0.1 for recovery plans using array based replication protection groups. With SRM 5.1 forced failover is now fully supported for all protection group types.

In some cases SRM will be unable to handle storage failure scenarios at the protection site. Perhaps the devices have entered an APD or PDL state, or perhaps storage controllers are unavailable, or for many other reasons. Perhaps the original SAN is reduced to a puddle of molten slag.

In these cases, SRM can enter a state where it waits for responses from the storage for an untenable amount of time. For instance, timeouts have been seen to last as long as 8 hours while waiting for responses from ‘misbehaving’ storage at the protected site.

Forced failover handles these scenarios. If storage is in a known inconsistent state, a user may choose to run a recovery plan failover in “forced failover” mode. Alternately, if a recovery plan is failing and timing out due to unresponsive protected site storage, the administrator could cancel the running recovery plan and launch it again in forced failover mode.

Forced failover will run *only* recovery-side operations of the recovery plan. It will not attempt any protected site operations such as storage unmounts or VM shutdowns. During a forced failover execution of a recovery plan any responses generated by the protected site are completely ignored.

Array-based replication forced failover worked with SRM 5.0.1, and after extensive testing has now been introduced to work with vSphere Replication as well.

Failback supported with both Array and vSphere Replication

SRM 5.1 now includes vSphere Replication in the “automated failback” workflow!

With SRM 5 VMware introduced the “Reprotect” and failback workflows that allowed storage replication to be automatically reversed, protection of VMs to be automatically configured from the “failed over” site back to the “primary site” and thereby allowing a failover to be run that moved the environment back to the original site.

Taken together as “automated failback” this feature was well received by those using array-based replication, but was unavailable for use with vSphere Replication.

With SRM 5.1 users can now do automated reprotects and run failback workflows for recovery plans with any type of protection group, both VR and ABR inclusive.

After running a *planned failover only*, the SRM user can now reprotect back to the primary environment:

Planned failover shuts down production VMs at the protected site cleanly, and disables their use via GUI. This ensures the VM is a static object and not powered on or running, which is why we have the requirement for planned migration to fully automate the process.

The “Reprotect” button when used with VR will now issue a request to the VR Appliance (VRMS in SRM 5.0 terminology) to configure replication in opposite direction.

When this takes place, VR will reuse the same settings that were configured for initial replication from the primary site (RPO, which directory, quiescence values, etc.) and will use the old production VMDK as seed target automatically.

VR now begins to replicate replicate back to the primary disk file originally used as the production VM before failover.

If things have gone wrong at the primary site and an automatic reprotect is not possible due to missing or bad data at the original site, VR can be manually configured, and when the “Reprotect” is issued SRM will automatically use the manually configured VR settings to update the protection group.

Once the reprotect is complete a failback is simply the process of running the recovery plan that was used to failover initially.

vSphere Essentials Plus Support

SRM 5.1 is now supported with vSphere Essentials Plus, enabling smaller companies to move towards reliable disaster recovery protection for their sites.

•vCenter version 5.1 is the only version that will work with SRM 5.1. Lower versions of vSphere/VI are supported, but vCenter must be up to date.

•At time of shipping, only vSphere 4.x and 5.x are supported.

•ONLY ESXi 5.0 and 5.1 will work for vSphere Replication as the VR Agent is a component of the ESXi 5.x hypervisor.

•While both Storage DRS and sVmotion are not supported with SRM 5.1, they will work in some scenarios even though unsupported.

•While Storage vMotion with array-replicated protected VMs can be done by an administrator, they must then ensure that the target datastore is replicated and that the virtual machine is once again configured for protection. Because this is a very manual process it is not officially supported.

•Storage DRS compounds this problem by automating storage vmotion, and thereby will cause the VMDK of the protected virtual machines to migrate to potentially un-protected storage. Because of this it is unsupported with SRM 5

•Storage vMotion and Storage DRS are not supported at all with SRM 5 using vSphere Replication as migration of a VMDK will cause the migrated VM to reconfigure itself for protection, potentially putting it in violation of its recovery point objective.

 


2 Comments

vSphere 5.1 Announced with Distributed Switch Enhancements

With the release of vSphere 5.1, VMware brings a number of powerful new features and enhancements to the networking capabilities in the vSphere platform. These new features enable customers to manage their virtual switch infrastructure with greater efficiency and confidence. The new capabilities can be categorized into three main areas: operational improvements, monitoring and troubleshooting enhancements, and improved scalability and extensibility of the VMware vSphere Distributed Switch (VDS) platform. Following are some of the key features:

1)Network Health Check support – helps detect mis configurations across physical and virtual switches

2)Configuration Backup Restore – Allows vSphere admins to store the VDS configuration as well as recover the network from the old configurations

3)Rollback and recovery – Addresses the challenges that customer faced when management network failure caused the Hosts to disconnect from the vCenter Server

4)Port Mirroring enhancements – New troubleshooting capabilities are introduced by supporting RSPAN and ERSPAN

5)Netdump – Provides the ESXI hosts without disk (stateless/Autodeploy) the ability to core dump over network

6)Improved Scaling numbers

Network Health Check

Network Health check prevents the common configuration error such as Mismatched VLAN, MTU and teaming configuration.

This tool is very helpful in an organization where the network administrators and vSphere administrators respectively take the management ownership of physical network switches and vSphere hosts. In such organizations vSphere admins can provide the network related warnings to the network admins and help identify issues quickly.

Configuration Backup and Restore

VDS configuration is managed through vCenter Server and all the virtual network configuration details are stored in the vCenter database. Previously, In case of database corruption or database loss events, customers were not able to recover their network configurations and had to rebuild the virtual networking configuration from scratch. Also, there was no easy way to replicate the virtual network configuration in another environment or go back to the last working configuration after any accidental changes to virtual networking settings.

All of the above concerns are addressed through the VDS configuration backup and restore feature.

Backup a VDS Configuration

image

Restore a Port Group Configuration

image

Rollback and Recovery

The management network is configured on every host and is used to communicate with vCenter Server as well as to interact with other host during vSphere HA configuration. This is critical when it comes to centrally managing hosts through vCenter Server. If the management network on the host goes down or there is a misconfiguration, vCenter Server can’t connect to the host and thus can’t centrally manage resources.

If there is any issue with management network the Hosts can’t reach the vCenter server. And thus vCenter server can’t make any changes to the network and push to the hosts.

In such situation, The only way for the customer to recover is to go to individual hosts and build a standard switch with proper management network configuration. Once all the hosts have their management networks attached to a standard switch, vCenter Server can manage the hosts and re-configure the VDS.

With Rollback and recovery option customers don’t have to worry about going to standard switch route to recover from any mgmt. network failure scenario.

The Automatic Rollback and Recovery feature addresses all the concerns that customers have regarding the use of management network on a VDS. First, the automatic rollback feature automatically detects any configuration changes on the management network and if the host can’t reach the vCenter Server, it doesn’t allow the changes to take effect. Second, customers also have an option to reconfigure the management network of the VDS per host through DCUI. Customers have to connect to each host and through DCUI can change the management network parameters of the VDS

LACP

Link Aggregation Control Protocol (LACP) is a standard based link aggregation method to control the bundling of several physical network links together to form a logical channel for increased bandwidth and redundancy purposes. LACP allows a network device to negotiate an automatic bundling of links by sending LACP packets to the peer. As part of the vSphere 5.1 release, VMware now supports this standard based link aggregation protocol.

Single Root IO Virtualization is a standard that allows one PCI express (PCIe) adapter to be presented as multiple separate logical devices to the VMs. The hypervisor manages the physical function (PF) while the virtual functions (VFs) are exposed to the VMs. In the hypervisor SR-IOV capable network devices offer the benefits of direct I/O, which includes reduced latency and reduced host CPU utilization. VMware vSphere ESXi platform’s VM Direct Path (pass through) functionality provides similar benefits to the customer, but requires a physical adapter per VM. In SR-IOV the pass through functionality can be provided from a single adapter to multiple VMs through VFs.

SR-IOV

Single Root IO Virtualization is a standard that allows one PCI express (PCIe) adapter to be presented as multiple separate logical devices to the VMs. The hypervisor manages the physical function (PF) while the virtual functions (VFs) are exposed to the VMs. In the hypervisor SR-IOV capable network devices offer the benefits of direct I/O, which includes reduced latency and reduced host CPU utilization. VMware vSphere ESXi platform’s VM Direct Path (pass through) functionality provides similar benefits to the customer, but requires a physical adapter per VM. In SR-IOV the pass through functionality can be provided from a single adapter to multiple VMs through VFs.

BPDU Filter

BPDUs are data messages or packets that are exchanged across switches to detect loops in a network. These packets are part of the Spanning Tree Protocol (STP) and are used to discover the network topology. The VMware virtual switches (VDS and VSS) do not support STP and thus do not participate in BPDU exchange across external physical access switches over the uplinks.

The BPDU filter feature available in this release allows customer to filter the BPDU packets that are generated by virtual machines and thus prevents any Denial of Service attack situation. This feature is available on VMware vSphere Standard and Distributed switches, and can be enabled by changing the advanced “Net” settings on ESXi host.

Port Mirroring and NetFlow Enhancements

To address the network administrator’s need for visibility into virtual infrastructure traffic, VMware introduced port mirroring and NetFlow features as part of the vSphere 5.0 release. These features provide necessary and familiar tools to network administrators that help them in monitoring and troubleshooting tasks. In vSphere 5.1, the port-mirroring feature is enhanced through the additional support for RSPAN and ERSPAN capability.

IPFIX or NetFlow version 10 is the advanced and flexible protocol that allows customer to define the NetFlow records that can be collected at the VDS and sent across to a collector tool. Following are some key attributes of the protocol:

Customers can use templates to define the records

Template descriptions are communicated by the VDS to the Collector engine

Can report IPv6, MPLS, VXLAN flows.

VDS Management Plane Scalability

Following are the scalability numbers for VDS management plane

  • Static dvPortgroups goes up from 5 K to 10 K
  • Number of dvports goes up from 20 K to 60 K
  • Number of Hosts per VDS goes up from 350 to 500
  • Number of VDS supported on a VC goes up from 32 to 128

Netdump

Netdump is a vSphere ESXi platform debug feature that helps dump the vmkernels core dump to a server on the network. In this release of vSphere 5.1 the netdump support is extended to the ESXi host without local disks or also termed as stateless ESXi or Auto deploy environments.

In vSphere 5.0, enabling netdump on an ESXi host with the management network configured on a VDS was not allowed. In vSphere 5.1, this limitation has been removed. Users now can configure netdump on ESXi hosts using management network on VDS.


1 Comment

vSphere 5.1 Announced with Enhanced vSphere Web Client

Another of the new features of vSphere 5.1 is the Enhanced vSphere Web Client. The web Client was already part of vSphere 5 but now it is the Primary client for administrators in vSphere 5.1. Some facts

Enhanced vSphere Web Client:

The NEW virtual infrastructure client

  • Primary client for vSphere administrators in vSphere 5.1
  • Matched functionality to legacy vSphere Client
  • Additional vCenter 5.1 functionality, only available through the vSphere Web Client

Browser based

  • Internet Explorer / FireFox / Chrome fully supported (Rumours are Chrome is the fastest)
  • others (Safari, etc.) are possible (But without VM console access)

vSphere Web Client – Installation

Installer located on ISO image

Install on vCenter Server or separate server (recommended)

Login using

  • https://<FQDN or IP Address>:9443/vsphere-client/
  • Install Client Integration Plugin for console access

image

  • vSphere Web Client included with vCenter Server Appliance

vSphere Web Client – Object Navigator

Breaks the traditional hierarchy view of an object

  • Objects linked and displayed by relationships

image

Conventional top level hierarchy view maintained on HOME screen and links to object navigator

  • Allows an admin to view objects by solutions
  • But maintains global perspectives

image

  • Allows an admin to jump to the crucial element faster via object relationships and object search
  • Reduces client clutter and repetitive information by simplifying display of objects
  • Displayed objects are all that is communicated between server and browser

image

vSphere Web Client Interface

The new interface has the look and feel of vCloud Director but with loads of new features and goes to the same layout that vCenter Operations Manager for example has already.

image

vSphere Web Client – Plugin’s

Plugins are now server based

•Recreated in FLEX

•HTML Plugins (temporary work around)

VMware Plugins (90 Days post GA)

•vSphere Update Manager (VUM)

•vCenter Site Recovery Manager (SRM)

•vShield Manager

All VMware Solutions will integrate as they get updated

Third Party Plugins

•EMC, NetApp, HP, Dell etc

Centralised Log Browser

Proven framework to provide rich troubleshooting tools

vSphere Web Client plugin

Takes snapshot of specified host / vCenter logs

Provides rich user interface to review log data

  • search
  • filter by name / event / keyword
  • compare multiple logs
  • highlight key words

image

Simplifies Troubleshooting

The new vSphere Web Client looks to be a great replacement for the viclient and with SRM and other tools tipped to integrate it should provide every vSphere administrator an easier way to manage and administer their environments and give them all the stats and tools needed.

There are going to be a whole bunch of web based tutorial’s for people to learn how to use the new vSphere Web Client on

http://blogs.vmware.com/vsphere/vcenter-server/

I’m really looking forward to learning how it all works and being able to integrate all the new and existing plugins into it.

Gregg

Note: Screenshots thanks to VMware.


Leave a comment

VCDX Spotlight: Tom Ralph

Name: Tom Ralph

Twitter Handle: @tomralph

Blog URL: http://www.virtualserverguy.com

Current Employer: VMware, Inc

VCDX #: 51

How did you get into using VMware?

I first started by purchasing an IBM xSeries 1U rack mount server off eBay to try out this new product from VMware. I fired up the server in my home office, installed ESX 2.53, and started to learn about virtualization. After the first 20 minutes, I could see that VMware and server virtualization was the future.

What made you decide to do the VCDX?

As soon as I learned of the VCDX certification, I made it a goal to achieve the VCDX certification and a number under 100.

How long did it take you to complete the whole VCDX journey?

I had already had my VCP from 2007, I first took my VMware Design Exam in March of 2010, with the Administration shortly after that. I then defended my design in August of 2010. I then paced around VMworld awaiting my results, which finally came 2 weeks after the show ended.

What advice would you give to people thinking of pursuing the VCDX accreditation?

If you want it, go for it! I learned more about technology, enterprise architecture, and process than I ever thought I would have. During your defence know when to say ‘I do not know’, it is a hard skill to master but a critical one. Know the smallest details of your design and know them through and through.

If you could do the whole VCDX journey again what would you do differently?

When I first attempted the VCDX certification, I was newly married to a wonderful woman that allowed me to focus 100% on the process. Now we have a 1-year-old child, I am not able to devote the time needed. I would take more time to thoroughly understand and complete my design.

Life after the VCDX?  How did your company respond?  Was it worth it?

My previous company did not know what to make of the certification or what it really meant. It wasn’t soon after I got the VCDX certification that I made the choice to leave that company and move to VMware. From there, my career has blossomed and continues to do so.