TheSaffaGeek

My ramblings about all things technical


Leave a comment

VCDX Spotlight: Kenny Garreau

Name: Kenny Garreau

Twitter Handle: @kennega

Blog URL: http://dudewheresmycloud.com

Current Employer : Lumenate

VCDX #: 115

How did you get into using VMware?

My first exposure to VMware came when I was starting as a System Admin for a financial services company. I invested a lot of personal time learning the technology, and was eventually given the task of re-architecting our virtual infrastructure. This was a formative time for learning and putting into practice VMware, networking and storage design skills before I dove into the consulting arena.

What made you decide to do the VCDX?

I spent a couple of years consulting before I felt I had enough customer presentation and design experience to suitably defend a design. The design I submitted for my VCDX application was my first design at my second consulting job, and I remember thinking “Wow, this would be a great candidate for a VCDX defense.” It turns out that VMware and the panellists agreed.

How long did it take you to complete the whole VCDX journey?

I began by completing my VCAP-DCD and VCAP-DCA at the end of October 2012. I submitted my initial design in early December of 2012 to defend at PEX. I didn’t pass, so I took a couple of months off to recharge. I went back at it for VMworld 2013, and passed. So about 10 months.

What advice would you give to people thinking of pursuing the VCDX accreditation?

Understand it’s not about the size or the complexity of the design you’re submitting to defend. It’s about your skills in designing around customer requirements and constraints, and mitigating risk to the customer and the project. Know your design, and recognize that if you include something in your design, justify it and know it. Finally, make sure your significant other knows and understands the journey you’re about to undertake. You’ll need their support, but it’s equally important to make time for them as well.

If you could do the whole VCDX journey again what would you do differently?

I would engage my fellow applicants earlier – they will be much more critical of your design going through the process than someone outside of the VCDX program. I’d try to complete my design a couple of weeks ahead of the deadline and run through a mock defense. It will help you identify weak points in your presentation, both technically and grammatically. You can then improve your design for its final submission and review by the panel.

Life after the VCDX?  How did your company respond?  Was it worth it?

Spend some time decompressing; you are going to need it! I had an overwhelming response from my co-workers, but the community response was what inspired me. Those who have been through the program realize the time and effort that goes into the entire process. To be counted among many of the very best names in datacenter and virtualization design is a humbling honour.


Leave a comment

VCAP-CID Objective 1.2 – Identify and Categorize Business Requirements

Knowledge

 Identify discovery questions for a conceptual design (number of users, number of VMs, capacity, etc.)

  • These questions are ones you are going to ask during the design workshop for the design/project. For the workshop you need to make sure you have the applicable project participants/stakeholders who can join the workshops (depends if you want one big one where people come and go at certain points or multiple ones where you speak to each business unit/ team). For the stakeholder meetings/design workshops I personally like to try bring in the following people, this does vary depending on the project and what has been chosen but 9/10 times these are the people you want to speak to:
      • Virtualisation administrators (if applicable. If not already present then future administrators of the solution)
      • Server Hardware Administrators
      • Backup Administrators
      • Storage Administrators
      • Desktop/OS Administrators
      • Network Administrators
      • Application Administrators (these are very important as their applications may have very specific requirements)
      • Security Officer
      • Project Sponsors
      • End users/ Help desk personnel (this I find is helpful to find out what are the current support desk tickets/problems the company are facing and if these will impact the project in any way. Also these discussions are easy to have in the hallway/over a coffee but have alerted me to unknown risks that would have severely impacted the design and delivery)

vcap

Identify the effect of product architecture, capabilities, and constraints on a conceptual design.

  • I may be looking at this the wrong way but I think this is actually around how specific products architecture, capabilities and constraints isn’t applicable in a conceptual design as for a conceptual design you are only creating a “napkin” design diagram of how the whole environment is going to be delivered.

Skills and Abilities

Relate business and technical requirements to a conceptual design.

  • From one of the VMware service delivery kits available to VMware partners they give a great breakdown of what requirements are and what business and technical requirements are:
    • Requirement – Documented statement that depicts the requisite attributes, characteristics, or qualities of the system
    • Business requirements – Describes what must be achieved for the system to provide value
      • System must provide self-service capability
      • System must provide x% availability
      • System must provide optimal scalability and elasticity
    • Technical requirements – Describes the properties of a system which allow it to fulfill the business requirements
      • System requires a Web portal where users can log in securely and deploy virtual machines based on defined policies
      • System must have fully redundant components throughout entire stack (host, network, storage)
      • System leverages virtualization technology and associated features
  • As mentioned these requirements will be gleamed from the Design Workshops/Stakeholder meetings and then put into the conceptual design. This is where you would work out if the customer requires a private, hybrid, public or even community cloud deployment. For example if the customer requires certain data to remain in a country for regulatory reasons then in the conceptual design you know compute resources, networking and connectivity between that country and the primary site need to be available. The speeds, number of hosts, make of hosts and amount of memory and vCPU are not in the conceptual design as this is the “napkin” design just covering the concept of how it will all work out and may actually change once you get to the logical and physical designs.
Number Requirement
R001 Virtualise the existing 6000 UK servers as virtual machines, with no degradation in performance when compared to current physical workloads
R002 To provide an infrastructure that can provide 99.7% availability or better
R003 The overall anticipated cost of ownership should be reduced after deployment
R004 Users to experience as close to zero performance impact when migrating from the physical infrastructure to the virtual infrastructure
R005 Design must maintain simplicity where possible to allow existing operations teams to manage the new environments
R006 Granular access control rights must be implemented throughout the infrastructure to ensure the highest levels of security
R007 Design should be resilient and provide the highest levels of availability where possible whilst keeping costs to a minimum
R008 The design must incorporate DR and BC practices to ensure no loss of data is achieved
R009 Management components must secured with the highest level of security
R010 Design must take into account VMware best practices for all components in the design as well as vendor best practices where applicable
  • For Technical Requirements a great way of doing it is to break them down into sections like:
    • Virtual Datacentre Requirements – eg: Allocation model Virtual Datacenters reserves 75% of CPU and memory
    • Availability Requirements – eg: VMware vCloud Director (clustering, load balancing)
    • Network Requirements – eg: Organizations have the ability to provision vApp networks
    • Storage Requirements – eg: Different tiers of storage resources must be available to the customer (Tier 1 = Gold, Tier 2 = Silver, Tier 3 = Bronze)
    • Catalogue Requirements – eg: Catalog items are stored on a dedicated virtual datacenter and dedicated storage
    • SLA Requirements – eg: SLA Requirement #1 – Networking 100%
    • Security Requirements – eg: Organizations are isolated from each other
    • Management Requirements – eg: Only technical staff uses remote console access
    • Metering Requirements – eg: Metering solution must monitor vApp power states for PAYG
    • Compliance Requirements– eg: Solution must comply with PCI standards
    • Tenant Requirements – eg: Customer requires the ability to fence off vApp deployments
  • To make sure you are doing the design in a VCDX-like manner which should push you to do it at a very high level, don’t forget to refine the customer-specific technical requirements and validate that they are specific, measurable, accurate, realistic, and testable (SMART).

Gather customer inventory data.

  • This is what is going to be on the new vCloud system whether it is existing workloads or new workloads. A good way of getting this if the customer allows it is to run a VMware Capacity Planner collection on the existing workloads that are going to be migrated in so you know sizes, I/O and current state analysis values. The Capacity Planner can only be run by VMware partners so if this isn’t possible for you then manual collection and recording is going to be required. Another method is via the VMware vCloud Planner which is another tool only available to VMware Partners so maybe getting a VMware partner in to do this for you prior to the project running would be a good idea
  • Also knowing what the customer already has can help you understand possible future constraints for example that all their current servers are IBM and so this is likely to be the server platform for this design.
  • There may also be a requirement to use existing legacy physical kit already present in the datacentre which needs to be recorded and fully understood so that the risks and constraints of using this infrastructure are fully understood. For example if you are using legacy network switches which can’t do stretched VLANs this will impact your design substantially if you have two sites and a requirement for the Management cluster to be failed over/migrated in the event of a disaster.

Determine customer business goals.

  • This is plainly what is the customer looking to gain from the deployment of this solution? At the end of the project what do they hope to achieve? These are sometimes not as clear as you may hope as people have different ideas of what they want the solution to achieve so as the architect you will need to take all these business requirements, set expectations if they are unrealistic due to varying reasons like cost or pre-selected hardware and then define them and get sign off from the customer that they agree to these before any additional work is done. This is very important as if these aren’t defined and agreed to by the customer then scope creep can happen which could cause the project to fail.

Identify requirements, constraints, risks, and assumptions.

  • I’m not going to go into great depth here as I think the definitions of each will give you a good idea of what each is. During the design workshops/stakeholder meetings these are worked out, recorded and agreed to by the customer. Always remember that for any design you need to collect all of these and then look at it in a holistic manner and understand the impacts of each decision.
    • Requirements – Documented statement that depicts the requisite attributes, characteristics, or qualities of the system. See above portions around Business and Technical requirements plus the examples.
    • Constraints – Requirements that restrict the amount of freedom in developing the design
      • Hardware which already exists and must be used (for example,host or storage array)
      • Physical limitations (distance between sites, datacenter space)
      • Cost $$$
    • Risks – Potential issues that may negatively impact the reliability of the design
      • Lack of redundancy for specific hardware component
      • Support staff has not had any training
    • Assumptions – Suppositions made during the design process regarding the expected usage and implementation of a system
      • Provides a sounding board for design decisions which must be validated
      • Hardware required is installed before vCloud implementation
      • Network bandwidth is not a limiting factor for external end users
      • Appropriate training is provided to existing technical staff
    • For assumptions and risks I like to get these highlighted to the customer right away as you normally don’t want any assumptions if possible and for the assumptions you record in your design you want these to be realistically clarified already so that the assumptions are only there to ensure that if what they promised would be there isn’t you can refer them to the assumptions they signed off.

Given customer requirements and product capabilities, determine the impact to a conceptual design.

  • This I think is covered above in places but is also something you can only really learn from actually doing a design and understanding how requirements shape a design and what impacts each of them have. On a conceptual design it isn’t as much of an impact as in a logical and physical design but limitations like keeping workloads in specific geographies and the capability of vCloud stretched clusters between the two locations for example are something that will impact the conceptual design. I would also read the Service definitions listed below in the recommended tools from the blueprint and the implementation examples from the vCAT.

Tools

If you feel I have missed something or am wrong on something then please do comment as I don’t proclaim to be the best and am always learning and welcome constructive criticism and feedback

Gregg


Leave a comment

VCAP-CID Objective 1.1 – Create a Conceptual Design Based on Business Requirements

Due to an imminent customer engagement I am due to be working on I have been refining my vCloud skills and dusty away the cobwebs. One of these tasks was to book the VCP5-IaaS and sit it so that it forced me to learn the basics again and be sure I had a solid base knowledge with no gaps. My experience of the exam and the resources I used for it are mentioned in my VCP5-IaaS Exam Experience blog posting. I have now been using the VCAP-CID blueprint as a structure for perfecting my vCloud design skills and so I thought I would slowly post up each objective for my own benefit but also hopefully help other people looking to take the VCAP-CID. I will be consolidating all the objectives on my blog page here

Skills and Abilities

  • Distinguish between virtualization, automation and cloud computing.

    • This could be defined in a number of ways (I’m more than happy to be corrected here) but the way I piece it all together is:
      • Virtualization is what VMware has been doing for years with vSphere and its complementing technologies. This is nothing new to anyone preparing for this exam and if it is then I hate to tell you this but this exam isn’t for you.
      • Automation ties perfectly into the NIST definition of on-demand self-service which is :  Unilaterally provision computing, as needed, automatically without requiring human interaction
        • This can be done through multiple technologies and mechanisms like VMware’s vCenter Orchestrator, vCAC,vFabric Application Director and third party tools like Puppet, Razor and IBM’s Virtualization Automation solution. Without true automation you can’t have a Cloud.
      • Cloud computing is perfectly defined by the industry recognised NIST cloud requirements which are:
        • On-demand self-service: Unilaterally provision computing, as needed, automatically without requiring human interaction
        • Broad network access: Capabilities are available over the network and accessed through standard mechanisms
        • Resource pooling: The provider’s computing resources are pooled with virtual resources dynamically assigned and re-assigned according to consumer demand.
        • Rapid elasticity: Capabilities can be rapidly and elastically provisioned, in some cases automatically, to quickly scale out and be rapidly released to quickly scale in.
        • Measured service: Cloud systems automatically control and optimize resource use by leveraging a metering capability. Resource usage can be monitored, controlled, and reported providing transparency of the utilized service.
      • For VMware’s IaaS definition from which they define the VMware vCloud blueprint is:
        • A cloud must be built on a pooled, virtual infrastructure. Pools include not only CPU and memory resources but also storage, networking, and associated services.
        • The cloud should provide application mobility between clouds, allowing the consumer to enter and leave the cloud easily with existing workloads. The ability to use existing consumer tools to migrate workloads to or from the cloud is highly desirable. Mobility of workloads between clouds requires cross-cloud resource management.
        • The cloud should be open and interoperable, allowing the consumption of cloud resources over open, Internet-standard protocols. Access to cloud resources does not require any other specific network protocols or clients.
        • Cloud consumers should pay only for resources they consume or commit to consuming.
        • The cloud should be a secure, trusted location for running cloud consumer workloads.
        • Cloud consumers should have the option and the ability to protect their cloud-based workloads from data loss.
        • Cloud consumers are not responsible for the maintenance of any part of the shared infrastructure and do not need to interact with the cloud provider to maintain the infrastructure. They are not responsible for storage and network maintenance, ongoing cloud infrastructure patches, or business continuity activities. The cloud should be available to run high-availability workloads, and any faults occurring in the cloud infrastructure should be transparent to cloud consumers as a result of built-in availability, scalability, security, and performance guarantees.
  • Distinguish between private, public, hybrid and community cloud computing.

    • These are defined perfectly in the vCAT 3.1 introduction document as:
      • Private cloud: A private vCloud (also known as an internal vCloud.) operates on private networks, where resources are accessible behind the firewall by a single company. In many cases, all the tenants share one legal entity. For example, a university might offer IaaS to its medical and business schools, or a company might do the same for various groups or business units. The private vCloud can be managed by the enterprise and hosted on premise or operated on a dedicated infrastructure provided by a vCloud service provider or systems integrator. In any case, a private vCloud must conform to the organizational security constraints.
      • Public cloud: A public vCloud offers IT resources as a service through external service providers and is shared across multiple organizations or the Internet. This can be viewed as a vCloud infrastructure that is operated by one organization for use by multiple, legally separated organizations. A public vCloud is provisioned for open access and might be owned, managed, and operated by one or more entities. A public vCloud provider might also support a private, community, or hybrid vCloud.
      • Hybrid cloud: A hybrid vCloud combines the benefits of the private and the public vCloud, with flexibility and choice of deployment methods. A hybrid vCloud consists of multiple, linked vCloud infrastructures. These distinct vCloud infrastructures can be private, community, or public, they but must meet a set of requirements defined by the providers and agreed to by the consumers. Connecting these vCloud instances requires data and application mobility as well as management. When load-balancing between vCloud instances (cloud bursting), use a consistent monitoring and management approach when migrating an application or data workload.
      • Community cloud: A Community vCloud is a specific public vCloud use case where the cloud is shared, and typically owned, by a group of organizations with a common set of requirements. In many cases, the organizations also include some level of legal separation. Community vCloud resources are shared, with some parts under central control and other parts with defined autonomy. A vCloud built for government, education, or healthcare might be an example of a community vCloud. A community vCloud can be offered by a traditional service provider, by a member of the community, or by a third-party vendor and hosted on one or more sites. It can be placed on-premise at one or more of the organizations’ sites, off-premise at a vCloud provider site, or both on- and off-premise.

 

  • Analyze a customer use case to determine how cloud computing can satisfy customer requirements.

    • For this I would recommend you read the Service Definitions document from the vCAT as this covers all the definitions and how they map to customer requirements and fulfil these requirements. Also the VMware vCloud Implementation Examples document also from the vCAT shows you how varying implementations can benefit businesses in differing ways

 

  • Given a customer use case, determine the appropriate cloud computing model.

    • This is one I feel you can only do once you have a firm understanding of the capabilities of all the different Cloud offerings and how each of them meet varying requirements and also have differing constraints/disadvantages.


30 Comments

VCAP5-DCD Retake

This Monday I re-sat my VCAP5-DCD exam after having marginally failed it the first time in January this year. I wrote a fairly extensive blog posting about my opinions about the exam and the additional resources I planned to use. I would recommend people read that posting first if you haven’t as I still maintain 95% of the pieces I said and mentioned in there are true about the exam. This time I thankfully passed the exam and with not a bad score of 333 also.

 

Resources used:

 

For this attempt i did use a fair portion more resources and actually think I studied more this time than I did for my first attempt. I thought I would list the resources I used or re-used for this attempt and am planning on adding the resources mentioned here on my VCAP5-DCA & DCD Study Resources page if they aren’t mentioned on there already:

 

– I read the official VMware book Building a Virtual Datacenter to try help me get the holistic view and mentality you have to maintain during the build of a virtual datacenter and how every decision can have an impact on another portion of your environment and design. The book was really good and I would recommend it but I have to admit I did skip certain portions as I had covered them in books that had them covered much better and in more depth.

 

-I bought the kindle version of the new VMware vSphere Design book from Forbes Guthrie and Scott Lowe. I bought the kindle version as the paperback version wasn’t out in Europe for a while and my timeframes for studying were very tight. The book is utterly brilliant and covers both vSphere 5 and 5.1 and I would HIGHLY recommend it for the exam and anyone who works with VMware.

 

-As I stated I would, I read the  VMware press book Managing and Optimizing VMware vSphere Deployments by Harley Stagner and Sean Crookston which helped me gain more knowledge around all the portions of a design and the link each component in the design has. The main piece from this book that i really liked was the operational portions as you can’t do a design without having the end goal and plan of it being able to run for a long time after you have left (if you are a consultant like I am).

 

-The main thing I really focused on was going through the whole vSphere Design workshop course notes, lab guides and answers to the lab guides and made sure I understood every single portion and why certain decisions were made by VMware in the completed designs of the labs. If you haven’t been on the course I would beg management to put you on it as it covers every portion you need to know for the exam and gives some great tips for the exam (no I cant tell you what these are)

 

Exam experience:

I was more nervous for this attempt than my first attempt as I really wanted to pass it this time as with having a five week old little one my studying schedule took a knock and I actually postponed the exam for two week later from it’s initial date due to not getting through portions I wanted before the attempt.

 

Once I got into the exam and started making my way through the questions with each question I felt I had got correct or very close to correct i became more and more confident. I also think i managed my time a bit better this time and wasn’t as overwhelmed by what they were asking of me. Before the exam starts they tell you how many visio style questions you are going to get so I wrote down the numbers (1-6 for me) and marked them out after each one so that I knew how my time management was going. I did have two drag and drop questions in my last three questions which used up my time and meant i only had around 8 minutes left by the time I completed the last question. The result came up and very quickly and I was in shock that it stated congratulations and actually started feeling dizzy after not having been able to eat much before the exam due to feeling sick from nerves and not having drank much as I knew I couldn’t afford toilet breaks.

 

Tips:

 

For this attempt i came across and learnt a few tips for the exam which helped me with the visio style questions and allowed me to be sure portions were connected correctly.

 

-There is a scissors icon beside the bin in the right hand bottom corner that allows you to cut a connector/connection you have made in error without moving loads of portions across the page by trying to move the connection to the bin. I did this drag and drop mistake a few times in my first attempt and it really hurt me as it moved portions off the screen and so meant I had to redo pieces.

-Make sure connections have stuck to boxes by carefully trying to move the box and seeing if the connector follows. This is related to the piece above and is a good tip to make sure you have connected the boxes correctly. Also make sure you connect the correct portions together as I noticed once or twice I didn’t click the correct piece and so the pieces I meant to have connected were actually not connected so be careful where you click.

-Do practice designs at home on paint or visio or even word to allow yourself to visualise how you would do different visio style designs scenarios so that when you are in the exam and maybe see one of them you know what your final designs should look like.

 

Conclusion/what’s next:

 

So now that I have both my VCAP5-DCA and DCD I can start designing my VCDX infrastructure and submit the design for defence for the VCDX5 accreditation. I still need to do some soul searching and decide when I want to submit as it’s a serious amount of work to complete all the required documents and my planned design is only about 60% where I want it to be before submitting it so I’m estimating around 40 hours of work to get it all ready which isn’t easy to find with a 5 week old, a full time job as a consultant and my sanity maintained. I will most likely slowly start building my design and documents and submit for PEX early next year although I may be drawn to do it sooner or later.

 

For those looking to do either of the exams I would recommend starting right away and also booking a date for it so that you are pushed to get through everything, the exams are very challenging but there are amazing resources out there which will help you gain the knowledge to pass the exam and with loads of lab time and practicing you can pass them. Good luck to all those who are preparing or looking to do the exams and hopefully my resources page and this blog help you.

 

Gregg


13 Comments

Safe and Legit Storage Design Completed

Below is my thoughts, additional questions I felt needed to be asked/things to be clarified and the Design decisions,justifications and impacts due to these decisions for the Safe and Legit Storage design. If you missed the posting where I detailed the mock scenario you can read it here 

 

Note: This is a learning exercise for me so if you feel I’ve missed something or made a wrong decision then please write it in the comments and I’m more than happy (it was one of the main reasons I’m looking to do this series of postings) to discuss and I’ll amend the design accordingly if it makes sense and hopefully I along with other people reading these postings will learn from it and become better.

 

Additional Questions

As I said there probably would be and which is something I feel is really important when doing real world designs is trying to think of as many questions around a customer requirements so that you can ensure you have their requirements recorded correctly and that they aren’t vague.The additional questions and the answers to them are listed below:

 

Q: Is there any capability of utilising the existing storage in the privately owned UK DC?

 

A: Due to the consolidation and migration of  the other UK DC’s and the current workloads in the privately owned DC a new SAN is a better option due to the SAN being 3 years old now and so it is more cost effective to purchase a new one. Also due to the probable need for auto-tiered storage to meet the customers requirements a new SAN with these capabilities is needed

 

Q: Is there no way a minimal planned outage/downtime can be organised for the migration of the workloads due to the likely higher cost of equipment to ensure this near-zero downtime?

 

A: The customer would prefer to try keep to the near-zero downtime and so it is agreed that after the conceptual design of the storage and the remaining components in the whole design further meetings can be held to discuss a balance between cost and the desire for near-zero downtime

 

Q: With the leasing out of the private level 4 suites in the future will there be a requirement to manage/host other companies processes and data within this infrastructure being designed?

 

A: No there is currently no plan to do this due to security concerns and the number of compliancy regulations Safe and Legit need to maintain and fulfil. There is however a possibility of internal consumption and charging for usage of the DC’s resources to other departments.

 

Q: What other questions do you feel should be asked?

Additional Functional Requirements

-5K 3rd party users will need to be able to gain access into the environment without any impact during the migration and consolidation

-Rented DC’s kit needs to be fully migrated to the privately owned datacenter before Q1 2015 to ensure the contracts don’t need to be renewed

Constraints

Below are the constraints I felt were detailed in the scenario. These will possibly change as I go further through all the other sections but so far these are the ones I felt were applicable:

– Usage of EMC kit

– Usage of Cisco kit

– Usage of the privately owned DC’s physical infrastructure for the consolidation of all three UK DC’s.

Assumptions

Below are the assumptions I felt had to be made. These will possibly change as I go further through all the other sections and normally I try to keep these as minimal as possible but for a project of this size it would be extremely difficult to not have any as you do have to trust certain things are in place:

– There is sufficient bandwidth between the UK DC’s to allow migration of the existing workloads with as little of an impact to the workloads as possible

All required upstream dependencies will be present during the implementation phase.

– There is sufficient bandwidth into and out of the privately owned DC to support the bandwidth requirements of all three DC’s workloads

– All VLANs and subnets required will be configured before implementation.

Storage will be provisioned and presented to the VMware ESX™ hosts
accordingly.

– Power and cooling in the privately owned DC is able to manage the addition of the required physical infrastructure of the Virtual Infrastructure whilst for a certain amount of time having older physical machines still running alongside

– Safe and Legit have the existing internal skillset to support the physical and virtual infrastructure being deployed.

– There are adequate licences for required OS and applications required for the build

 

Risks

– The ability of ensuring near-zero downtime during the migration of workloads to the privately owned DC may be at risk due to budget constraints impacting the procurement of the required infrastructure to ensure zero downtime

Storage Array

Design Choice EMC FC SAN with two x8GB SP
   
Justification -EMC due to constraint of having to use EMC storage due to previous usage
-EMC VNX 5700 with Auto-Tiering enabled
– 8GB to ensure high transmission speeds to the storage,12GB is too high and expensive for this design
   
Design Impacts -Switches will need to be capable of 8GB connectivity
– FC Cabling needs to be capable of transmitting 8GB speeds
-HBA’s on ESXi hosts need to be capable of 8GB speeds
   

Number of LUNs and LUN sizes

Design Choice 400 x 1TB LUNs will be used
   
Justification -Each VM will be provisioned with 50GB average of disk
-So with around 15 vm’s per lun + 20% for swap and snapshots, 15x 50GB / .8 = 937.5
– So 6000 total VM’s / 15 VMs per LUN = 400 LUNs
   
Design Impacts -Tiered storage will be used with auto tiering enabled to balance storage costs with VM performance requirements
   

Storage load balancing and availability

Design Choice -EMC PowerPath/VE multipathing plug-in (MPP) will be used.
   
Justification

-EMC PowerPath/VE leverages the vSphere Pluggable Storage Architecture (PSA), providing performance and load-balancing benefits over the VMware native multipathing plug-in (NMP).

   
Design Impacts -Requires additional cost for PowerPath licenses.
   

VMware vSphere VMFS or RDM

Design Choice -VMFS will be used as the standard unless there is a specific need for raw device mapping . This will be done on a case by case basis
   
Justification

-VMFS is a clustered file system specifically engineered for storing virtual machines.

   
Design Impacts -Usage of the VMware vSphere Client to create the datastores must be done to ensure correct disk alignment
   

Host Zoning

Design Choice

-Single-initiator zoning will be used. Each host will have two paths to the storage ports across separate fabrics.

   
Justification -This is keeping to EMC best practices and ensures no single point of failure with multiple paths to targets across multiple fabrics
   
Design Impacts -Zones will need to be created for each portion by the storage team
   

LUN Presentation

Design Choice

-LUNs will be masked consistently across all hosts in a cluster.

   
Justification -This allows for virtual machines to be run on any host in the cluster and ensures both HA and DRS optimisation
   
Design Impacts -The storage team will need to control and deploy this due to the masking being done on the storage array
   

Thick or Thin disks

Design Choice -This provisioning will be used as the standard unless there is a specific need for thick provisioned disks . This will be done on a case by case basis
   
Justification

-The rate of change for a system volume is low, while data volumes tend to have
a variable rate of change.

   
Design Impacts -Alarms will need to be configured to ensure that if disks reach an out of space condition there is ample time to provision more storage
   

Virtual Machine I/O Priority

Design Choice -Storage I/O Control will not be used
   
Justification -This is due to the storage utilising Auto-Tiering/FAST which works at the block level to balance and is therefore a better way of balancing
– Due to the likelihood that VMware SRM is going to be used then SDRS and SIOC is not supported
   
Design Impacts – FAST/Auto-Tiering will need to be configured correctly by the storage vendor
   

Storage Profiles

Design Choice -Storage Profiles will not be configured
   
Justification -Storage will be managed by the storage team
   
Design Impacts -Storage team will need to configure storage as the virtual infrastructure requires
   

Describe and diagram the logical design

Attribute Specification
Storage Type Fibre Channel
Number of Storage Processors 2 to ensure redundancy
Number of Fibre Channel Switches (if any) 2 to ensure redundancy
Number of ports per host per switch 1
Total number of LUNs 400 (as mentioned above)
LUN Sizes 1TB (as mentioned above)
VMFS datastores per LUN 1

image

Describe and diagram the physical design

Array vendor and model EMC VNX 5700
Type of array Active-Active
VMware ESXi host multipathing policy PowerPath/VE MPP
Min/Max speed rating of storage switch ports 2GB/8GB

I’m looking for the correct EMC diagrams to create the physical design diagram  so will update this postings this week with the diagram promise Smile

Well that’s my attempt at the storage design portion of Safe and Legit. Hopefully people will agree with most of the decisions I’ve made if not all of them and I have to admit it took me most of my Sunday just to do this piece and think of all the impacts and as stated there may be additional constraints and risks further down the line.

 

Gregg


4 Comments

Safe and Legit Storage Design

In my previous posting I created a fictitious company who requires you as the VMware Architect to design them a vSphere 5.0 environment to meet all their requirements whilst keeping within their constraints and mitigating risks. Now I didn’t list the constraints or the risks as I felt this was something that is very important to learn how to define in preparation for the VCAP5-DCD and vSphere designs in real life practice.

The first portion of the design I’m hoping to create (and get everyone’s opinions,participation and comments on ) is the storage design. So below are the portions I will be trying to fill out for the Safe and Legit scenario and hopefully people also wanting to learn and participate will fill out each of the sections with their own design decisions and then we can compare and hopefully learn together/off each other.

Storage Array

Design Choice
Justification
Design Impacts

Number of LUNs and LUN sizes

Design Choice
Justification
Design Impacts

Storage load balancing and availability

Design Choice
Justification
Design Impacts

VMware vSphere VMFS or RDM

Design Choice
Justification
Design Impacts

Host zoning

Design Choice
Justification
Design Impacts

LUN Presentation

Design Choice
Justification
Design Impacts

Thick or Thin disks

Design Choice
Justification
Design Impacts

Virtual Machine I/O Priority

Design Choice
Justification
Design Impacts

Storage Profiles

Design Choice
Justification
Design Impacts

Describe and diagram the logical design

Attribute Specification
Storage Type
Number of Storage Processors
Number of Fibre Channel Switches (if any)
Number of ports per host per switch
Total number of LUNs
LUN Sizes
VMFS datastores per LUN

Describe and diagram the physical design

Array vendor and model
Type of array
VMware ESXi host multipathing policy
Min/Max speed rating of storage switch ports

Loads of bits to decide and design. I’m hoping to have my storage design decisions and what I thought were the constraints and risks for the design up by the end of the week and if not then by the latest next week Monday in my next posting. Happy designing Winking smile

Gregg


14 Comments

VCAP5-DCD Design Practice

As some people may know I am currently preparing to re-take my VCAP5-DCD and I have reached the point in my preparations now where I am doing mock designs and also going through the labs from the VMware Design Workshop and so I thought I would follow the same idea and start creating a mock customer design scenario and also put down the same vein of questions I am being asked from the design workshop labs and hopefully if people are interested they can use it, write down what design choices,the justifications for these  choices and the impacts these choices create on the rest of the design and hopefully everyone will learn from this. Below is a company profile that I made up and I also used some ideas from a scenario Matt Mould one of my Xtravirt colleagues sent me as few months back:

Company Profile
•    Safe & Legit, are a global trading company – they specialise in ground defence equipment
•    13,000 physical servers across 9 sites.
o    6k  UK (3 sites)
o    2k  CN (3 sites)
o    5k  US (3 sites)
•    There are two level 4 DC’s per country (for info on DC levels see
http://en.wikipedia.org/wiki/Data_center
•    DC’s are linked by an MPLS cloud from BT, Verizon, Colt and NTT (contracts end Q1 2015)
•    One DC per country is privately owned and Safe & Legit want to retain the real estate, but make room to lease out sought after level 4 private suites, thus providing a new revenue stream, and hopefully make their own DC’s cost neutral in doing so. Therefore they are looking to virtualise as much of their physical estate as possible into vSphere 5.0
•    The remaining DC’s are rented from BT, Verizon and NTT (contracts end Q1 2015) . The CFO has voiced his desire to cut the cost of these rentals and would ideally like to not have to renew the contracts if possible.
•    ERP is centralised in the UK
•    Each country has locally hosted Print, Domain, UC & Messaging
•    Collaboration is centralised, again in the UK
•    Typical/normal file sharing is not permitted, all ‘matter’ is recorded and audited in Safe & Legit’s collaboration system
•    With the exception of ERP, all systems must move to a shared or distributed model. This is following a series of natural disasters in the US and China, that could have been avoided by having a DR and BC plan in place.
•    All communication end points are encrypted, but new legislation is relaxing where encryption is required. This is achievable following an ERP upgrade that separates out sensitive and non-sensitive data.
•    There are up to 5,000 3rd party users, that own a license to trade under Safe& Legit LLC, licensees are dropping as the competition develop newer, faster and cheaper ways to deliver access to their trading systems. Safe & Legit still require you to purchase expense fixed private comms to deliver their trading apps. They do not want these 3rd party users to be impacted at all during the migrations and for there to be a near zero RTO and RPO

•   The UK site has been chosen as the first site to be migrated but due to Safe and Legit’s work on ground defence equipment they have not authorised the running of a capacity planner collection as they don’t want their data to leave the premises but have calculated that for each site to be virtualised the environment must be able to meet the following values:

-The 6k physical servers in the UK are comprised of  2000 Linux servers and 4000 Windows servers

-On average each windows server is provisioned with 20GB boot disk (average used is 15GB) and a 50GB data disk (average used is 30GB)

– Each Linux server is configured with 60GB total storage (average used is 30GB)

– Safe and Legit expect a 10 percent annual server growth over the next three years

-Safe and Legit have a long standing vendor relationship with EMC and Cisco and so have requested the usage of their equipment due to this relationship and in house knowledge of the administration of these vendor products

-They have created the following two tables from internal analysis and monitoring:

CPU Resource Requirement
Metric Amount
Avg # of CPUs per physical server 4
Avg CPU MHz 3,400 MHz
Avg normalised CPU MHz 1,240
Avg CPU utilisation per physical system 5% (170 MHz)
Avg Peak utilisation per physical system 8% (272 MHz)
Total CPU resources req for 1k vm’s at peak 272,000 MHz
RAM Resource Requirement
Metric Amount
Avg amount of RAM per physical system 4096MB
Avg memory utilisation 30% (1228.8MB)
Avg Peak Memory Utilisation 80% ( 3276.8MB)
Total RAM required for 1k VMs at peak before memory sharing 3,276,800MB
Anticipated memory sharing benefit when virtualised 50%
Total RAM req for 1k VMs at peak with memory sharing 1,638.400MB

Business Requirements

From workshops and SME meetings the following requirements were collected

Number Requirement
R001 Virtualise the existing 6000 UK servers as virtual machines, with no degradation in performance when compared to current physical workloads
R002 To provide an infrastructure that can provide 99.7% availability or better
R003 The overall anticipated cost of ownership should be reduced after deployment
R004 Users to experience as close to zero performance impact when migrating from the physical infrastructure to the virtual infrastructure
R005 Design must maintain simplicity where possible to allow existing operations teams to manage the new environments
R006 Granular access control rights must be implemented throughout the infrastructure to ensure the highest levels of security
R007 Design should be resilient and provide the highest levels of availability where possible whilst keeping costs to a minimum
R008 The design must incorporate DR and BC practices to ensure no loss of data is achieved
R009 Management components must secured with the highest level of security
R010 Design must take into account VMware best practices for all components in the design as well as vendor best practices where applicable
R011 Any others you think I have missed from the scenario

Additional Functional Requirements (From Storage Design posting)

-5K 3rd party users will need to be able to gain access into the environment without any impact during the migration and consolidation

-Rented DC’s kit needs to be fully migrated to the privately owned datacenter before Q1 2015 to ensure the contracts don’t need to be renewed

Constraints and Risks

You tell me in the comments Smile

Constraints from Storage Design posting:

– Usage of EMC kit

– Usage of Cisco kit

– Usage of the privately owned DC’s physical infrastructure for the consolidation of all three UK DC’s.

Risks from Storage Design posting:

– The ability of ensuring near-zero downtime during the migration of workloads to the privately owned DC may be at risk due to budget constraints impacting the procurement of the required infrastructure to ensure zero downtime

Additional Questions (from Storage Design posting)

This is something I feel is really important when doing real world designs is trying to think of as many questions around a customer requirements so that you can ensure you have their requirements recorded correctly and that they aren’t vague.The additional questions and the answers to them are listed below:

Q: Is there any capability of utilising the existing storage in the privately owned UK DC?

A: Due to the consolidation and migration of  the other UK DC’s and the current workloads in the privately owned DC a new SAN is a better option due to the SAN being 3 years old now and so it is more cost effective to purchase a new one. Also due to the probable need for auto-tiered storage to meet the customers requirements a new SAN with these capabilities is needed

Q: Is there no way a minimal planned outage/downtime can be organised for the migration of the workloads due to the likely higher cost of equipment to ensure this near-zero downtime?

A: The customer would prefer to try keep to the near-zero downtime and so it is agreed that after the conceptual design of the storage and the remaining components in the whole design further meetings can be held to discuss a balance between cost and the desire for near-zero downtime

Q: With the leasing out of the private level 4 suites in the future will there be a requirement to manage/host other companies processes and data within this infrastructure being designed?

A: No there is currently no plan to do this due to security concerns and the number of compliancy regulations Safe and Legit need to maintain and fulfil. There is however a possibility of internal consumption and charging for usage of the DC’s resources to other departments.

Summary

So that is the company profile and my idea around it. I obviously created 90% of the above from my head so there will be additional questions around it but I think this gives a really solid amount of information for people to start thinking. I’m going to do the first posting around Storage Design for Safe and Legit quite soon and will put up what questions and component you normally have to think of but if people want to think of what they would choose prior then hopefully we can get a good discussion going around it.

As I add each section to the design I am hoping to keep updating this posting and then once complete making it all linked on a single page on my blog

Gregg


8 Comments

VCAP5-DCD : My Experiences

I thought I would put out a posting around my experiences of the VCAP5-DCD exam I sat yesterday and what I felt helped me in my preparations and what I plan to use to better my knowledge for my resitting.

Yep I am going to need to re-sit the exam as unfortunately I just failed the exam but I do feel that what I studied was extremely helpful as without having done it I wouldn’t have been close so that is very positive and now i have a great idea of what I need to do in preparation before my retry.

 

The Resources I used this time

The resources I used for yesterdays attempt of the exam were quite extensive to say the least but I am learning design from the ground up almost as I have only been doing enterprise level designs for the past year having previously been a VMware Administrator. The resources I used are on my page here but I wanted to list out the exact ones and what i felt they helped me with and why I think they are essential for the exam:

I know this is going to be a strange one but it did really help me in my preparations and that is having studied for my VCAP5-DCA prior to doing this exam as it helped me learn the new technologies, how to physically create them and the level logical and physical designs have to be to allow the VMware administrator (if this is a different person) to build the solution

The VMware vSphere: Design Workshop [V5.0] was extremely beneficial and really gives you a great idea of what doing designs for a living is like but also how there are many different options for each solution. Unfortunately for the VCAP5-DCD exam there is only one way of doing something and that is the VMware recommended way and this is my first BIG piece of advice before doing the exam. Make sure you learn the VMware way of doing design as in the exam the way you think it should be done or have done it in the past may not be the VMware recommended way of doing it and it is therefore incorrect. Also the course is only three days so I would HIGHLY recommend trying to do all the lab work from the course at home and then make sure you go to your transcript under VMware learning, click next steps under the course name and then download the completed design scenarios that you followed during the course so you can learn how VMware would have built it.

Next piece of material I used was the VMware vSphere Design book from Scott Lowe,Forbes Guthrie and Maish Saidel-Keesing. The book was amazing and I would recommend it to no end to anyone doing the exam and anyone doing VMware designs in general as they cover everything and it is extensive to say the least. I did read the version 4 version as the version 5 is meant to be out within the next few months and it gave a really great covering of all the components as 85% of vSphere 5 is the same as vSphere 4 and most of the concepts are exactly the same

The vSphere 5 Clustering Tech Deepdive book by Frank Denneman and Duncan Epping was amazing in giving me a deep understanding of the vSphere 5 cluster, it’s components and technologies and the advanced settings you can create and use for certain scenarios. This book is an absolute must for the exam and covers parts I haven’t seen mentioned anywhere else. My recommendation on this book is read and understand and be able to apply EVERYTHING in this book prior to your exam.

As I mentioned I did my VCAP5-DCA prior to attempting this exam and therefore I used resources for that exam like the VMware vSphere 5 Training trainsignal videos by Elias Khnaser and David Davis and all the VMware vSphere 4 VCAP Training Package videos David did for the VCAP4-DCA exam.  These helped me build a solid understanding prior to the DCD exam as I believe how can you design something if you don’t know how it works and how each part integrates.

Talking of Trainsignal videos a MASSIVE resource I used for the DCD was Scott Lowe’s Designing VMware Infrastructure Trainsignal set of videos. These were amazing and Scott gives some brilliant descriptions and examples of what Risk,Assumption,Requirements and Constraints are and how to apply them. I personally battled with differentiating between Functional and Non-Functional requirements and Scott’s videos helped with this as did an article that Victor Forde sent me when I asked if anyone could try help me clear up the definitive differences and Bas Raayman did a great posting asking these questions here . The videos don’t just cover the terminology but cover every facet of designing a virtual infrastructure and how they are are holistically interconnected. I plan to re-watch a few of these videos and also the second last one where Scott brings all the pieces together to create a final design as I think this is very important for the exam and  real world designing

The APAC vBrownbags were another resource I used extensively and is something that helped loads in my preparations and understanding of certain things. The content covered in a number of the sessions were amazing and I took down loads of notes during them and made sure I watched them whenever I could including the gym

 

The DRBC Design – Disaster Recovery and Business Continuity Fundamentals course was another online course I did in my prep to fully understand DR and BC concepts but also how certain decisions impact how things are done. The course is free so I would highly recommend it.

The resources I will be using and re-using next time

The above resources were really great and all the notes I created from all of them will be used extensively again to try get everything into my mind.

The official VMware book Building a Virtual Datacenter will be a book I am planning to read in my aim of trying to get myself into the VMware mind-set of designing and what are the recommendations for every component. The book was given to me a while back so I am planning to start reading through it very soon

Harley Stagner and Sean Crookston’s VMware press book Managing and Optimizing VMware vSphere Deployments is another book I am planning to read prior to my re-take as they have covered how to take your existing knowledge of all the components and apply it to a design as well as having done a mock design which I’m hoping I will learn loads from.

As I mentioned above I attended the design workshop VMware course and so I am planning on going through all the course notes and the lab work and actually trying to create every portion as I don’t think there was near enough time in the workshop to be able to complete all the lab work. Also as I highlighted in red I was fortunate to notice (no one tells you these are available if you did the course) that the completed designs from the lab work have been done for you by VMware and therefore you can use these to see how VMware recommend doing them and thereby hopefully I will learn the VMware way of designing every portion.

Doing some mock designs of my own and then trying to apply VMware recommendations (notice I never say best practices as supposedly there are none but for the exam there has to be as only one answer is correct) and hopefully learn how to apply these for the Visio like questions

Talking of the Visio like questions, I am planning on trying to create my own mock questions while using these kinds of ideas so that i know how to create all portions super fast as the time frames in the exam are very tight.

Conclusion

I felt the exam is passable which is fairly comforting for me, the exam reminded me a lot of the Microsoft Design exams I did for my MCSE’s but on steroids. As for when I am going to re-try the exam that is still something I need to work out as I was hoping to also get my VCP5-IaaS and thereby my VCP5-Cloud before the VCP5-Cloud exam is released and the upgrade path is gone. A lot of people said if you have been doing design for years then don’t really bother studying and just go do the exam but I disagree massively on this as if you have been doing designs for years you know there are many many ways of building a solution but in the exam there is only the VMware way and so experience may work against you as maybe that isn’t the VMware recommended way of doing it. Good luck to anyone doing the exam, I hope my thoughts above haven’t stressed you out and maybe help you study places I missed or didn’t know would need to and thereby you pass the exam

 

Gregg