TheSaffaGeek

My ramblings about all things technical


13 Comments

Safe and Legit Storage Design Completed

Below is my thoughts, additional questions I felt needed to be asked/things to be clarified and the Design decisions,justifications and impacts due to these decisions for the Safe and Legit Storage design. If you missed the posting where I detailed the mock scenario you can read it here 

 

Note: This is a learning exercise for me so if you feel I’ve missed something or made a wrong decision then please write it in the comments and I’m more than happy (it was one of the main reasons I’m looking to do this series of postings) to discuss and I’ll amend the design accordingly if it makes sense and hopefully I along with other people reading these postings will learn from it and become better.

 

Additional Questions

As I said there probably would be and which is something I feel is really important when doing real world designs is trying to think of as many questions around a customer requirements so that you can ensure you have their requirements recorded correctly and that they aren’t vague.The additional questions and the answers to them are listed below:

 

Q: Is there any capability of utilising the existing storage in the privately owned UK DC?

 

A: Due to the consolidation and migration of  the other UK DC’s and the current workloads in the privately owned DC a new SAN is a better option due to the SAN being 3 years old now and so it is more cost effective to purchase a new one. Also due to the probable need for auto-tiered storage to meet the customers requirements a new SAN with these capabilities is needed

 

Q: Is there no way a minimal planned outage/downtime can be organised for the migration of the workloads due to the likely higher cost of equipment to ensure this near-zero downtime?

 

A: The customer would prefer to try keep to the near-zero downtime and so it is agreed that after the conceptual design of the storage and the remaining components in the whole design further meetings can be held to discuss a balance between cost and the desire for near-zero downtime

 

Q: With the leasing out of the private level 4 suites in the future will there be a requirement to manage/host other companies processes and data within this infrastructure being designed?

 

A: No there is currently no plan to do this due to security concerns and the number of compliancy regulations Safe and Legit need to maintain and fulfil. There is however a possibility of internal consumption and charging for usage of the DC’s resources to other departments.

 

Q: What other questions do you feel should be asked?

Additional Functional Requirements

-5K 3rd party users will need to be able to gain access into the environment without any impact during the migration and consolidation

-Rented DC’s kit needs to be fully migrated to the privately owned datacenter before Q1 2015 to ensure the contracts don’t need to be renewed

Constraints

Below are the constraints I felt were detailed in the scenario. These will possibly change as I go further through all the other sections but so far these are the ones I felt were applicable:

- Usage of EMC kit

- Usage of Cisco kit

- Usage of the privately owned DC’s physical infrastructure for the consolidation of all three UK DC’s.

Assumptions

Below are the assumptions I felt had to be made. These will possibly change as I go further through all the other sections and normally I try to keep these as minimal as possible but for a project of this size it would be extremely difficult to not have any as you do have to trust certain things are in place:

- There is sufficient bandwidth between the UK DC’s to allow migration of the existing workloads with as little of an impact to the workloads as possible

- All required upstream dependencies will be present during the implementation phase.

- There is sufficient bandwidth into and out of the privately owned DC to support the bandwidth requirements of all three DC’s workloads

- All VLANs and subnets required will be configured before implementation.

- Storage will be provisioned and presented to the VMware ESX™ hosts
accordingly.

- Power and cooling in the privately owned DC is able to manage the addition of the required physical infrastructure of the Virtual Infrastructure whilst for a certain amount of time having older physical machines still running alongside

- Safe and Legit have the existing internal skillset to support the physical and virtual infrastructure being deployed.

- There are adequate licences for required OS and applications required for the build

 

Risks

- The ability of ensuring near-zero downtime during the migration of workloads to the privately owned DC may be at risk due to budget constraints impacting the procurement of the required infrastructure to ensure zero downtime

Storage Array

Design Choice EMC FC SAN with two x8GB SP
   
Justification -EMC due to constraint of having to use EMC storage due to previous usage
-EMC VNX 5700 with Auto-Tiering enabled
- 8GB to ensure high transmission speeds to the storage,12GB is too high and expensive for this design
   
Design Impacts -Switches will need to be capable of 8GB connectivity
- FC Cabling needs to be capable of transmitting 8GB speeds
-HBA’s on ESXi hosts need to be capable of 8GB speeds
   

Number of LUNs and LUN sizes

Design Choice 400 x 1TB LUNs will be used
   
Justification -Each VM will be provisioned with 50GB average of disk
-So with around 15 vm’s per lun + 20% for swap and snapshots, 15x 50GB / .8 = 937.5
- So 6000 total VM’s / 15 VMs per LUN = 400 LUNs
   
Design Impacts -Tiered storage will be used with auto tiering enabled to balance storage costs with VM performance requirements
   

Storage load balancing and availability

Design Choice -EMC PowerPath/VE multipathing plug-in (MPP) will be used.
   
Justification

-EMC PowerPath/VE leverages the vSphere Pluggable Storage Architecture (PSA), providing performance and load-balancing benefits over the VMware native multipathing plug-in (NMP).

   
Design Impacts -Requires additional cost for PowerPath licenses.
   

VMware vSphere VMFS or RDM

Design Choice -VMFS will be used as the standard unless there is a specific need for raw device mapping . This will be done on a case by case basis
   
Justification

-VMFS is a clustered file system specifically engineered for storing virtual machines.

   
Design Impacts -Usage of the VMware vSphere Client to create the datastores must be done to ensure correct disk alignment
   

Host Zoning

Design Choice

-Single-initiator zoning will be used. Each host will have two paths to the storage ports across separate fabrics.

   
Justification -This is keeping to EMC best practices and ensures no single point of failure with multiple paths to targets across multiple fabrics
   
Design Impacts -Zones will need to be created for each portion by the storage team
   

LUN Presentation

Design Choice

-LUNs will be masked consistently across all hosts in a cluster.

   
Justification -This allows for virtual machines to be run on any host in the cluster and ensures both HA and DRS optimisation
   
Design Impacts -The storage team will need to control and deploy this due to the masking being done on the storage array
   

Thick or Thin disks

Design Choice -This provisioning will be used as the standard unless there is a specific need for thick provisioned disks . This will be done on a case by case basis
   
Justification

-The rate of change for a system volume is low, while data volumes tend to have
a variable rate of change.

   
Design Impacts -Alarms will need to be configured to ensure that if disks reach an out of space condition there is ample time to provision more storage
   

Virtual Machine I/O Priority

Design Choice -Storage I/O Control will not be used
   
Justification -This is due to the storage utilising Auto-Tiering/FAST which works at the block level to balance and is therefore a better way of balancing
- Due to the likelihood that VMware SRM is going to be used then SDRS and SIOC is not supported
   
Design Impacts - FAST/Auto-Tiering will need to be configured correctly by the storage vendor
   

Storage Profiles

Design Choice -Storage Profiles will not be configured
   
Justification -Storage will be managed by the storage team
   
Design Impacts -Storage team will need to configure storage as the virtual infrastructure requires
   

Describe and diagram the logical design

Attribute Specification
Storage Type Fibre Channel
Number of Storage Processors 2 to ensure redundancy
Number of Fibre Channel Switches (if any) 2 to ensure redundancy
Number of ports per host per switch 1
Total number of LUNs 400 (as mentioned above)
LUN Sizes 1TB (as mentioned above)
VMFS datastores per LUN 1

image

Describe and diagram the physical design

Array vendor and model EMC VNX 5700
Type of array Active-Active
VMware ESXi host multipathing policy PowerPath/VE MPP
Min/Max speed rating of storage switch ports 2GB/8GB

I’m looking for the correct EMC diagrams to create the physical design diagram  so will update this postings this week with the diagram promise Smile

Well that’s my attempt at the storage design portion of Safe and Legit. Hopefully people will agree with most of the decisions I’ve made if not all of them and I have to admit it took me most of my Sunday just to do this piece and think of all the impacts and as stated there may be additional constraints and risks further down the line.

 

Gregg


2 Comments

All things virtual 13

 

Yes I decided to get rid of the roman numerals and go with the old trusty numbers from now on as I think it looks better and it’s more user friendly for the five people who read these posts :)

It’s been two weeks since the last All things virtual posting due to work constraints and my studying for my MCITP: Enterprise Administrator exam. Unfortunately the exam was cancelled unbeknownst to me so I’ve had to reschedule for a few weeks time (i know loads of people are going to think I just failed and don’t want to say it but I’d honestly say it if I had). Anyhow since it’s been two weeks since the last version there has been loads of really top class postings and information to have come up in the virtualisation arena.

Firstly as I said in my posting yesterday the second vBeers is happening TOMORROW,July the 1st. I was fortunate enough to have made it to the first one and it was awesome to meet and chat to loads of the guys I follow and chat to via twitter,their blogs(My blogroll holds my favourites) the VMware Community forums or in the VMware community roundtables. If you’re near the London area tomorrow evening I’d highly recommend going along.

Next is a brilliant posting by Duncan Epping of Yellow Bricks all about troubleshooting and recognising is a vm is swapping and if so how to work it out as it isn’t as simple as looking and seeing if the SWCUR value in esxtop is giving out values. As I stated in my blog posting a few weeks back I’m learning to use ESXTOP and better my skills in using this tool to manage my environments and be able to spot these kinds of things via this tool.

Eric Sloof blogged all about the release of the Maximum vSphere book. The book was written by Simon Seagrave of Techhead fame and Eric Siebert of vsphere-land.com fame. Simon wrote the chapters on ‘Performance in vSphere’ and ‘Building Your Own vSphere Lab’ and Eric wrote all the remaining chapters. John Troyer has also written the forward for the book. I haven’t personally read Eric’s VMware® VI3 Implementation and Administration book but these guys are top of the industry and their blogs are some of the best out there so you know the content is going to be amazing. Hopefully I can get my hands on a copy of this once it’s released.

As I blogged  almost a month ago now about the latest versions of vCentre and vSphere having been released, Update 2. Chad Sakac of Virtual Geek fame posted a brilliant write up all about the release also and has added some very helpful fixes to issues that may arise from updating to update 2. I’ve managed to update most of my home test environment to update 2 but unfortunately haven’t had the time to fully play with /break it yet. Talking of new versions William Lam of Virtually Ghetto fame has posted a very interesting posting all about the possible imminent release of vSphere 4.1. If rumours are true then this release will be the non COS release. Kind of crazy to put an update out then release a new version in my opinion especially for all of us that have to keep environments up to date whilst not breaking anything in the process. Jason Boche of Boche.net did a nice little posting all about how a simple Google search gives plenty of proof that the COS is going away.Duncan Epping also posted that DRS sub clusters are supposedly due tin the next version also.

A fair number of the guys and I’m friendly with and/or follow on twitter were fortunate enough to have been invited to take the VCAP-DCA beta exams over a week ago now . Jason Boche, William Lam and  Chris Dearden are a few that I noticed who blogged about it and from their comments and rants it sounds as if the exam is going to be a real test and that to pass this you are going to need to have used,played,configured and fully understand all the technologies and features that the vSphere family of products have to offer. Even though this means I’m probably going to end up spending innumerable hours playing around with my lab(which i kind of do out of nerdy fun already anyways) and also means that people can’t just learn answers to questions from cheat sites and post 500 out of 500 scores even though they misspell VMware and will hopefully also help me to increase my skills and knowledge which is what all exams/certifications should do for you.

One of the biggest banes of any VMware administrators life is the managing and controlling of snapshots especially if you allow them to be created by the users of the vm’s as i have to in our environment. I’ve posted before all about the great ways I use to ease the management and monitoring of VMware Snapshots. Last week Mike Bean posted a brilliant guest posting in the VMware communities blogs all about VMware snapshots and what they are meant for and what they are not meant for and how they are created and maintained. I’ve saved this one to my favourites as it’s got all the reasons you need to explain to a user in why they can’t have five snapshots on one vm and keep it for months on end.

Duncan Epping posted all about the new SIOC (Storage IO Control) feature due to released in most likely the next version of vSphere. I had seen this video before the posting as it was obviously all over twitter very quickly and I’m really excited and pleased that this feature is coming.

Last but not least a big congratulations to Simon Long in his announcement that he is joining VMware as a Senior Consultant. Wow if memory serves me right that takes him from being made redundant and looking for a role to being a VMware employee in 12 months!! Congrats Simon!

Gregg Robertson

 

 

Follow

Get every new post delivered to your Inbox.

Join 61 other followers