This blog post continues on from my previous blog post on vSphere Sizing Formula – CPU & RAM, this is again for me to share with you my methodology when it comes to sizing the requirements for a storage design.
I’m not going to go into specific storage protocols as this can be influenced by a multitude of design decisions.
Step 1 – Datastore Size
The answer to this isn’t a one size fits all, but let me give you my considerations.
Restore Time Objective
Your RTO objective should help determine define your VM size. Let me explain what I mean, by walking over an example.
Your RTO time is an hour. You have VM’s which are 2TB in size because they have multiple disk partitions in the guest operating system. You currently backup using LTO-4 tapes which produce a restore rate of 120/MB’s.
Restore Rate x 60 Seconds x 60 minutes = Restore Amount
120 x 60 x 60 = 432,000 MB
Restore Amount = Resore Time Objective? Yes/No
432,000 MB (422GB) = No Restore Time Objective has been violated
Knowing this we would then dictate the VM would be changed to at least 5 x VM’s to meet the required RTO.
So this then determines our maximum VM size to be 422GB to meet RTO.
VM Size
Now we know our RTO and maximum VM size of 422GB, we then move onto the required space for our VM.
Maximum VM Size – Swap File = Actual Maximum VM Size
422GB – 8GB = 414GB
Buffer Space
Buffer space is the amount of space we need on the datastore for items such as:
- Log Files
- Snapshots
- Storage vMotion (temporary move space)
Rule of thumb on this is 25%.
Queue Depth
Jason Boche (@jasonboche) wrote an excellent article which can affect performance called VAAI and the Unlimited VM’s per Datastore Urban Myth
Queue Depth for HBA’s is defaulted at 32, please check with your vendor if this should be altered for a vSphere environment.
Queue Depth for Software iSCSI Adapter is defaulted at 128, again check with you vendor is this should be altered in a vSphere.
Active IO Per VM x Number Of VM’s = Overall Active IO
9 x 500 = 4500
Overall Active IO / Number Of Hosts = Average Active IO Per Host
4500 / 12 = 375
Average Active IO Per Host / Queue Depth – Growth (Spike) % = VM’s Per Datastore
375 / 32 – 50% = 6
Datastore Size
VM’s Per Datastore x Actual Maximum VM Size + Buffer Size = Datastore Size
6 x 414GB + 25% = 3TB
Step 2 – Performance Data Collection
I tend to look at performance first then capacity second, I have discounted capacity for this blog post as this should be straight forward to work out based around the RAID type you have chosen to use.
For the basis of the data collection we are going to make the following assumptions.
Disk | IOPS |
7.2K SATA/NearLine SAS | 75 |
10K SAS | 125 |
15K SAS | 150 |
SSD | 2,500 |
Table To Show RAID Penalty
RAID Type | Write Penalty |
0 | 1 |
1 | 2 |
5 | 4 |
6 | 6 |
10 | 2 |
Read/Write Collection
This is obtaining the key metrics from your physical systems, application owners, current storage array network, or perhaps its from perfmon or VMware Capacity Planner.
Read MB/Sec + Write MB/Sec = Overall MB/Sec
83 MB/Sec + 12 MB/Sec = 95 MB/sec
Read MB/Sec / Overall MB/Sec = Read Percentage
83 MB/Sec / 95 MB/Sec = 87%
Write MB/Sec / Overall MB/Sec = Write Percentage
12 MB/Sec / 95 MB/Sec = 13%
Front End IOPS Collection
Front End IOPS are the I/O transfers per second that your collection tool will see. In this case we will use 1,231 IOPS.
Back End IOPS Collection
Back End IOPS is the performance that your SAN/NAS needs to accommodate taking into account RAID write penalty.
Read % + (RAID Penalty x Write %) = RAID Penalty Percentage
RAID 1: 87% + (2 x 13%) = 113%
RAID 5: 87% + (4 x 13%) = 139%
RAID 6: 87% + (6 x 13%) = 165%
RAID 10: 87% + (2 x 13%) = 113%
RAID Penalty Percentage x Front End IOPS = Back End IOPS
RAID 1: 113% x 1,231 = 1,391 IOPS
RAID 5: 139% x 1,231 = 1,711 IOPS
RAID 6: 165% x 1,231 = 2,031 IOPS
RAID 10: 113% x 1,231 = 1,391 IOPS
Step 3 – Target Disk Requirements
This is the process of determining which disk types meet your performance requirements.
Back End IOPS / Disk IOPS = Number Required Hard Drives
Table To Show Hard Drives Per RAID Type
RAID Type |
Back End IOPS |
Disk IOPS |
Number Required Hard Drives |
10 | 1,391 | 75 IOPS (7.2K SATA) | 19 |
10 | 1,391 | 125 IOPS (10K SAS) | 12 |
10 | 1,391 | 150 IOPS (15K SAS) | 10 |
1 | 1,391 | 2,500 IOPS (SSD) | 2 (to meet RAID 1 disk requirements) |
5 | 1,711 | 75 IOPS (7.2K SATA) | 23 |
5 | 1,711 | 125 IOPS (10K SAS) | 14 |
5 | 1,711 | 150 IOPS (15K SAS) | 12 |
5 | 1,711 | 2,500 IOPS (SSD) | 3 (to meet RAID 5 disk requirements) |
6 | 2,031 | 75 IOPS (7.2K SATA) | 28 |
6 | 2,031 | 125 IOPS (10K SAS) | 17 |
6 | 2,031 | 150 IOPS (15K SAS) | 14 |
6 | 2,031 | 2,500 IOPS (SSD) | 4 (to meet RAID 6 disk requirements) |
Note: RAID 1 hasn’t been included apart from for SSD due to IOPS performance.
Step 4 – Growth Requirements
This is the amount of performance or capacity increase that is required to meet business or application objectives, in this case we will use 50%.
RAID Type |
Back End IOPS |
Growth % | Required IOPS | Disk IOPS |
Number Required Hard Drives |
10 | 1,391 | 50% | 2,087 | 75 IOPS (7.2K SATA) | 28 |
10 | 1,391 | 50% | 2,087 | 125 IOPS (10K SAS) | 17 |
10 | 1,391 | 50% | 2,087 | 150 IOPS (15K SAS) | 14 |
1 | 1,391 | 50% | 2,087 | 2,500 IOPS (SSD) | 2 (to meet RAID 1 disk requirements) |
5 | 1,711 | 50% | 2,567 | 75 IOPS (7.2K SATA) | 35 |
5 | 1,711 | 50% | 2,567 | 125 IOPS (10K SAS) | 21 |
5 | 1,711 | 50% | 2,567 | 150 IOPS (15K SAS) | 18 |
5 | 1,711 | 50% | 2,567 | 2,500 IOPS (SSD) | 3 (to meet RAID 5 disk requirements) |
6 | 2,031 | 50% | 3,047 | 75 IOPS (7.2K SATA) | 41 |
6 | 2,031 | 50% | 3,047 | 125 IOPS (10K SAS) | 25 |
6 | 2,031 | 50% | 3,047 | 150 IOPS (15K SAS) | 21 |
6 | 2,031 | 50% | 3,047 | 2,500 IOPS (SSD) | 4 (to meet RAID 6 disk requirements) |
Considerations
Most storage vendors will have some kind of caching, you can use this to decrease the number of disks required or you can use it as an added performance bonus.
Latency hasn’t been taken into account, the rule for this is the less hops/distance travelled should equal less latency e.g. DAS is faster than SAN.
Hi Creg,
I appreciate this article in calculating data store size as it seems to clear way to understand the size of data store, however i have query on step1.
Restore Time Objective
Question is: we will be know the size of 1 VM prior to determine the size of the data store or first we will determine the size of the VM by RTO.
Regards
Raj
Hi Creg,
I appreciate for this article in calculating datastore size as it seems to be a clear way of understanding the size of datastore, however i have a query on step1.
Restore Time Objective
Question is: will we be knowing the size of 1 VM prior to determining the size of the data store or first we will determine the size of the VM
and can clarify why you chose the RTO for determining the size of the VM.
Regards
Raj