RAID 1 and 0 = stripe AND mirror. This is sometimes called RAID10, RAID1+0, RAID01 or RAID0+1, which are all different varieties of a nested RAID level where disks are striped and mirrored at the same time without parity.
Depending on the settings on the controller it either stripes mirrored pairs, or it will mirror striped arrays.
So, with 6 disks:
Stripe ((Mirror 1+a) (Mirror 2+b) (Mirror 3+c))
or
Mirror ((Stripe 1 2 3) (Stripe a b c))
In n-drive RAID0 array any block written to disk is splitted to n-parts and written in parallel - each part to its own disk
In RAID10 instead of single disk you have two mirrored disks, and stripping works the same way.
In theory you should get 50% write speed improvement using 8 instead of 4 disks. My practice is quite close to that theory
IMHO if you have 8 disks it will be better if you configure two disks in RAID1 or or four in RAID10 and use this drive exclusively for single log file and anything, literally anything else. My point is that this prevents your log disks from doing any seeks, which should give you real performance boost with transactions commiting.
The rest of 6 (or 4) disks you may configure as two arrays for data and log files for tempdb for example.
to DenSter: RAID10 is always a stripe over mirrored pairs regardless of controller. Mirror of striped set is called RAID01
My point was not "what it is called", my point is to explain a nested RAID level with striping and mirroring to NavStudent. I said "1 + 0" to avoid going into semantics about what it is called, but apparently failed.
I know that for some hardware purists it makes a big difference between 1+0 and 0+1, and I understand the difference, but I have heard people who call themselves hardware specialists explain it both ways, so that's why I am not trying to make any claim one way or the other.
Depending on the controller it either stripes mirrored pairs, or it will mirror striped arrays.
My point was that that the RAID level implementation in controller cannot be just free interpretation whether it is stripe over mirrored pairs or mirror of stripes as:
1. definitions of RAID10 (or RAID1+0) and RAID01 (or RAID0+1) are quite precise,
2. there are significant differences in performance and protection levels.
It is not about being or not hardware or technical language purist, because those differences are not only in the name.
So, with 6 disks:
Stripe ((Mirror 1+a) (Mirror 2+b) (Mirror 3+c))
or
Mirror ((Stripe 1 2 3) (Stripe a b c))
Again, I never gave it any name, I just wanted to explain the difference. Because it has been explained to me both ways, and both have been called both names to me, I did not use any names.
So, I edited my original post, just to make it perfectly clear what I meant
In theory you should get 50% write speed improvement using 8 instead of 4 disks. My practice is quite close to that theory
IMHO if you have 8 disks it will be better if you configure two disks in RAID1 or or four in RAID10 and use this drive exclusively for single log file and anything, literally anything else. My point is that this prevents your log disks from doing any seeks, which should give you real performance boost with transactions commiting.
The rest of 6 (or 4) disks you may configure as two arrays for data and log files for tempdb for example.
Regards,
Slawek
Dear Slawek, I have 6 HDD and need to setup Proxmox VE for a few VMs and have 2 questions on your explanation.
1. You wrote that 8 disks lead to ~50% write speed improvement comparing with 4 disks, but then you recommend to take away some part of 8 those disks for another RAID array. That action will reduce speed improvement for the data array, will it? And do you mean that "real performance boost with transactions commiting" will prevail this reduce, I mean make it tolerable?
2. The rest of disks - 6 or 4 devices - you wrote "configure as two arrays". Do you mean to configure literally 2 arrays (RAID1 each ?)? If 2 arrays, why? Why two RAID1 arrays are more preferable than one RAID10 that provides speed improvement?
Thank you!
Keep in mind that these older discussions were concerned with configuring the old-style "spinning" disk. They are less relevant to modern SSD drive systems. Although RAID still can play a role today.
In the world of spinning disks, there is a delay caused by mechanics - a seek delay when heads need to adjust their position and a rotational delay when disk heads wait until the rotating disk plate is at the 'correct' position to read or write data.
If you have one large array partitioned between different logical disks the different workloads hits the same disks - forcing it to constantly change tracks and wait for the correct sectors
Why different workloads? Transaction log in SQL Server is sequentially written. Every transaction on commit is written to a disk - bypassing any hardware disk cache. A transaction is not committed until the OS confirms that the data hit the drive.
Data file(s), on the other hand, are read as required and written usually periodically at every checkpoint, but when there is a shortage of memory for SQL Server to read new data the old one could be potentially written at any time, Therefore when log writes and data reads/writes hitting the same physical disks this forces them to constantly change their heads positions. Which costs time - a lot of time.
if you take a pair of disks from a larger array you'll make it a bit slower. But then you 'sort' the disk operations - transaction log sequential writes go to separate disks, which rarely need to change their heads' position. Eliminating disk seeks and waits gives much more write performance boosts than extra disks added to an RAID array.
That's in the world of spinning disks...
As @bbrown mentioned - with SSDs in place all the above no longer matters that much. It still matters just a bit as still SSDs are better at sequential operations than random ones. But overall SSD speed makes those far less relevant.
Why different workloads? Transaction log in SQL Server is sequentially written. Every transaction on commit is written to a disk - bypassing any hardware disk cache. A transaction is not committed until the OS confirms that the data hit the drive.
Data file(s), on the other hand, are read as required and written usually periodically at every checkpoint, but when there is a shortage of memory for SQL Server to read new data the old one could be potentially written at any time, Therefore when log writes and data reads/writes hitting the same physical disks this forces them to constantly change their heads positions. Which costs time - a lot of time.
Dear Slawek, thank you for the clear explanation! Yes, we still use HDD disks, so it all matters to us.
Could you also give a link(s) to practical articles/videos about this multi-RAID method if you have? Or maybe also any extra explanation in any other topic?
For example, we have Zimbra mail server, and I'm not sure it can even allow to specify separated database location during installation process. So it's not clear should we consider the whole server as data or database regarding multi-RAID method.
I mean it's all clear in your explanation, but no doubt we'll have extra questions when we setup our specific platform, extra link(s) would be very useful if you have.
Comments
RAID 1 = Mirror
RAID 1 and 0 = stripe AND mirror. This is sometimes called RAID10, RAID1+0, RAID01 or RAID0+1, which are all different varieties of a nested RAID level where disks are striped and mirrored at the same time without parity.
Depending on the settings on the controller it either stripes mirrored pairs, or it will mirror striped arrays.
So, with 6 disks:
Stripe ((Mirror 1+a) (Mirror 2+b) (Mirror 3+c))
or
Mirror ((Stripe 1 2 3) (Stripe a b c))
<edit>modified for better semantics</edit>
RIS Plus, LLC
Nested RAID levels
RIS Plus, LLC
Another question which arises is - how are you going to select stripe size to balance the load across 3 disk pairs.
to DenSter: RAID10 is always a stripe over mirrored pairs regardless of controller. Mirror of striped set is called RAID01
Regards,
Slawek
Dynamics NAV, MS SQL Server, Wherescape RED;
PRINCE2 Practitioner - License GR657010572SG
GDPR Certified Data Protection Officer - PECB License DPCDPO1025070-2018-03
So it RAID 10 of 6 disk is
Stripe ((Mirror 1+a) (Mirror 2+b) (Mirror 3+c))
And RAID 10 of 8 disk would be
Stripe ((Mirror 1+a) (Mirror 2+b) (Mirror 3+c)(Mirror 4+d))
As far as the size, wouldn't it be at bit or Byte level?
How much of performance gain do you get for RAID 10 of 8 disk compared of RAID 10 of 4 disks?
Yes, if I understood correctly your syntax
In n-drive RAID0 array any block written to disk is splitted to n-parts and written in parallel - each part to its own disk
In RAID10 instead of single disk you have two mirrored disks, and stripping works the same way.
In theory you should get 50% write speed improvement using 8 instead of 4 disks. My practice is quite close to that theory
IMHO if you have 8 disks it will be better if you configure two disks in RAID1 or or four in RAID10 and use this drive exclusively for single log file and anything, literally anything else. My point is that this prevents your log disks from doing any seeks, which should give you real performance boost with transactions commiting.
The rest of 6 (or 4) disks you may configure as two arrays for data and log files for tempdb for example.
Regards,
Slawek
Dynamics NAV, MS SQL Server, Wherescape RED;
PRINCE2 Practitioner - License GR657010572SG
GDPR Certified Data Protection Officer - PECB License DPCDPO1025070-2018-03
I know that for some hardware purists it makes a big difference between 1+0 and 0+1, and I understand the difference, but I have heard people who call themselves hardware specialists explain it both ways, so that's why I am not trying to make any claim one way or the other.
RIS Plus, LLC
1. definitions of RAID10 (or RAID1+0) and RAID01 (or RAID0+1) are quite precise,
2. there are significant differences in performance and protection levels.
It is not about being or not hardware or technical language purist, because those differences are not only in the name.
Looks like we didn't understand each other
Regards,
Slawek
Dynamics NAV, MS SQL Server, Wherescape RED;
PRINCE2 Practitioner - License GR657010572SG
GDPR Certified Data Protection Officer - PECB License DPCDPO1025070-2018-03
So, I edited my original post, just to make it perfectly clear what I meant
RIS Plus, LLC
Dear Slawek, I have 6 HDD and need to setup Proxmox VE for a few VMs and have 2 questions on your explanation.
1. You wrote that 8 disks lead to ~50% write speed improvement comparing with 4 disks, but then you recommend to take away some part of 8 those disks for another RAID array. That action will reduce speed improvement for the data array, will it? And do you mean that "real performance boost with transactions commiting" will prevail this reduce, I mean make it tolerable?
2. The rest of disks - 6 or 4 devices - you wrote "configure as two arrays". Do you mean to configure literally 2 arrays (RAID1 each ?)? If 2 arrays, why? Why two RAID1 arrays are more preferable than one RAID10 that provides speed improvement?
Thank you!
If you have one large array partitioned between different logical disks the different workloads hits the same disks - forcing it to constantly change tracks and wait for the correct sectors
Why different workloads? Transaction log in SQL Server is sequentially written. Every transaction on commit is written to a disk - bypassing any hardware disk cache. A transaction is not committed until the OS confirms that the data hit the drive.
Data file(s), on the other hand, are read as required and written usually periodically at every checkpoint, but when there is a shortage of memory for SQL Server to read new data the old one could be potentially written at any time, Therefore when log writes and data reads/writes hitting the same physical disks this forces them to constantly change their heads positions. Which costs time - a lot of time.
if you take a pair of disks from a larger array you'll make it a bit slower. But then you 'sort' the disk operations - transaction log sequential writes go to separate disks, which rarely need to change their heads' position. Eliminating disk seeks and waits gives much more write performance boosts than extra disks added to an RAID array.
That's in the world of spinning disks...
As @bbrown mentioned - with SSDs in place all the above no longer matters that much. It still matters just a bit as still SSDs are better at sequential operations than random ones. But overall SSD speed makes those far less relevant.
Dynamics NAV, MS SQL Server, Wherescape RED;
PRINCE2 Practitioner - License GR657010572SG
GDPR Certified Data Protection Officer - PECB License DPCDPO1025070-2018-03
Dear Slawek, thank you for the clear explanation! Yes, we still use HDD disks, so it all matters to us.
Could you also give a link(s) to practical articles/videos about this multi-RAID method if you have? Or maybe also any extra explanation in any other topic?
For example, we have Zimbra mail server, and I'm not sure it can even allow to specify separated database location during installation process. So it's not clear should we consider the whole server as data or database regarding multi-RAID method.
I mean it's all clear in your explanation, but no doubt we'll have extra questions when we setup our specific platform, extra link(s) would be very useful if you have.