It should be noted here that RAID 0 has no redundancy or error repair capabilities, but it is low in cost and requires at least two disks. This architecture is generally only used when there are not high requirements for data security.

RAID 0

is Data Stripping data striping technology, also known as Stripe storage. RAID 0 can connect multiple hard disks into a larger cluster, thereby significantly improving disk performance and throughput. It should be noted here that RAID 0 has no redundancy or error repair capabilities, but it is low in cost and requires at least two disks. This architecture is generally only used when there are not high requirements for data security. RAID 0 continuously splits data in bits or bytes, read/writes on multiple disks in parallel, and RAID 0 is the fastest among all levels. In theory, RAID0, composed of N disks, is N times the read and write speed of a single disk. However, RAID 0 does not have redundant backup function. If a disk (physical) is corrupted, all data will not be used. Therefore, it cannot be regarded as the RAID structure in the narrow sense.

(1) RAID 0 Simple architecture

is to connect N blocks of the same hard disk in hardware form through a smart disk controller or a disk driver in the operating system in software to form a separate logical drive with a capacity N times that of a single hard disk. When data is written, it is written to each disk in turn. When a disk space is exhausted, the data will be automatically written to the next disk. Its advantage is that it can increase the capacity of the disk. The speed is the same as that of any of the disks. If any of them fails, the entire system will be damaged, and the reliability is 1/n of the hard disk alone.

(2) Another architecture of RAID 0

is to use N block hard disks to select reasonable band sizes to create a band set. It is best to equip each hard disk with a special disk controller, which can read and write data to N block disks at the same time when computer data is read and written, which will increase the speed by n times and improve the performance of the system.

RAID 1

Mirror (Mirror) storage. Mirroring the data of one disk to another disk, ensuring the system reliability and centralized reading of data without affecting performance, so RAID 1 can improve read performance. RAID 1 is the most cost-effective unit in in disk array, but provides high data security and availability. When a disk fails, the system can automatically switch to the mirrored disk to read and write without reorganizing the failed data.

RAID 1 has the following characteristics:

(1) Each disk of RAID 1 has a corresponding mirror disk, and the data is mirrored at any time. The system can read data from any disk in a set of mirror disks.

(2) The space that can be used on a disk is only half of the total disk capacity, and the system cost is relatively high.

(3) As long as there is at least one disk in any pair of mirror disks in the system to use, the system can even run normally when there is a problem with half of the hard disks.

(4) The faulty RAID system has insufficient reliability and the faulty hard disk must be replaced in time. Otherwise, when the remaining mirror disks also fail, the entire system will not be able to continue to use.

(5) After adding the new disk, the original data will take a long time to synchronize the reading and writing of data, and the external access to the data will not be affected, but the performance of the entire system will decline at this time.

(6) RAID 1 has a considerable load on hard disk control devices. Using multiple hard disk control devices can improve data security and availability. In terms of repairability, it has high data redundancy capabilities, but the disk utilization is 50%. When the original data is busy, it can be copied directly from the mirror.

RAID 2

is also known as Hamming Code verification stripe storage. Data strips are distributed in blocks on different hard disks, with bar units in bits or bytes, and are used to provide error checking and recovery. This encoding technology requires multiple disks to store inspection and recovery information, making the implementation of RAID 2 technology more complex and is therefore rarely used in commercial website projects.

RAID 3

is also known as parity check (XOR) stripe storage, a shared check disk, and the data stripe storage unit is bytes.RAID 3 is a parity bit for storing data using a hard disk, and the data is stored in segments on the remaining hard disks. It stores numbers in parallel like RAID 0, but is not as fast as RAID 0. If the data disk (physical) is damaged, as long as the bad hard disk is replaced, the RAID control system will rebuild the data on the bad disk in the new disk based on the data check bit of the check disk. However, if the check disk (physical) is corrupted, all data will not be used. Although the data is protected by using a separate verification disk, the security of the data is not mirrored, the utilization rate of the hard disk has been greatly improved to n-1. RAID 3 provides a good transmission rate for large amounts of continuous data, but for random data, parity disks can become a bottleneck in writing operations. Resulting reading and writing speed.

RAID 4

is also a parity check (XOR) stripe storage, a shared check disk, and the data stripe storage unit is block. RAID 4 also blocks data strips and distributes them on different disks, but the strip units are blocks or records. RAID 4 uses a disk as a parity disk. Each write operation requires access to the parity disk. At this time, the parity disk will become a bottleneck in the write operation, so RAID 4 is rarely used in commercial environments.

RAID 5

parity (XOR) stripe storage, verification data distributed storage , and the data stripe storage unit is block. RAID 5 does not specify a parity disk separately, but stores data and parity information across all disks. On RAID 5, the read/write pointer can operate on the array device simultaneously, providing higher data traffic. RAID 5 is more suitable for small data blocks and random read and write data. Compared with RAID 3, the most important difference between RAID 3 is that every time RAID 3 performs data transmission, all array disks need to be involved; for RAID 5, most data transmissions only operate on one disk and can be operated in parallel. There is a "write loss" in RAID 5, that is, each write operation will generate four actual read/write operations, including old data and parity information twice, and new data and parity information twice.

RAID 5 Disperse the check blocks into all data disks. It uses a special algorithm that can calculate the storage location of any zone verification block. This ensures that any read and write operations on the check block will be balanced across all RAID disks, eliminating the possibility of bottlenecks. RAID5 has high readout efficiency, average writing efficiency, and good block collective access efficiency. RAID 5 improves system reliability, but does not solve the parallelism of data transmission, and the design of the controller is also quite difficult. In order to have RAID-5 levels of redundancy, a disk array consisting of at least three disks is required (excluding a hot spare). RAID-5 can be implemented through disk array controller hardware or through certain network operating system software. The utilization rate of the hard disk is n-1.

When recovering, for example, we need to restore A0 in the figure below, here we must add B0, C0, D0 to 0 parity to calculate and obtain A0 for data recovery. So when two disks are broken, the data of the entire RAID is invalid.

RAID 6

parity (XOR) stripe storage, two distributed storage verification data, and the data stripe storage unit is block. Compared with RAID 5, RAID 6 adds a second independent parity information block. Two independent parity systems use different algorithms, and the data reliability is very high, and even if the two disks fail at the same time, it will not affect the use of data. However, RAID 6 needs to allocate to disk space with larger parity information, and has a greater "write loss" compared to RAID 5, so the "write performance" is very poor. Poor performance and complex implementations make RAID 6 rarely available for practical application.

RAID 7

This is a brand new RAID architecture. Because it comes with a real-time operating system and software tools for storage management, it can run completely independently of the host and does not occupy the host CPU resources. RAID 7 can be regarded as a small storage computer, making it more advanced compared to other RAID architectures.

RAID 7 level is by far the highest performance RAID architecture in theory, because it is already very different from the previous ones in terms of its formation method. Basically, see the figure. In the past, a hard disk was a "post" that formed an array. In RAID 7, multiple hard disks formed a "post" and they all have their own channels. Because of this, you can break this diagram into hard disks connected to the main channel, but it is more subdivided than the previous level. The advantage of this is that when reading/writing data in a certain area, it can be quickly positioned, without only accessing part of the data area at the same time due to the limitations of a single hard disk. In RAID 7, a previous single hard disk was equivalent to being divided into multiple independent hard disks, with its own reading and writing channels.