CN103761195B

CN103761195B - Storage method utilizing distributed data encoding

Info

Publication number: CN103761195B
Application number: CN201410009331.1A
Authority: CN
Inventors: 王欢
Original assignee: Inspur Electronic Information Industry Co Ltd
Current assignee: Inspur Electronic Information Industry Co Ltd
Priority date: 2014-01-09
Filing date: 2014-01-09
Publication date: 2017-05-10
Anticipated expiration: 2034-01-09
Also published as: CN103761195A

Abstract

The invention provides a storage method utilizing distributed data encoding. The storage method includes that after a client side process receives data from a cache, encoding calculation is performed to generate a data verifying block; after encoding is completed, a client side sends data pieces to all server sides; every time after write success is returned by a server, a counter is maintained on the client side; according to configuration parameters of a system, when different levels at which data can be restored are reached, rewriting is no longer performed on server nodes failed in writing. Compared with the prior art, the storage method has the advantages that a diaster recovery scheme is realized by utilizing array operation, failures of any multiple storage nodes or magnetic disks are allowable, and CPU (central processing unit) occupancy rate of conventional erasure code or array operation is greatly lowered; the storage method is high in practicability and easy in popularization.

Description

A kind of storage method of utilization distributed data coding

Technical field

The present invention relates to technical field of computer data storage, what specifically a kind of utilization distributed data was encoded deposits Method for storing.

Background technology

Modern traditional distributed file system either business or open source system, are substantially and employ many copies Mode ensures the security of data, the such as GPFS of the IBM of business, the BWFS of blue whale, panFS of Pansas companies etc., increases income Such as lustre, GlusterFS, HDFS, ceph etc., be generally between node by the way of many copies, to ensure The fault redundance of node, adopts RAID5 or RAID6 etc. to ensure the fault redundance of disk, often the side of copy again on node Formula is realized by writing the form of a data more, and this results in the performance write as the quantity of copy increases and linear drop Low, for the RAID operation between disk, whether initialization or rebuild recovers, and it is all a difficult problem in the industry that its height is time-consuming, in kind Kind it is inefficient guarantee safety in the case of, the failure of one or two node and a disk can be only ensured on the contrary, with The requirement of industrial 99.9999% differs greatly.

Nowadays in the big data epoch, data are the most important assets of an enterprise, and the security of data is many times one The lifeblood of individual company.Arrival when mobile Internet, the growth rate that the data of mass users are daily is surprising, in a timing Interior, data are required for the storage of high security, and affect the topmost factor of data storage, and one is security, and two is performance.

In the prior art, copy mode is easily brought in traditional distributed file system fissure problem and inconsistency Problem；Simultaneously traditional disk level RAID5 or RAID scheme can greatly improve the cost of enterprise, while in reconstruction and recovery process Power consumption is larger, causes the wasting of resources, based on this, now provides a kind of new, High Availabitity high-performance storage method.

The content of the invention

The technical assignment of the present invention is to solve the deficiencies in the prior art, there is provided a kind of storage of utilization distributed data coding Method.

The technical scheme is that what is realized in the following manner, a kind of storage side of utilization distributed data coding Method, its concrete storing process is：

First, client process is received after the data of caching, is carried out coding and is calculated generation verification data block, the verification Data block can ensure to be carried out in the case of 3 nodes or disk failures, i.e., using N+M modes, N is the number of data block, and M is The number of check block, N is the natural number more than or equal to 1, and M is the natural number more than or equal to 3；

2nd, after coding is completed, client sends data slice to all of server end, returns per next server After being write as work(, in client maintenance counter, according to the configuration parameter of system, when reaching the different stage that can recover data, Then the server node to writing failure is no longer written over.

In the step one client carry out in real time encode calculate adopts for 2+2 grouping parallel high spped coding modes, this The particular content of mode is：3 data blocks are one group, using two groups respectively coded system form respective check block data, together The overall coding of the whole 6 block number evidences of Shi Jinhang, now produces 3 check blocks, and redundancy reaches 3.

The detailed process of the step 2 is：When client receives read request, by client send reading instruction to own Storage server node, in the data that each storage server to be subjected such as client is returned, when client is to the number that returns According to after the accumulation for carrying out consistency detection and calculator, for the preferential check block for returning is cached, caching number according to Configuration determines that, if subsequently returning all normal data blocks, data block returns to user and calls, and the check block of caching is lost Abandon；

Find to have data block to read during normal reading or found to have damaged according to marker bit, then by delaying The check block filling deposited, and according to the packet situation of 2+2, decoding calculating is carried out, after obtaining data, return again to be called to upper strata User.

In the normal reading flow process, preferential transmission reads request to normal data block node.

There is reparation process in the cluster, the process carries out continual inspection to all of data, when discovery has number When according to loss or node failure, data are re-read, decoding recovers data, then re-writes storage server, repaiies Multiple process has all carried out detailed record to repairing file and layout, when new node is put back in cluster, then enters line number by reparation process According to unloading.

The produced compared with prior art beneficial effect of the present invention is：

What a kind of storage method of utilization distributed data coding of the present invention instead of that traditional copy mode ensures can By property, and while instead of the RAID schemes of disk level, the fissure and low performance of copy, and RAID5 are solved the problems, such as Or the height of the reconstruction in RAID6 schemes and recovery takes and high cost problem；Disaster recovery solution is realized using the computing of matrix, Any number of memory nodes or disk failure are allowed, traditional correcting and eleting codes or the CPU usage of matrix operation is dropped significantly It is low；It is practical, it is easy to promote.

Description of the drawings

Accompanying drawing 1 is written document flow chart of the invention.

Accompanying drawing 2 is encryption algorithm schematic diagram of the invention.

Accompanying drawing 3 is to read document flowchart in the present invention.

Accompanying drawing 4 reads document flowchart to repair in the present invention.

Accompanying drawing 5 is to repair document flowchart in the present invention.

Specific embodiment

A kind of storage method of utilization distributed data coding of the present invention is described in detail below below in conjunction with the accompanying drawings.

As shown in accompanying drawing 1～5, the present invention provides a kind of storage method of utilization distributed data coding, its concrete storage Process is：

As shown in Figure 1, after client process is received from the data of caching, carry out coding and calculate generation check number According to block, traditional computational methods have Cauchy's encoder matrix and generalized circular matrix, and carry out matrix operation and often take very high CPU, Performance also cannot be improved, it is contemplated that the overwhelming majority is that a node failure or two node failures occur in practical application, This scheme meets needs by can ensure that 3 nodes or disk failures in a practical situation, by it in the technical scheme Referred to as N+M modes, the number of the timely data blocks of N, M is the number of check block, and now M=3, corresponding, if using copy side , will simultaneously there are 4 copies in formula, at this moment the performance of copy can be substantially reduced.

Increase income in scheme in HDFS, TFS etc., equally exist the input tolerant for Chinese mode of Cauchy matrix or generalized circular matrix, But they use is carried out on memory node, it is impossible to the real-time for guaranteeing data security, the asynchronous scheme for carrying out can Poor by property, security is low, and the scheme of majority is all that such write performance cannot be protected at all by the way of reservation copy Card, the present invention carries out in real time coding calculating using client, one-step optimization performance of going forward side by side, no matter remote from security or performance The method of super HDFS and TFS distributeds system.

Invention simultaneously employs a kind of 2+2 grouping parallels high spped coding mode, and as shown in Figure 2,3 data blocks are one Group, using two groups respectively coded system form respective check block data, while carry out the overall coding of whole 6 block number evidence, this Sample one meets 3 check blocks of generation together, and redundancy also reaches 3, equivalent to the scheme of copy 4.And one check block of generation is most fast Speed be to adopt uniform enconding, performance is significantly better than the coded system of generalized circular matrix or Cauchy matrix.

Often the damageability of one or two nodes is very big in practical application, then recover data and often only need to a school Test, reduce the transmission quantity of check block and improve the decoding speed for recovering data.When in first group of damage, one section Point, when second group of damage, 2 nodes, then need the 3rd group two check blocks and second group check block.But The generation of such case is very low.

After coding is completed, client sends data slice to all of server end, returns per next server and writes After success, in client maintenance counter, according to the configuration parameter of system, when the different stage that can recover data is reached, then Server node to writing failure is no longer written over, and so equally improves the performance write, and general idea is to try to minimum appointing The write operation of business.

File normally reads flow process：As shown in Figure 3, when client receives read request, reading is sent by client and is referred to All of storage server node is made, in the data that each storage server to be subjected such as client is returned, when client pair The data of return are carried out after the accumulation of consistency detection and calculator, for the preferential check block for returning is cached, are cached Number determines that, if subsequently returning all normal data blocks, data block returns to user and calls, the school of caching according to configuration Test block discarding.

In the normal flow process read, preferential transmission reads request to normal data block node, it is to avoid the decoding of client Calculate.

Flow process is read in file reparation：As shown in Figure 4, find have data block to read or root during normal reading Find to have damaged according to marker bit, then have the check block filling of caching, and according to packet 2+2 methods, carry out decoding calculating, obtain To after data, the user called to upper strata is returned again to.

File repairs flow process：As shown in Figure 5, in order to ensure the security of data, can there is reparation process in cluster, it is right All of data carry out continual inspection, when finding to have loss of data or node failure, re-read data, solve Code recovers data, then re-writes storage server, and reparation process has all carried out detailed record to repairing file and layout, when New node is put back in cluster, then the unloading of data is carried out by reparation process.

The present invention can be reduced greatly improving space utilisation and reduce the low performance problem that copy mode is brought, while drop The RAID costs of low disk level.

The foregoing is only embodiments of the invention, it is all within the spirit and principles in the present invention, made it is any Modification, equivalent, improvement etc., should be included within the scope of the present invention.

Claims

1. the storage method of a kind of utilization distributed data coding, it is characterised in that its concrete storing process is：

First, client process is received after the data of caching, is carried out coding and is calculated generation verification data block, the verification data Block can ensure to be used in the case of 3 nodes or disk failures, and data block adopts N+M modes with the number of check block, and N is several According to the number of block, M is the number of check block, and N is the natural number more than or equal to 1, and M is the natural number more than or equal to 3；

2nd, after coding is completed, client sends data slice to all of server end, returns per next server and is write as It is when the different stage that can recover data is reached, then right according to the configuration parameter of system in client maintenance counter after work( The server node for writing failure is no longer written over；

In the step one client carry out in real time encode calculate adopt for 2+2 grouping parallel high spped coding modes, which Particular content be：3 data blocks are one group, using two groups respectively coded system form respective check block data, while entering The overall coding of the whole 6 block number evidence of row, now produces 3 check blocks, and redundancy reaches 3；

The detailed process of the step 2 is：When client receives read request, reading instruction is sent by client and is deposited to all of Storage server node, in the data that each storage server to be subjected such as client is returned, when client is entered to the data for returning After the accumulation of row consistency detection and calculator, for the preferential check block for returning is cached, caching number is according to configuration It is determined that, if subsequently returning all normal data blocks, data block returns to user and calls, and the check block of caching is abandoned；

Find to have data block to read during normal reading or found to have damaged according to marker bit, then by caching Check block is filled, and according to the packet situation of 2+2, carries out decoding calculating, after obtaining data, returns again to the use called to upper strata Family.

2. the storage method that a kind of utilization distributed data according to claim 1 is encoded, it is characterised in that：It is described normal In reading flow process, preferential transmission reads request to normal data block node.

3. the storage method that a kind of utilization distributed data according to claim 2 is encoded, it is characterised in that：The method should For in cluster, there is reparation process in cluster, the reparation process carries out continual inspection to all of data, when discovery has When loss of data or node failure, data are re-read, decoding recovers data, then re-writes storage server, Reparation process has all carried out detailed record to repairing file and layout, when new node is put back in cluster, is then carried out by reparation process The unloading of data.