CN103761195A - Storage method utilizing distributed data encoding - Google Patents

Storage method utilizing distributed data encoding Download PDF

Info

Publication number
CN103761195A
CN103761195A CN201410009331.1A CN201410009331A CN103761195A CN 103761195 A CN103761195 A CN 103761195A CN 201410009331 A CN201410009331 A CN 201410009331A CN 103761195 A CN103761195 A CN 103761195A
Authority
CN
China
Prior art keywords
data
client
coding
server
buffer memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410009331.1A
Other languages
Chinese (zh)
Other versions
CN103761195B (en
Inventor
王欢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201410009331.1A priority Critical patent/CN103761195B/en
Publication of CN103761195A publication Critical patent/CN103761195A/en
Application granted granted Critical
Publication of CN103761195B publication Critical patent/CN103761195B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a storage method utilizing distributed data encoding. The storage method includes that after a client side process receives data from a cache, encoding calculation is performed to generate a data verifying block; after encoding is completed, a client side sends data pieces to all server sides; every time after write success is returned by a server, a counter is maintained on the client side; according to configuration parameters of a system, when different levels at which data can be restored are reached, rewriting is no longer performed on server nodes failed in writing. Compared with the prior art, the storage method has the advantages that a diaster recovery scheme is realized by utilizing array operation, failures of any multiple storage nodes or magnetic disks are allowable, and CPU (central processing unit) occupancy rate of conventional erasure code or array operation is greatly lowered; the storage method is high in practicability and easy in popularization.

Description

A kind of storage means of utilizing distributed data coding
Technical field
The present invention relates to technical field of computer data storage, specifically a kind of storage means of utilizing distributed data coding.
Background technology
No matter modern traditional distributed file system is business or increases income system, substantially be all to have adopted the mode of many copies to guarantee the security of data, as the GPFS of the IBM of business, the BWFS of blue whale, panFS of Pansas company etc., the for example lustre increasing income, GlusterFS, HDFS, ceph etc., it is generally the mode that adopts many copies between node, ensure the fault redundance of node, on node, adopt again RAID5 or RAID6 etc. to ensure the fault redundance of disk, often the mode of copy is to realize by writing the form of a piece of data more, this performance linear reduction along with the quantity increase of copy with regard to causing writing, for the RAID operation between disk, no matter be that initialization or reconstruction recover, its height is consuming time is all a difficult problem in the industry, in the situation that all inefficient guaranteeing safety, only can ensure on the contrary the fault of one or two node and a disk, differ greatly with industrial 99.9999% requirement.
Nowadays large data age, the most important assets of data Shi Yige enterprise, the security of data is the lifeblood of Shi Yige company many times.Arrival in the time of mobile Internet, the growth rate of data every day of mass users is surprising, and within a certain period of time, data all need the storage of high security, and affect data, store topmost factor, and the one, security, the 2nd, performance.
In the prior art, in traditional distributed file system, copy mode is easily brought fissure problem and problem of inconsistency; Traditional disk level RAID5 or RAID scheme can improve the cost of enterprise greatly simultaneously, and in reconstruction and rejuvenation, power consumption is larger simultaneously, causes the wasting of resources, based on this, now provides a kind of novel, high available high-performance storage method.
Summary of the invention
Technical assignment of the present invention is to solve the deficiencies in the prior art, and a kind of storage means of utilizing distributed data coding is provided.
Technical scheme of the present invention realizes in the following manner, this kind of storage means of utilizing distributed data coding, and its concrete storing process is:
One, client process receives after the data from buffer memory, encode to calculate and generate checking data piece, this checking data piece can guarantee to carry out in 3 nodes or disk failures situation, adopt N+M mode, N is the number of data block, M is the number of check block, and N is more than or equal to 1 natural number, and M is more than or equal to 3 natural number;
Two, after completing coding, client sends data slice to all server ends, after every next server returns and writes successfully, at client maintenance counter, according to the configuration parameter of system, when reaching the different stage that can recover data, to writing the server node of failure, no longer rewrite.
What in described step 1, client was encoded calculating employing in real time is 2+2 grouping parallel high spped coding mode, the particular content of which is: 3 data blocks are one group, adopt two groups of coded system formation check block data separately respectively, carry out the overall coding of whole 6 blocks of data simultaneously, now produce 3 check blocks, redundance reaches 3.
The detailed process of described step 2 is: when client is received read request, by client, send and read instruction to all storage server nodes, in client awaits, accept the data that each storage server returns, after client is carried out the accumulation of consistency detection and counter to the data of returning, for the check block preferentially returning, carry out buffer memory, buffer memory number is determined according to configuration, if follow-up, return to all normal data blocks, data block returns to user and calls, and the check block of buffer memory abandons;
In the process of normally reading, find that there is that data block cannot read or find to damage according to marker bit, by the check block of buffer memory, filled, and according to the grouping situation of 2+2, the calculating of decoding, obtaining after data, then return to the user that call on upper strata.
Described normally reading in flow process, preferential transmission reads request to normal data block node.
In described cluster, there is reparation process, this process is carried out continual inspection to all data, when finding that there is loss of data or node failure, again reading out data, decoding recovers data, then again writes storage server, and reparation process has all been carried out detailed record to repairing file and layout, when new node, put back in cluster, by reparation process, carried out the unloading of data.
The beneficial effect that the present invention compared with prior art produced is:
The reliability that a kind of storage means of utilizing distributed data to encode of the present invention has replaced traditional copy mode to guarantee, and contemporary other RAID scheme of disk level of having replaced, solved fissure problem and the low performance of copy, and the height of the reconstruction in RAID5 or RAID6 scheme and recovery is consuming time and expensive problem; Utilize the computing of matrix to realize disaster recovery solution, allow any number of memory nodes or disk failure, the CPU usage of traditional correcting and eleting codes or matrix operation is reduced greatly; Practical, be easy to promote.
Accompanying drawing explanation
Accompanying drawing 1 is written document process flow diagram of the present invention.
Accompanying drawing 2 is encryption algorithm schematic diagram of the present invention.
Accompanying drawing 3 is read document flowchart in the present invention.
Accompanying drawing 4 is for repairing and read document flowchart in the present invention.
Accompanying drawing 5 is repaired document flowchart in the present invention.
Embodiment
Below in conjunction with accompanying drawing, a kind of storage means of distributed data coding of utilizing of the present invention is described in detail below.
As shown in accompanying drawing 1~5, the invention provides a kind of storage means of utilizing distributed data coding, its concrete storing process is:
As shown in Figure 1, when client process receives after the data from buffer memory, encode to calculate and generate checking data piece, traditional computing method have Cauchy's encoder matrix and Fan Demeng matrix, and carry out matrix operation, often take very high CPU, performance also can not get improving, consider that in practical application, a node failure or two node failures appear in the overwhelming majority, this scheme can guarantee that in actual conditions 3 nodes or disk failures can satisfy the demand, in this technical scheme, be referred to as N+M mode, the number of the timely data block of N, M is the number of check block, M=3 now, corresponding, if adopt copy mode, to there are 4 copies simultaneously, at this moment the performance of copy can reduce greatly.
At HDFS, TFS etc. increase income in scheme, there is equally the input tolerant for Chinese mode of Cauchy matrix or Fan Demeng matrix, what but they all adopted is to carry out on memory node, the real-time that cannot guarantee data security, asynchronous scheme poor reliability of carrying out, security is low, and most schemes is all to adopt the mode that retains copy, write performance can not be guaranteed at all like this, the present invention adopts the client calculating of encoding in real time, and the one-step optimization performance of going forward side by side, no matter from the equal method of far super HDFS and TFS distributed system of security or performance.
Invention has simultaneously adopted a kind of 2+2 grouping parallel high spped coding mode, as shown in Figure 2,3 data blocks are one group, adopt two groups of coded system formation check block data separately respectively, carry out the overall coding of whole 6 blocks of data simultaneously, such one meets 3 check blocks of generation together, and redundance also reaches 3, is equivalent to the scheme of copy 4.And the fastest speed that produces a check block is to adopt uniform enconding, performance is better than the coded system of Fan Demeng matrix or Cauchy matrix greatly.
In practical application, often the damageability of one or two nodes is very large, and recovering data often only needs a verification, has reduced the transmission quantity of check block and has improved the decoding speed that recovers data.When damaging a node at first group, second group when damaging 2 nodes, need two check blocks of the 3rd group and a check block of second group.But the generation of this situation is very low.
After completing coding, client sends data slice to all server ends, after every next server returns and writes successfully, at client maintenance counter, according to the configuration parameter of system, when reaching the different stage that can recover data, to writing the server node of failure, no longer rewrite, the same like this performance of writing that improved, general idea is the write operation of minimum task as far as possible.
File is normally read flow process: as shown in Figure 3, when client is received read request, by client, send and read instruction to all storage server nodes, in client awaits, accept the data that each storage server returns, after client is carried out the accumulation of consistency detection and counter to the data of returning, for the check block preferentially returning, carry out buffer memory, buffer memory number is determined according to configuration, if follow-up, return to all normal data blocks, data block returns to user and calls, and the check block of buffer memory abandons.
In the flow process of reading normally, preferential transmission reads request to normal data block node, avoids the decoding of client to calculate.
Flow process is read in file reparation: as shown in Figure 4, in the process of normally reading, find that there is that data block cannot read or find to damage according to marker bit, there is the check block of buffer memory to fill, and according to grouping 2+2 method, the calculating of decoding, obtain after data, then return to the user that call on upper strata.
File is repaired flow process: as shown in Figure 5, in order to guarantee the security of data, in cluster, can there is reparation process, all data are carried out to continual inspection, when finding that there is loss of data or node failure, again reading out data, decoding recovers data, then again writes storage server, and reparation process has all been carried out detailed record to repairing file and layout, when new node, put back in cluster, by reparation process, carried out the unloading of data.
The present invention can reduce and greatly improves space utilisation and reduce the low performance problem that copy mode is brought, and reduces other RAID cost of disk level simultaneously.
The foregoing is only embodiments of the invention, within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (5)

1. a storage means of utilizing distributed data coding, is characterized in that its concrete storing process is:
One, client process receives after the data from buffer memory, encode to calculate and generate checking data piece, this checking data piece can guarantee to carry out in 3 nodes or disk failures situation, adopt N+M mode, N is the number of data block, M is the number of check block, and N is more than or equal to 1 natural number, and M is more than or equal to 3 natural number;
Two, after completing coding, client sends data slice to all server ends, after every next server returns and writes successfully, at client maintenance counter, according to the configuration parameter of system, when reaching the different stage that can recover data, to writing the server node of failure, no longer rewrite.
2. a kind of storage means of utilizing distributed data coding according to claim 1, it is characterized in that: what in described step 1, client was encoded calculating employing in real time is 2+2 grouping parallel high spped coding mode, the particular content of which is: 3 data blocks are one group, adopt two groups of coded system formation check block data separately respectively, carry out the overall coding of whole 6 blocks of data simultaneously, now produce 3 check blocks, redundance reaches 3.
3. a kind of storage means of utilizing distributed data coding according to claim 2, it is characterized in that: the detailed process of described step 2 is: when client is received read request, by client, send and read instruction to all storage server nodes, in client awaits, accept the data that each storage server returns, after client is carried out the accumulation of consistency detection and counter to the data of returning, for the check block preferentially returning, carry out buffer memory, buffer memory number is determined according to configuration, if follow-up, return to all normal data blocks, data block returns to user and calls, the check block of buffer memory abandons,
In the process of normally reading, find that there is that data block cannot read or find to damage according to marker bit, by the check block of buffer memory, filled, and according to the grouping situation of 2+2, the calculating of decoding, obtaining after data, then return to the user that call on upper strata.
4. a kind of storage means of utilizing distributed data coding according to claim 3, is characterized in that: described normally reading in flow process, preferential transmission reads request to normal data block node.
5. according to arbitrary described a kind of storage means of utilizing distributed data coding in claim 1~4, it is characterized in that: in described cluster, have reparation process, this process is carried out continual inspection to all data, when finding that there is loss of data or node failure, again reading out data, decoding recovers data, then again write storage server, reparation process has all been carried out detailed record to repairing file and layout, when new node, put back in cluster, by reparation process, carried out the unloading of data.
CN201410009331.1A 2014-01-09 2014-01-09 Storage method utilizing distributed data encoding Active CN103761195B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410009331.1A CN103761195B (en) 2014-01-09 2014-01-09 Storage method utilizing distributed data encoding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410009331.1A CN103761195B (en) 2014-01-09 2014-01-09 Storage method utilizing distributed data encoding

Publications (2)

Publication Number Publication Date
CN103761195A true CN103761195A (en) 2014-04-30
CN103761195B CN103761195B (en) 2017-05-10

Family

ID=50528437

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410009331.1A Active CN103761195B (en) 2014-01-09 2014-01-09 Storage method utilizing distributed data encoding

Country Status (1)

Country Link
CN (1) CN103761195B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104731676A (en) * 2015-03-24 2015-06-24 浪潮集团有限公司 Method for accelerating data recovery of cluster system
CN105791353A (en) * 2014-12-23 2016-07-20 深圳市腾讯计算机系统有限公司 Distributed data storage method and system based on erasure code
CN105930103A (en) * 2016-05-10 2016-09-07 南京大学 Distributed storage CEPH based erasure correction code overwriting method
WO2017041233A1 (en) * 2015-09-08 2017-03-16 广东超算数据安全技术有限公司 Encoding and storage node repairing method for functional-repair regenerating code
CN107615248A (en) * 2015-06-17 2018-01-19 华为技术有限公司 Distributed data storage method, control device and system
CN109976663A (en) * 2017-12-27 2019-07-05 浙江宇视科技有限公司 Distributed storage response method and system
CN110059068A (en) * 2019-04-11 2019-07-26 厦门网宿有限公司 Data verification method and data verification system in a kind of distributed memory system
CN111610938A (en) * 2020-05-29 2020-09-01 宁波富万信息科技有限公司 Distributed data code storage method, electronic device and computer readable storage medium
CN112732164A (en) * 2019-10-28 2021-04-30 北京白山耘科技有限公司 Cross-node data group management method, device and medium
WO2023184921A1 (en) * 2022-03-30 2023-10-05 苏州浪潮智能科技有限公司 Raid encoding and decoding method and apparatus, and device and non-volatile readable storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106227617A (en) * 2016-07-15 2016-12-14 乐视控股(北京)有限公司 Self-repair method and storage system based on correcting and eleting codes algorithm

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1530836A (en) * 2003-03-17 2004-09-22 株式会社瑞萨科技 Nonvolatile memory device and data processing system
US20090172464A1 (en) * 2007-12-30 2009-07-02 Agere Systems Inc. Method and apparatus for repairing uncorrectable drive errors in an integrated network attached storage device
CN101488104A (en) * 2009-02-26 2009-07-22 北京世纪互联宽带数据中心有限公司 System and method for implementing high-efficiency security memory
US7681104B1 (en) * 2004-08-09 2010-03-16 Bakbone Software, Inc. Method for erasure coding data across a plurality of data stores in a network
US7823009B1 (en) * 2001-02-16 2010-10-26 Parallels Holdings, Ltd. Fault tolerant distributed storage for cloud computing
CN102272731A (en) * 2008-11-10 2011-12-07 弗森-艾奥公司 Apparatus, system, and method for predicting failures in solid-state storage
US20120192037A1 (en) * 2007-07-18 2012-07-26 Panasas, Inc. Data storage systems and methods having block group error correction for repairing unrecoverable read errors
US20130173956A1 (en) * 2011-12-30 2013-07-04 Streamscale, Inc. Using parity data for concurrent data authentication, correction, compression, and encryption

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7823009B1 (en) * 2001-02-16 2010-10-26 Parallels Holdings, Ltd. Fault tolerant distributed storage for cloud computing
CN1530836A (en) * 2003-03-17 2004-09-22 株式会社瑞萨科技 Nonvolatile memory device and data processing system
US7681104B1 (en) * 2004-08-09 2010-03-16 Bakbone Software, Inc. Method for erasure coding data across a plurality of data stores in a network
US20120192037A1 (en) * 2007-07-18 2012-07-26 Panasas, Inc. Data storage systems and methods having block group error correction for repairing unrecoverable read errors
US20090172464A1 (en) * 2007-12-30 2009-07-02 Agere Systems Inc. Method and apparatus for repairing uncorrectable drive errors in an integrated network attached storage device
CN102272731A (en) * 2008-11-10 2011-12-07 弗森-艾奥公司 Apparatus, system, and method for predicting failures in solid-state storage
CN101488104A (en) * 2009-02-26 2009-07-22 北京世纪互联宽带数据中心有限公司 System and method for implementing high-efficiency security memory
US20130173956A1 (en) * 2011-12-30 2013-07-04 Streamscale, Inc. Using parity data for concurrent data authentication, correction, compression, and encryption

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105791353A (en) * 2014-12-23 2016-07-20 深圳市腾讯计算机系统有限公司 Distributed data storage method and system based on erasure code
CN105791353B (en) * 2014-12-23 2020-03-17 深圳市腾讯计算机系统有限公司 Distributed data storage method and system based on erasure codes
CN104731676A (en) * 2015-03-24 2015-06-24 浪潮集团有限公司 Method for accelerating data recovery of cluster system
CN107615248B (en) * 2015-06-17 2019-12-13 华为技术有限公司 Distributed data storage method, control equipment and system
CN107615248A (en) * 2015-06-17 2018-01-19 华为技术有限公司 Distributed data storage method, control device and system
WO2017041233A1 (en) * 2015-09-08 2017-03-16 广东超算数据安全技术有限公司 Encoding and storage node repairing method for functional-repair regenerating code
CN105930103B (en) * 2016-05-10 2019-04-16 南京大学 A kind of correcting and eleting codes covering write method of distributed storage CEPH
CN105930103A (en) * 2016-05-10 2016-09-07 南京大学 Distributed storage CEPH based erasure correction code overwriting method
CN109976663A (en) * 2017-12-27 2019-07-05 浙江宇视科技有限公司 Distributed storage response method and system
CN109976663B (en) * 2017-12-27 2021-12-28 浙江宇视科技有限公司 Distributed storage response method and system
CN110059068A (en) * 2019-04-11 2019-07-26 厦门网宿有限公司 Data verification method and data verification system in a kind of distributed memory system
CN112732164A (en) * 2019-10-28 2021-04-30 北京白山耘科技有限公司 Cross-node data group management method, device and medium
CN111610938A (en) * 2020-05-29 2020-09-01 宁波富万信息科技有限公司 Distributed data code storage method, electronic device and computer readable storage medium
WO2023184921A1 (en) * 2022-03-30 2023-10-05 苏州浪潮智能科技有限公司 Raid encoding and decoding method and apparatus, and device and non-volatile readable storage medium

Also Published As

Publication number Publication date
CN103761195B (en) 2017-05-10

Similar Documents

Publication Publication Date Title
CN103761195B (en) Storage method utilizing distributed data encoding
US9552258B2 (en) Method and system for storing data in raid memory devices
CN109725822B (en) Method, apparatus and computer program product for managing a storage system
US10162704B1 (en) Grid encoded data storage systems for efficient data repair
US10089176B1 (en) Incremental updates of grid encoded data storage systems
US9998539B1 (en) Non-parity in grid encoded data storage systems
US9904589B1 (en) Incremental media size extension for grid encoded data storage systems
US9063910B1 (en) Data recovery after triple disk failure
US11513891B2 (en) Systems and methods for parity-based failure protection for storage devices
CN109814807B (en) Data storage method and device
CN110442535B (en) Method and system for improving reliability of distributed solid-state disk key value cache system
CN105956128B (en) A kind of adaptive coding storage fault-tolerance approach based on simple regeneration code
CN102110154B (en) File redundancy storage method in cluster file system
CN102520890B (en) RS (Reed-Solomon) - DRAID( D redundant array of independent disk) system based on GPUs (graphic processing units) and method for controlling data of memory devices
CN101984400B (en) RAID control method, device and system
CN103729151A (en) Failure data recovery method based on improved erasure codes
US20140164695A1 (en) Method and system for storing and rebuilding data
CN110427156B (en) Partition-based MBR (Membrane biological reactor) parallel reading method
CN108228382A (en) A kind of data reconstruction method for EVENODD code single-deck failures
CN107153661A (en) A kind of storage, read method and its device of the data based on HDFS systems
CN101901115B (en) Method for constructing redundant array of inexpensive disks (RAID) 6 level
CN104866243A (en) RAID-6 transverse and oblique check encoding and decoding method for optimizing input/output load
Iliadis Reliability evaluation of erasure-coded storage systems with latent errors
CN116501553B (en) Data recovery method, device, system, electronic equipment and storage medium
US9928141B1 (en) Exploiting variable media size in grid encoded data storage systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant