CN103761195B - Storage method utilizing distributed data encoding - Google Patents
Storage method utilizing distributed data encoding Download PDFInfo
- Publication number
- CN103761195B CN103761195B CN201410009331.1A CN201410009331A CN103761195B CN 103761195 B CN103761195 B CN 103761195B CN 201410009331 A CN201410009331 A CN 201410009331A CN 103761195 B CN103761195 B CN 103761195B
- Authority
- CN
- China
- Prior art keywords
- data
- block
- client
- storage method
- server
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000005540 biological transmission Effects 0.000 claims description 4
- 238000012795 verification Methods 0.000 claims description 4
- 238000009825 accumulation Methods 0.000 claims description 3
- 238000001514 detection method Methods 0.000 claims description 3
- 238000007689 inspection Methods 0.000 claims description 3
- 238000012423 maintenance Methods 0.000 claims description 3
- 239000003550 marker Substances 0.000 claims description 3
- 238000011084 recovery Methods 0.000 abstract description 4
- 239000011159 matrix material Substances 0.000 description 9
- 230000008439 repair process Effects 0.000 description 3
- 238000013500 data storage Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 241000283084 Balaenoptera musculus Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- VQLYBLABXAHUDN-UHFFFAOYSA-N bis(4-fluorophenyl)-methyl-(1,2,4-triazol-1-ylmethyl)silane;methyl n-(1h-benzimidazol-2-yl)carbamate Chemical compound C1=CC=C2NC(NC(=O)OC)=NC2=C1.C=1C=C(F)C=CC=1[Si](C=1C=CC(F)=CC=1)(C)CN1C=NC=N1 VQLYBLABXAHUDN-UHFFFAOYSA-N 0.000 description 1
- 238000000205 computational method Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
Abstract
The invention provides a storage method utilizing distributed data encoding. The storage method includes that after a client side process receives data from a cache, encoding calculation is performed to generate a data verifying block; after encoding is completed, a client side sends data pieces to all server sides; every time after write success is returned by a server, a counter is maintained on the client side; according to configuration parameters of a system, when different levels at which data can be restored are reached, rewriting is no longer performed on server nodes failed in writing. Compared with the prior art, the storage method has the advantages that a diaster recovery scheme is realized by utilizing array operation, failures of any multiple storage nodes or magnetic disks are allowable, and CPU (central processing unit) occupancy rate of conventional erasure code or array operation is greatly lowered; the storage method is high in practicability and easy in popularization.
Description
Technical field
The present invention relates to technical field of computer data storage, what specifically a kind of utilization distributed data was encoded deposits
Method for storing.
Background technology
Modern traditional distributed file system either business or open source system, are substantially and employ many copies
Mode ensures the security of data, the such as GPFS of the IBM of business, the BWFS of blue whale, panFS of Pansas companies etc., increases income
Such as lustre, GlusterFS, HDFS, ceph etc., be generally between node by the way of many copies, to ensure
The fault redundance of node, adopts RAID5 or RAID6 etc. to ensure the fault redundance of disk, often the side of copy again on node
Formula is realized by writing the form of a data more, and this results in the performance write as the quantity of copy increases and linear drop
Low, for the RAID operation between disk, whether initialization or rebuild recovers, and it is all a difficult problem in the industry that its height is time-consuming, in kind
Kind it is inefficient guarantee safety in the case of, the failure of one or two node and a disk can be only ensured on the contrary, with
The requirement of industrial 99.9999% differs greatly.
Nowadays in the big data epoch, data are the most important assets of an enterprise, and the security of data is many times one
The lifeblood of individual company.Arrival when mobile Internet, the growth rate that the data of mass users are daily is surprising, in a timing
Interior, data are required for the storage of high security, and affect the topmost factor of data storage, and one is security, and two is performance.
In the prior art, copy mode is easily brought in traditional distributed file system fissure problem and inconsistency
Problem;Simultaneously traditional disk level RAID5 or RAID scheme can greatly improve the cost of enterprise, while in reconstruction and recovery process
Power consumption is larger, causes the wasting of resources, based on this, now provides a kind of new, High Availabitity high-performance storage method.
The content of the invention
The technical assignment of the present invention is to solve the deficiencies in the prior art, there is provided a kind of storage of utilization distributed data coding
Method.
The technical scheme is that what is realized in the following manner, a kind of storage side of utilization distributed data coding
Method, its concrete storing process is:
First, client process is received after the data of caching, is carried out coding and is calculated generation verification data block, the verification
Data block can ensure to be carried out in the case of 3 nodes or disk failures, i.e., using N+M modes, N is the number of data block, and M is
The number of check block, N is the natural number more than or equal to 1, and M is the natural number more than or equal to 3;
2nd, after coding is completed, client sends data slice to all of server end, returns per next server
After being write as work(, in client maintenance counter, according to the configuration parameter of system, when reaching the different stage that can recover data,
Then the server node to writing failure is no longer written over.
In the step one client carry out in real time encode calculate adopts for 2+2 grouping parallel high spped coding modes, this
The particular content of mode is:3 data blocks are one group, using two groups respectively coded system form respective check block data, together
The overall coding of the whole 6 block number evidences of Shi Jinhang, now produces 3 check blocks, and redundancy reaches 3.
The detailed process of the step 2 is:When client receives read request, by client send reading instruction to own
Storage server node, in the data that each storage server to be subjected such as client is returned, when client is to the number that returns
According to after the accumulation for carrying out consistency detection and calculator, for the preferential check block for returning is cached, caching number according to
Configuration determines that, if subsequently returning all normal data blocks, data block returns to user and calls, and the check block of caching is lost
Abandon;
Find to have data block to read during normal reading or found to have damaged according to marker bit, then by delaying
The check block filling deposited, and according to the packet situation of 2+2, decoding calculating is carried out, after obtaining data, return again to be called to upper strata
User.
In the normal reading flow process, preferential transmission reads request to normal data block node.
There is reparation process in the cluster, the process carries out continual inspection to all of data, when discovery has number
When according to loss or node failure, data are re-read, decoding recovers data, then re-writes storage server, repaiies
Multiple process has all carried out detailed record to repairing file and layout, when new node is put back in cluster, then enters line number by reparation process
According to unloading.
The produced compared with prior art beneficial effect of the present invention is:
What a kind of storage method of utilization distributed data coding of the present invention instead of that traditional copy mode ensures can
By property, and while instead of the RAID schemes of disk level, the fissure and low performance of copy, and RAID5 are solved the problems, such as
Or the height of the reconstruction in RAID6 schemes and recovery takes and high cost problem;Disaster recovery solution is realized using the computing of matrix,
Any number of memory nodes or disk failure are allowed, traditional correcting and eleting codes or the CPU usage of matrix operation is dropped significantly
It is low;It is practical, it is easy to promote.
Description of the drawings
Accompanying drawing 1 is written document flow chart of the invention.
Accompanying drawing 2 is encryption algorithm schematic diagram of the invention.
Accompanying drawing 3 is to read document flowchart in the present invention.
Accompanying drawing 4 reads document flowchart to repair in the present invention.
Accompanying drawing 5 is to repair document flowchart in the present invention.
Specific embodiment
A kind of storage method of utilization distributed data coding of the present invention is described in detail below below in conjunction with the accompanying drawings.
As shown in accompanying drawing 1~5, the present invention provides a kind of storage method of utilization distributed data coding, its concrete storage
Process is:
As shown in Figure 1, after client process is received from the data of caching, carry out coding and calculate generation check number
According to block, traditional computational methods have Cauchy's encoder matrix and generalized circular matrix, and carry out matrix operation and often take very high CPU,
Performance also cannot be improved, it is contemplated that the overwhelming majority is that a node failure or two node failures occur in practical application,
This scheme meets needs by can ensure that 3 nodes or disk failures in a practical situation, by it in the technical scheme
Referred to as N+M modes, the number of the timely data blocks of N, M is the number of check block, and now M=3, corresponding, if using copy side
, will simultaneously there are 4 copies in formula, at this moment the performance of copy can be substantially reduced.
Increase income in scheme in HDFS, TFS etc., equally exist the input tolerant for Chinese mode of Cauchy matrix or generalized circular matrix,
But they use is carried out on memory node, it is impossible to the real-time for guaranteeing data security, the asynchronous scheme for carrying out can
Poor by property, security is low, and the scheme of majority is all that such write performance cannot be protected at all by the way of reservation copy
Card, the present invention carries out in real time coding calculating using client, one-step optimization performance of going forward side by side, no matter remote from security or performance
The method of super HDFS and TFS distributeds system.
Invention simultaneously employs a kind of 2+2 grouping parallels high spped coding mode, and as shown in Figure 2,3 data blocks are one
Group, using two groups respectively coded system form respective check block data, while carry out the overall coding of whole 6 block number evidence, this
Sample one meets 3 check blocks of generation together, and redundancy also reaches 3, equivalent to the scheme of copy 4.And one check block of generation is most fast
Speed be to adopt uniform enconding, performance is significantly better than the coded system of generalized circular matrix or Cauchy matrix.
Often the damageability of one or two nodes is very big in practical application, then recover data and often only need to a school
Test, reduce the transmission quantity of check block and improve the decoding speed for recovering data.When in first group of damage, one section
Point, when second group of damage, 2 nodes, then need the 3rd group two check blocks and second group check block.But
The generation of such case is very low.
After coding is completed, client sends data slice to all of server end, returns per next server and writes
After success, in client maintenance counter, according to the configuration parameter of system, when the different stage that can recover data is reached, then
Server node to writing failure is no longer written over, and so equally improves the performance write, and general idea is to try to minimum appointing
The write operation of business.
File normally reads flow process:As shown in Figure 3, when client receives read request, reading is sent by client and is referred to
All of storage server node is made, in the data that each storage server to be subjected such as client is returned, when client pair
The data of return are carried out after the accumulation of consistency detection and calculator, for the preferential check block for returning is cached, are cached
Number determines that, if subsequently returning all normal data blocks, data block returns to user and calls, the school of caching according to configuration
Test block discarding.
In the normal flow process read, preferential transmission reads request to normal data block node, it is to avoid the decoding of client
Calculate.
Flow process is read in file reparation:As shown in Figure 4, find have data block to read or root during normal reading
Find to have damaged according to marker bit, then have the check block filling of caching, and according to packet 2+2 methods, carry out decoding calculating, obtain
To after data, the user called to upper strata is returned again to.
File repairs flow process:As shown in Figure 5, in order to ensure the security of data, can there is reparation process in cluster, it is right
All of data carry out continual inspection, when finding to have loss of data or node failure, re-read data, solve
Code recovers data, then re-writes storage server, and reparation process has all carried out detailed record to repairing file and layout, when
New node is put back in cluster, then the unloading of data is carried out by reparation process.
The present invention can be reduced greatly improving space utilisation and reduce the low performance problem that copy mode is brought, while drop
The RAID costs of low disk level.
The foregoing is only embodiments of the invention, it is all within the spirit and principles in the present invention, made it is any
Modification, equivalent, improvement etc., should be included within the scope of the present invention.
Claims (3)
1. the storage method of a kind of utilization distributed data coding, it is characterised in that its concrete storing process is:
First, client process is received after the data of caching, is carried out coding and is calculated generation verification data block, the verification data
Block can ensure to be used in the case of 3 nodes or disk failures, and data block adopts N+M modes with the number of check block, and N is several
According to the number of block, M is the number of check block, and N is the natural number more than or equal to 1, and M is the natural number more than or equal to 3;
2nd, after coding is completed, client sends data slice to all of server end, returns per next server and is write as
It is when the different stage that can recover data is reached, then right according to the configuration parameter of system in client maintenance counter after work(
The server node for writing failure is no longer written over;
In the step one client carry out in real time encode calculate adopt for 2+2 grouping parallel high spped coding modes, which
Particular content be:3 data blocks are one group, using two groups respectively coded system form respective check block data, while entering
The overall coding of the whole 6 block number evidence of row, now produces 3 check blocks, and redundancy reaches 3;
The detailed process of the step 2 is:When client receives read request, reading instruction is sent by client and is deposited to all of
Storage server node, in the data that each storage server to be subjected such as client is returned, when client is entered to the data for returning
After the accumulation of row consistency detection and calculator, for the preferential check block for returning is cached, caching number is according to configuration
It is determined that, if subsequently returning all normal data blocks, data block returns to user and calls, and the check block of caching is abandoned;
Find to have data block to read during normal reading or found to have damaged according to marker bit, then by caching
Check block is filled, and according to the packet situation of 2+2, carries out decoding calculating, after obtaining data, returns again to the use called to upper strata
Family.
2. the storage method that a kind of utilization distributed data according to claim 1 is encoded, it is characterised in that:It is described normal
In reading flow process, preferential transmission reads request to normal data block node.
3. the storage method that a kind of utilization distributed data according to claim 2 is encoded, it is characterised in that:The method should
For in cluster, there is reparation process in cluster, the reparation process carries out continual inspection to all of data, when discovery has
When loss of data or node failure, data are re-read, decoding recovers data, then re-writes storage server,
Reparation process has all carried out detailed record to repairing file and layout, when new node is put back in cluster, is then carried out by reparation process
The unloading of data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410009331.1A CN103761195B (en) | 2014-01-09 | 2014-01-09 | Storage method utilizing distributed data encoding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410009331.1A CN103761195B (en) | 2014-01-09 | 2014-01-09 | Storage method utilizing distributed data encoding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103761195A CN103761195A (en) | 2014-04-30 |
CN103761195B true CN103761195B (en) | 2017-05-10 |
Family
ID=50528437
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410009331.1A Active CN103761195B (en) | 2014-01-09 | 2014-01-09 | Storage method utilizing distributed data encoding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103761195B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106227617A (en) * | 2016-07-15 | 2016-12-14 | 乐视控股(北京)有限公司 | Self-repair method and storage system based on correcting and eleting codes algorithm |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105791353B (en) * | 2014-12-23 | 2020-03-17 | 深圳市腾讯计算机系统有限公司 | Distributed data storage method and system based on erasure codes |
CN104731676A (en) * | 2015-03-24 | 2015-06-24 | 浪潮集团有限公司 | Method for accelerating data recovery of cluster system |
CN107615248B (en) * | 2015-06-17 | 2019-12-13 | 华为技术有限公司 | Distributed data storage method, control equipment and system |
WO2017041233A1 (en) * | 2015-09-08 | 2017-03-16 | 广东超算数据安全技术有限公司 | Encoding and storage node repairing method for functional-repair regenerating code |
CN105930103B (en) * | 2016-05-10 | 2019-04-16 | 南京大学 | A kind of correcting and eleting codes covering write method of distributed storage CEPH |
CN109976663B (en) * | 2017-12-27 | 2021-12-28 | 浙江宇视科技有限公司 | Distributed storage response method and system |
CN110059068B (en) * | 2019-04-11 | 2021-04-02 | 厦门网宿有限公司 | Data verification method and data verification system in distributed storage system |
CN112732164A (en) * | 2019-10-28 | 2021-04-30 | 北京白山耘科技有限公司 | Cross-node data group management method, device and medium |
CN111610938B (en) * | 2020-05-29 | 2022-07-05 | 广东奥飞数据科技股份有限公司 | Distributed data code storage method, electronic device and computer readable storage medium |
CN114415983B (en) * | 2022-03-30 | 2022-06-07 | 苏州浪潮智能科技有限公司 | RAID encoding and decoding method, device, equipment and readable storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1530836A (en) * | 2003-03-17 | 2004-09-22 | 株式会社瑞萨科技 | Nonvolatile memory device and data processing system |
CN101488104A (en) * | 2009-02-26 | 2009-07-22 | 北京世纪互联宽带数据中心有限公司 | System and method for implementing high-efficiency security memory |
US7681104B1 (en) * | 2004-08-09 | 2010-03-16 | Bakbone Software, Inc. | Method for erasure coding data across a plurality of data stores in a network |
US7823009B1 (en) * | 2001-02-16 | 2010-10-26 | Parallels Holdings, Ltd. | Fault tolerant distributed storage for cloud computing |
CN102272731A (en) * | 2008-11-10 | 2011-12-07 | 弗森-艾奥公司 | Apparatus, system, and method for predicting failures in solid-state storage |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090055682A1 (en) * | 2007-07-18 | 2009-02-26 | Panasas Inc. | Data storage systems and methods having block group error correction for repairing unrecoverable read errors |
US8001417B2 (en) * | 2007-12-30 | 2011-08-16 | Agere Systems Inc. | Method and apparatus for repairing uncorrectable drive errors in an integrated network attached storage device |
US8914706B2 (en) * | 2011-12-30 | 2014-12-16 | Streamscale, Inc. | Using parity data for concurrent data authentication, correction, compression, and encryption |
-
2014
- 2014-01-09 CN CN201410009331.1A patent/CN103761195B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7823009B1 (en) * | 2001-02-16 | 2010-10-26 | Parallels Holdings, Ltd. | Fault tolerant distributed storage for cloud computing |
CN1530836A (en) * | 2003-03-17 | 2004-09-22 | 株式会社瑞萨科技 | Nonvolatile memory device and data processing system |
US7681104B1 (en) * | 2004-08-09 | 2010-03-16 | Bakbone Software, Inc. | Method for erasure coding data across a plurality of data stores in a network |
CN102272731A (en) * | 2008-11-10 | 2011-12-07 | 弗森-艾奥公司 | Apparatus, system, and method for predicting failures in solid-state storage |
CN101488104A (en) * | 2009-02-26 | 2009-07-22 | 北京世纪互联宽带数据中心有限公司 | System and method for implementing high-efficiency security memory |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106227617A (en) * | 2016-07-15 | 2016-12-14 | 乐视控股(北京)有限公司 | Self-repair method and storage system based on correcting and eleting codes algorithm |
Also Published As
Publication number | Publication date |
---|---|
CN103761195A (en) | 2014-04-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103761195B (en) | Storage method utilizing distributed data encoding | |
US11941257B2 (en) | Method and apparatus for flexible RAID in SSD | |
US11947423B2 (en) | Data reconstruction in distributed storage systems | |
JP6294518B2 (en) | Synchronous mirroring in non-volatile memory systems | |
EP3100183A1 (en) | Efficient data reads from distributed storage systems | |
CN102110154B (en) | File redundancy storage method in cluster file system | |
US9063869B2 (en) | Method and system for storing and rebuilding data | |
CN109814807A (en) | A kind of date storage method and device | |
CN110427156B (en) | Partition-based MBR (Membrane biological reactor) parallel reading method | |
CN102520890A (en) | RS (Reed-Solomon) - DRAID( D redundant array of independent disk) system based on GPUs (graphic processing units) and method for controlling data of memory devices | |
CN112749039A (en) | Method, apparatus and program product for data writing and data recovery | |
CN107153661A (en) | A kind of storage, read method and its device of the data based on HDFS systems | |
US9106260B2 (en) | Parity data management for a memory architecture | |
Iliadis | Reliability evaluation of erasure-coded storage systems with latent errors | |
CN115114055A (en) | Managing capacity reduction and restoration due to storage device failure | |
CN114995767B (en) | Data management method, storage device and storage medium of solid state disk | |
Subedi | Exploration of erasure-coded storage systems for high performance, reliability, and inter-operability | |
CN104881252A (en) | Layout structure for disk array based on E code | |
Fegade et al. | Cloud iDedup: History aware in-line Deduplication for cloud storage to reduce fragmentation by utilizing Cache Knowledge | |
Pan et al. | ACS: an alternate coding scheme to improve degrade read performance for SSD-based RAID5 systems | |
Li et al. | HRSF: single disk failure recovery for liberation code based storage systems | |
Zhang et al. | DW-LRC: A Dynamic Wide-stripe LRC Codes for Blockchain Data Under Malicious Node Scenarios | |
Liang et al. | An Endurance-aware RAID-6 Code with Low Computational Complexity and Write Overhead | |
Evgenii et al. | Cloud Wolverine: The fastest data recovery | |
WO2013023564A9 (en) | Method and apparatus for flexible raid in ssd |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |