CN100388237C

CN100388237C - Data reconstitution method based on lightweight computing

Info

Publication number: CN100388237C
Application number: CNB2004100838706A
Authority: CN
Inventors: 尹杰; 雷迎春; 张松
Original assignee: BEIJING VEGA GRID TECHNOLOGY Co Ltd
Current assignee: BEIJING VEGA GRID TECHNOLOGY Co Ltd
Priority date: 2004-10-20
Filing date: 2004-10-20
Publication date: 2008-05-14
Anticipated expiration: 2024-10-20
Also published as: CN1763726A

Abstract

The present invention relates to a data reconstitution method based on lightweight computing, which belongs to the technical field of storage management in steam media application. In a server applied by stream media, when variation occurs to storage (the storage is increased or reduced), imbalance of distribution of media data in the storage is generated. The present invention can distribute and adjust the media data in new storage at the cost of minute calculation amount and move, the storage capacity resource and the band-width resource of a system are fully utilized. After the variation of the storage of the system occurs, the data reconstitution method comprises the following steps: sequence numbers of data blocks move rightwards in a circulation mode in effective bits of a media file; the number of bits for the move is determined by operation of the storage variation experienced by the media file, a value obtained by displacement is used as a functional value, the current magnetic-disk number is used as a divisor, and a division remaining hash method is utilized to select data blocks stored on a new magnetic disk (the storage is increased) or new storage positions of data blocks on a deleted magnetic disk (the storage is reduced).

Description

Data reconstitution method based on light weight calculating

Technical field

The invention belongs to storage management technique field in the Streaming Media application, particularly a kind of data reconstitution method that calculates based on light weight.

Background technology

Streaming Media is used and is based upon on the basis of IP network, transmits by IP network; System is divided into server end and client, and server end is made of the server group of planes with certain computing power and storage capacity, has one group of disk in the server of being responsible for storing, and client promptly can be common PC, also can be the combination of TV and set-top box; Media file flows to client by IP network from server end, utilizes the player software in PC or the set-top box to play in client.Compare with traditional text, pictorial information, looking during Streaming Media is used, audio file need very big storage space and playback rate, so system adopts a plurality of disks to come media information; In addition,, adopt the technology of interleaving usually, a media file is divided into data block in proper order, these data blocks are stored on a plurality of disks dispersedly in order to improve the concurrent visit capacity of same media file.Document [1]

Halvorsen, Carsten Griwodz, Ketil Lund, Vera Goebel, ThomasPlagemann, Jonathan Walpole.Storage Systems Support for MultimediaApplications.IEEE DISTRIBUTED SYSTEMS ONLINE 1541-4922 Vol.5, No.2, February 2004.For the storage order of these data blocks on a plurality of disks, there are two kinds usually: sequential storage and storage at random.Sequential storage is exactly that the data block in the same media file is stored on a plurality of disks in the mode of repeating query, and system can order read on a plurality of disks; Storage is that the data block in the media file is stored on a plurality of disks randomly at random, comes read block by catalog system.For such media file storage mode, when storage changes, system will add new disk or delete a part of existing disk, at this moment, in order to make full use of storage space and memory bandwidth, the adjustment that must distribute of original media data block.Existing solution has three kinds, first kind for sequential storage and fully the reorganization method, document [2] S.Ghandeharizadeh and D.Kim.On-line reorganization of data in scalablecontinuous media servers.Proc.7th International Conference on Database andExpert Systems Applications, September 1996, the shortcoming of this method need after the storage change operation to be the data block that moves too many, makes that to adjust cost very big; Second method is based on the recombination method of storage at random, document [3] A.Goel, C.Shahabi, S.-Y.Yao, and R.Zimmerman.SCADDAR:An efficient randomized technique to reorganizecontinuous media blocks.Proc.International Conference on Data Engineering, 2002, the author is again at document [3] Shu-Yuen Didi Yao, Cyrus Shahabi, Per-Ake Larson, Hash-Based Labeling Techniques for Storage Scaling, among the To appear in TheVLDB Journal this method is improved, but this method based on storage at random still exists some problems, the result of method and Pseudo-random number generator chosen direct relation, has uncertainty, cause disk the load balancing sex expression be not fine, and the computation complexity that data block is adjusted is relevant with the storage change number of times, even the author has solved this problem in document [3], but still existing the problem of exploration, computation complexity is still with to sound out number of times relevant; The third method is aimed at a kind of " the line translation data recombination algorithm " that growth formula serverless backup VOD system proposes, document [4] T.K.Ho and Jack Y.B.Lee.A Row-Permutated DataReorganization Algorithm for Growing Server-less Video-on-DemandSystems.Proc.Intemational Symposium on Cluster Computing and the Grid (CCGrid) 2003, Tokyo, Japan, May 12-15,2003, because this method designs at the serverless backup stream media system, thereby can't adapt to general Streaming Media application.

In sum, for the problem of data distribution after the storage change, existing solution all exists various deficiencies, shows that mainly the mobile data amount is big, adjusts computation complexity height and bad three aspects of load balancing.

Summary of the invention

The object of the present invention is to provide a kind of data reconstitution method that calculates based on light weight, it can calculate by simple, moves data seldom, guarantees that storage has good load balancing characteristic, makes to make full use of system's storage resources and bandwidth resources.

In the server that Streaming Media is used, when storage changes (storage increases or reduces), the distribution of media data in storage just produced lack of uniformity, the present invention can be with few calculated amount and mobile cost, with the media data adjustment that in new storage, distributes, make to make full use of system memory size resource and bandwidth resources.After system's generation storage change, the method of data recombination is: the serial number of data block is carried out ring shift right in the significance bit of media file, the figure place that moves is determined by the storage change operation that this media file experienced, with the displacement gained value as functional value, to be divisor when the front disk number, utilization removes the new memory location (storage reduces) that surplus Hash method is selected data block on data blocks stored on the new disk (storage increases) or the deleted disk.

The data reconstitution method that calculates based on light weight of the present invention, when storage changes, for each data block on the disk, calculate a value as functional value by light weight, and being divisor when the front disk number, utilization removes the new memory location that surplus Hash method is selected data block on data blocks stored on the new disk or the deleted disk.

The present invention is a kind of data reconstitution method that calculates based on light weight, and its implementation is as follows:

Each media file is with (F, B, RECORD) tlv triple is represented, wherein F is the identification number of this media file, B is the significance bit of this media file, RECORD is the storage change record of this media file, be recorded in the storage change operation of being experienced in the life cycle of this media file, each media file is divided into a plurality of data blocks with certain size, each data block is with (F, SID) two tuples represent that wherein F is the identification number of media file under the data block, and SID is the serial number of data block in media file;

When storage changes, be that unit carries out the data recombination operation with the disk, be divided into two kinds of situations: storage increases and storage reduces;

When new disk joined in the disk groups, the disk number of new disk increased progressively on the basis of original disk number, upgraded the RECORD of All Files.For a data block on the disk, at first obtain the times N that storage that media file experiences increases according to the RECORD of media file under this data block, according to the significance bit B of media file the SID of this data block is carried out the value that the ring shift right N position in the significance bit obtains then and be designated as PID, decide this blocks of data piece whether will be transferred on the new disk according to D=(PID+F) modM again, wherein M is the disk number after the storage change, when D is not the identification number of initiate disk, this data block does not move, having only the D of working as is initiate disk sequence number, and this data block is transferred on the disk that this sequence number is D;

When deletion disk in existing disk groups, carry out the deletion of disk according to disk sequence number direction from big to small, upgrade the RECORD of All Files.For the data block on the deleted disk, at first obtain the times N that storage that media file experiences increases according to the RECORD of media file under this data block, according to the B of media file the SID of this data block is carried out the value that the ring shift right N position in the significance bit obtains then and be designated as PID, the disk sequence number of coming the computational data piece to be transferred by D=(PID+F) modM, wherein M is a disk number in the disk groups after the disk deletion, and D is the data block target disk that will be transferred to number.

Description of drawings

Fig. 1 is the data reconstitution method process flow diagram that calculates based on light weight of the present invention.

Embodiment

In sum, the data reconstitution method concrete steps of calculating based on light weight of the present invention are as follows: referring to Fig. 1.

Step S1.1: read storage change type S, storage increases or storage reduces, and upgrades the storage change record of all media files;

Step S1.2: if S for increasing, needs to adjust the data block on original disk groups, finding needs the data that move and it is moved on the new disk, enter S1.3, if S need transfer to all data on the deleted disk on other not deleted disks for reducing, enter S1.12;

Step S1.3: from existing disk, select not adjusted disk as adjusting object;

Step S1.4: in this disk, find not adjusted data block as adjusting object;

Step S1.5: the change frequency N that the storage that media file is experienced under the read block increases, read document identification number F and significance bit B, read the disk number M after this data block serial number I and the storage change;

Step S1.6: with I ring shift right N position in the B position, the value that obtains is designated as PID;

Step S1.7: calculate (PID+F) modM,, then need to move this data block, then enter S1.8, otherwise this data block do not need to move, enter S1.9 if resulting value is the label that newly adds disk;

Step S1.8: write down target disk that this data block shifts and number be (PID+F) modM;

Step S1.9:, change S1.4 over to, otherwise enter S1.10 if also have data block not adjust in the disk;

Step S1.10: move all data blocks that need move to its new memory location;

Step S1.11:, change S1.3 over to, otherwise enter S1.19 if also have other disks of not adjusting;

Step S1.12: from deleted disk, select a unadjusted disk as adjusting object;

Step S1.13: from this disk, find not adjusted data block as adjusting object;

Step S1.14: the change frequency N that the storage that media file is experienced under the read block increases, read document identification number F and significance bit B, read the disk number M after this data block serial number I and the storage change;

Step S1.15: with I ring shift right N position in the B position, the value that obtains is designated as PID, and the reposition that writes down this data block transfer is that disk number is the disk of (PID+F) modM;

Step S1.16:, if also have data block not adjust in the disk, change S1.13 over to, otherwise enter S1.17;

Step S1.17: all data on the mobile disk are to new memory location;

Step S1.18:, change S1.12 over to, otherwise enter S1.19 if also have other deleted disks not adjust;

Step S1.19: the DATA DISTRIBUTION adjustment after the storage change this time finishes.

Realization of the present invention depends on following three preconditions:

The first, in the Streaming Media application system, the storage medium that is adopted in storage server is the disk groups that identical disk is formed, and the storage space of each disk all is identical with bandwidth like this;

The second, all media files all are divided into the identical data block of size in the system;

The 3rd, original state is for each data block of a media file, according to D ₀=(SID+F) modM ₀Decide its memory location in disk groups, wherein M ₀Disk number when being initial in the disk groups, D ₀Represent the initial storage disk number of this data block, F is the identification number of this document.

Compare the present invention and available data recombination method, we can find out that obviously the present invention has the following advantages:

1. after storage changes, in the data adjustment process for the computation complexity of a data block Be 1, the storage change times influence that not experienced by the affiliated file of this data block.

2. after storage changes, adjust the approaching theory of data block number that the data block distributions moves Minimum embodies adjustment cost seldom.

3. after data block was adjusted in new storage, it is equal that all disks have good load in the storage Weighing apparatus property.

4. our data reconstitution method is very low to the memory requirement of system, need not the catalog system support.

Claims

1. the data reconstitution method that calculates based on light weight, it is characterized in that, when storage changes, for each data block on the disk, calculate a value as functional value by light weight, and being divisor when the front disk number, utilization removes the new memory location that surplus Hash method is selected data block on data blocks stored on the new disk or the deleted disk;

Wherein, described by light weight calculate a value as functional value, concrete steps are as follows:

Step S1.3: from existing disk, select not adjusted disk as adjusting object;

Step S1.4: in this disk, find not adjusted data block as adjusting object;

Step S1.5: the times N that the storage that media file is experienced under the read block increases, read the identification number F of media file and the significance bit B of media file, read the disk number M after this data block serial number I and the storage change;

Step S1.10: move all data blocks that need move to its new memory location;

Step S1.12: from deleted disk, select a unadjusted disk as adjusting object;

Step S1.13: from this disk, find not adjusted data block as adjusting object;

Step S1.17: all data on the mobile disk are to new memory location;

Step S1.19: the DATA DISTRIBUTION adjustment after the storage change this time finishes;

The realization of this method depends on following three preconditions:

The 3rd, original state is for each data block of a media file, according to D ₀=(SID+F) modN ₀Decide its memory location in disk groups, wherein N ₀Disk number when being initial in the disk groups, D ₀Represent the initial storage disk number of this data block, F is the identification number of this document.

2. the data reconstitution method that calculates based on light weight according to claim 1, it is characterized in that, each media file is with (F, B, RECORD) tlv triple is represented, wherein F is the identification number of this media file, B is the significance bit of this media file, RECORD is the storage change record of this media file, is recorded in the storage change operation of being experienced in the life cycle of this media file, and each media file is divided into a plurality of data blocks with certain size, each data block is with (F, SID) two tuples represent that wherein F is the identification number of media file under the data block, and SID is the serial number of data block in media file;

When new disk joins in the disk groups, the disk number of new disk increases progressively on the basis of original disk number, upgrade the RECORD of All Files, for a data block on the disk, at first obtain the times N that storage that media file experiences increases according to the RECORD of media file under this data block, according to the significance bit B of media file the SID of this data block is carried out the value that the ring shift right N position in the significance bit obtains then and be designated as PID, decide this blocks of data piece whether will be transferred on the new disk according to D=(PID+F) modM again, wherein M is the disk number after the storage change, when D is not the identification number of initiate disk, this data block does not move, having only the D of working as is initiate disk sequence number, and this data block is transferred on the disk that this sequence number is D;

When deletion disk in existing disk groups, carry out the deletion of disk according to disk sequence number direction from big to small, upgrade the RECORD of All Files, for the data block on the deleted disk, at first obtain the times N that storage that media file experiences increases according to the RECORD of media file under this data block, according to the B of media file the SID of this data block is carried out the value that the ring shift right N position in the significance bit obtains then and be designated as PID, the disk sequence number of coming the computational data piece to be transferred by D=(PID+F) modM, wherein M is a disk number in the disk groups after the disk deletion, and D is the data block target disk that will be transferred to number.