CN102662639A

CN102662639A - Mapreduce-based multi-GPU (Graphic Processing Unit) cooperative computing method

Info

Publication number: CN102662639A
Application number: CN2012101028344A
Authority: CN
Inventors: 吕相文; 袁家斌; 曾青华
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2012-04-10
Filing date: 2012-04-10
Publication date: 2012-09-12

Abstract

The invention discloses a mapreduce-based multi-GPU (Graphic Processing Unit) cooperative computing method, which belongs to the application field of computer software. Corresponding to single-layer parallel architecture of common high-performance GPU computing and MapReduce parallel computing, a programming model adopts a double-layer GPU and MapReduce parallel architecture to help a developer simplify the program model and multiplex the existing concurrent codes through a MapReduce program model with cloud computing concurrent computation by combining the structure characteristic of a GPU plus CPU (Central Processing Unit) heterogeneous system, thus reducing the programming complexity, having certain system disaster tolerance capacity and reducing the dependency of equipment. According to the computing method provided by the invention, the GPU plus MapReduce double concurrent mode can be used in a cloud computing platform or a common distributive computing system so as to realize concurrent processing of MapReduce tasks on a plurality of GPU cards.

Description

The collaborative computing method of a kind of many GPU based on Mapreduce

Technical field

The present invention relates to the collaborative computing method of a kind of many GPU, affiliated computer software application field based on Mapreduce.

Background technology

In recent years under hardware technology promotes; The computing power and the programmability of graphic process unit (GPU) are developed fast; Characteristics with highly-parallel calculating make GPU no longer be confined to daily graphics process task, and beginning to relate to more widely, high performance universal calculates (GPGPU) field.Because GPU has high performance multi-processor array and high bandwidth, hides the video memory system that postpones, this makes in the computing application of a large amount of repeating data set operations and intensive memory access, and GPU has than traditional C PU and has more advantage.And the calculating of frequently-used data and program all need accomplish by CPU, as far as the user, if the too much time that takies CPU can make the user move large program or when comparing big data computation the sensation computing machine very slow, reduce system performance; GPU is mainly used in the calculating of recreation or figure in domestic consumer, free time, therefore appropriateness used GPU can bring good effect, has promptly reduced the holding time of CPU, and the GPU that is in the excessive free time is fully used obviously more than CPU.

The same focus that obtains paying close attention in the high performance parallel computation field is a MapReduce mass data processing framework; Our the large-scale data computing power that can be had only expensive large server just to have in the past of common computer cluster through cheapness, and better at aspects such as stability and extendabilities than the latter.The MapReduce model is applied to aspects such as astronomical information calculations processing, the inventory analysis of magnanimity case, virus base storage, network retrieval service now, solves the contradiction between the growth of data explosion formula and Computer Storage ability and the computing power deficiency.

Up to the present much the research of this two aspect is all limited in some aspects; As assist CPU to algorithm with program is quickened and carry out Distributed Calculation by the cluster that many computing machine GPU form to single computer GPU; Undeniable these aspects all obtain sizable progress, but deficiency also exists.In the face of the increasing recreation of scale during with program single computer GPU acceleration can not bring how big change, the contradiction between the growth of data magnanimity and the computing power of computing machine can not be resolved; Though same common distributed computer GPU cluster is all well and good aspect computing power; But in case node failure or other problem occur; The performance of whole cluster can receive very big influence: need frequent CPU to calculate when also having the MapReduce model to carry out Map and Reduce operation; Sometimes in addition CPU usage be absolutely, the therefore also very necessary participation of using GPU comes the computing power of balanced system.

Summary of the invention

Because the MapReduce model needs frequent CPU to calculate when carrying out Map and Reduce operation, during in the face of a large amount of parallel computation task, CPU usage even reach a hundred per cent.And GPU has than better data width of CPU and computation capability, and appropriateness is used GPU, has promptly reduced the holding time of CPU, can use the participation of GPU to come the computing power of balanced system again.

The objective of the invention is to combine the different advantages of GPU technology and MapReduce technology; On the basis of MapReduce multiple programming; The participation of use GPU comes the computing power of balanced system, and a kind of method of supporting large-scale distributed parallel computation through GPU calculating and MapReduce technological incorporation programming model finally is provided.

The collaborative computing method of a kind of many GPU based on Mapreduce of the present invention comprise following step:

1) at first, client is to the request of management phase transmission tasks.

2) then, the name node NameNode in the management phase is in charge of NameSpace, the calculation stages cluster configuration information of file system, the information such as position of storage block; Work shadow device JobTracker is responsible for calculation task is started and dispatches, and can realize the state of the implementation status and the calculation stages of tracing task.

3) in calculation stages:

1. after back end DataNode receives the read-write requests from name node NameNode; Call CPU; Mass data is read scanning, and horizontal division again is divided into the data subset splits fragment of M fixed size; M is a natural number, and its size is that number and data results according to computing node in the computing system determines usually;

2. the task tracker TaskTracker of idle CPU formats M data subclass split to work shadow device JobTracker request task and after meeting with a response, and further resolves into a collection of key/value to < key1, value1 >;

3. the task tracker TaskTracke of idle GPU is to work shadow device JobTracker request task and after meeting with a response, and each the data subset split to input creates a Map task; With each the record < key1 among the corresponding split; Value1>to the line scanning of going forward side by side as input, and it is formatd to the GPU special algorithm, use the CUDA storehouse of GPU to realize a local combiner Combiner; < key2, value2>key/value is right in the middle of producing and exporting;

4. utilize middle key/value that subregion function hash (key) mod R produces the Map function to being divided into R different partition areas, the R here is a natural number less than M.Then GPU carries out relevance ranking with intermediate result according to key2; And the value2 data aggregate that the key2 value is identical forms a new tabulation together; It is right to form < key2, list (value2) >, and the list here (value2) is the array that value2 formed by identical key2 value.Again with these key/values to being divided into R different partition areas, each division is fitted on the Reduce task of appointment

5. the workstation that has been assigned with the Reduce task calls idle CPU, and the task tracker TaskTracker that starts CPU goes to read the data < key2, list (value2)>that the Map function is submitted to.Behind the intermediate data after the traversal ordering; The task tracker TaskTracker of CPU passes to each subregion the task tracker TaskTracker of idle GPU; Format by it; Use the GPU concurrent technique to do the handled operation, obtain a plurality of output results of Reduce task, start union operation and obtain final output valve.

6. the task tracker TaskTracker of GPU gives CPU with net result and calls part, and so far, a MapReduce technical process is accomplished.

The invention has the beneficial effects as follows

Programming model of the present invention combines the two advantage of GPU general-purpose computations and MapReduce model; Research and realize a complete high performance parallel computation system is that hardware cooperates and to carry out large-scale data based on MapReduce parallel computational model software and handle with GPU.

Nowadays, aspect high-performance calculation, the independent utility of GPU technology and cloud computing technology is ripe relatively, and the instantiation of plurality of applications is arranged.Realize using GPU to cooperate the generic structure of CPU calculating, utilize the data structure of mass data, make the program part of data and parallel computation on GPU, store and move.This framework forms a level of abstraction between graphic hardware and application program, have very strong versatility, reduces communicating by letter of GPU and CPU to greatest extent, improves the computing power of entire system and improves system performance.

But this two big technology is merged, enable to carry out the parallel computation of dual level, further promote efficient.Through to research based on the distributed computing framework of MapReduce, proposed the Distributed Calculation based on GPU is improved, can utilize existing computing equipment to reach higher parallel speed through this improvement.

Description of drawings

Fig. 1 is based on the MapReduce parallel computational model hardware topology structural drawing of GPGPU;

Fig. 2 is based on the MapReduce parallel computation building-block of logic of CPU;

Fig. 3 is based on the MapReduce parallel computational model building-block of logic of GPGPU;

Fig. 4 is based on the MapReduce parallel computational model TaskTracker calculation task module map of GPGPU.

Embodiment

Below in conjunction with accompanying drawing and embodiment the programming flow process that has added GPU concurrent technique MapReduce framework afterwards of the present invention is elaborated:

The visual description that the described model hardware topology diagram of Fig. 1 is the real hardware platform mainly is made up of common computer, 100 m ethernet switch and the link of the circuit between them.Usually other ripe MapReduce models also have by rack-mount server replacement common computer; Perhaps extract file system module and carry out the storage of data, make the computing node cluster only be used for multiple parallel computing platform patterns such as calculating by independent equipment.

Shown in Figure 2 is realization Hadoop platform based on the MapReduce parallel computational model system of CPU, and Hadoop is not one and is used to the distributed file system of storing merely. but a framework that is designed with bundle execution Distributed Application on the large-scale cluster of not forming fully by common hardware with.Hadoop comprises two parts: the big formula file system of a branch HDFS (Hadoop Distributed File System) and real the sloughing off of MapReduce.Therefore, the target of Hadoop is that the MapReduce module that we will improve Hadoop realizes our MapReduce parallel computational model based on GPGPU for the exploitation Distributed Application provides a framework.

The Hadoop platform logically is divided into three layers, and soil will be through carrying out communication based on ICP/IP protocol between layer and layer.Ground floor is a client tier, and client is transmitted the calculation task request to management node, after calculation task is accomplished by management node to the client return results.The second layer is the management node layer, and management node has two parts: name node NameNode is the supvr in the model system, mainly is in charge of NameSpace, the computing node cluster configuration information of file system, the information such as position of storage block; Work shadow device JobTracker is responsible for calculation task is started and dispatch, and the state of ability real-time follow-up task executions situation and computing node.The 3rd layer is the computing node layer, and this layer also is divided into two parts: back end DataNode is responsible for handling the read-write requests from name node NameNode, can also carry out simultaneously data block establishment, delete and duplicate; The operation that task tracker TaskTracker then is responsible for to work shadow device JobTracker request task and starts this calculation task after obtaining task at computing node.

Fig. 3 will provide our the MapReduce parallel computational model building-block of logic based on GPGPU; As can be seen from the figure first and second layers with as broad as long based on the model of CPU; And difference is that mainly task tracker TaskTracker partly is divided into two in the 3rd layer of computing node; One is the CPU calling module, and another piece is the GPU computing module, can the modular design in Fig. 4 provide detailed explanation.

Though name node NameNode and work shadow device JobTracker module can adopt GPU to calculate the performance with balance The model system among the management node Master, we improve the maximum task tracker TaskTracker calculation task module section of load earlier.

As explaining among Fig. 4; With task tracker TaskTracker module disassemble by CPU call the part and by the GPU calculating section; GPU can not replace other non-computing functions of CPU; Like access hard disk, obtain physical address, reading and writing of files etc. and still do by CPU, calculate this heavy task and then can let GPU go to accomplish.Here, at first CPU calls part and will call the GPU calculating section then and give a plurality of Map operations among the GPU with each deblocking, and start these Map tasks among the GPU according to option read data files and with its piecemeal from hard disk or internal memory.Can produce after Map task run among the GPU is accomplished that a plurality of middle key/value is right, and then GPU start again sorting operation with these middle key/values to according to keywords carrying out sorting operation.Next call the part catcher by CPU; To carrying out piecemeal again, and give a plurality of Reduce operations in the GPU calculating section middle key/value that sorts by key word again, start these Reduce tasks among the GPU then each piecemeal; Obtain a plurality of output results of Reduce task at last; The final output valve of startup union operation acquisition is given CPU and is called part, so far, even if preliminary completion of MapReduce computation process.

The collaborative computing method concrete steps of many GPU based on Mapreduce of the present invention are following:

1. after back end DataNode received the read-write requests from NameNode, horizontal division was carried out with large-scale dataset in the MapReduce storehouse, was divided into the data subset splits fragment of M fixed size, and this part work has CPU to handle.M is a natural number, and its size is to come and the data results determines according to the number of computing node in the computing system usually.

2. M data subclass split formatd, further resolve into a collection of key/value < key1, value1 >; Specifically be formatted as which kind of data layout, key value and value value can be set according to the characteristics of concrete data set, for example can be in the things database with < key1; Value1>be set to < Tid; List >, Tid representes the things identifier in the things database, list is the corresponding list value of things in the things database.Because operations such as this part comprises access hard disk, obtains physical address, reading and writing of files, this part is accomplished by CPU, calls the relevant Map function of GPU then and carries out calculating operation.

3.Map the task of function is to each data subset split of input, creates a Map task, with each the record < key1 among the corresponding split; Value1>to the line scanning of going forward side by side as input, and it is formatd to the GPU algorithm, use the GPU algorithm to realize a local combiner Combiner; Produce and the middle < key2 of output; Value2>key/value is right, and it is right for example in the things database, to may be defined as < itemsets, sup>key/value; Sup representes the support counting of itemsets in data subset, and itemsets representes candidate k item collection.

4. utilize middle key/value that subregion function hash (key) mod R produces the Map function to being divided into R different partition areas, the R here is a natural number less than M.Then GPU sorts intermediate result according to key2; And the value2 data aggregate that the key2 value is identical forms a new tabulation together; It is right to form < key2, list (value2)>key/value, and list (value2) is the array that value2 formed by identical key2 value; Again with these key/values to being divided into R different partition areas, each division is fitted on the Reduce task of appointment.

5. the workstation that has been assigned with the Reduce task calls CPU, and data < key2, list (value2)>key/value that reads the submission of Map function is right.Intermediate data after the traversal ordering, CPU passes to GPU with each subregion, is formatd by it, uses the GPU concurrent technique to do the handled operation, obtains a plurality of output results of Reduce task, starts union operation and obtains final output valve.

Call part 6.GPU give CPU with net result, so far, a MapReduce technical process is accomplished.

Claims

1. the many GPU based on Mapreduce work in coordination with computing method, it is characterized in that comprising following step:

1) at first, client is to the request of management phase transmission tasks;

2) then, the name node NameNode in the management phase is in charge of NameSpace, the calculation stages cluster configuration information of file system, the positional information of storage block; Work shadow device JobTracker is responsible for calculation task is started and dispatch, and the implementation status of realization tracing task and the state of calculation stages;

3) in calculation stages:

1. after back end DataNode receives the read-write requests from name node NameNode; Call CPU; Mass data is read scanning, and horizontal division again is divided into the data subset splits fragment of M fixed size; M is a natural number, and its size determines according to the number and the data results of computing node in the computing system;

3. the task tracker TaskTracke of idle GPU is to work shadow device JobTracker request task and after meeting with a response, and each the data subset split to input creates a Map task; With each the record < key1 among the corresponding split; Value1>to the line scanning of going forward side by side as input, and it is gone up the operation special algorithm to GPU format, use the CUDA storehouse of GPU to realize a local combiner Combiner; < key2, value2>key/value is right in the middle of producing and exporting;

4. utilize middle key/value that subregion function hash (key) mod R produces the Map function to being divided into R different partition areas; The R here is a natural number less than M; Then GPU carries out relevance ranking with intermediate result according to key2, and the value2 data aggregate that the key2 value is identical forms a new tabulation, formation < key2 together; List (value2)>right; List (value2) is for by the array that value2 formed of identical key2 value, again with these key/values to being divided into R different partition areas, each division is fitted on the Reduce task of appointment;

5. the workstation that has been assigned with the Reduce task calls idle CPU; The task tracker TaskTracker that starts CPU removes to read the data < key2 that the Map function is submitted to; List (value2) >, behind the intermediate data after the traversal ordering, the task tracker TaskTracker of CPU passes to each subregion the task tracker TaskTracker of idle GPU; Format by it; Use the GPU concurrent technique to do the handled operation, obtain a plurality of output results of Reduce task, start union operation and obtain final output valve;

6. the task tracker TaskTracker of GPU gives CPU with net result and calls part.