CN103399758A - Hardware accelerating method, device and system - Google Patents

Hardware accelerating method, device and system Download PDF

Info

Publication number
CN103399758A
CN103399758A CN2011104594236A CN201110459423A CN103399758A CN 103399758 A CN103399758 A CN 103399758A CN 2011104594236 A CN2011104594236 A CN 2011104594236A CN 201110459423 A CN201110459423 A CN 201110459423A CN 103399758 A CN103399758 A CN 103399758A
Authority
CN
China
Prior art keywords
business
configuration file
performance number
request amount
service request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011104594236A
Other languages
Chinese (zh)
Other versions
CN103399758B (en
Inventor
周丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Huawei Technology Co Ltd
Original Assignee
Huawei Symantec Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Symantec Technologies Co Ltd filed Critical Huawei Symantec Technologies Co Ltd
Priority to CN201110459423.6A priority Critical patent/CN103399758B/en
Publication of CN103399758A publication Critical patent/CN103399758A/en
Application granted granted Critical
Publication of CN103399758B publication Critical patent/CN103399758B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a hardware accelerating method, device and system. The device comprises a service monitoring unit, a configuration loading unit and a configuration file storing zone. The service monitoring unit is used for acquiring service requesting amount of data processing service, the service requesting amount and a requesting amount threshold value are compared, and a performance value of the data processing service matched with the service requesting amount is acquired. The configuration file storing zone is used for storing FPGA configuration files. The configuration loading unit is used for acquiring the FPGA configuration files matched with the performance value of the data processing service according to the acquired performance value of the data processing service matched with the service requesting amount, the matched FPGA configuration files are loaded, and accordingly corresponding hardware accelerating is achieved. According to the hardware accelerating device, requirements for combination of different kinds of performance can be automatically met, and production cost of the hardware accelerating device is lowered.

Description

Hardware-accelerated methods, devices and systems
Technical field
The present invention relates to memory technology, relate in particular to a kind of hardware-accelerated methods, devices and systems.
Background technology
There is the processing of a lot of computation-intensives in present Data processing.For example, in storage system, data de-duplication technology (being called for short " heavily deleting ") and redundant data compress technique (being called for short " compression ") are all the normal valid data reduction technologies that adopts; No matter heavily delete or compress, all comprising a large amount of computation-intensives processes, Hash calculation and cryptographic hash that described computation-intensive is processed such as deblocking calculating, block data relatively wait, the calculated amount of these processing is very large, can take considerable processor resource, may the performance of other business be impacted.In order to reduce described computation-intensive, process the dependence to processor, current main employing hardware accelerator comes auxiliary processor to calculate.The hardware accelerator of prior art, for example can be for adopting field programmable gate array (Field-Programmable Gate Array, be called for short: FPGA) chip is the hardware accelerator card of core, by the hardware accelerator card of this FPGA, realizes heavily deleting and compressing the hardware-accelerated of processing.
But, the inventor finds through research, the existing technological deficiency of current FPGA is: FPGA adopts a kind of fixing configuration file, FPGA also can only realize the corresponding logic function of this configuration file, thus can only adapt to this configuration file corresponding heavily delete, the performance combination of each function such as compression distributes.For example, processing resource in FPGA comprises 1000 logical blocks, configuration according to configuration file, this FPGA need to distribute wherein 200 logical blocks for the treatment of heavily deleting function, distribute wherein 800 logical blocks for the treatment of compression function, namely heavily deleting with compression factor is the performance combination of 1: 4, and described performance combination refers to the ratio of the processing resource of the shared accelerator card of each function.
Yet, in practical application, different user and different applied environments may cause the performance combination of required each function different, for example, certain user may have more data heavily to delete demand, heavily deletes the resources requirement of function greater than compression function (for the treatment of heavily deleting, 200 logical blocks are for the treatment of compression such as 800 logical blocks of needs), but obviously, the accelerator card design of prior art can not be met consumers' demand; Although can adopt the accelerator card of producing multiple performance combination to meet above-mentioned different user demands, for example produce the accelerator card (each accelerator card still only adopts a kind of fixing configuration file and only corresponding a kind of performance combination) of corresponding multiple performance combination, but this inevitable production and handling cost that can increase again hardware accelerator, and, when user's applied environment and performance requirements of combination change, must again buy the accelerator card of the demand after corresponding the variation.
Summary of the invention
First aspect of the present invention is to provide a kind of hardware accelerator, by a kind of hardware accelerator, automatically to adapt to different performance requirements of combination, reduces the production cost of hardware accelerator.
Another aspect of the present invention is to provide a kind of hardware-accelerated method, by a kind of hardware accelerator, automatically to adapt to different performance requirements of combination, reduces the production cost of hardware accelerator.
Another aspect of the present invention is to provide a kind of hardware-accelerated system, by a kind of hardware accelerator, automatically to adapt to different performance requirements of combination, reduces the production cost of hardware accelerator.
Hardware accelerator provided by the invention comprises: business monitoring means, configuration loading unit, on-site programmable gate array FPGA and configuration file storage area;
Described business monitoring means, for obtaining respectively corresponding two kinds of data, process the service request amount of business at least, and the default request amount threshold value of described service request amount and corresponding described data processing business is compared, obtain the performance number that the described data corresponding with described service request amount are processed business;
Described configuration file storage area, for depositing the on-site programmable gate array FPGA configuration file, described FPGA configuration file comprises the configuration file of respectively corresponding different pieces of information processing business, and the configuration file of each data processing business comprises the configuration file of corresponding different respectively service feature values;
Described configuration loading unit, for the flux matched described data of the described and service request according to obtaining, process the performance number of business, obtain the FPGA configuration file of processing the performance number coupling of business with described data, load the FPGA configuration file of described coupling, to realize the hardware-accelerated of corresponding described FPGA configuration file.
Hardware-accelerated method provided by the invention comprises:
Obtain data and process the service request amount of business, and the default request amount threshold value of described service request amount and corresponding described data processing business is compared, obtain the performance number that the described data flux matched with described service request are processed business;
According to the flux matched described data of the described and service request that obtains, process the performance number of business, obtain the FPGA configuration file of processing the performance number coupling of business with described data, load the FPGA configuration file of described coupling, to realize the hardware-accelerated of corresponding described FPGA configuration file; Described FPGA configuration file comprises the configuration file of respectively corresponding different pieces of information processing business, and the configuration file of each data processing business comprises the configuration file of corresponding different respectively service feature values.
Hardware-accelerated system provided by the invention comprises: on-site programmable gate array FPGA and hardware accelerator of the present invention.
the technique effect of hardware accelerator of the present invention is: by the business monitoring means is set, configuration loading unit etc., the business monitoring means obtains corresponding performance number according to the request amount of obtaining, and indication configuration loading unit loads the FPGA configuration file of corresponding described performance number, can realize that the corresponding performance of being determined by performance number makes up hardware-accelerated, this business monitoring means can the Real-Time Monitoring service request and configuration file corresponding to real-time loading adjust, solved the problem that hardware accelerator can not meet user's different performance requirements of combination, realized automatically adapting to different performance requirements of combination by a kind of hardware accelerator, reduce the production cost of hardware accelerator.
The technique effect of the hardware-accelerated method of the present invention is: the request amount of obtaining by basis obtains corresponding performance number, and load the FPGA configuration file of corresponding described performance number, can realize that the corresponding performance of being determined by performance number makes up hardware-accelerated, can the Real-Time Monitoring service request and configuration file corresponding to real-time loading adjust, solved the problem that hardware accelerator can not meet user's different performance requirements of combination, realized automatically adapting to different performance requirements of combination by a kind of hardware accelerator, reduced the production cost of hardware accelerator.
The technique effect of the hardware-accelerated system of the present invention is: the request amount of obtaining by basis obtains corresponding performance number, and load the FPGA configuration file of corresponding described performance number, can realize that the corresponding performance of being determined by performance number makes up hardware-accelerated, can the Real-Time Monitoring service request and configuration file corresponding to real-time loading adjust, solved the problem that hardware accelerator can not meet user's different performance requirements of combination, realized automatically adapting to different performance requirements of combination by a kind of hardware accelerator, reduced the production cost of hardware accelerator.
The accompanying drawing explanation
Fig. 1 is the structural representation of hardware accelerator one embodiment of the present invention;
Fig. 2 is the structural representation of another embodiment of hardware accelerator of the present invention;
Fig. 3 is the configuration file hierarchical Design schematic diagram in another embodiment of hardware accelerator of the present invention;
Fig. 4 is the structural representation of another embodiment of hardware accelerator of the present invention;
Fig. 5 is the application schematic diagram of the another embodiment of hardware accelerator of the present invention;
Fig. 6 is the schematic flow sheet of the hardware-accelerated embodiment of the method for the present invention;
Fig. 7 is the application scenarios schematic diagram of hardware accelerator embodiment of the present invention;
Fig. 8 is the Another Application scene schematic diagram of hardware accelerator embodiment of the present invention.
Embodiment
Embodiment mono-
Fig. 1 is the structural representation of hardware accelerator one embodiment of the present invention, and the hardware accelerator of the present embodiment is the hardware accelerator card take FPGA as core.
As shown in Figure 1, the hardware accelerator of the present embodiment can comprise: business monitoring means 11, configuration loading unit 12 and configuration file storage area 13.Wherein,
Configuration file storage area 13, for depositing a plurality of on-site programmable gate array FPGA configuration files, described a plurality of FPGA configuration file comprises the configuration file of respectively corresponding different pieces of information processing business, and the configuration file of each data processing business comprises the configuration file of corresponding different respectively service feature values;
The performance number that described data are processed business refers to, and for example, can process for 1 second to the 100MB raw data, or can process the 200MB raw data for 1 second etc.; The FPGA configuration file corresponding with described performance number refers to, this configuration file comprises the configuration data of some FPGA, if FPGA is configured according to described configuration data, just can reach data and process the performance number of business, for example make the compression business reach the performance of processing the 100MB raw data 1 second.In order to obtain concrete above-mentioned performance, FPGA must need the processing resource that himself is had to distribute, and for example, 1 pressure channel need to take 100 parts of logical resources, and 1 pressure channel can only reach the 20MBps handling property.In 1000 parts of logical resources of initial FPGA, there are 100 parts for the treatment of the compression business, can only reach the handling property of 20MBps, according to configuration file corresponding to above-mentioned performance number, FPGA may will distribute 500 parts of logical resources for the treatment of the compression business, to reach 5 pressure channels, the i.e. handling property of 100MBps.
Described a plurality of FPGA configuration file comprises the configuration file of respectively corresponding different pieces of information processing business, and the configuration file of each data processing business comprises the configuration file of corresponding different respectively service feature values; These a plurality of configuration files may be that corresponding a plurality of different data are processed business, for example, and the corresponding configuration file of business, the configuration file of corresponding compression business etc. heavily deleted; For the same data, process business, also comprise the file to different performance value that should business, for example, the corresponding configuration file of heavily deleting business has a plurality of, comprising the configuration file (process 200MB and heavily delete business 1 second) of the configuration file (process 100MB and heavily delete business 1 second) of corresponding the first performance number, corresponding the second performance number etc.
Business monitoring means 11, for obtaining data, process the service request amount of business, and the default request amount threshold value of described service request amount and corresponding described data processing business is compared, obtain the performance number that the described data flux matched with described service request are processed business;
Optionally, for example, can be simultaneously on the hardware accelerator card of FPGA process at least two kinds of data and process business, these data process that business is deleted such as comprising heavily, compression, deblocking, Hash etc., the hardware accelerator card that is equivalent to this FPGA have heavily delete, the functions such as compression, deblocking, Hash.
Concrete, this business monitoring means 11 can obtain the service request amount that data are processed business, described service request amount refers to, above-mentioned for heavily deleting, the data volume of the request such as compression, for example, the number of the compression request of obtaining in 1 second is 10, and it is 50MB that the data volume of these 10 requests is namely asked the data of compressing, and the request amount of compressing business is 50MB.The default request amount threshold value that business monitoring means 11 is processed business by described service request amount and corresponding described data compares; Described request amount threshold value can set in advance in business monitoring means 11, this request amount threshold value can or be a value range for a numerical value, for example, default compression service request amount threshold value is 80MB, for above-mentioned request amount 50MB and 80MB are compared; Perhaps, when threshold value is value range for example during 60B-80MB, be also by above-mentioned request amount 50MB and this value range relatively.
Concrete, by above-mentioned comparison, business monitoring means 11 can obtain the performance number that the data corresponding with the service request amount are processed business.For example, the configuration file of the initial compression business that adopts is the configuration file of corresponding 80MB performance number, by above-mentioned Real-Time Monitoring compression service request amount and analysis, find that present compression request amount decreases as 50MB, and lower than the request amount threshold value of setting, show that present configuration file performance is higher, should reduce the handling property of this business to reduce power consumption, by after adopt reducing with the immediate performance number of described service request amount, as the described performance number corresponding with the service request amount.At this moment, namely business monitoring means 11 has determined that data corresponding with the service request amount process the performance number of business.
It should be noted that the embodiment of the present invention is said comprises the performance number with service request amount " identical " with service request amount " the most approaching " performance number.
Configuration loading unit 12, for the flux matched described data of the described and service request according to obtaining, process the performance number of business, obtain the FPGA configuration file of processing the performance number coupling of business with described data, load the on-site programmable gate array FPGA configuration file of corresponding described performance number, to realize the hardware-accelerated of corresponding described FPGA configuration file.
Concrete, after business monitoring means 11 had determined that the data corresponding with the service request amount are processed the performance number of business, configuration loading unit 12 can obtain according to this performance number the configuration file of required loading, and corresponding configuration file is loaded.Configuration loading unit 12 can load the FPGA configuration file of corresponding described performance number according to described performance number, the corresponding configuration file of heavily deleting business per second 100MB performance number of example loading described above.By configuration file is loaded, can realize the hardware-accelerated of service feature value that this configuration file is corresponding.for example, the hardware-accelerated calorie requirement of FPGA is realized the compression business and is heavily deleted business, loaded the configuration file of corresponding compression business 100Mps performance number, and the corresponding configuration file of heavily deleting business 300Mps performance number, FPGA will carry out according to the configuration file of correspondence the setting of logic function so, finally realize the performance of above-mentioned data processing business, namely, through after according to above-mentioned configuration file, being configured, this FPGA hardware accelerator card can be realized the compression business of 100Mps performance number and the business of heavily deleting of 300Mps performance number, while adopting this FPGA hardware accelerator card to carry out data processing hardware-accelerated, compression business and the handling property of heavily deleting business are combined as 1: 3.
As can be known by above-mentioned analysis, business monitoring means 11 can be processed by Real-time Monitoring Data the service request amount of business, determines the service feature value corresponding with this request amount; Configuration loading unit 12 can load configuration file that should performance number, thereby has realized the real-time adjustment to the function of FPGA hardware accelerator card; For example, when user's applied environment changes while causing the user to change the handling property requirements of combination of different business, the FPGA hardware accelerator card of the present embodiment just can go out by business monitoring means 11 Real-Time Monitorings of self variation of customer service request, and adopt above-mentioned method to be configured the logic function of self, finally realize the hardware-accelerated of respective user performance requirements of combination, namely the conformability requirements of combination changes automatically, and smoothly adjusts non-interrupting service.And, by automatic conformability requirements of combination, change, also can reduce the power consumption of FPGA hardware accelerator card; By a kind of hardware accelerator card, just can realize above-mentioned functions, than multiple accelerator card of the prior art, reduce the production and administration cost.
The hardware accelerator of the embodiment of the present invention, not only can be applied to the data reduction field of storage system, can also be applied in the application of the computing intensities such as data XOR in storage system, data encryption, can also be applied in addition other and need to do in the hardware-accelerated non-storage system of many algorithms.For the FPGA hardware accelerator card, in practical application, usually there are two types, the FPGA that can partly reshuffle and the FPGA that can not partly reshuffle, below be elaborated to structure and the principle of work of this FPGA hardware accelerator of two types respectively with two embodiment.
Embodiment bis-
Fig. 2 is the structural representation of another embodiment of hardware accelerator of the present invention, and the present embodiment is that structure and the principle to the FPGA hardware accelerator that can partly reshuffle describes, and this Fig. 2 has shown structural principle and the principle of work of this hardware accelerator; Wherein, the FPGA that can partly reshuffle refers to and can the wherein a part of configuration file in the configuration file that be loaded into this FPGA be reconfigured, for example, FPGA comprises the configuration file of corresponding business A and the configuration file of business B, can only change the configuration file of business B.
At first, stored configuration file in the configuration file storage area 13 of the FPGA of this hardware accelerator is described.In configuration file storage area 13, deposit respectively corresponding different pieces of information and processed a plurality of configuration files of the different performance value of business, namely have the various configurations file to configure for loading, and the hardware accelerator of prior art can only adopt a kind of machine made configuration file.
Concrete, referring to Fig. 3, Fig. 3 is the configuration file hierarchical Design schematic diagram in another embodiment of hardware accelerator of the present invention.File in the configuration file storage area is according to the clear division of functional module, for example can comprise A, B, C, D and five functional modules of E, described functional module for example refers to, A represents that compression function module, the B for the treatment of the compression business represents to represent for the treatment of the piecemeal functional module of deblocking business etc. for the treatment of heavily the delete functional module, C of heavily deleting business, between each functional module, not having direct coupled relation, is directly namely independently mutually.For some functional modules wherein, comprise again a plurality of configuration files of corresponding different service process performance values; For example, for heavily deleting functional module B, this B module comprises B1, B2, B3 and tetra-configuration files of B4, the different performance value that each configuration file correspondence is heavily deleted, for example B1 represents that corresponding per second processing 100MB heavily deletes the performance number of data volume, B2 represents that corresponding per second processing 200MB heavily deletes the performance number of data volume, and B3 represents that corresponding per second processing 300MB heavily deletes the performance number of data volume etc., and the FPGA accelerator card can have corresponding service process performance after loading described configuration file.Wherein, which functional module is set specifically, and each functional module arranges the configuration file of corresponding which performance number, can according to actual user demand, independently be set by the manufacturer of hardware accelerator, at this, do not limit.
In the corresponding embodiment of Fig. 3, the described configuration file that the configuration loading unit loads is corresponding a kind of performance number, wherein corresponding a kind of data of a kind of performance number are processed business, accordingly: described configuration loading unit loads the FPGA configuration file of described coupling, specifically comprises loading respectively with described data processing the sub-configuration file that business is corresponding and mate with the performance number of the described described data processing business of obtaining.
Referring to Fig. 2, in the FPGA configuration file storage area of Fig. 2, placed A, B, tri-kinds of functional modules of C, and each functional module has all comprised the configuration file of corresponding three kinds of performance numbers.Wherein, it should be noted that, in fact, A1, A2, B1, B2 etc. are the wherein a part of dispensing unit in complete FPGA configuration file, such as FPGA need to load A2 and C3, the combination of A2 and C3 is equivalent to a complete FPGA configuration file, and FPGA is only the configuration loading and completes after loaded A2 and C3, had the performance combination that A2 and C3 determine, and one of them in A2 and C3 is the wherein part in described integrated configuration file; But, in embodiments of the present invention, in order to simplify statement, all adopt configuration file to name, for A2, C3 also referred to as configuration file, for both combination also referred to as configuration file.In addition, different performance value file in each functional module is according to the level setting of going forward one by one, for example, the 100MBps performance number of A1 corresponding A function, the 300MBps performance number of A2 corresponding A function, the 500MBps performance number of A3 corresponding A function, that is, the present embodiment increases progressively with sequence number with performance number and increases to example and describe.
In the corresponding embodiment of Fig. 2, the described configuration file that the configuration loading unit loads is the composition file of corresponding at least two kinds of performance numbers, wherein every kind of corresponding a kind of data of performance number are processed business, accordingly: described configuration loading unit loads the FPGA configuration file of described coupling, specifically comprises loading with described data processing the composition file that business is corresponding and mate with the performance number of the described described data processing business of obtaining.
In addition, " fixed function " in Fig. 2 and " basic function framework " are just realized some basic functional configuration of FPGA configuration and logic function, do not repeat them here.In the present embodiment, for the setting position of business monitoring means, configuration loading unit and configuration file storage area, can set flexibly, for example, be all to be arranged on business monitoring means 11, configuration loading unit 12 on FPGA in Fig. 2; Wherein, the FPGA hardware accelerator normally comprises a FPGA hardware accelerator card and an accelerator card driving/administrative unit, on the FPGA hardware accelerator card, comprises described FPGA; For example, it is upper that business monitoring means and configuration loading unit can be arranged on FPGA, also can be arranged on the zone outside the FPGA of FPGA hardware accelerator card, or also can be arranged on accelerator card driving/administrative unit; The configuration file storage area can be arranged on the zone outside the FPGA of FPGA hardware accelerator card, or for example disk is first-class also can be arranged on other memory devices, as long as this FPGA can have access to the memory block of configuration file.The setting position of each functional unit of the hardware accelerator of the embodiment of the present invention is more flexible, does not do strict restriction, as long as can realize the function of self-adaptation adjustment performance of the present invention combination.
It is as follows that the FPGA hardware accelerator of employing the present embodiment carries out hardware-accelerated configuration flow: referring to Fig. 4, Fig. 4 is the structural representation in another embodiment of hardware accelerator of the present invention.The business monitoring means 11 of hardware accelerator can comprise comparison subelement 111, the first processing subelement 112 and the second processing subelement 113.This comparison subelement 111 can obtain the service request amount that this hardware accelerator receives, and described service request amount and default request amount threshold value are compared, wherein, each data is processed request amount threshold value corresponding to business can be different, for example, for the compression business, can preset its request amount threshold value is 80MB; Described threshold value also can be value range 60MB-80MB for example.
Wherein, first processes subelement 112, in the service request amount, during higher than the request amount threshold value, showing the performance of this business processing of needs lifting, after lifting after adopt promoting with the immediate performance number of described service request amount, as the described performance number flux matched with service request; Second processes subelement 113, in the service request amount, during lower than the request amount threshold value, showing that needs reduce the performance of this business processing, adopt after reduction with the immediate performance number of described service request amount, as the described performance number corresponding with the service request amount.concrete, performance number after performance number after described lifting or reduction can have two kinds of implementations, for example, realization for the performance number after promoting, what suppose present FPGA hardware accelerator employing is the configuration file corresponding to 50MBps performance number of compression business, through request amount relatively after, the service feature that determines lifting compression business is 150MBps, and the configuration file of corresponding compression business comprises the 50MBps performance number, the 100MBps performance number, the 150MBps performance number, can directly adopt the configuration file (being equivalent to directly load the high-performance configuration file) of the configuration file replacement 50MBps performance number of 150M performance number this moment, perhaps, also can on the basis of 50MBps performance number, increase the configuration file (being equivalent to increase the service channel of compression business) of 100MBps performance number.In like manner, for the realization of the performance number after reducing, also can adopt two kinds of above-mentioned modes, namely directly adopt the performance number of lower one-level, or reduce service channel.
It should be noted that, above-mentioned, according to request amount, relatively carry out performance while adjusting, normally performance number is promoted step by step or reduces, when promoting or reduce one-level, can judge whether this performance mates with request amount, if do not mate, continue promote or reduce, until reach with request amount, mate; And, as for the quantity of the configuration file that loads, do not limit, as long as can reach default performance yet.in addition, the present embodiment is when carrying out the service request monitoring, generally that the multiple business request is monitored simultaneously, the selection that last concrete every kind of data are processed the performance number of business also needs that the comparative result of each business is carried out to comprehensive judgement and determines, for example, when compression business and piecemeal business all need to promote service feature, suppose that the compression service needed promotes service feature to 200MBps, the piecemeal service needed promotes service feature to 500MBps, also need to consider the capacity limitation of total processing resource of whole FPGA hardware accelerator this moment, will be over the processing resource capacity 600MBps of hardware accelerator if two kinds of business all promote according to above-mentioned requirements, can adopt the priority method to set up this moment, supposing to arrange the piecemeal business is high priority, preferentially meet the performance need of piecemeal business, by the performance boost of piecemeal business to 500MBps, and the compression business that the 100MBps resource of system spare is dispensed to low priority is got final product, the requirement that can not meet compression business 200MBps performance at this moment.
As can be known by above-mentioned analysis, this business monitoring means 11 can the Real-Time Monitoring service request and configuration file corresponding to real-time loading adjust, for example, business monitoring means 11 can be processed by Real-time Monitoring Data the service request amount of business, determine the service feature value corresponding with this request amount, configuration loading unit 12 is loaded in FPGA configuration file that should performance number from configuration file storage area 13, transferring, thereby has realized the real-time adjustment to the function of FPGA hardware accelerator card; For example, when user's applied environment changes while causing the user to change the handling property requirements of combination of different business, the FPGA hardware accelerator card of the present embodiment just can go out by the business monitoring means Real-Time Monitoring of self variation of customer service request, and adopt above-mentioned method to be configured the logic function of self, finally realize the hardware-accelerated of respective user performance requirements of combination, namely the conformability requirements of combination changes automatically, and smoothly adjusts non-interrupting service.And, by automatic conformability requirements of combination, change, also can reduce the power consumption of FPGA hardware accelerator card; By a kind of hardware accelerator card, just can realize above-mentioned functions, than multiple accelerator card of the prior art, reduce the production and administration cost.
The hardware accelerator of the present embodiment, by business monitoring means, configuration loading unit are set, the business monitoring means obtains corresponding performance number according to the request amount of obtaining, the configuration loading unit loads the FPGA configuration file of corresponding described performance number, can realize the hardware-accelerated of correspondence, solved the problem that hardware accelerator can not meet user's different performance requirements of combination, realized automatically adapting to different performance requirements of combination by a kind of hardware accelerator, reduced the production cost of hardware accelerator.
Embodiment tri-
Fig. 5 is the application schematic diagram of the another embodiment of hardware accelerator of the present invention, and the present embodiment is that structure and the principle to the FPGA hardware accelerator that can not partly reshuffle describes, and this Fig. 5 has shown structural principle and the principle of work of this hardware accelerator.Wherein, the FPGA that can not partly reshuffle refers to and can only the configuration file integral body of this FPGA be reconfigured, for example, FPGA comprises the configuration file of corresponding business A and business B, can only integral replacing this comprise the configuration file of business A and business B, can not only change wherein for example file of corresponding business A of a part.
The principle of the hardware accelerator of the hardware accelerator of the present embodiment and Fig. 2~Fig. 4 embodiment (hereinafter to be referred as a upper embodiment) is roughly the same, therefore the present embodiment only makes a brief description, and stresses the difference place with the hardware accelerator of a upper embodiment.As shown in Figure 5, the principal feature of this hardware accelerator is that the structure of the configuration file deposited in the configuration file storage area is different from a upper embodiment.
Concrete, each configuration file in a upper embodiment is certain performance number that independent certain data of correspondence are processed business, for example, configuration file A1 is the 100MBps performance number that corresponding data is processed business A (compression business).And the configuration file of the loading in the present embodiment is to comprise that corresponding at least two kinds of data process the composition file of business, and wherein every kind of data configuration file of processing business is to wherein a kind of performance number that should business.For example, " functional performance combination 2 " shown in Fig. 5 just can be equivalent to A2 in Fig. 2 and the combination of C3, comprises A, two kinds of functions of C (being data processing business), and the performance A2 in A, the performance C3 in C.The configuration loading unit is directly whole " functional performance combination 2 " to be carried out to whole the loading, rather than as above an embodiment is that A2, C3 load respectively like that.For example, by service request, compare, needing obtaining configuration compression service feature is 100MBps, the piecemeal service feature is 200MBps, just can select " functional performance combination 2 ", A2 in this combination just corresponding performance is the compression business of 100MBps, and the corresponding performance of C3 is the piecemeal business of 200MBps.
In addition, the frame mode that arranges of the configuration file of the functional performance of the present embodiment combination also goes for an embodiment, that is, the FPGA hardware accelerator that can partly reshuffle also can adopt the configuration file of functional performance combination; But because the present embodiment can not partly be reshuffled, therefore the configuration file set-up mode in a upper embodiment can not be applicable to the present embodiment.
the hardware accelerator of the present embodiment, by the business monitoring means is set, configuration loading unit etc., the business monitoring means obtains corresponding performance number according to the request amount of obtaining, and indication configuration loading unit loads the FPGA configuration file of corresponding described performance number, can realize that the corresponding performance of being determined by performance number makes up hardware-accelerated, this business monitoring means can the Real-Time Monitoring service request and configuration file corresponding to real-time loading adjust, solved the problem that hardware accelerator can not meet user's different performance requirements of combination, realized automatically adapting to different performance requirements of combination by a kind of hardware accelerator, reduce the production cost of hardware accelerator.
Embodiment tetra-
Fig. 6 is the schematic flow sheet of the hardware-accelerated embodiment of the method for the present invention, and as shown in Figure 6, the method can comprise:
601, obtain the service request amount that data are processed business, and the default request amount threshold value of described service request amount and corresponding described data processing business is compared, obtain the performance number that the described data flux matched with described service request are processed business;
Wherein, when the default request amount threshold value of service request amount and corresponding described data being processed to business compares, if the service request amount, higher than described request amount threshold value, adopts the performance number after promoting, as the described performance number corresponding with the request treatment capacity; If described service request amount, lower than described request amount threshold value, adopts the performance number after reducing, as described and performance number request treatment capacity coupling.
Wherein, the default request amount threshold value of service request amount and corresponding described data processing business is compared; If described service request amount is higher than described request amount threshold value, adopt immediate with described service request amount after promoting, as the described performance number flux matched with service request; If described service request amount is lower than described request amount threshold value, adopt after reducing with the immediate performance number of described service request amount, as the described performance number corresponding with the service request amount.
602, according to the flux matched described data of the described and service request that obtains, process the performance number of business, obtain the FPGA configuration file of processing the performance number coupling of business with described data, load the FPGA configuration file of described coupling, to realize the hardware-accelerated of corresponding described FPGA configuration file.
Wherein, if FPGA is for can not partly reshuffle FPGA; Load the FPGA configuration file of described coupling, comprise: the FPGA configuration file that loads described coupling, described configuration file is processed business for comprising at least two kinds of data of correspondence, and corresponding every kind of described data are processed the composition file of a kind of performance number of business, and the definite service request amount of the performance number of processing business of the described data in described composition file and described business monitoring means is corresponding.
If FPGA is for can partly reshuffle FPGA; Load the FPGA configuration file of described coupling, comprising: load respectively the described configuration file that corresponding different pieces of information is processed business; The described configuration file that loads is a kind of performance number that corresponding a kind of data are processed business.
For example, the method for the present embodiment can be performed for the hardware accelerator of any embodiment of the present invention; Concrete principle can be in conjunction with described referring to device embodiment.
The hardware-accelerated method of the present embodiment, the request amount of obtaining by basis obtains corresponding performance number, and load the FPGA configuration file of corresponding described performance number, can realize that the corresponding performance of being determined by performance number makes up hardware-accelerated, can the Real-Time Monitoring service request and configuration file corresponding to real-time loading adjust, solved the problem that hardware accelerator can not meet user's different performance requirements of combination, realized automatically adapting to different performance requirements of combination by a kind of hardware accelerator, reduced the production cost of hardware accelerator.
Embodiment five
Fig. 7 is the application scenarios schematic diagram of hardware accelerator embodiment of the present invention, and Fig. 8 is the Another Application scene schematic diagram of hardware accelerator embodiment of the present invention.The present embodiment is mainly that the applied scene of the hardware accelerator of any embodiment of the present invention is briefly described, but is not limited to this two kinds of scenes in actual the use.
As shown in Figure 7, be the mode that a kind of typical memory device inside is heavily deleted/compressed.Workflow is as follows: application server carries out the data storage by the stores service interface that memory device provides; The stores service interface passes to data heavily and deletes/the compression applications module, and this module arrives Installed System Memory by deposit data; Heavily delete/store application module and call the interface that the accelerator card driving provides, the order hardware accelerator card carries out the processing such as deblocking, Hash, compression to the data that front deposits Installed System Memory in; Heavily delete/the compression applications module will speed up the result data that card finishes dealing with and reads, and and disk in canned data compare, select unduplicated new data to be written in disk.
The mode of heavily deleting/compressing while as shown in Figure 8, being a kind of typical memory device link copy transmissions.Workflow is as follows: the link copying application program in storage system A read in disk intend transmission data to internal memory, call heavily delete/the compression applications module carries out data reduction; Heavily delete/the compression applications module calls accelerator card and drives, and the order hardware accelerator card carries out the processing such as deblocking, Hash, compression to the data that front deposits Installed System Memory in; Data after link replication application module will be reduced pass to the link transmission module from internal memory, and the data transmission after being reduced by the link transmission module is to the storage system B of far-end; The upper corresponding link copying application program of storage system B stores the data after these reductions in the middle of the disk of self into.
The hardware accelerator of the present embodiment, the request amount of obtaining by basis obtains corresponding performance number, and load the FPGA configuration file of corresponding described performance number, can realize that the corresponding performance of being determined by performance number makes up hardware-accelerated, can the Real-Time Monitoring service request and configuration file corresponding to real-time loading adjust, solved the problem that hardware accelerator can not meet user's different performance requirements of combination, realized automatically adapting to different performance requirements of combination by a kind of hardware accelerator, reduced the production cost of hardware accelerator.
Embodiment six
The embodiment of the present invention also provides a kind of hardware-accelerated system, and this system comprises on-site programmable gate array FPGA and the described hardware accelerator of any embodiment of the present invention.Concrete principle of work can, in conjunction with described referring to apparatus and method embodiment, repeat no more.
Wherein, the business monitoring means in described hardware accelerator, configuration Loading Control unit or FPGA configuration file storage area, be arranged at described FPGA above or be arranged at outside described FPGA.
One of ordinary skill in the art will appreciate that: all or part of step that realizes above-mentioned each embodiment of the method can complete by the hardware that programmed instruction is correlated with.Aforesaid program can be stored in a computer read/write memory medium.This program, when carrying out, is carried out the step that comprises above-mentioned each embodiment of the method; And aforesaid storage medium comprises: the various media that can be program code stored such as ROM, RAM, magnetic disc or CD.
Finally it should be noted that: above each embodiment, only in order to technical scheme of the present invention to be described, is not intended to limit; Although with reference to aforementioned each embodiment, the present invention is had been described in detail, those of ordinary skill in the art is to be understood that: it still can be modified to the technical scheme that aforementioned each embodiment puts down in writing, or some or all of technical characterictic wherein is equal to replacement; And these modifications or replacement do not make the essence of appropriate technical solution break away from the scope of various embodiments of the present invention technical scheme.

Claims (10)

1. a hardware accelerator, is characterized in that, comprising: business monitoring means, configuration loading unit and configuration file storage area;
Described business monitoring means, for obtaining data, process the service request amount of business, and the default request amount threshold value of described service request amount and corresponding described data processing business is compared, obtain the performance number that the described data flux matched with described service request are processed business;
Described configuration file storage area, for depositing the on-site programmable gate array FPGA configuration file, described FPGA configuration file comprises the configuration file of respectively corresponding different pieces of information processing business, and the configuration file of each data processing business comprises the configuration file of corresponding different respectively service feature values;
Described configuration loading unit, for the flux matched described data of the described and service request according to obtaining, process the performance number of business, obtain the FPGA configuration file of processing the performance number coupling of business with described data, load the FPGA configuration file of described coupling, to realize the hardware-accelerated of corresponding described FPGA configuration file.
2. hardware accelerator according to claim 1, is characterized in that, described business monitoring means comprises:
Relatively subelement, compare for the default request amount threshold value of service request amount and corresponding described data being processed to business;
First processes subelement, in described service request amount during higher than described request amount threshold value, adopt after promoting with the immediate performance number of described service request amount, as the described performance number flux matched with service request;
Second processes subelement, in described service request amount during lower than described request amount threshold value, adopt after reducing with the immediate performance number of described service request amount, as the described performance number corresponding with the service request amount.
3. hardware accelerator according to claim 1, is characterized in that, when each described configuration file, is the composition file of corresponding at least two kinds of performance numbers, and wherein every kind of corresponding a kind of data of performance number are processed business, and are corresponding:
Described configuration loading unit loads the FPGA configuration file of described coupling, specifically comprises loading with described data processing the composition file that business is corresponding and mate with the performance number of described data processing business.
4. hardware accelerator according to claim 1, is characterized in that, when each described configuration file is corresponding a kind of performance number, wherein corresponding a kind of data of a kind of performance number are processed business, and are corresponding:
Described configuration loading unit loads the FPGA configuration file of described coupling, comprises that specifically loading processes with described data the configuration file that business is corresponding and mate with performance number that the described described data of obtaining are processed business respectively.
5. a hardware-accelerated method, is characterized in that, comprising:
Obtain data and process the service request amount of business, and the default request amount threshold value of described service request amount and corresponding described data processing business is compared, obtain the performance number that the described data flux matched with described service request are processed business;
According to the flux matched described data of the described and service request that obtains, process the performance number of business, obtain the FPGA configuration file of processing the performance number coupling of business with described data, load the FPGA configuration file of described coupling, to realize the hardware-accelerated of corresponding described FPGA configuration file; Described FPGA configuration file comprises the configuration file of respectively corresponding different pieces of information processing business, and the configuration file of each data processing business comprises the configuration file of corresponding different respectively service feature values.
6. hardware-accelerated method according to claim 5, it is characterized in that, described default request amount threshold value by described service request amount and corresponding described data processing business compares, and obtains the performance number that the described data flux matched with described service request are processed business, comprising:
The default request amount threshold value of service request amount and corresponding described data being processed to business compares;
If described service request amount is higher than described request amount threshold value, adopt immediate with described service request amount after promoting, as the described performance number flux matched with service request;
If described service request amount is lower than described request amount threshold value, adopt after reducing with the immediate performance number of described service request amount, as the described performance number corresponding with the service request amount.
7. hardware-accelerated method according to claim 5, it is characterized in that, be the composition file of corresponding at least two kinds of performance numbers when each described configuration file, and wherein every kind of corresponding a kind of data of performance number are processed business, load the FPGA configuration file of described coupling, comprising:
Load with described data and process the composition file that business is corresponding and mate with the performance number of the described described data processing business of obtaining.
8. hardware-accelerated method according to claim 5, is characterized in that, when each described configuration file is corresponding a kind of performance number, wherein corresponding a kind of data of a kind of performance number are processed business, load the FPGA configuration file of described coupling, comprising:
Load respectively with described data and process the sub-configuration file that business is corresponding and mate with performance number that the described described data of obtaining are processed business.
9. a hardware-accelerated system, is characterized in that, comprising: the arbitrary described hardware accelerator of on-site programmable gate array FPGA and claim 1-4.
10. hardware-accelerated system according to claim 9, is characterized in that, the business monitoring means in described hardware accelerator, configuration Loading Control unit or FPGA configuration file storage area are arranged at described FPGA above or are arranged at outside described FPGA.
CN201110459423.6A 2011-12-31 2011-12-31 Hardware-accelerated methods, devices and systems Active CN103399758B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110459423.6A CN103399758B (en) 2011-12-31 2011-12-31 Hardware-accelerated methods, devices and systems

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110459423.6A CN103399758B (en) 2011-12-31 2011-12-31 Hardware-accelerated methods, devices and systems

Publications (2)

Publication Number Publication Date
CN103399758A true CN103399758A (en) 2013-11-20
CN103399758B CN103399758B (en) 2016-11-23

Family

ID=49563392

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110459423.6A Active CN103399758B (en) 2011-12-31 2011-12-31 Hardware-accelerated methods, devices and systems

Country Status (1)

Country Link
CN (1) CN103399758B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104899085A (en) * 2015-05-29 2015-09-09 华为技术有限公司 Data processing method and apparatus
CN106777729A (en) * 2016-12-26 2017-05-31 中核控制系统工程有限公司 A kind of algorithms library simulation and verification platform implementation method based on FPGA
US20170300437A1 (en) * 2014-12-31 2017-10-19 Huawei Technologies Co., Ltd. Service acceleration method and apparatus
WO2018086436A1 (en) * 2016-11-09 2018-05-17 华为技术有限公司 Accelerator loading method and system, and accelerator loading apparatus
CN108319563A (en) * 2018-01-08 2018-07-24 华中科技大学 A kind of network function acceleration method and system based on FPGA
CN110334801A (en) * 2019-05-09 2019-10-15 苏州浪潮智能科技有限公司 A kind of hardware-accelerated method, apparatus, equipment and the system of convolutional neural networks
US11221866B2 (en) 2016-11-09 2022-01-11 Huawei Technologies Co., Ltd. Accelerator loading method, system, and apparatus

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050014559A1 (en) * 2003-07-16 2005-01-20 Igt Secured verification of configuration data for field programmable gate array devices
EP1553502A2 (en) * 2003-10-17 2005-07-13 Kabushiki Kaisha Toshiba Reconfigurable signal processing module
CN101286738A (en) * 2008-05-15 2008-10-15 华为技术有限公司 Method, device and system for loading logic files based on equipment information
CN101441574A (en) * 2007-11-20 2009-05-27 中兴通讯股份有限公司 Multiple-FPGA logical loading method in embedded system
CN101452502A (en) * 2008-12-30 2009-06-10 华为技术有限公司 Method for loading on-site programmable gate array FPGA, apparatus and system
CN102147735A (en) * 2010-02-10 2011-08-10 华为技术有限公司 Interface single board and business logic loading method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050014559A1 (en) * 2003-07-16 2005-01-20 Igt Secured verification of configuration data for field programmable gate array devices
EP1553502A2 (en) * 2003-10-17 2005-07-13 Kabushiki Kaisha Toshiba Reconfigurable signal processing module
CN101441574A (en) * 2007-11-20 2009-05-27 中兴通讯股份有限公司 Multiple-FPGA logical loading method in embedded system
CN101286738A (en) * 2008-05-15 2008-10-15 华为技术有限公司 Method, device and system for loading logic files based on equipment information
CN101452502A (en) * 2008-12-30 2009-06-10 华为技术有限公司 Method for loading on-site programmable gate array FPGA, apparatus and system
CN102147735A (en) * 2010-02-10 2011-08-10 华为技术有限公司 Interface single board and business logic loading method

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170300437A1 (en) * 2014-12-31 2017-10-19 Huawei Technologies Co., Ltd. Service acceleration method and apparatus
US10545896B2 (en) * 2014-12-31 2020-01-28 Huawei Technologies Co., Ltd. Service acceleration method and apparatus
US10432506B2 (en) 2015-05-29 2019-10-01 Huawei Technologies Co., Ltd. Data processing method and apparatus
CN104899085B (en) * 2015-05-29 2018-06-26 华为技术有限公司 A kind of data processing method and device
CN104899085A (en) * 2015-05-29 2015-09-09 华为技术有限公司 Data processing method and apparatus
WO2018086436A1 (en) * 2016-11-09 2018-05-17 华为技术有限公司 Accelerator loading method and system, and accelerator loading apparatus
US11221866B2 (en) 2016-11-09 2022-01-11 Huawei Technologies Co., Ltd. Accelerator loading method, system, and apparatus
US11416267B2 (en) 2016-11-09 2022-08-16 Huawei Technologies Co., Ltd. Dynamic hardware accelerator selection and loading based on acceleration requirements
CN106777729A (en) * 2016-12-26 2017-05-31 中核控制系统工程有限公司 A kind of algorithms library simulation and verification platform implementation method based on FPGA
CN108319563A (en) * 2018-01-08 2018-07-24 华中科技大学 A kind of network function acceleration method and system based on FPGA
CN108319563B (en) * 2018-01-08 2020-01-03 华中科技大学 Network function acceleration method and system based on FPGA
US10678584B2 (en) 2018-01-08 2020-06-09 Huazhong University Of Science And Technology FPGA-based method for network function accelerating and system thereof
CN110334801A (en) * 2019-05-09 2019-10-15 苏州浪潮智能科技有限公司 A kind of hardware-accelerated method, apparatus, equipment and the system of convolutional neural networks

Also Published As

Publication number Publication date
CN103399758B (en) 2016-11-23

Similar Documents

Publication Publication Date Title
CN103399758A (en) Hardware accelerating method, device and system
CN103049220B (en) Storage controlling method, memory control device and solid-state memory system
CN102859499B (en) Computer system and storage controlling method thereof
CN104160384B (en) For the system and method for dynamic priority control
US7685342B2 (en) Storage control apparatus and method for controlling number of commands executed in storage control apparatus
US11061580B2 (en) Storage device and controllers included in storage device
CN105224424B (en) A kind of backup method and system
CN103136074A (en) Data storage method and data storage system of multiple disk array systems
CN106547612A (en) A kind of multi-task processing method and device
KR101200998B1 (en) Hybrid raid controller having multi pci bus switching
CN101989231A (en) Erasure coded data storage capacity and power management
CN103559072A (en) Method and system for implementing bidirectional auto scaling service of virtual machines
CN105630638A (en) Equipment and method for distributing cache for disk array
CN103888501A (en) Virtual machine migration method and device
CN102402422B (en) The method that processor module and this assembly internal memory are shared
CN103392165B (en) Storage system
CN103631894A (en) Dynamic copy management method based on HDFS
CN102870374B (en) Load-sharing method and apparatus, and veneer,
CN104424052A (en) Automatic redundant distributed storage system and method
CN101533336B (en) Redundant array of independent disks memory system and method thereof
WO2013136366A1 (en) Storage apparatus and program update method
CN101827120A (en) Cluster storage method and system
KR101200997B1 (en) Raid controller having multi pci bus switching
CN105243026A (en) Memory access control method and apparatus for terminal device
KR101465447B1 (en) Method for external merge sort, system for external merge sort and distributed processing system for external merge sort

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220927

Address after: No. 1899 Xiyuan Avenue, high tech Zone (West District), Chengdu, Sichuan 610041

Patentee after: Chengdu Huawei Technologies Co.,Ltd.

Address before: 611731 Qingshui River District, Chengdu hi tech Zone, Sichuan, China

Patentee before: HUAWEI DIGITAL TECHNOLOGIES (CHENG DU) Co.,Ltd.