WO2012149776A1

WO2012149776A1 - Method and apparatus for storing data

Info

Publication number: WO2012149776A1
Application number: PCT/CN2011/080284
Authority: WO
Inventors: 张振龙; 巩玉旺
Original assignee: 华为技术有限公司
Priority date: 2011-09-28
Filing date: 2011-09-28
Publication date: 2012-11-08
Also published as: CN102388374A

Abstract

An embodiment of the present invention provides a method and apparatus for storing data. The method comprises: establishing a hotspot data model based on an original data record; screening for hotspot data from the original data record and/or a new data record according to the hotspot data model; and storing the screened hotspot data into a first storage device. According to the embodiment of the present invention, hotspot data can be screened out at the data record level according to the hotspot data model, and the screened hotspot data is stored into a specific storage device, thus implementing an effective storage policy at the data record level.

Description

Method and device for storing data

Embodiments of the present invention relate to the field of computer technology, and, more particularly, to a method and apparatus for storing data. Background technique

The emergence of new storage devices has changed the traditional storage architecture and prompted the database to be improved accordingly. For example, the new high-speed storage devices SSD (Solid State Disk) and PCM (Phase Change Memory) read and write faster than ordinary disks, slower than memory, data loss is not lost, often as a database The second level cache (Cache) is used. How to identify the Hot Data that needs to be cached and how to organize the data on the new high-speed storage device is an important issue that needs to be solved to effectively implement data storage or caching.

At present, the data of the hotspot data at the block level has been improved by the identification of ¹ j (identification) and pre-identification (Pre-identification). In the prior art, the hotspot data refers to data that is often used during the operation of the database of the server, and generally refers to data blocks, that is, the hotspot data is mainly stored in the form of data blocks. The identification algorithm for identifying such hotspot data is relatively mature. For example, it is known whether the data block is hot data by the number of hits of the statistical data block. This storage method identifies and stores hotspot data at the data block level (ie, the lower layer of the database), and cannot implement an effective storage strategy at the data record level (upper layer of the database). Summary of the invention

Embodiments of the present invention provide a method and apparatus for storing data, which can implement an effective storage strategy at a data recording level.

In one aspect, a method for storing data is provided, including: establishing a hotspot data model based on an original data record; and extracting hotspot data from the original data record or a new data record according to the hotspot data model; The hotspot data is stored in the first storage device.

In another aspect, an apparatus for storing data is provided, including: an establishing module, configured to establish a hotspot data model based on an original data record; and a screening module, configured to record from the foregoing original data according to the hotspot data model or The hot data is filtered out in the data record; the storage module is configured to store the filtered hot data into the first storage device. In the embodiment of the present invention, the hotspot data is filtered according to the hotspot data model at the data recording level, and the filtered hotspot data is stored in a specific storage device, thereby implementing an effective storage strategy at the data record level. DRAWINGS

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings to be used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only the present invention. For some embodiments, other drawings may be obtained from those of ordinary skill in the art without departing from the drawings.

1 is a schematic flow chart of a method of storing data according to an embodiment of the present invention.

2 is a schematic flow chart of a method of storing data according to another embodiment of the present invention. 3 is a schematic flow chart of a process of storing data in accordance with an embodiment of the present invention.

4 is a structural schematic diagram of an apparatus for storing data in accordance with one embodiment of the present invention. FIG. 5 is a structural schematic diagram of an apparatus for storing data according to another embodiment of the present invention. detailed description

The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without making creative labor are within the scope of the present invention.

It should be understood that the technical solution of the present invention can be applied to various fields of using computers, for example, it can be applied to the field of telecommunications, e-commerce, social platforms, etc., especially applications involving a large amount of data.

In the actual application process of the database, the query of large data volume is often involved. For example, the query of data accounts for more than 70% of the application of the database, and it takes a large price to search for a large amount of data from the disk, and the user needs The data of the query is usually only about 20% of the data in the data table.

At present, the hotspot data identified by the block-level hotspot identification and pre-identification technology is usually stored in the cache in the form of a block. Therefore, there is no way to optimize the specific scenarios and applications accordingly. In addition, although the hotspot data related to the data record can be identified according to the creation time of the data record (for example, the data block created in the preset time period can be used as the hot spot data and the data block is cached), but only Determining hotspot data based on creation time is not flexible enough, and the decision factor is too numerous. The hotspot data is stored or cached on the high-speed storage device at the data record level to improve the query efficiency of the database.

1 is a schematic flow diagram of a method 100 of storing data in accordance with one embodiment of the present invention. The method 100 of Figure 1 can be performed by a server.

110. Establish a hotspot data model based on the original data record.

Hotspot data in accordance with embodiments of the present invention refers to data records that are often used in a database. In a relational database, a data record is a set of related information corresponding to a row of information in a data source, which can be a row in a data table, each row including n attributes (fields or data items). The above original data record may be a data record stored on an original storage device (for example, a normal disk).

The hotspot data model according to an embodiment of the present invention may be a function model for identifying hotspot data. For example, the hotspot data model may be automatically generated by an artificial intelligence method (for example, a Bayesian classification algorithm), and may be based on actual application conditions. The update updates the model. The hotspot data model is used to classify data records to separate data records into hotspot data and non-hotspot data.

120. Filter hotspot data from the original data record or the new data record according to the hot data model, or select hotspot data from the original data record and the new data record.

The legacy data record is entered into the hotspot data model in accordance with an embodiment of the present invention to determine whether each data record in the original data record is hotspot data or non-hotspot data. Further, the newly stored data record can also be entered into the hotspot data model to determine if the new data record is hot point data.

130. Store the filtered hotspot data into the first storage device.

For example, the first storage device may be a high speed storage device or a storage device that acts as a cache or memory.

In the embodiment of the present invention, hotspot data can be filtered according to the hotspot data model at the data recording level, and the filtered hotspot data is stored in a specific storage device, thereby implementing an effective storage strategy at the data record level. In addition, the identification and pre-identification of the hot spot data is performed at the data recording level according to an embodiment of the present invention, so that the identification and pre-identification of the hotspot data is transparent to the application.

According to another embodiment of the present invention, the method further includes: storing, in the original data record, the data record that is not the hotspot data in the second storage device, where the storage rate of the first storage device is higher than the second storage The storage rate of the storage device.

For example, in accordance with an embodiment of the present invention, storing selected hotspot data in a storage device having a higher storage rate (e.g., a high speed storage device, cache, or memory) can significantly improve query efficiency.

According to an embodiment of the present invention, the first storage device is a high-speed storage device, wherein in 120, a sample data record is extracted from the original data record; determining a hit count of the sample data record; and using the sample data record as a The data source of the hot data model is established, and the hot data model is established based on the number of hits.

For example, in order to reduce the overhead of modeling, a certain number of data records can be randomly extracted from the original data records on the ordinary disk as samples, and the number of times these samples are hit in the preset time can be calculated, and then according to the number of hits. These sample data records are divided into hot data and non-hot data. Then, the classified hotspot data and non-hotspot data can be analyzed by artificial intelligence method to determine the influence of the attribute value of the data record on the hotspot data classification, thereby obtaining the hotspot data model.

According to another embodiment of the present invention, the method further includes: performing the process of establishing the hotspot data model in the case that the hotspot data model expires, and updating the hotspot data in the first storage device according to the re-established hotspot data model.

According to the embodiment of the present invention, the expiration of the hotspot data model includes: the lifetime of the hotspot data model exceeds a preset time or the hit rate of the hotspot data in the first storage device is too low.

For example, after the database is running for a period of time, the data records in the database may change. Accordingly, the hotspot data model established based on the original data records will expire. In addition, the hit rate of hotspot data in the high-speed storage device may be too low. In this case, the sample needs to be re-extracted from the changed data record, and a new hotspot data model is established based on the extracted samples, so as to maintain an effective storage strategy and efficient query efficiency.

2 is a schematic flow diagram of a method 200 of storing data in accordance with one embodiment of the present invention. The method 200 of Figure 2 can be performed by a server. 210, 220, and 230 of Fig. 2 are similar to 110, 120, and 130 of Fig. 1, and will not be described again.

210. Establish a hotspot data model based on the original data record.

220. Filter out hotspot data from the original data record or the new data record according to the hot data model, or select hotspot data from the original data record and the new data record. 230. The filtered hotspot data is stored in the first storage device, and the data record in the original data record that is not the hotspot data is stored in the second storage device, where the storage rate of the first storage device is higher than the second storage The storage rate of the device.

240. When receiving the query request, optimize the query to generate a corresponding execution plan. Usually after receiving a query request, the server's query optimizer can generate and evaluate multiple execution plans, and finally select the lowest-cost (for example, the fastest running, least-resourced) execution plan for the query. For example, when performing query optimization, you can perform a query on each of the high-speed storage device and the normal disk, and take the joint result set of the two as the final execution plan. Embodiments according to the present invention are not limited thereto, and for example, when a query request is received, an execution plan stored in the cache may be directly used as a final execution plan for inquiry.

250. Acquire data from the first storage device and the second storage device according to the foregoing execution plan.

In the embodiment of the present invention, hotspot data can be filtered according to the hotspot data model at the data recording level, and the filtered hotspot data is stored in a specific storage device, thereby implementing an effective storage strategy at the data record level. In addition, according to the embodiment of the present invention, the hotspot data is identified and pre-identified at the data recording level, so that the identification and pre-identification of the hotspot data is transparent to the application transparent to the application, and the hotspot data is cached to the specific layer in the upper layer of the database. In storage devices, it helps to achieve query optimization.

According to another embodiment of the present invention, the method further includes: performing the process of establishing the hotspot data model in the case that the hotspot data model expires or the hit rate of the hotspot data in the first storage device is too low, and according to the re The established hotspot data model updates the hotspot data in the first storage device.

Embodiments of the present invention are described in more detail below with reference to specific examples. 3 is a schematic flow chart of a process of storing data in accordance with an embodiment of the present invention.

As shown in Table 1, the source data table (Table) in the database lists 9999999 data records, each data record containing four attributes (fields or data items): identification, name, gender, and age. The above source data table can be stored in a normal disk. In different applications or different data table structures, the columns used to decide hotspot data may be different. For example, this example selects gender and age attributes (fields or data items) for judgment. For example, a user can be provided with configurable items so that the user can specify which columns to use as objects for decision hotspot data at the database application level when creating the form. Embodiments according to the present invention are not limited thereto, and which can be determined statistically Columns can be used to make decision hotspot data.

310. Extract a sample data record from a data record of a source data table of an original storage device (for example, a normal disk). For example, in the initial stage of establishing a hotspot data model, random sampling statistics are performed on a large number of data records, and a part of the data (for example, 20% of data) in the above source data table is extracted as a sample data record. The sample data can be retained in the original storage device and identified to be distinguished from other data, thereby being logically abstracted into a table (hereinafter referred to as a sample data table). Alternatively, the sample data record in the high speed storage device can be updated every predetermined time (e.g., one day or one week).

320, determining the number of hits for each sample data record. For example, a statistical column is added to the sample data table to count the number of times each data record is hit, as shown in Table 2.

Table 2

Identification (ID) Name (Name) Gender (Sex) Age (Age) Hits (CNT)

168 N168 Female 18 100

520 N520 Female 14 6

777 N777 Male 17 9

1234 N1234 Male 30 50

5202 N5202 Female 20 123

9999 N9999 Male 12 3

t _{tttttttttttttt}

9874574 , , , , , , , , , , , , 330. Record the sample data as a data source for establishing a hotspot data model, and establish a hotspot data model based on the number of hits. For example, after a preset time (which can be set according to a specific application, such as one day or one week), the above sample data records are sorted according to the number of hits recorded for each sample data, and the top 30 percent of the hits are ranked. The data record is designated as hotspot data, and embodiments according to the present invention are not limited thereto, and the above percentage may be adjusted as needed. For example, an artificial intelligence method (for example, a Bayesian classification algorithm) can be used for intelligent analysis, and a process of using the Bayesian classification algorithm for intelligent analysis is also called a learning process or a training process of hotspot data. The specific intelligent analysis process will be described in detail later.

340. Sort the data records in the source data table according to the hotspot data model to select hotspot data. For example, the data records in the source data table described above can be used as input to the hotspot data model. After passing through the hotspot data model, the data records are divided into hotspot data and non-hotspot data as output of the hotspot data model.

350: Store the filtered hotspot data into the high speed storage device, and store the non hotspot data into the original storage device. For example, the filtered hotspot data is stored in a high-speed storage device at the data recording level, and the non-hotspot number is stored in a normal disk. As shown in Table 3, data records of women under the age of 20 and whose gender is female are identified as hotspot data and stored in high speed storage to the device. As shown in Table 4, non-hotspot data is stored on a regular disk.

table 3

Identification (ID) Name (Name) Gender (Sex) Age ( Age)

2 N2 Female 17

5 N5 female 20 t _ttttttttttt

9986454 _ttt female 21

Table 4

Identification (ID) Name (Name) Gender (Sex) Age ( Age)

1 Nl force 16

3 N3 force 18

4 N4 force 19

6 N6 force 20

... ... ... ...

9999999 • · · Force • · · 360. Determine whether the new data record is hot data or non-hot data according to the hot data model. After completing the learning or training of the hotspot data, if there is a new data record (for example, the data records identified as 10000000 and 10000001 in Table 5 and Table 6) need to be stored, the data may be judged according to the hotspot data model. Recorded as hot data or non-hot data, if it is hot data, it is stored in the high-speed storage device, if it is not hot data, it is stored in the disk. For example, the data in Table 5 is stored in a normal disk, and the data records in Table 6 are stored in a high speed storage device.

table 5

Identification (ID) Name (Name) Gender (Sex) Age ( Age)

10000000 N10000000 Female 60 Table 6

Identification (ID) Name (Name) Gender (Sex) Age ( Age)

10000001 N10000001 Female 20

370. When receiving the query request, optimize the query to generate a corresponding execution plan, and obtain data from the high speed storage device and the original storage device according to the execution plan. For example, if a query request is received, the query can be optimized at the database query optimizer level (the most optimized optimization here is to execute the query statement on the high-speed storage device and the original storage device, respectively, and take the joint result set) , to generate a corresponding execution plan, and obtain corresponding data from the high-speed storage device and the original storage device according to the execution plan.

380. In the case that the hotspot data model expires, for example, the hotspot data in the foregoing high-speed storage device has a low hit rate or the lifetime of the hotspot data model exceeds a preset time, and the process of establishing the hotspot data model is performed again. And update the hotspot data model in the high speed storage device. For example, after a period of time, the hotspot data model may change, causing the original hotspot data to no longer be a hotspot. During a preset time (for example, one day), during a non-busy period, according to the hit statistics, for example, when the hit rate of the hotspot data in the high speed storage device is less than 50%, the process of establishing the hotspot data model is re-executed, and according to The re-established hotspot data model updates (or refreshes) the hotspot data. For example, the hotspot data that matches the hotspot data model is filtered from the high-speed storage device and remains on the high-speed storage device, and the rest is stored in a normal disk, and then selected from the ordinary disk. The hotspot data that matches the hot data model is stored in the high-speed storage device, and the rest remains on the normal disk. The following uses the Naive Bayes classification method as an example to describe the establishment process of the hotspot data model. For convenience of description, the following procedure extracts only 10 samples and selects gender and age attributes as objects for decision hotspot data. As shown in Table 7, the first column and the second column are the gender and age attributes of the sample, respectively, and the third column indicates whether the corresponding data record is hotspot data for training (or learning) (hereinafter referred to as training hotspot data). In addition, the threshold value 20 of the age attribute may be an average of the ages in the data table.

Table 7

Gender (Sex) Age (Age) <20? Whether it is hot data (H) Female (F) 25 Yes

Female (F) 17 Yes

Male (M) 19 No

Male (M) 30 No

Female (F) 14 Yes

Female (F) 18 Yes

Male (M) 23 No

Female (F) 40 No

Female (F) 17 Yes

Male (M) 30 Yes

The naive Bayes classification formula is = argmax _J P(v J (<3⁄4 IJ, where v represents Naobo V, ■ eV The target value of the Yesi classification method output, ie the maximum value of the classification function, . eV = {i3⁄4, N ₀ } is the target value of each training sample data, and ^ is the value of each attribute used to train the sample data. The naive Bayesian classification formula of this example can be as follows:

h = arg max ph _j )p(Sex I h )p(Age I ), where / ί represents the maximum of the probability that a data record is hot data or non-hot spot data, hj indicates that each sample data record is hot Data or non-hotspot data. The parameters of the hotspot data model can be obtained by this formula as follows: P (H = Yes) = 6/10 = 0.6, P ( H = No ) = 4/10 = 0.4, P ( Sex = FIH = Yes ) = 5 / 6 , P ( Sex=FIH = No ) = 1/4, P ( Sex = MIH = Yes ) =1/6, P (Sex=MIH = No ) = 3/4, P ( Age < 20IH=Yes ) = 4 /6, P (Age 20 I H = No ) = 1/4, P ( Age > 20 IH = Yes ) = 2/6, and P ( Age > 20 IH = No ) = 3/4. According to the parameters of the hot data model described above, it can be determined that a certain data record is hot data or non-hot data. For example, the gender attribute of data record 1 is female and the age attribute is 14, and if data record 1 is hotspot data, then P ( H=Yes ) P ( Sex=FIH = Yes ) P ( Age < 20 IH = Yes ) = 0.6 x 5/6 x 4/6=0.3333, if data record 1 is non-hot spot data, then P ( H=No ) P ( Sex=FIH = No ) P ( Age < 20 IH = No ) = 0.4 1/4 1 /4=0.025, and finally get h = 0.3333, so it can be determined that data record 1 is most likely hot data. For another example, the gender attribute of data record 2 is male and the age attribute is 16, and if data record 2 is hotspot data, then P ( H = Yes ) P ( Sex = MIH = Yes ) P ( Age < 20 IH = Yes ) = 0.6 x 1/6 x 4/6=0.0667, if data record 2 is non-hot spot data, then P ( H=No ) P ( Sex=MIH = No ) P ( Age 20 IH = No ) =0.4 x 3/4 x 1/4=0.075, and finally get A = 0.075, so it can be determined that data record 2 is most likely to be non-hotspot data.

4 is a structural schematic diagram of an apparatus 400 for storing data in accordance with one embodiment of the present invention. The apparatus of FIG. 4 may be a server, including: an establishing module 410, a screening module 420, and a storage module 430.

The setup module 410 establishes a hotspot data model based on the original data records. The screening module 420 filters the hotspot data from the original data record or the new data record according to the hotspot data model, or filters the hotspot data from the original data record and the new data record. The storage module 430 stores the filtered hotspot data into the first storage device.

According to another embodiment of the present invention, the storage module 430 stores the data record of the original data record that is not the hotspot data in the second storage device, where the storage rate of the first storage device is higher than the storage rate of the second storage device. .

According to another embodiment of the present invention, the establishing module 410 further performs the above process of establishing a hotspot data model if the hotspot data model expires or the hit rate of the hotspot data in the first storage device is too low, and according to The re-established hotspot data model updates the above hotspot data model.

According to an embodiment of the invention, the first storage device is a high speed storage device, and the establishing module 410 is from The sample data record is extracted from the original data record, the number of hits of the sample data record is determined, the sample data record is used as a data source for establishing the hot data model, and the hot data model is established based on the number of hits.

For the operation and function of the various units of the device 400, reference may be made to the methods 110, 120, and 130 of the above-described FIG. 1, and in order to avoid redundancy, details are not described herein again.

FIG. 5 shows a structural schematic diagram of an apparatus 500 for storing data in accordance with another embodiment of the present invention. The apparatus of FIG. 5 may be a server, including: an establishing module 510, a screening module 520, a storage module 530, an optimization module 540, and an acquisition module 550. The device 500 building module 510, the screening module 520, and the storage module 530 of FIG. 5 are similar to the building module 410, the screening module 420, and the storage module 430 of FIG. 4, and are not described herein again.

The optimization module 540, when receiving the query request, optimizes the query to generate a corresponding execution plan. The acquisition module 550 acquires data from the first storage device and the second storage device, respectively, according to the execution plan described above.

According to another embodiment of the present invention, the establishing module 410 further performs the above process of establishing a hotspot data model if the hotspot data model expires or the hit rate of the hotspot data in the first storage device is too low, and according to The re-established hotspot data model updates the hotspot data in the first storage device.

For operations and functions of the various units of the apparatus 500, reference may be made to the methods 210, 220, 230, and 240 of the above-described FIG. 2, and in order to avoid redundancy, details are not described herein again.

According to an embodiment of the present invention, pre-identification of hotspot data is performed on the upper layer of the database, which is transparent to the application layer and reduces the complexity of application development. In addition, the use of the high speed storage device as the storage device or the cache device of the upper layer of the database according to the embodiment of the present invention facilitates the decision of the query optimizer, and the pre-identification of the hotspot data for the newly generated data record can be improved according to an embodiment of the present invention. Query efficiency.

One of ordinary skill in the art will recognize that each of the embodiments described herein in connection with the embodiments disclosed herein The exemplary unit and algorithm steps can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods to implement the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present invention.

It will be apparent to those skilled in the art that, for the convenience of the description and the cleaning process, the specific operation of the system, the device and the unit described above may be referred to the corresponding processes in the foregoing method embodiments, and details are not described herein again.

In the several embodiments provided herein, it should be understood that the disclosed systems, devices, and methods may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed. In addition, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be electrical, mechanical or otherwise.

The units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solution of the embodiment.

In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention, which is essential to the prior art or part of the technical solution, may be embodied in the form of a software product stored in a storage medium, including The instructions are used to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk, and the like, which can store program codes. .

The above description is only a specific embodiment of the present invention, but the scope of protection of the present invention is not limited. In this regard, any person skilled in the art can easily conceive changes or substitutions within the scope of the present invention. Therefore, the scope of the invention should be determined by the scope of the claims.

Claims

Rights request

A method for storing data, comprising:

Establish a hotspot data model based on the original data records;

Extracting hotspot data from the original data record or the new data record according to the hotspot data model;

The filtered hotspot data is stored in the first storage device.

2. The method according to claim 1, further comprising:

The data record in the original data record that is not the hotspot data is stored in the second storage device, wherein the storage rate of the first storage device is higher than the storage rate of the second storage device.

3. The method according to claim 2, further comprising:

Upon receiving the query request, the query is optimized to generate a corresponding execution plan; data is obtained from the first storage device and the second storage device, respectively, according to the execution plan.

The method according to any one of claims 1 to 3, further comprising: re-executing the process of establishing a hotspot data model if the hotspot data model expires, and according to The established hotspot data model updates hotspot data in the first storage device.

The method according to any one of claims 1 to 4, wherein the expiration of the hotspot data model comprises: the life cycle of the hotspot data model exceeds a preset time or the hotspot data is in the The hit rate in the first storage device is too low.

The method according to any one of claims 1 to 5, wherein the first storage device is a high-speed storage device, and the hot data model is established based on the original data record, including:

Extracting a sample data record from the original data record;

Determining the number of hits of the sample data record;

The sample data record is used as a data source for establishing the hotspot data model, and the hotspot data model is established based on the number of hits.

7. A device for storing data, comprising:

Establishing a module for establishing a hotspot data model based on the original data record;

a screening module, configured to filter hotspot data from the original data record or the new data record according to the hotspot data model; a storage module, configured to store the filtered hotspot data into the first storage device.

The device according to claim 7, wherein the storage module further stores a data record that is not hotspot data in the original data record into the second storage device, where the storage rate of the first storage device Higher than the storage rate of the second storage device.

9. The device according to claim 8, further comprising:

An optimization module, configured to optimize a query to generate a corresponding execution plan when receiving a query request;

And an obtaining module, configured to respectively acquire data from the first storage device and the second storage device according to the execution plan.

The apparatus according to any one of claims 7 to 9, wherein the establishing module further performs the process of establishing a hotspot data model in a case where the hotspot data model expires, and Updating hotspot data in the first storage device according to the re-established hotspot data model.

The apparatus according to any one of claims 7 to 10, wherein the expiration of the hotspot data model comprises: a life cycle of the hotspot data model exceeds a preset time or the hotspot data is in the The hit rate in the first storage device is too low.

The device according to any one of claims 7 to 11, wherein the first storage device is a high speed storage device, and the establishing module extracts a sample data record from the original data record. Determining a number of hits of the sample data record, using the sample data record as a data source for establishing the hotspot data model, and establishing the hotspot data model based on the number of hits.