WO2011142134A1 - 情報検索装置、情報検索方法、コンピュータ・プログラムおよびデータ構造 - Google Patents
情報検索装置、情報検索方法、コンピュータ・プログラムおよびデータ構造 Download PDFInfo
- Publication number
- WO2011142134A1 WO2011142134A1 PCT/JP2011/002641 JP2011002641W WO2011142134A1 WO 2011142134 A1 WO2011142134 A1 WO 2011142134A1 JP 2011002641 W JP2011002641 W JP 2011002641W WO 2011142134 A1 WO2011142134 A1 WO 2011142134A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- path
- search
- information
- node
- index
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 50
- 238000004590 computer program Methods 0.000 title claims description 17
- 239000000284 extract Substances 0.000 claims description 16
- 238000010586 diagram Methods 0.000 description 10
- 230000014509 gene expression Effects 0.000 description 6
- 238000010276 construction Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 2
- 239000003973 paint Substances 0.000 description 2
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/93—Document management systems
- G06F16/94—Hypermedia
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
Definitions
- the present invention relates to an information search apparatus, and more particularly, an information search apparatus, an information search system, an information search method, a computer program, and a computer program for searching for a target node from nodes and graph structure information represented by edges connecting the nodes. Regarding data structure.
- the graph structure information is information that represents the elements constituting the target information as nodes and the relationship between the nodes as edges.
- the information search device described in Patent Document 1 clusters graph structure information into subgraphs, and generates a table with each node as an index and a table with each edge as an index for each subgraph.
- the information search device searches for a subgraph that matches the query graph based on these tables.
- Non-Patent Document 1 As such other information retrieval apparatuses, those described in Non-Patent Document 1 are also known.
- the information retrieval apparatus described in Non-Patent Document 1 retrieves graph structure information represented by RDF (Resource Description Framework).
- the information search apparatus includes a relational database (hereinafter, database is also referred to as DB) construction unit 91, a relational DB 92, and a retrieval unit 93.
- database hereinafter, database is also referred to as DB
- the relationship DB 92 includes a class table, a property table, a type table, a resource table, a path table, and a triple table. Storing.
- RDF information is expressed by a sentence (hereinafter also referred to as a triple) composed of three elements, a subject, a predicate, and an object.
- the subject represents a resource in the information model
- the predicate represents a property of the resource
- the object represents a resource or a property value.
- resources are represented by nodes
- properties are represented by edges (or arcs).
- This information retrieval apparatus registers information in the relation DB 92 as follows.
- the relational DB construction unit 91 generates a class table, a property table, a type table, a resource table, and a triple table based on the given RDF graph.
- the relation DB construction unit 91 determines a resource to be a root, and generates a series of all properties (arc path) from the determined root resource to other resources.
- the relation DB construction unit 91 assigns a path ID (pathID) to each generated arc path, and registers a path expression (pathexp) representing each arc path and a path ID in the path table.
- pathID path ID
- pathexp path expression representing each arc path
- the path expression representing the arc path is expressed as a string of property names.
- the search unit 93 searches the relation DB 92 generated as described above by generating an SQL query.
- the search unit 93 when searching for information that is not specified by designating only the path, performs a search using a triple table.
- resources that refer to resources that have a specific value as a property value are referenced by a property or resources that have a specific value as a property value. Resources.
- Non-Patent Document 1 has a problem that it takes time to search for information that cannot be specified by specifying only a path.
- Non-Patent Document 1 is a search for information specified by specifying only a path, there is a problem that the search time increases as the information model becomes complicated.
- Non-Patent Document 1 even for a query that can use a path table, the number of comparisons of path expressions that are search keys increases in the order of the number of paths, and the search time greatly increases. is there.
- the present invention has been made to solve the above-described problem, and an object of the present invention is to provide an information search apparatus that can search a target node at high speed even if the graph structure information becomes complicated.
- the information search device of the present invention is an information search device for searching for a target node satisfying a search condition from graph structure information having a plurality of nodes and edges connecting the nodes as elements, and each node included in the graph structure information
- a path field generation unit that extracts a path that is a sequence of the elements that can be traced starting from the node, generates a path field that connects the extracted paths for each node, and the graph structure information
- a posting list which is a list of information including a node having a path field including the element and position information indicating a position where the element appears in the path field is generated, and the element and the posting list are generated.
- an index generation unit that generates an index repository that associates
- a search path generation unit that generates a sequence of elements and a node having a path field including each element included in the search path from the index repository, and in the search path among the searched nodes
- a search unit that searches for the target node by extracting a node having a path field that satisfies the appearance order of elements based on the position information.
- the data structure of the present invention is a data structure for storing graph structure information having a plurality of nodes and edges connecting the nodes as elements, each element being generated for each element, and each node Among the path fields of each node represented by concatenating paths that are the sequence of elements that can be traced starting from the node, the node having the path field containing the element and the position where the element appears in the path field Are stored in association with a posting list, which is a list of information composed of position information representing.
- the information search system of the present invention also includes a graph structure information storage device storing graph structure information having a plurality of nodes and edges connecting the nodes as elements, and a search for a target node satisfying a search condition from the graph structure information.
- An information search system comprising: a requesting client device; and an information search device for searching for the target node from the graph structure information, wherein the information search device
- a path field generation unit that extracts a path that is a sequence of the elements that can be traced starting from a node and generates a path field that connects the extracted paths for each node, and each element that constitutes the graph structure information
- An index generation unit that generates a posting list that is a list of information including position information indicating a position, generates an index repository that associates the element with the posting list, and a search path that represents the search condition
- a search path generation unit that generates a sequence of nodes, and
- the computer program of the present invention is a computer program for controlling the operation of an information retrieval apparatus that retrieves a target node satisfying a retrieval condition from graph structure information having a plurality of nodes and edges connecting the nodes as elements. Then, for each node included in the graph structure information, a path that is a sequence of the elements that can be traced from the node is extracted, and a path field that connects the extracted paths is generated for each node. For each element constituting the graph structure information, a posting that is a list of information including a node having a path field including the element and position information indicating a position where the element appears in the path field. A list is generated, and the index that associates the element with the posting list is created.
- An index generation process for generating a repository a search path generation process for generating a search path representing the search condition as a column of the elements, and a node having a path field including each element included in the search path,
- the graph structure information storage device stores graph structure information having a plurality of nodes and edges connecting the nodes as elements, and the information search device includes each of the graph structure information included in the graph structure information.
- the information search device includes each of the graph structure information included in the graph structure information. For each node, a path that is a sequence of the elements that can be traced from the node is extracted, a path field that connects the extracted paths is generated for each node, and each element constituting the graph structure information is A posting list which is a list of information including a node having a path field including the element and position information indicating a position where the element appears in the path field is generated, and the element is associated with the posting list.
- the client device satisfies the search condition among the graph structure information.
- a search node to the information retrieval apparatus, the information retrieval device, Generating a search path representing the search condition as a column of the elements, searching the index repository for a node having a path field including each element included in the search path;
- the target node is searched by extracting a node having a path field that satisfies the appearance order of elements in the search path among the searched nodes based on the position information.
- the present invention it is possible to provide an information search apparatus that can search a target node at high speed even if the graph structure information becomes complicated.
- FIG. 1 shows a hardware configuration of an information search system 1 as a first embodiment of the present invention.
- the information search system 1 includes an information search device 11, a graph structure information storage device 12, and a client device 13. Further, the information search device 11, the graph structure information storage device 12, and the client device 13 are connected to be communicable with each other.
- the information search apparatus 11 is a general-purpose device including at least a CPU (Central Processing Unit) 1101, a RAM (Random Access Memory) 1102, a ROM (Read Only Memory) 1103, a storage device 1104, and a network interface 1105. It is composed of a typical computer.
- CPU Central Processing Unit
- RAM Random Access Memory
- ROM Read Only Memory
- the graph structure information storage device 12 is configured by a general-purpose computer including at least a CPU 1201, a RAM 1202, a ROM 1203, a storage device 1204, and a network interface 1205.
- the client device 13 is configured by a general-purpose computer including at least a CPU 1301, a RAM 1302, a ROM 1303, a storage device 1304, a network interface 1305, an input device 1306, and an output device 1307.
- the graph structure information storage device 12 stores, in the storage device 1204, graph structure information having a plurality of nodes and edges connecting the nodes as elements.
- An example of the graph structure information stored in the graph structure information storage device 12 is shown in FIG.
- the graph structure information in FIG. 3 includes nodes A, B1, B2, C1, and C2 and edges a, b, c, and d connecting the nodes.
- the graph structure information storage device 12 provides graph structure information to the information search device 11 in response to a request from the information search device 11.
- the client device 13 acquires, via the input device 1306, information representing a search request for a target node that satisfies the search condition among the graph structure information. Then, the client device 13 transmits information representing the search request to the information search device 11. In addition, the client device 13 outputs information representing the search result acquired from the information search device 11 via the output device 1307.
- the information search apparatus 11 includes a path field generation unit 101, an index generation unit 102, an index repository 103, a search unit 104, and a search path generation unit 105.
- the path field generation unit 101, the index generation unit 102, the search unit 104, and the search path generation unit 105 are stored in the storage device 1104 as a computer program, and are realized by the CPU 1101 that reads the program into the RAM 1102 and executes it.
- the index repository 103 is configured by a storage device 1104.
- the path field generation unit 101 For each node included in the graph structure information, the path field generation unit 101 selects one or more paths that are columns of elements (that is, nodes and edges) that can be traced from the node as graph structure information. Extract from Then, the path field generation unit 101 generates a path field obtained by connecting one or more extracted paths for each node.
- the index generation unit 102 generates an index repository 103 in which each element (that is, each node and each edge) constituting the graph structure information is associated with the posting list.
- the posting list is a list of information that is generated for each element and includes a node having a path field including the element and position information where the element appears in the path field.
- the position information included in the posting list may be, for example, a numerical value representing the order in which the element appears in the path field from the top, and the position where the element appears in the path field. It may be other information that can identify.
- the index repository 103 stores an element and a posting list of the element in association with each other.
- the search path generation unit 105 generates a search path representing a search condition as an element string.
- the search unit 104 searches the index repository 103 for a node having a path field including each element included in the search path. Then, the search unit 104 searches for a target node by extracting a node having a path field that satisfies the appearance order of elements in the search path among the searched nodes based on the position information of the posting list.
- the index generation unit 102 reads the graph structure information from the graph structure information storage device 12 (step S1).
- the path field generation unit 101 generates a path field for each node included in the read graph structure information (step S2).
- the path field generation unit 101 uses the path [A] [a] [B1] [b] [as a sequence of elements that can be traced from the node A with respect to the node A of the graph structure information illustrated in FIG. Three paths are extracted: C1], path [A] [a] [B1] [d] [C2], and path [A] [c] [B2]. Then, the path field generation unit 101 concatenates these three paths as the path field of the node A, and connects the path fields [A] [a] [B1] [b] [C1] [A] [a] [B1] [d] [C2] [A] [c] [B2] is generated. Similarly, the path field generation unit 101 generates path fields for the nodes B1, B2, C1, and C2.
- the index generation unit 102 associates each element constituting the graph structure information with the posting list of each element and registers it in the index repository 103 (step S3).
- the index generation unit 102 generates a posting list for the element b constituting the graph structure information shown in FIG.
- the posting list of the element b is “A ⁇ 4>, B1 ⁇ 2> ”.
- the index generation unit 102 registers the element b and the posting list of the element b in the index repository 103 in association with each other. Similarly, the index generation unit 102 generates a posting list for the remaining elements and registers the generated posting list in the index repository 103.
- the information search apparatus 11 ends the process of generating the index repository 103.
- the client device 13 requests the information search device 11 to search for a target node that satisfies the search condition, and the information search device 11 accepts this (Yes in step S4).
- the client device 13 requests the information search device 11 to “search for a node having the node C1 ahead of the edge b” from the graph structure information shown in FIG.
- the search path generation unit 105 generates a search path representing the accepted search condition (step S5).
- the search path generation unit 105 generates [x] [b] [C1] as a search path for the above example search conditions.
- [x] represents a target node.
- the search unit 104 searches the index repository 103 for a node having a path field including each element included in the search path, and has a path field that satisfies the appearance order of the elements included in the search path among the searched nodes. A thing is extracted as a target node (step S6).
- the search unit 104 searches the nodes A and B1 as having the path field including the elements [b] and [C1] included in the search path. Then, the search unit 104 extracts the node B1 from the searched nodes A and B1, assuming that it has a path field that satisfies the appearance order that [b] is the second and [C1] is the third in the search path. .
- the information search system can search a target node at high speed even if the graph structure information becomes complicated.
- the graph structure information is stored by an index repository that uses each element constituting the graph structure information as an index. Therefore, even if the graph structure information is complicated, the number of tuples of the index repository to be searched is changed to the graph structure information. This is because it can be suppressed to the order of the number of elements constituting the. Therefore, when searching for a target node, the number of times the element that is the index word of the index repository is compared with the element included in the search path can be suppressed to the order of the number of elements constituting the graph structure information. Realize.
- the information search system as the first exemplary embodiment of the present invention can reduce the resource consumption of the storage device.
- the reason is that the number of index words, which is a factor for determining the size of the index repository, can be an order of the number of elements constituting the graph structure information.
- the data structure of the index repository in the first embodiment of the present invention is suitable as a data structure for storing graph structure information to be searched for the target node.
- the index repository stores the posting list of each element in association with each element of the graph structure information
- the graph structure information can be stored while suppressing the number of index words. This is because a node having a path field including a term included in the search path is searched for this data structure, and further, the target node can be searched at high speed by filtering using the position information.
- FIG. 6 shows the configuration of an information search system 2 as a second embodiment of the present invention.
- the same components as those in the first embodiment of the present invention are denoted by the same reference numerals, and detailed description thereof will be omitted.
- the information search system 2 includes an information search device 21, a knowledge information repository 22, and a client device 13.
- the information search device 21, the knowledge information repository 22, and the client device 13 are connected to be communicable with each other.
- the information search device 21 and the knowledge information repository 22 are configured by general-purpose computers in the same manner as the information search device 11 and the graph structure information storage device 12 in the first embodiment of the present invention.
- the knowledge information repository 22 constitutes an embodiment of the graph structure information storage device of the present invention.
- the knowledge information repository 22 stores a knowledge information model represented by an RDF graph.
- the knowledge information model constitutes an embodiment of the graph structure information in the present invention.
- An example of the knowledge information model stored in the knowledge information repository 22 is shown in FIG.
- resources and literals in RDF constitute an embodiment of the node of the present invention
- properties in RDF constitute an embodiment of the edge of the present invention.
- an ellipse indicates a resource
- a rectangle indicates a resource (literal) that takes a specific value
- an arrow indicates a property that is a relation between resources.
- a character string in the resource indicates a resource ID for identifying the resource.
- the character string on the property indicates a property ID for identifying the type of the property.
- a character string in a literal indicates a specific value (literal value) taken by the literal.
- the resource ID and the property ID may be a URI, a numerical value, a character string, or the like, and may be information that can uniquely identify the resource and property type.
- the resource ID, property ID, and literal value are referred to as a model vocabulary (hereinafter also referred to as a term).
- FIG. 7 is an example in which employees and their customers in an insurance company and their family members and enrollment insurance information are represented by RDF graphs.
- the resource b1 is an entity of a corporation (company), and has e1 as hasEmployee (employee). Note that “the entity of the corporation (company)” indicates that the type (type) of the resource b1 is “corporation”.
- the resource e1 is the entity of the employee (employee), and the contact (contact address) is xxx @ yyy. zzz. xx (e-mail address) and c1 as hasClient (customer).
- the resource c1 is an entity of Client (customer), and has p1 as hasFamilyMember (family member).
- the resource p1 is an entity of Person (person), and has a1 and a2 as has Insurance (subscription insurance).
- the resource a1 is an entity of Insurance (insurance), has a validUntil (expiration date) of December 31, 2010, and has a true (true) as inNonrefundable (whether or not it is non-insurance insurance).
- the resource a2 is an entity of Insurance (insurance), has a validUntil (expiration date) of December 31, 2015, and has a false (false) as inNonrefundable (whether or not it is a non-retired insurance).
- the information search apparatus 21 includes a path field generation unit 201, an index generation unit 202, a tokenize unit 212, an index repository 203, a search unit 204, a search path generation unit 205, an input / output unit 206, model data DB207.
- the path field generation unit 201, the index generation unit 202, the tokenize unit 212, the search unit 204, and the search path generation unit 205 are stored in a computer storage device as a computer program, and are read into a RAM and executed. Implemented by the CPU.
- the index repository 203 and the model data DB 207 are configured by a computer storage device.
- the model data DB 207 constitutes an embodiment of a subgraph storage unit in the present invention.
- the input / output unit 206 is stored in a computer storage device as a computer program, and is configured by a CPU that reads the program into a RAM and executes it, and a network interface.
- the path field generation unit 201 generates a path field for each resource constituting the knowledge information model, like the path field generation unit 101.
- the path field generation unit 201 extracts one or more paths representing a resource and property column that can be traced starting from each resource. Then, the path field generation unit 201 represents each extracted path as a suffix path in which a resource ID, a property ID, and a literal value are concatenated starting from the starting resource. Further, the path field generation unit 201 generates a path field for each resource by connecting suffix paths representing all paths that can be traced from one resource. Further, the path field generation unit 201 replaces the resource ID of the starting resource with the reserved word “THIS” in the generated path field.
- FIG. 8 is a path field generated for the resource e1 in the knowledge information model illustrated in FIG.
- the path field of the resource e1 is obtained by concatenating 10 paths that can be traced starting from the resource e1 with suffix paths.
- e1 which is the resource ID of the resource e1 as the starting point is replaced with the reserved word THIS.
- the tokenize unit 212 divides the pass field generated by the pass field generation unit 201 into vocabulary units of the knowledge information model.
- the tokenize unit 212 constitutes a part of an embodiment of the index generation unit of the present invention.
- the index generation unit 202 acquires a knowledge information model from the knowledge information repository 22. Then, the index generation unit 202 uses the path field generation unit 201 to generate a path field for each resource included in the knowledge information model.
- the index generation unit 202 uses the tokenize unit 212 to divide the generated path field into vocabularies and register them in a path index, a literal property index group, and a metadata index group of an index repository 203 described later.
- the index repository 203 has a path index, a literal property index group, and a metadata index group.
- the path index stores the terms (resource ID, property ID, literal value) constituting the knowledge information model in association with the posting list.
- the posting list is a list of information including a resource ID of a resource having a path field including the term and position information where the term appears in the path field.
- FIG. 9 shows an example of a path index corresponding to the knowledge information model shown in FIG.
- the path index stores, for example, the term type and its posting list in association with each other.
- the posting list of the term type is a list of information including the resources b1, e1, c1, p1, a1, and a2 having the path field including the term type and the position information of the term type in the path field.
- the appearance of the term type in the second term, the seventh term, etc. of the path field of the resource e1 is represented as e1 ⁇ 2,7.
- the position information in the path field may be represented by the number of terms counted from the top in the path field, or may be represented by a character string, a numerical value, a symbol, a reference relationship, or the like, and the term appears in the path field. Any information that can identify the position may be used.
- the pass index stores the reserved word THIS as an index word.
- the posting list associated with the reserved word THIS indicates where each resource is located in the path in the path field.
- the reserved word is not limited to “THIS” and may be information that does not overlap with other terms.
- the literal property index group is an index generated for each property ID of a property having a literal (literal property) in the knowledge information model.
- One literal property index stores a literal value and a resource having the property in association with each other.
- FIG. 10 shows an example of a literal property index group corresponding to the knowledge information model shown in FIG.
- FIG. 10 shows a contact index (FIG. 10 (a)), an isNonfundable index (FIG. 10 (b)), and a validUntil index (FIG. 10 (c)) respectively corresponding to the property IDs of three literal properties, contact, inNonrefundable, and validUntil. ).
- the number of literal property indexes included in the literal property index group is not limited to three, but depends on the number of types of literal properties included in the target knowledge information model.
- the index repository 203 does not need to store these literal property index groups separately in different tables, and may store them in the same table so as to be logically distinguishable.
- the metadata index group stores resources and property values in association with resources having special meanings in the knowledge information model.
- FIG. 11 shows an example of a metadata index corresponding to the knowledge information model shown in FIG.
- the type index in FIG. 11 is a metadata index prepared as a special property in which the property “type” represents the type (also referred to as class or type) of each resource.
- the type index stores a resource having a type property and its property value in association with each other.
- the index repository 203 may have an ID index as a metadata index.
- the ID index stores the resource ID and the property ID in association with the terms assigned thereto.
- the index repository 203 does not need to store them separately in different tables, and may store them in the same table so as to be logically distinguishable.
- the index generation unit 202 generates a posting list for each term representing each resource ID, each property ID, and each literal value constituting the knowledge information model represented by RDF. Then, the index generating unit 202 registers each term and the posting list of each term in the index repository 203 in association with each other.
- the model data DB 207 includes a subgraph table as shown in FIG.
- the sub-graph table extracts, for each resource constituting the knowledge information model, a sub-graph from the resource to a predetermined depth and stores it in association with the resource ID.
- each resource ID and a subgraph up to a resource or a literal value adjacent to the resource via one property are stored.
- the representation format of the subgraph stored in the subgraph table is a representation by a character string such as N3 (Notation 3), binary data such as a Java (registered trademark) object, serialized data of binary data, or a compression format thereof. Also good.
- the representation format of the subgraph may be any format that can reproduce a part of the original knowledge information model.
- the input / output unit 206 acquires search conditions from the client device 13 and notifies the search unit 204 of the acquired search conditions. Further, the input / output unit 206 outputs the search result notified from the search unit 204 to the client device 13.
- the search unit 204 receives a search condition from the input / output unit 206, and generates a search path and a path query corresponding to the search path using a search path generation unit 205 described later.
- the search unit 204 searches the index repository 203 using the generated path query.
- the search unit 204 acquires a sub graph of the search result resource from the model data DB 207 and outputs it to the input / output unit 206.
- the search path generation unit 205 generates a search path representing a search condition as a resource and property column.
- the search path generation unit 205 extracts a resource that is restricted to have a specific literal value as a property from the search conditions as a restricted resource, and searches a resource and property column that connects the target resource and the restricted resource. Generate as a path.
- the search path is a path in which all resources and properties are specified from the beginning to the end, depending on the content of the search condition, and a path that includes some resources or properties and includes unspecified parts. There is a case.
- the search path generation unit 205 generates a path query for the path index based on the appearance order of resources and properties in the generated search path. At this time, if there is an unspecified part in the search path, the search path generation unit 205 generates a path query in consideration of the number of elements that can be inserted in the unspecified part.
- the index generation unit 202 acquires a knowledge information model from the knowledge information repository 22 (step S11).
- the path field generation unit 201 generates a path field obtained by concatenating the suffix path for each resource constituting the knowledge information model (step S12).
- the tokenize unit 212 tokenizes each generated path field into terms constituting the knowledge information model (step S13).
- the index generation unit 202 generates a posting list for each term representing each resource, each property, and each literal constituting the knowledge information model. Then, the index generation unit 202 associates each term with its posting list and registers it in the path index (step S14). Furthermore, if the term is a literal property, the index generation unit 202 registers it in the literal property index. Further, if the term has a special property, the index generation unit 202 registers it in the metadata index.
- the index generation unit 202 extracts a subgraph of each resource constituting the knowledge information model and registers it in the model data DB 207 (step S15).
- the information search device 21 ends the index generation process.
- the information search apparatus 21 may execute either one first. If the information search device 21 can execute two processes in parallel, the series of processes in steps S12 to S14 and the process in step S15 may be executed in parallel.
- the input / output unit 206 acquires the search condition for the target resource from the client device 13 (Yes in step S21).
- the search unit 204 extracts a limited resource group that is limited to have a specific literal value as a property and its limit content (step S22).
- the search unit 204 uses the search path generation unit 205 to generate a query for searching for the substance of the limited resource for each extracted limited resource based on the limited content (step S23). .
- the search unit 204 searches the index repository 203 using the generated query to obtain an entity set of limited resources (step S24).
- the search unit 204 acquires the entity set group of the limited resource group by repeating the query generation (step S23) and the search (step S24) for all the limited resources extracted in step S22.
- the search unit 204 uses the search path generation unit 205 to generate a search path based on the search condition acquired in step S21 and the entity set group of the limited resource searched in step S24. Then, the search unit 204 uses the search path generation unit 205 to generate a path query for searching the path index based on the generated search path (step S25).
- the search unit 204 searches the index repository 203 using a path query, and acquires a resource ID group representing the search result (step S26).
- the search unit 204 acquires from the model data DB 207 a subgraph group of a resource ID group that represents the search result. Then, the search unit 204 generates sub-graph groups and generates graph data representing search results, and outputs the generated graph data to the input / output unit 206. Then, the input / output unit 206 outputs graph data representing the search result to the client device 13 (step S27).
- the information search device 21 ends the search process.
- the search path generation unit 205 receives a search condition, a limited resource group, and an entity set group of the limited resource from the search unit 204.
- the search path generation unit 205 extracts a target resource desired to be obtained as a search result from the search condition (step S31).
- the search path generation unit 205 performs the following processing for each restricted resource.
- step S32 a combination of properties on the knowledge information model between the target resource and the restricted resource is specified, and a property column is generated as a search path (step S32).
- the search path generation unit 205 OR-joins the entity set of the restricted resource and registers it at the position of the restricted resource in the search path (step S33).
- the search path generation unit 205 registers the reserved word THIS indicating the target resource at the position of the target resource in the search path (step S34).
- the search path generation unit 205 generates a phrase query that allows a distance 1 between terms in consideration of an unspecified part (step S35).
- the search path generated in steps S32 to S34 may include an unspecified portion where no resource is specified between properties. For this reason, the search path generation unit 205 generates a phrase query indicating that up to one resource can be inserted in an unspecified part between properties. Note that the number of elements that can be inserted between properties is not limited to one, and is appropriately set according to the contents of the search condition.
- the search path generation unit 205 executes the processing of steps S32 to S35 for each restricted resource.
- the search path generation unit 205 generates a property query group when there is a restriction on the property of the target resource in the search condition (step S36).
- the search path generation unit 205 AND-joins the phrase query group for all restricted resources and the property query group of the target resource, and returns it to the search unit 204 as a path query (step S37).
- the client device 13 asks the information retrieval device 21 from the knowledge information model shown in FIG. 7 that the sales representative in charge of a person who has a family member who is covered by a non-payment insurance within 2010. ”Will be described with reference to FIGS. 14 and 15 again.
- the input / output unit 206 acquires a pseudo SQL sentence (formula 1) representing the above-described search condition from the client device 13 (step S21).
- a pseudo SQL sentence (formula 1) representing the above-described search condition from the client device 13 (step S21).
- [Formula 1] Select x; Where x type Employee, x hasClient y, y hasFamilyMember z, z hasInsurance i, i validUntil ⁇ 20110101, i isNonrefundable true, i type Insurance;
- the search unit 204 extracts the variable i as a limited resource from Equation 1 (step S22).
- the search unit 204 uses the search path generation unit 205 to generate a query of Formula 2 using a condition that limits the variable i as a query for searching for this limited resource (step S23).
- [Formula 2] Select i; Where i validUntil ⁇ 20110101, i isNonrefundable true, i type Insurance;
- the search unit 204 searches for the literal property index group and the metadata index group using Expression 2, and obtains a resource ID a1 as the entity of the limited resource (step S24).
- the search path generation unit 205 extracts x as a target resource from Equation 1 (step S31).
- the search path generation unit 205 uses the search path between the target resource x and the limited resource i as follows: [hasClient] [hasFamilyMember] [hasInsurance] Is generated (step S32).
- the search path generation unit 205 adds the entity a1 of the restricted resource to the end that is the position of the restricted resource with respect to the search path, [hasClient] [hasFamilyMember] [hasInsurance] [a1] (Step S33) Further, the search path generation unit 205 adds the reserved word THIS to the head of the target resource position with respect to the generated search path, [THIS] [hasClient] [hasFamilyMember] [hasInsurance] [a1] (Step S34).
- the search path generation unit 205 generates a phrase query that allows the inter-term distance 1 in consideration of an unspecified part based on the generated search path (step S35).
- This phrase query is expressed as follows, for example.
- FIG. 16 shows a conceptual representation of this phrase query.
- the search path generation unit 205 generates a phrase query in consideration of the order of appearance of terms in the search path and the number of elements that can be inserted into unspecified locations.
- the search path generation unit 205 generates a property query corresponding to x type Employee as a property restriction for the target resource x (step S36).
- the search path generation unit 205 AND-links the phrase query generated in step S35 and the property query generated in step S36, and returns it as a path query (step S37).
- the search unit 204 searches the path index, the literal property index group, and the metadata index using this path query, and obtains e1 as the target resource (step S26).
- Step S26-1 The search unit 204 searches the path index for resources including the term THIS at the head of the phrase query (b1, e1, c1, p1, a1, a2 are searched).
- Step S26-2) The search unit 204 searches for the next term hasClient from the path index (b1 and e1 are searched).
- Step S26-3) The search unit 204 merges the results of steps S26-1 and S26-2 in units of resources (b1 and e1 remain).
- Step S26-4 The search unit 204 searches the next term hasFamilyMember from the path index (b1, e1, and c1 are searched).
- Step S26-5) The results of Steps S26-3 and S26-4 are merged in units of resources (b1 and e1 remain).
- Step S26-6) The searching unit 204 searches for the next term hasInsurance from the path index (b1, e1, c1, and p1 are searched).
- Step S26-7) The search unit 204 merges the results of steps S26-5 and S26-6 in units of resources (b1 and e1 remain).
- Step S26-8) The search unit 204 searches for the next term a1 from the path index (b1, e1, c1, and p1 are searched).
- Step S26-9) The search unit 204 merges the results of steps S26-7 and S26-8 in units of resources (b1 and e1 remain).
- Step S26-10) The search unit 204 filters resources based on the positional information of the path fields of the resources b1 and e1 remaining after the merge (e1 remains as a final search result).
- search unit 204 may execute the filtering process in step S26-10 after each of steps S26-1 to S26-9. Further, the search unit 204 may determine whether or not to execute this filtering process after each of steps S26-1 to S26-9 based on the increasing tendency of the number of resources after the search process or the merge process.
- the search unit 204 that has obtained the resource e1 as a search result acquires the subgraph of the resource e1 from the model data DB 207 and outputs it to the input / output unit 206 (step S27).
- the expression format of the search condition requested by the client device 13 is not limited to the pseudo SQL statement such as Expression 1, and includes, for example, information representing the RDF graph itself, SPARQ Protocol and RDF Query Language (SPARQL), and the like. It may be an RDF query language.
- Expression 1 includes, for example, information representing the RDF graph itself, SPARQ Protocol and RDF Query Language (SPARQL), and the like. It may be an RDF query language.
- the information retrieval system can retrieve the target resource at high speed even if the knowledge information model becomes complicated.
- the vocabulary constituting the knowledge information model is an index word in the path index, so even if the knowledge information model is complicated, the number of index words that affect the search speed can be suppressed to the order of the vocabulary number of the knowledge information model. Because.
- the information search system according to the second embodiment of the present invention can be searched at a substantially constant search speed with almost no influence even when a series of resources included in the search request becomes long. .
- the data structure of the path index in the second embodiment of the present invention is suitable as a data structure for storing a knowledge information model to be searched for a target resource.
- the path index stores the posting list in association with the terms constituting the knowledge information model
- the knowledge information model can be stored while suppressing the number of index words. This is because, with respect to this data structure, a resource having a path field including a term constituting a path query is searched from the path index, and then the position information is used for filtering, whereby the target resource can be searched at high speed.
- the information retrieval system as the second embodiment of the present invention can reduce the resource consumption for storing the knowledge information model.
- the data structure necessary for storing the knowledge information model is only the model data DB and the index repository. Furthermore, since the subgraph in the model data DB is not used for searching but is used when presenting a search result, it can be stored in a compressed format.
- the path index can suppress the number of index words, which is a factor for determining the index size, to the order of the number of vocabularies instead of the number of paths.
- the literal property index group and the metadata index group are sufficiently smaller than the path index, and the information to be stored is only the term and the resource ID, so the resource consumption is small. Therefore, both the model data DB and the index repository can be reduced to a small scale, and the consumption of resources such as storage devices can be reduced.
- FIG. 17 shows a functional block configuration of an information retrieval system 3 as a third embodiment of the present invention.
- the same components as those of the second embodiment of the present invention are denoted by the same reference numerals and detailed description thereof will be omitted.
- the information search system 3 is different from the information search system 2 according to the second embodiment of the present invention in that the information search system 3 includes an information search device 31 having a path field generation unit 301 instead of the path field generation unit 201.
- the path field generation unit 301 is different from the path field generation unit 201 in that a complete path is used instead of a suffix path as a path that can be traced from each resource.
- the complete path is a path obtained by connecting a path from a root resource to a starting resource to a suffix path from the starting resource.
- FIG. 18 shows the path field of the resource e1.
- 10 paths starting from e1 are represented by a complete path and connected.
- the information search system 3 as the third embodiment of the present invention executes index generation processing, search processing, and path query generation processing. This is different from the second embodiment in the index generation process.
- the index generation process of the information search system 3 will be described with reference to FIG.
- the information search system 3 generates a path field using a complete path instead of step S12 for the index generation processing of the information search system 2 as the second embodiment of the present invention shown in FIG.
- step S42 is executed.
- step S42 generation of a path field for the resource e1 of the knowledge information model shown in FIG. 7 will be described.
- the path field generation unit 301 extracts ten paths starting from the resource e1. Then, the path field generation unit 301 represents each path as a complete path obtained by concatenating the path from the root resource [b1] [hasEmployee] to the resource e1 and the suffix path from the resource e1. And the path field of the resource e1.
- the information search device 31 executes steps S11 and S13 to S15 in the same manner as the index generation process in the second embodiment of the present invention, and ends the index generation process.
- the information search system as the third embodiment of the present invention can improve the search function for the knowledge information model.
- the reason is that by using a path field with a complete path concatenated, a search request for searching the path from the restricted resource to the target resource, such as “an employee working for a company with a sales of 500 million yen or more” can be handled. Because it can.
- FIG. 20 a functional block configuration of an information search system 4 as a fourth embodiment of the present invention will be described with reference to FIG.
- the same components as those of the second embodiment of the present invention are denoted by the same reference numerals and detailed description thereof will be omitted.
- the information search system 4 is different from the information search system 2 according to the second embodiment of the present invention in that an information search device 41 further including an index update unit 409 is provided.
- the index update unit 409 acquires from the knowledge information repository 22 a resource, a property related to the resource, and a resource or literal that is an object of the property. Then, the index update unit 409 compares the subgraph registered in the model data DB 207, and identifies the changed, added, or deleted resource as a difference resource. Then, the index update unit 409 updates information related to the differential resource among the information stored in the index repository 203 and the model data DB 207.
- the information search system 4 performs an index update process.
- the index update process of the information search system 4 will be described with reference to FIG.
- the index update unit 409 acquires the resource, the property related to the resource, and the resource or literal of the object from the knowledge information repository 22. Then, the index updating unit 409 compares the acquired information with the subgraph already registered in the model data DB 207, and temporarily stores the changed or added difference resource (step S51).
- the index updating unit 409 identifies a resource that is registered in the model data DB 207 but does not exist in the knowledge information repository 22, and temporarily stores it as a difference resource for deletion (step S52).
- the index update unit 409 deletes all the difference resources and their subgraphs from the model data DB 207 (step S53).
- the index update unit 409 deletes information related to all the difference resources from the index repository 203 (step S54). Specifically, the index update unit 409 deletes the tuple corresponding to the difference resource from the path index. In addition, the index update unit 409 deletes the corresponding information from the posting list including information on the difference resource. In addition, the index update unit 409 deletes the tuple related to the difference resource from the literal property index group and the metadata index group.
- the index update unit 409 executes the following processing for each changed or added difference resource.
- the index update unit 409 uses the path field generation unit 201 to generate a path field of this differential resource (step S55).
- the index update unit 409 uses the tokenize unit 212 to tokenize the path field generated in step S55 based on the terms in the model obtained from the knowledge information repository 22 (step S56). .
- the index update unit 409 registers information related to the difference resource in the index repository 203 (step S57). Specifically, the index updating unit 409 generates a posting list of the difference resource based on the tokenized path field and registers it in the path index. Also, the index update unit 409 adds information including the difference resource and the position information of the term in the path field to the posting list of each term included in the path field of the difference resource. In addition, if the difference resource is a resource having a literal property or a special property, the index update unit 409 also registers in the literal property index group or the metadata index group.
- the index update unit 409 registers the difference resource and a subgraph having a predetermined depth from the resource in the model data DB 207 (step S58).
- the information search system can more efficiently perform the index repository update process for searching for the target resource at higher speed from the knowledge information.
- the reason is that by comparing the knowledge information repository and the model data DB, it is possible to specify the changed part of the knowledge information model, and update the index repository and the model data DB only for the specified changed part.
- the number of index words in the index repository and the model data DB is limited to the order of the number of vocabularies in the model, so that the update time for reflecting the changed portion in the index repository and the model data DB can be shortened.
- the information retrieval system has been described as retrieving a target resource from a knowledge information model represented by an RDF graph as graph structure information.
- the present invention is also applicable to a case where a target node is searched from other graph structure information represented by nodes and edges connecting the nodes.
- the operation of the information search apparatus described with reference to each flowchart is stored in a storage device (storage medium) of the information search apparatus as a computer program of the present invention.
- the CPU may read and execute the computer program.
- the present invention is constituted by the code of the computer program or a storage medium.
- the CPU executes the computer program, the path field generation unit, the index generation unit, the index update unit, the tokenize unit, the search unit, and the search shown in the functional block configuration diagram of each embodiment described above Each process of the path generation unit is realized.
- the graph structure storage device and the knowledge information repository may be configured as a local file system by a storage device of a computer that constitutes the information search device.
- the client device may be realized on the same computer by an application stored in a storage device of the computer constituting the information search device.
- the information search apparatus may acquire a search request from the user via the input / output device instead of acquiring the search request from the client apparatus.
- the information search apparatus may present the search result to the user via the input / output device.
- An information search apparatus for searching for a target node satisfying a search condition from graph structure information having a plurality of nodes and edges connecting between the nodes as elements, For each node included in the graph structure information, a path field generation unit that extracts a path that is a sequence of the elements that can be traced from the node and generates a path field that connects the extracted paths for each node.
- a posting list which is a list of information including a node having a path field including the element and position information indicating a position where the element appears in the path field is generated.
- An index generation unit that generates an index repository that associates the element with the posting list;
- a search path generation unit that generates a search path representing the search condition as a column of the elements;
- a node having a path field including each element included in the search path is searched from the index repository, and a node having a path field satisfying the appearance order of elements in the search path among the searched nodes is
- An information retrieval apparatus comprising:
- the path field generation unit generates the path field by representing each path that can be traced from each node as a sequence of elements from a root node in the graph structure information.
- the information search device according to appendix 1 or appendix 2.
- the system further comprises an index update unit that acquires difference information representing an element changed in the graph structure information and updates the index repository by controlling the path field generation unit and the index generation unit based on the difference information.
- the information search device according to any one of appendix 1 to appendix 3, wherein
- Appendix 5 A subgraph storage unit for extracting a subgraph having a predetermined depth starting from each node from the graph structure information, and storing the extracted subgraph;
- a search result presentation unit for presenting a subgraph starting from the target node searched by the search unit;
- a posting list that is a list of information including position information indicating a position where the element appears in the path field; A data structure that stores and associates.
- the position information included in the posting list represents a position in a path field in which each path that can be traced from each node is represented by a row of elements from a root node in the graph structure information and connected.
- a graph structure information storage device storing graph structure information having a plurality of nodes and edges connecting between the nodes as elements;
- a client device that requests a search for a target node that satisfies a search condition from the graph structure information;
- An information retrieval device for retrieving the target node from the graph structure information;
- An information retrieval system comprising The information search device includes: For each node included in the graph structure information, a path that is a sequence of the elements that can be traced from the node is extracted, and a path field that connects the extracted paths is generated for each node.
- a posting list which is a list of information including a node having a path field including the element and position information indicating a position where the element appears in the path field is generated.
- An index generation unit that generates an index repository that associates the element with the posting list;
- a search path generation unit that generates a search path representing the search condition as a column of the elements;
- a node having a path field including each element included in the search path is searched from the index repository, and a node having a path field satisfying the appearance order of elements in the search path among the searched nodes is A search unit for searching for the target node by extracting based on position information;
- An information retrieval system having
- a computer program for controlling the operation of an information retrieval apparatus that retrieves a target node satisfying a retrieval condition from graph structure information having a plurality of nodes and edges connecting the nodes as elements, For each node included in the graph structure information, a path field generation process that extracts a path that is a sequence of the elements that can be traced from the node and generates a path field that connects the extracted paths for each node.
- a posting list that is a list of information including a node having a path field including the element and position information indicating a position where the element appears in the path field is generated.
- An index generation process for generating an index repository in which the element and the posting list are associated with each other;
- a search path generation process for generating a search path representing the search condition as a column of the elements;
- a node having a path field including each element included in the search path is searched from the index repository, and a node having a path field satisfying the appearance order of the elements in the search path among the searched nodes.
- a computer program that causes a computer to execute.
- the graph structure information storage device stores a plurality of nodes and graph structure information whose elements are edges connecting the nodes, Information retrieval device For each node included in the graph structure information, a path that is a sequence of the elements that can be traced from the node is extracted. Generate a path field that connects the extracted paths for each node, For each element constituting the graph structure information, a posting list which is a list of information including a node having a path field including the element and position information indicating a position where the element appears in the path field is generated.
- the element and the posting list are associated with each other and stored in an index repository
- the client device is Requesting the information search device to search for a target node satisfying a search condition in the graph structure information
- the information retrieval device is Generating a search path representing the search condition as a sequence of the elements; Searching the index repository for a node having a path field including each element included in the search path; Searching the target node by extracting a node having a path field that satisfies the appearance order of elements in the search path among the searched nodes, based on the position information; Information retrieval method.
- An information search device for searching for a target node satisfying a search condition from graph structure information having a plurality of nodes and edges connecting the nodes as elements, For each node included in the graph structure information, a path that is a sequence of the elements that can be traced from the node is extracted. Generate a path field that connects the extracted paths for each node, For each element constituting the graph structure information, a posting list that is a list of information including a node having a path field including the element and position information indicating a position where the element appears in the path field is generated.
- the element and the posting list are associated with each other and stored in an index repository, Generating a search path representing the search condition as a sequence of the elements; Searching the index repository for a node having a path field including each element included in the search path; Searching the target node by extracting a node having a path field that satisfies the appearance order of elements in the search path among the searched nodes, based on the position information; Information retrieval method.
- the search unit of the information search device searches for the target node based on the appearance order of elements in the search path and the number of elements that can be inserted in the unspecified part when there is an unspecified part in the search path.
- the information search system according to appendix 8, wherein:
- the information retrieval device is When searching for the target node, if there is an unspecified part in the search path, the target node is searched based on the appearance order of elements in the search path and the number of elements that can be inserted in the unspecified part. 12.
- the present invention can provide an information search apparatus that can search a target node at high speed even if the graph structure information becomes complicated, and a knowledge information search apparatus that searches for a target resource from a large-scale knowledge information model. It is suitable as.
Abstract
Description
SELECT r.resourceName
FROM path AS p, resource AS r
WHERE p.pathID = r.pathID
AND p.pathexp = ’#title<#paints’
SELECT t1.object
FROM triple AS t1, triple AS t2, triple AS t3, triple AS t4
WHERE t1.predicate = ’paints’
AND t1.subject = t2.subject
AND t2.predicate = ’first’
AND t2.object = ’Picasso’
AND t1.subject = t3.subject
AND t3.predicate = ’last’
AND t3.object = ’Pablo’
前記検索条件を表す検索パスを前記要素の列として生成し、前記検索パスに含まれる各要素が含まれるパスフィールドを有するノードを前記インデックスレポジトリから検索し、
検索されたノードのうち前記検索パスにおける要素の出現順序を満たすパスフィールドを有するノードを、前記位置情報に基づいて抽出することにより前記目的ノードを検索する。
本発明の第1の実施の形態としての情報検索システム1のハードウェア構成を図1に示す。図1において、情報検索システム1は、情報検索装置11と、グラフ構造情報格納装置12と、クライアント装置13とを備えている。また、情報検索装置11と、グラフ構造情報格納装置12と、クライアント装置13とは互いに通信可能に接続されている。
ストとを対応付けて格納している。
その理由は、インデックスレポジトリのサイズを決定する要因である索引語の数が、グラフ構造情報を構成する要素数のオーダーで済むためである。
次に、本発明の第2の実施の形態について説明する。
ここでは、まず、入出力部206がクライアント装置13から目的リソースの検索条件を取得する(ステップS21でYes)。
[式1]
Select x; Where x type Employee, x
hasClient y, y hasFamilyMember z, z hasInsurance i, i validUntil < 20110101, i
isNonrefundable true, i type Insurance;
次に、検索部204は、式1から、変数iを制限リソースとして抽出する(ステップS22)。
[式2]
Select i; Where i validUntil < 20110101, i
isNonrefundable true, i type Insurance;
次に、検索部204は、式2を用いて、リテラルプロパティインデックス群およびメタデータインデックス群に対する検索を行い、制限リソースの実体として、リソースID a1を得る(ステップS24)。
[hasClient][hasFamilyMember][hasInsurance]
を生成する(ステップS32)。
[hasClient][hasFamilyMember][hasInsurance][a1]
とする(ステップS33)
さらに、検索パス生成部205は、生成した検索パスに対して、目的リソースの位置である先頭に、予約語THISを追加し、検索パスを、
[THIS][hasClient][hasFamilyMember][hasInsurance][a1]
とする(ステップS34)。
THIS.{0,1}hasClient.{0,1}hasFamilyMember.{0,1}hasInsurance.{0,1}a1
ここで、.{0,1}はその位置に他のタームが1つまで挿入可能であることを表す。また、このフレーズクエリを概念的に表したものを図16に示しておく。このように、検索パス生成部205は、検索パスにおけるタームの出現順序と未特定箇所に挿入可能な要素数を考慮してフレーズクエリを生成する。
(ステップS26-1):検索部204は、パスインデックスから、フレーズクエリの先頭のタームTHISを含むリソースを検索する(b1,e1,c1,p1,a1,a2が検索される)。
(ステップS26-2):検索部204は、パスインデックスから、次のタームhasClientを検索する(b1,e1が検索される)。
(ステップS26-3):検索部204は、ステップS26-1およびS26-2の結果をリソース単位でマージする(b1,e1が残る)。
(ステップS26-4):検索部204は、パスインデックスから、次のタームhasFamilyMemberを検索する(b1,e1,c1が検索される)。
(ステップS26-5):ステップS26-3およびS26-4の結果をリソース単位でマージする(b1,e1が残る)。
(ステップS26-6):検索部204は、パスインデックスから、次のタームhasInsuranceを検索する(b1,e1,c1,p1が検索される)。
(ステップS26-7):検索部204は、ステップS26-5およびS26-6の結果をリソース単位でマージする(b1,e1が残る)。
(ステップS26-8):検索部204は、パスインデックスから、次のタームa1を検索する(b1,e1,c1,p1が検索される)。
(ステップS26-9):検索部204は、ステップS26-7およびS26-8の結果をリソース単位でマージする(b1,e1が残る)。
(ステップS26-10):検索部204は、マージされて残ったリソースb1、e1のパスフィールドの位置情報に基づいて、リソースのフィルタリングを行う(最終的な検索結果としてe1が残る)。
次に、本発明の第3の実施の形態について図面を参照して説明する。
その理由は、完全パスを連結したパスフィールドを用いることにより、例えば「売り上げ5億円以上の企業に勤めている社員」といったように、制限リソースから目的リソースへのパスを検索する検索要求に対応できるからである。
次に、本発明の第4の実施の形態について図面を参照して詳細に説明する。
複数のノードおよびノード間を結ぶエッジを要素とするグラフ構造情報から検索条件を満たす目的ノードを検索する情報検索装置であって、
前記グラフ構造情報に含まれる各ノードについて、該ノードを起点としてたどることができる前記要素の列であるパスを抽出し、抽出したパスを連結したパスフィールドを、ノード毎に生成するパスフィールド生成部と、
前記グラフ構造情報を構成する各要素について、該要素が含まれるパスフィールドを有するノードと該要素が該パスフィールド中に出現する位置を表す位置情報とからなる情報のリストであるポスティングリストを生成し、前記要素と前記ポスティングリストとを対応付けたインデックスレポジトリを生成する索引生成部と、
前記検索条件を表す検索パスを前記要素の列として生成する検索パス生成部と、
前記検索パスに含まれる各要素が含まれるパスフィールドを有するノードを、前記インデックスレポジトリから検索し、当該検索されたノードのうち前記検索パスにおける要素の出現順序を満たすパスフィールドを有するノードを、前記位置情報に基づいて抽出することにより前記目的ノードを検索する検索部と、
を備えた情報検索装置。
前記検索部は、前記検索パスの一部に未特定箇所がある場合、当該検索パスにおける要素の出現順序および未特定箇所に挿入可能な要素数に基づいて前記目的ノードの検索を行う、ことを特徴とする付記1に記載の情報検索装置。
前記パスフィールド生成部は、前記各ノードを起点としてたどることができる各パスを、前記グラフ構造情報におけるルートノードからの前記要素の列で表すことによって前記パスフィールドを生成する、ことを特徴とする付記1または付記2に記載の情報検索装置。
前記グラフ構造情報において変更された要素を表す差分情報を取得し、前記差分情報に基づいて前記パスフィールド生成部および前記索引生成部を制御することによって前記インデックスレポジトリを更新する索引更新部をさらに備えた、ことを特徴とする付記1から付記3のいずれかに記載の情報検索装置。
前記グラフ構造情報から前記各ノードを起点とする所定の深さのサブグラフを抽出し、抽出したサブグラフを格納するサブグラフ格納部と、
前記検索部によって検索された目的ノードを起点とするサブグラフを提示する検索結果提示部と、
をさらに備えた、ことを特徴とする付記1から付記4のいずれかに記載の情報検索装置。
複数のノードおよびノード間を結ぶエッジを要素とするグラフ構造情報を格納するデータ構造であって、
前記各要素と、
前記要素毎にそれぞれ生成され、前記各ノードを起点としてたどることができる前記要素の列であるパスを連結して表した各ノードのパスフィールドのうち、該要素が含まれるパスフィールドを有するノードと該要素が該パスフィールド中に出現する位置を表す位置情報とからなる情報のリストであるポスティングリストと、
を対応付けて格納したデータ構造。
前記ポスティングリストに含まれる前記位置情報は、前記各ノードを起点としてたどることができる各パスを、前記グラフ構造情報におけるルートノードからの前記要素の列で表して連結したパスフィールドにおける位置を表す、ことを特徴とする付記6に記載のデータ構造。
複数のノードおよびノード間を結ぶエッジを要素とするグラフ構造情報を格納したグラフ構造情報格納装置と、
前記グラフ構造情報から検索条件を満たす目的ノードの検索を要求するクライアント装置と、
前記グラフ構造情報から前記目的ノードを検索する情報検索装置と、
を備えた情報検索システムであって、
前記情報検索装置は、
前記グラフ構造情報に含まれる各ノードについて、該ノードを起点としてたどることができる前記要素の列であるパスを抽出し、抽出したパスを連結したパスフィールドを、ノード毎にそれぞれ生成するパスフィールド生成部と、
前記グラフ構造情報を構成する各要素について、該要素が含まれるパスフィールドを有するノードと該要素が該パスフィールド中に出現する位置を表す位置情報とからなる情報のリストであるポスティングリストを生成し、前記要素と前記ポスティングリストとを対応付けたインデックスレポジトリを生成する索引生成部と、
前記検索条件を表す検索パスを前記要素の列として生成する検索パス生成部と、
前記検索パスに含まれる各要素が含まれるパスフィールドを有するノードを、前記インデックスレポジトリから検索し、当該検索されたノードのうち前記検索パスにおける要素の出現順序を満たすパスフィールドを有するノードを、前記位置情報に基づいて抽出することにより前記目的ノードを検索する検索部と、
を有する情報検索システム。
複数のノードおよびノード間を結ぶエッジを要素とするグラフ構造情報から検索条件を満たす目的ノードを検索する情報検索装置の動作制御のためのコンピュータ・プログラムであって、
前記グラフ構造情報に含まれる各ノードについて、該ノードを起点としてたどることができる前記要素の列であるパスを抽出し、抽出したパスを連結したパスフィールドを、ノード毎に生成するパスフィールド生成処理と、
前記グラフ構造情報を構成する各要素について、該要素が含まれるパスフィールドを有するノードと該要素が該パスフィールド中に出現する位置を表す位置情報とからなる情報のリストであるポスティングリストを生成し、前記要素と前記ポスティングリストとを対応付けたインデックスレポジトリを生成する索引生成処理と、
前記検索条件を表す検索パスを前記要素の列として生成する検索パス生成処理と、
前記検索パスに含まれる各要素が含まれるパスフィールドを有するノードを、前記インデックスレポジトリから検索し、当該検索されたノードのうち、前記検索パスにおける要素の出現順序を満たすパスフィールドを有するノードを、前記位置情報に基づいて抽出することにより前記目的ノードを検索する検索処理と、
をコンピュータに実行させるコンピュータ・プログラム。
グラフ構造情報格納装置が、複数のノードおよびノード間を結ぶエッジを要素とするグラフ構造情報を格納し、
情報検索装置が、
前記グラフ構造情報に含まれる各ノードについて、該ノードを起点としてたどることができる前記要素の列であるパスを抽出し、
抽出したパスを連結したパスフィールドをノード毎にそれぞれ生成し、
前記グラフ構造情報を構成する各要素について、該要素が含まれるパスフィールドを有するノードと該要素が該パスフィールド中に出現する位置を表す位置情報とからなる情報のリストであるポスティングリストを生成し、
前記要素と前記ポスティングリストとを対応付けてインデックスレポジトリに格納し、
クライアント装置が、
前記グラフ構造情報のうち検索条件を満たす目的ノードの検索を前記情報検索装置に要求し、
前記情報検索装置が、
前記検索条件を表す検索パスを前記要素の列として生成し、
前記検索パスに含まれる各要素が含まれるパスフィールドを有するノードを前記インデックスレポジトリから検索し、
検索されたノードのうち前記検索パスにおける要素の出現順序を満たすパスフィールドを有するノードを、前記位置情報に基づいて抽出することにより前記目的ノードを検索する、
情報検索方法。
複数のノードおよびノード間を結ぶエッジを要素とするグラフ構造情報から検索条件を満たす目的ノードを検索する情報検索装置が、
前記グラフ構造情報に含まれる各ノードについて、該ノードを起点としてたどることができる前記要素の列であるパスを抽出し、
抽出したパスを連結したパスフィールドをノード毎にそれぞれ生成し、
前記グラフ構造情報を構成する各要素について、該要素が含まれるパスフィールドを有するノードと該要素が該パスフィールド中に出現する位置を表す位置情報とからなる情報のリストであるポスティングリストを生成し、
前記要素と前記ポスティングリストとを対応付けてインデックスレポジトリに格納し、
前記検索条件を表す検索パスを前記要素の列として生成し、
前記検索パスに含まれる各要素が含まれるパスフィールドを有するノードを前記インデックスレポジトリから検索し、
検索されたノードのうち前記検索パスにおける要素の出現順序を満たすパスフィールドを有するノードを、前記位置情報に基づいて抽出することにより前記目的ノードを検索する、
情報検索方法。
前記情報検索装置の前記検索部は、前記検索パスの一部に未特定箇所がある場合、当該検索パスにおける要素の出現順序および未特定箇所に挿入可能な要素数に基づいて前記目的ノードの検索を行うことを特徴とする付記8に記載の情報検索システム。
前記検索処理において、前記検索パスの一部に未特定箇所がある場合、当該検索パスにおける要素の出現順序および未特定箇所に挿入可能な要素数に基づいて前記目的ノードの検索を行うことを特徴とする付記9に記載のコンピュータ・プログラム。
前記情報検索装置が、
前記目的ノードを検索する際に、前記検索パスの一部に未特定箇所がある場合、当該検索パスにおける要素の出現順序および未特定箇所に挿入可能な要素数に基づいて前記目的ノードの検索を行うことを特徴とする付記10または付記11に記載の情報検索方法。
11、21、31、41 情報検索装置
12 グラフ構造情報格納装置
13 クライアント装置
22 知識情報レポジトリ
101、201、301 パスフィールド生成部
102、202 索引生成部
103、203 インデックスレポジトリ
104、204 検索部
105、205 検索パス生成部
206 入出力部
207 モデルデータDB
212 トークナイズ部
409 索引更新部
Claims (10)
- 複数のノードおよびノード間を結ぶエッジを要素とするグラフ構造情報から検索条件を満たす目的ノードを検索する情報検索装置であって、
前記グラフ構造情報に含まれる各ノードについて、該ノードを起点としてたどることができる前記要素の列であるパスを抽出し、抽出したパスを連結したパスフィールドを、ノード毎に生成するパスフィールド生成部と、
前記グラフ構造情報を構成する各要素について、該要素が含まれるパスフィールドを有するノードと該要素が該パスフィールド中に出現する位置を表す位置情報とからなる情報のリストであるポスティングリストを生成し、前記要素と前記ポスティングリストとを対応付けたインデックスレポジトリを生成する索引生成部と、
前記検索条件を表す検索パスを前記要素の列として生成する検索パス生成部と、
前記検索パスに含まれる各要素が含まれるパスフィールドを有するノードを、前記インデックスレポジトリから検索し、当該検索されたノードのうち前記検索パスにおける要素の出現順序を満たすパスフィールドを有するノードを、前記位置情報に基づいて抽出することにより前記目的ノードを検索する検索部と、
を備えた情報検索装置。 - 前記検索部は、前記検索パスの一部に未特定箇所がある場合、当該検索パスにおける要素の出現順序および未特定箇所に挿入可能な要素数に基づいて前記目的ノードの検索を行う、
ことを特徴とする請求項1に記載の情報検索装置。 - 前記パスフィールド生成部は、前記各ノードを起点としてたどることができる各パスを、前記グラフ構造情報におけるルートノードからの前記要素の列で表すことによって前記パスフィールドを生成する、
ことを特徴とする請求項1または請求項2に記載の情報検索装置。 - 前記グラフ構造情報において変更された要素を表す差分情報を取得し、前記差分情報に基づいて前記パスフィールド生成部および前記索引生成部を制御することによって前記インデックスレポジトリを更新する索引更新部をさらに備えた、
ことを特徴とする請求項1から請求項3のいずれかに記載の情報検索装置。 - 前記グラフ構造情報から前記各ノードを起点とする所定の深さのサブグラフを抽出し、抽出したサブグラフを格納するサブグラフ格納部と、
前記検索部によって検索された目的ノードを起点とするサブグラフを提示する検索結果提示部と、
をさらに備えた、ことを特徴とする請求項1から請求項4のいずれかに記載の情報検索装置。 - 複数のノードおよびノード間を結ぶエッジを要素とするグラフ構造情報を格納するデータ構造であって、
前記各要素と、
前記要素毎にそれぞれ生成され、前記各ノードを起点としてたどることができる前記要素の列であるパスを連結して表した各ノードのパスフィールドのうち、該要素が含まれるパスフィールドを有するノードと該要素が該パスフィールド中に出現する位置を表す位置情報とからなる情報のリストであるポスティングリストと、
を対応付けて格納したデータ構造。 - 前記ポスティングリストに含まれる前記位置情報は、前記各ノードを起点としてたどることができる各パスを、前記グラフ構造情報におけるルートノードからの前記要素の列で表して連結したパスフィールドにおける位置を表す、
ことを特徴とする請求項6に記載のデータ構造。 - 複数のノードおよびノード間を結ぶエッジを要素とするグラフ構造情報を格納したグラフ構造情報格納装置と、
前記グラフ構造情報から検索条件を満たす目的ノードの検索を要求するクライアント装置と、
前記グラフ構造情報から前記目的ノードを検索する情報検索装置と、
を備えた情報検索システムであって、
前記情報検索装置は、
前記グラフ構造情報に含まれる各ノードについて、該ノードを起点としてたどることができる前記要素の列であるパスを抽出し、抽出したパスを連結したパスフィールドを、ノード毎にそれぞれ生成するパスフィールド生成部と、
前記グラフ構造情報を構成する各要素について、該要素が含まれるパスフィールドを有するノードと該要素が該パスフィールド中に出現する位置を表す位置情報とからなる情報のリストであるポスティングリストを生成し、前記要素と前記ポスティングリストとを対応付けたインデックスレポジトリを生成する索引生成部と、
前記検索条件を表す検索パスを前記要素の列として生成する検索パス生成部と、
前記検索パスに含まれる各要素が含まれるパスフィールドを有するノードを、前記インデックスレポジトリから検索し、当該検索されたノードのうち前記検索パスにおける要素の出現順序を満たすパスフィールドを有するノードを、前記位置情報に基づいて抽出することにより前記目的ノードを検索する検索部と、
を有する情報検索システム。 - 複数のノードおよびノード間を結ぶエッジを要素とするグラフ構造情報から検索条件を満たす目的ノードを検索する情報検索装置の動作制御のためのコンピュータ・プログラムであって、
前記グラフ構造情報に含まれる各ノードについて、該ノードを起点としてたどることができる前記要素の列であるパスを抽出し、抽出したパスを連結したパスフィールドを、ノード毎に生成するパスフィールド生成処理と、
前記グラフ構造情報を構成する各要素について、該要素が含まれるパスフィールドを有するノードと該要素が該パスフィールド中に出現する位置を表す位置情報とからなる情報のリストであるポスティングリストを生成し、前記要素と前記ポスティングリストとを対応付けたインデックスレポジトリを生成する索引生成処理と、
前記検索条件を表す検索パスを前記要素の列として生成する検索パス生成処理と、
前記検索パスに含まれる各要素が含まれるパスフィールドを有するノードを、前記インデックスレポジトリから検索し、当該検索されたノードのうち、前記検索パスにおける要素の出現順序を満たすパスフィールドを有するノードを、前記位置情報に基づいて抽出することにより前記目的ノードを検索する検索処理と、
をコンピュータに実行させるコンピュータ・プログラム。 - グラフ構造情報格納装置が、複数のノードおよびノード間を結ぶエッジを要素とするグラフ構造情報を格納し、
情報検索装置が、
前記グラフ構造情報に含まれる各ノードについて、該ノードを起点としてたどることができる前記要素の列であるパスを抽出し、
抽出したパスを連結したパスフィールドをノード毎にそれぞれ生成し、
前記グラフ構造情報を構成する各要素について、該要素が含まれるパスフィールドを有するノードと該要素が該パスフィールド中に出現する位置を表す位置情報とからなる情報のリストであるポスティングリストを生成し、
前記要素と前記ポスティングリストとを対応付けてインデックスレポジトリに格納し、
クライアント装置が、
前記グラフ構造情報のうち検索条件を満たす目的ノードの検索を前記情報検索装置に要求し、
前記情報検索装置が、
前記検索条件を表す検索パスを前記要素の列として生成し、
前記検索パスに含まれる各要素が含まれるパスフィールドを有するノードを前記インデックスレポジトリから検索し、
検索されたノードのうち前記検索パスにおける要素の出現順序を満たすパスフィールドを有するノードを、前記位置情報に基づいて抽出することにより前記目的ノードを検索する、
情報検索方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2011538762A JP4947245B2 (ja) | 2010-05-14 | 2011-05-12 | 情報検索装置、情報検索方法、コンピュータ・プログラムおよびデータ構造 |
EP11780393.2A EP2570936A4 (en) | 2010-05-14 | 2011-05-12 | INFORMATION RECOVERING DEVICE, INFORMATION RECOVERING METHOD, COMPUTER PROGRAM, AND DATA STRUCTURE |
CN2011800240419A CN102893281A (zh) | 2010-05-14 | 2011-05-12 | 信息搜索设备、信息搜索方法、计算机程序和数据结构 |
US13/642,890 US9141727B2 (en) | 2010-05-14 | 2011-05-12 | Information search device, information search method, computer program, and data structure |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2010111940 | 2010-05-14 | ||
JP2010-111940 | 2010-05-14 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011142134A1 true WO2011142134A1 (ja) | 2011-11-17 |
Family
ID=44914194
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/002641 WO2011142134A1 (ja) | 2010-05-14 | 2011-05-12 | 情報検索装置、情報検索方法、コンピュータ・プログラムおよびデータ構造 |
Country Status (5)
Country | Link |
---|---|
US (1) | US9141727B2 (ja) |
EP (1) | EP2570936A4 (ja) |
JP (1) | JP4947245B2 (ja) |
CN (1) | CN102893281A (ja) |
WO (1) | WO2011142134A1 (ja) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20150028934A (ko) * | 2013-09-06 | 2015-03-17 | 삼성전자주식회사 | 데이터 검색 방법 및 장치 |
JP2015531940A (ja) * | 2012-08-31 | 2015-11-05 | フェイスブック,インク. | グラフ照会言語api照会および構文解析 |
JP2016154050A (ja) * | 2013-03-13 | 2016-08-25 | フェイスブック,インク. | 短語のハッシュ |
CN112214645A (zh) * | 2019-07-11 | 2021-01-12 | 杭州海康威视数字技术股份有限公司 | 一种存储轨迹数据的方法及装置 |
JP2021128779A (ja) * | 2020-04-08 | 2021-09-02 | ベイジン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッド | データ拡張の方法及び装置、機器、記憶媒体 |
JP7077387B1 (ja) | 2020-11-25 | 2022-05-30 | ヤフー株式会社 | 情報処理装置、情報処理方法、及び情報処理プログラム |
JP2022083920A (ja) * | 2020-11-25 | 2022-06-06 | ヤフー株式会社 | 情報処理装置、情報処理方法、及び情報処理プログラム |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5866922B2 (ja) * | 2011-09-22 | 2016-02-24 | 富士ゼロックス株式会社 | 検索装置及びプログラム |
US11487707B2 (en) * | 2012-04-30 | 2022-11-01 | International Business Machines Corporation | Efficient file path indexing for a content repository |
WO2015052690A1 (en) * | 2013-10-10 | 2015-04-16 | Yandex Europe Ag | Methods and systems for indexing references to documents of a database and for locating documents in the database |
KR101678149B1 (ko) | 2016-02-05 | 2016-11-25 | 주식회사 비트나인 | 데이터베이스의 데이터 탐색방법 및 그 장치와 이를 위한 컴퓨터 프로그램 |
US10467229B2 (en) | 2016-09-30 | 2019-11-05 | Microsoft Technology Licensing, Llc. | Query-time analytics on graph queries spanning subgraphs |
US10545945B2 (en) | 2016-10-28 | 2020-01-28 | Microsoft Technology Licensing, Llc | Change monitoring spanning graph queries |
US10445361B2 (en) * | 2016-12-15 | 2019-10-15 | Microsoft Technology Licensing, Llc | Caching of subgraphs and integration of cached subgraphs into graph query results |
US10402403B2 (en) * | 2016-12-15 | 2019-09-03 | Microsoft Technology Licensing, Llc | Utilization of probabilistic characteristics for reduction of graph database traversals |
US10242223B2 (en) | 2017-02-27 | 2019-03-26 | Microsoft Technology Licensing, Llc | Access controlled graph query spanning |
US11100406B2 (en) | 2017-03-29 | 2021-08-24 | Futurewei Technologies, Inc. | Knowledge network platform |
CN108520029A (zh) * | 2018-03-27 | 2018-09-11 | 四川斐讯信息技术有限公司 | 一种基于图片和定位信息进行搜索的方法、服务器及系统 |
KR20210128096A (ko) * | 2020-04-16 | 2021-10-26 | 세종대학교산학협력단 | 사물인터넷 플랫폼 간 연동 방법 및 장치 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001034619A (ja) * | 1999-07-16 | 2001-02-09 | Fujitsu Ltd | Xmlデータの格納/検索方法およびxmlデータ検索システム |
JP2004118543A (ja) * | 2002-09-26 | 2004-04-15 | Toshiba Corp | 構造化文書検索方法、検索支援方法、検索支援装置および検索支援プログラム |
JP2006313501A (ja) * | 2005-05-09 | 2006-11-16 | Nippon Telegr & Teleph Corp <Ntt> | 共通クエリグラフパターン生成装置、生成方法、生成用プログラム、およびこれらを用いた共通サブグラフ検索装置、検索方法、検索用プログラム |
JP2007140713A (ja) * | 2005-11-15 | 2007-06-07 | Nippon Telegr & Teleph Corp <Ntt> | グラフ検索装置 |
JP2009258749A (ja) | 2009-07-24 | 2009-11-05 | Olympus Corp | 光学フィルタ及び光学機器 |
JP2010111940A (ja) | 2008-10-08 | 2010-05-20 | Jfe Steel Corp | 真空脱ガス装置における複合ランスを用いた加熱・精錬方法 |
Family Cites Families (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4039484B2 (ja) * | 2002-02-28 | 2008-01-30 | インターナショナル・ビジネス・マシーンズ・コーポレーション | XPath評価方法、これを用いたXML文書処理システム及びプログラム |
KR100484138B1 (ko) * | 2002-05-08 | 2005-04-18 | 삼성전자주식회사 | 관계형 데이터베이스에서 정규 경로식 질의를 처리하는xml 인덱싱 방법과 자료구조 |
WO2003107222A1 (en) * | 2002-06-13 | 2003-12-24 | Cerisent Corporation | Parent-child query indexing for xml databases |
AUPS300402A0 (en) * | 2002-06-17 | 2002-07-11 | Canon Kabushiki Kaisha | Indexing and querying structured documents |
US7162485B2 (en) * | 2002-06-19 | 2007-01-09 | Georg Gottlob | Efficient processing of XPath queries |
JP3982623B2 (ja) * | 2003-03-25 | 2007-09-26 | インターナショナル・ビジネス・マシーンズ・コーポレーション | 情報処理装置、データベース検索システム及びプログラム |
US7392239B2 (en) * | 2003-04-14 | 2008-06-24 | International Business Machines Corporation | System and method for querying XML streams |
US7877366B2 (en) * | 2004-03-12 | 2011-01-25 | Oracle International Corporation | Streaming XML data retrieval using XPath |
CN100517318C (zh) | 2004-04-09 | 2009-07-22 | 甲骨文国际公司 | 用于存取xml数据的索引 |
US20050257201A1 (en) * | 2004-05-17 | 2005-11-17 | International Business Machines Corporation | Optimization of XPath expressions for evaluation upon streaming XML data |
US9171100B2 (en) * | 2004-09-22 | 2015-10-27 | Primo M. Pettovello | MTree an XPath multi-axis structure threaded index |
US7685138B2 (en) | 2005-11-08 | 2010-03-23 | International Business Machines Corporation | Virtual cursors for XML joins |
US8949455B2 (en) * | 2005-11-21 | 2015-02-03 | Oracle International Corporation | Path-caching mechanism to improve performance of path-related operations in a repository |
US8015165B2 (en) * | 2005-12-14 | 2011-09-06 | Oracle International Corporation | Efficient path-based operations while searching across versions in a repository |
US7849091B1 (en) * | 2006-01-25 | 2010-12-07 | At&T Intellectual Property Ii, L.P. | Meta-data indexing for XPath location steps |
US8880506B2 (en) * | 2009-10-16 | 2014-11-04 | Oracle International Corporation | Leveraging structured XML index data for evaluating database queries |
JP2008041082A (ja) * | 2006-07-12 | 2008-02-21 | Hitachi Ltd | 処理装置及びプログラム |
US7765215B2 (en) | 2006-08-22 | 2010-07-27 | International Business Machines Corporation | System and method for providing a trustworthy inverted index to enable searching of records |
JP4374014B2 (ja) * | 2006-11-21 | 2009-12-02 | 株式会社日立製作所 | インデクス生成装置及びそのプログラム |
US7496568B2 (en) * | 2006-11-30 | 2009-02-24 | International Business Machines Corporation | Efficient multifaceted search in information retrieval systems |
US8079020B2 (en) * | 2007-03-05 | 2011-12-13 | Microsoft Corporation | Preferential path profiling |
JP2009295013A (ja) * | 2008-06-06 | 2009-12-17 | Hitachi Ltd | データベース管理方法、データベース管理装置およびプログラム |
CN101685444B (zh) | 2008-09-27 | 2012-05-30 | 国际商业机器公司 | 用于实现元数据搜索的系统和方法 |
CN101655862A (zh) * | 2009-08-11 | 2010-02-24 | 华天清 | 信息对象搜索的方法和装置 |
-
2011
- 2011-05-12 EP EP11780393.2A patent/EP2570936A4/en not_active Withdrawn
- 2011-05-12 WO PCT/JP2011/002641 patent/WO2011142134A1/ja active Application Filing
- 2011-05-12 CN CN2011800240419A patent/CN102893281A/zh active Pending
- 2011-05-12 JP JP2011538762A patent/JP4947245B2/ja active Active
- 2011-05-12 US US13/642,890 patent/US9141727B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001034619A (ja) * | 1999-07-16 | 2001-02-09 | Fujitsu Ltd | Xmlデータの格納/検索方法およびxmlデータ検索システム |
JP2004118543A (ja) * | 2002-09-26 | 2004-04-15 | Toshiba Corp | 構造化文書検索方法、検索支援方法、検索支援装置および検索支援プログラム |
JP2006313501A (ja) * | 2005-05-09 | 2006-11-16 | Nippon Telegr & Teleph Corp <Ntt> | 共通クエリグラフパターン生成装置、生成方法、生成用プログラム、およびこれらを用いた共通サブグラフ検索装置、検索方法、検索用プログラム |
JP2007140713A (ja) * | 2005-11-15 | 2007-06-07 | Nippon Telegr & Teleph Corp <Ntt> | グラフ検索装置 |
JP2010111940A (ja) | 2008-10-08 | 2010-05-20 | Jfe Steel Corp | 真空脱ガス装置における複合ランスを用いた加熱・精錬方法 |
JP2009258749A (ja) | 2009-07-24 | 2009-11-05 | Olympus Corp | 光学フィルタ及び光学機器 |
Non-Patent Citations (2)
Title |
---|
AKIYOSHI MATONO ET AL.: "A Path-based Relational RDF Database", ADC '05: PROCEEDINGS OF THE 16TH AUSTRALASIAN DATABASE CONFERENCE, 2005, pages 95 - 103, XP058168128 |
See also references of EP2570936A4 |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2015531940A (ja) * | 2012-08-31 | 2015-11-05 | フェイスブック,インク. | グラフ照会言語api照会および構文解析 |
JP2016154050A (ja) * | 2013-03-13 | 2016-08-25 | フェイスブック,インク. | 短語のハッシュ |
US10318652B2 (en) | 2013-03-13 | 2019-06-11 | Facebook, Inc. | Short-term hashes |
KR102104496B1 (ko) | 2013-09-06 | 2020-04-24 | 삼성전자주식회사 | 데이터 검색 방법 및 장치 |
KR20150028934A (ko) * | 2013-09-06 | 2015-03-17 | 삼성전자주식회사 | 데이터 검색 방법 및 장치 |
CN112214645B (zh) * | 2019-07-11 | 2023-09-19 | 杭州海康威视数字技术股份有限公司 | 一种存储轨迹数据的方法及装置 |
CN112214645A (zh) * | 2019-07-11 | 2021-01-12 | 杭州海康威视数字技术股份有限公司 | 一种存储轨迹数据的方法及装置 |
JP2021128779A (ja) * | 2020-04-08 | 2021-09-02 | ベイジン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッド | データ拡張の方法及び装置、機器、記憶媒体 |
JP7229291B2 (ja) | 2020-04-08 | 2023-02-27 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | データ拡張の方法及び装置、機器、記憶媒体 |
JP2022083919A (ja) * | 2020-11-25 | 2022-06-06 | ヤフー株式会社 | 情報処理装置、情報処理方法、及び情報処理プログラム |
JP2022083920A (ja) * | 2020-11-25 | 2022-06-06 | ヤフー株式会社 | 情報処理装置、情報処理方法、及び情報処理プログラム |
JP7109522B2 (ja) | 2020-11-25 | 2022-07-29 | ヤフー株式会社 | 情報処理装置、情報処理方法、及び情報処理プログラム |
JP7077387B1 (ja) | 2020-11-25 | 2022-05-30 | ヤフー株式会社 | 情報処理装置、情報処理方法、及び情報処理プログラム |
Also Published As
Publication number | Publication date |
---|---|
JP4947245B2 (ja) | 2012-06-06 |
US20130103693A1 (en) | 2013-04-25 |
US9141727B2 (en) | 2015-09-22 |
CN102893281A (zh) | 2013-01-23 |
JPWO2011142134A1 (ja) | 2013-07-22 |
EP2570936A1 (en) | 2013-03-20 |
EP2570936A4 (en) | 2015-08-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4947245B2 (ja) | 情報検索装置、情報検索方法、コンピュータ・プログラムおよびデータ構造 | |
US10659467B1 (en) | Distributed storage and distributed processing query statement reconstruction in accordance with a policy | |
US8768931B2 (en) | Representing and manipulating RDF data in a relational database management system | |
US9197597B2 (en) | RDF object type and reification in the database | |
Das et al. | A Tale of Two Graphs: Property Graphs as RDF in Oracle. | |
JP6720641B2 (ja) | 多言語データティアのデータ制約 | |
US8161371B2 (en) | Method and system for defining a heirarchical structure | |
US8983931B2 (en) | Index-based evaluation of path-based queries | |
US20090187581A1 (en) | Consolidation and association of structured and unstructured data on a computer file system | |
US20120310963A1 (en) | Apparatus and method of searching and visualizing instance path | |
JP5927886B2 (ja) | クエリシステム及びコンピュータプログラム | |
JP5844824B2 (ja) | Sparqlクエリ最適化方法 | |
US8756246B2 (en) | Method and system for caching lexical mappings for RDF data | |
Botoeva et al. | Ontology-based data access–Beyond relational sources | |
US8965910B2 (en) | Apparatus and method of searching for instance path based on ontology schema | |
Groppe et al. | Using an index of precomputed joins in order to speed up SPARQL processing. | |
JP3671765B2 (ja) | 異種情報源問い合わせ変換方法及び装置及び異種情報源問い合わせ変換プログラムを格納した記憶媒体 | |
US10769209B1 (en) | Apparatus and method for template driven data extraction in a semi-structured document database | |
JP5488792B2 (ja) | データベース操作装置、データベース操作方法、及びプログラム | |
US20170235845A1 (en) | Non-unique secondary indexing of semi-structured data in databases | |
JP2024504556A (ja) | データ処理システムによって管理されるデータエンティティにアクセスするためのシステム及び方法 | |
Unbehauen et al. | SPARQL Update queries over R2RML mapped data sources | |
Endres et al. | Index structures for preference database queries | |
JP2016062522A (ja) | データベース管理システム、データベースシステム、データベース管理方法およびデータベース管理プログラム | |
Sima et al. | Keyword query approach over rdf data based on tree template |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201180024041.9 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011538762 Country of ref document: JP |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11780393 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011780393 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 9520/CHENP/2012 Country of ref document: IN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13642890 Country of ref document: US |