WO2017064561A3 - Selection of initial document collection for visual interactive search - Google Patents

Selection of initial document collection for visual interactive search Download PDF

Info

Publication number
WO2017064561A3
WO2017064561A3 PCT/IB2016/001590 IB2016001590W WO2017064561A3 WO 2017064561 A3 WO2017064561 A3 WO 2017064561A3 IB 2016001590 W IB2016001590 W IB 2016001590W WO 2017064561 A3 WO2017064561 A3 WO 2017064561A3
Authority
WO
WIPO (PCT)
Prior art keywords
documents
initial
user
selection
space
Prior art date
Application number
PCT/IB2016/001590
Other languages
French (fr)
Other versions
WO2017064561A2 (en
Inventor
Diego Guy M. LEGRAND
Philip M. Long
Nigel Duffy
Original Assignee
Sentient Technologies (Barbados) Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sentient Technologies (Barbados) Limited filed Critical Sentient Technologies (Barbados) Limited
Publication of WO2017064561A2 publication Critical patent/WO2017064561A2/en
Publication of WO2017064561A3 publication Critical patent/WO2017064561A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor

Abstract

Roughly described, a system for user identification of a desired document. A database identifies a catalog of documents in an embedding space, in which the distance between documents corresponds to a measure of their dissimilarity. The system presents an initial collection of the documents toward the user from an initial candidate space which is part of the embedding space, then in response to iterative user input, refines the candidate space and subsequent collections of documents presented toward the user. The initial collection is determined using a weighted cost-based iterative addition to the initial collection of documents from the initial candidate space, trading off between two sub-objectives of representativeness and diversity.
PCT/IB2016/001590 2015-10-15 2016-10-17 Selection of initial document collection for visual interactive search WO2017064561A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201562242258P 2015-10-15 2015-10-15
US62/242,258 2015-10-15

Publications (2)

Publication Number Publication Date
WO2017064561A2 WO2017064561A2 (en) 2017-04-20
WO2017064561A3 true WO2017064561A3 (en) 2017-07-06

Family

ID=58517053

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2016/001590 WO2017064561A2 (en) 2015-10-15 2016-10-17 Selection of initial document collection for visual interactive search

Country Status (1)

Country Link
WO (1) WO2017064561A2 (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6098034A (en) * 1996-03-18 2000-08-01 Expert Ease Development, Ltd. Method for standardizing phrasing in a document
US6286018B1 (en) * 1998-03-18 2001-09-04 Xerox Corporation Method and apparatus for finding a set of documents relevant to a focus set using citation analysis and spreading activation techniques
US6353825B1 (en) * 1999-07-30 2002-03-05 Verizon Laboratories Inc. Method and device for classification using iterative information retrieval techniques
US20020164078A1 (en) * 2001-03-23 2002-11-07 Fujitsu Limited Information retrieving system and method
US20050165600A1 (en) * 2004-01-27 2005-07-28 Kas Kasravi System and method for comparative analysis of textual documents
US20080243842A1 (en) * 2007-03-28 2008-10-02 Xerox Corporation Optimizing the performance of duplicate identification by content
US7814107B1 (en) * 2007-05-25 2010-10-12 Amazon Technologies, Inc. Generating similarity scores for matching non-identical data strings
US20130212090A1 (en) * 2012-02-09 2013-08-15 Stroz Friedberg, LLC Similar document detection and electronic discovery
US8972394B1 (en) * 2009-07-20 2015-03-03 Google Inc. Generating a related set of documents for an initial set of documents

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6098034A (en) * 1996-03-18 2000-08-01 Expert Ease Development, Ltd. Method for standardizing phrasing in a document
US6286018B1 (en) * 1998-03-18 2001-09-04 Xerox Corporation Method and apparatus for finding a set of documents relevant to a focus set using citation analysis and spreading activation techniques
US6353825B1 (en) * 1999-07-30 2002-03-05 Verizon Laboratories Inc. Method and device for classification using iterative information retrieval techniques
US20020164078A1 (en) * 2001-03-23 2002-11-07 Fujitsu Limited Information retrieving system and method
US20050165600A1 (en) * 2004-01-27 2005-07-28 Kas Kasravi System and method for comparative analysis of textual documents
US20080243842A1 (en) * 2007-03-28 2008-10-02 Xerox Corporation Optimizing the performance of duplicate identification by content
US7814107B1 (en) * 2007-05-25 2010-10-12 Amazon Technologies, Inc. Generating similarity scores for matching non-identical data strings
US8972394B1 (en) * 2009-07-20 2015-03-03 Google Inc. Generating a related set of documents for an initial set of documents
US20130212090A1 (en) * 2012-02-09 2013-08-15 Stroz Friedberg, LLC Similar document detection and electronic discovery

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
STASKO ET AL.: "Jigsaw: supporting investigative analysis through interactive visualization.", 2008, XP031221446, Retrieved from the Internet <URL:http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1014.9375&rep=rep1&type=pdf> [retrieved on 20170419] *

Also Published As

Publication number Publication date
WO2017064561A2 (en) 2017-04-20

Similar Documents

Publication Publication Date Title
GB2544660A (en) Visual interactive search
JP2017503273A5 (en)
WO2012142553A3 (en) Identifying query formulation suggestions for low-match queries
WO2015138497A3 (en) Systems and methods for rapid data analysis
WO2008060919A3 (en) Image recognition system for use in analysing images of objects and applications thereof
WO2017098332A3 (en) Method and system for inputting information
JP2016508264A5 (en)
WO2016050347A3 (en) Audio identification device, audio identification method and audio identification system
WO2013177213A3 (en) Enabling natural language processing
WO2010110880A3 (en) Shape based picture search
WO2014182606A8 (en) Approximate privacy indexing for search queries on online social networks
WO2018014109A8 (en) System and method for analyzing and searching for features associated with objects
WO2009099798A3 (en) System and method for utilizing tiles in a search results page
WO2014111944A8 (en) Systems and methods for identifying explosives
WO2013071026A3 (en) Performing deduplication on product information search results
WO2016202214A3 (en) Method and device for displaying keyword
MX2019000101A (en) Collecting user information from computer systems.
WO2014164688A3 (en) Identification of concepts and associated processing
EP4300501A3 (en) Methods of sequencing data read realignment
WO2017173104A8 (en) Semantic search systems and methods for a distributed data system
JO3514B1 (en) System and method for accessing images with a captured query image
WO2017064561A3 (en) Selection of initial document collection for visual interactive search
TW201613366A (en) TV program based shopping guide system and TV program based shopping guide method thereof
WO2016085527A8 (en) Method and system for storage retrieval
WO2014026062A3 (en) Computerized system for delivering reasonably priced access to content to remotely located users at prices varying in time per user behavior and with automated access to outside websites matching a user&#39;s inquiry or interest

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16855013

Country of ref document: EP

Kind code of ref document: A2

ENP Entry into the national phase

Ref document number: 2017545913

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16855013

Country of ref document: EP

Kind code of ref document: A2