CA2706773C - Using reputation measures to improve search relevance - Google Patents
Using reputation measures to improve search relevance Download PDFInfo
- Publication number
- CA2706773C CA2706773C CA2706773A CA2706773A CA2706773C CA 2706773 C CA2706773 C CA 2706773C CA 2706773 A CA2706773 A CA 2706773A CA 2706773 A CA2706773 A CA 2706773A CA 2706773 C CA2706773 C CA 2706773C
- Authority
- CA
- Canada
- Prior art keywords
- relevancy
- user
- item
- computer
- search
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 claims abstract description 39
- 230000002596 correlated effect Effects 0.000 claims abstract description 10
- 230000004044 response Effects 0.000 claims description 7
- 239000002131 composite material Substances 0.000 claims 2
- 230000000694 effects Effects 0.000 abstract description 6
- 230000002265 prevention Effects 0.000 abstract 1
- 239000010985 leather Substances 0.000 description 9
- 230000015654 memory Effects 0.000 description 6
- 238000012545 processing Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000007935 neutral effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- TVEXGJYMHHTVKP-UHFFFAOYSA-N 6-oxabicyclo[3.2.1]oct-3-en-7-one Chemical compound C1C2C(=O)OC1C=CC2 TVEXGJYMHHTVKP-UHFFFAOYSA-N 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3325—Reformulation based on results of preceding query
- G06F16/3326—Reformulation based on results of preceding query using relevance feedback from the user, e.g. relevance feedback on documents, documents sets, document terms or passages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
- G06F16/24578—Query processing with adaptation to user needs using ranking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3334—Selection or weighting of terms from queries, including natural language queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
- G06Q10/06316—Sequencing of tasks or work
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
Abstract
A system and method for determining relevancy for dynamic data sets is disclosed. A specific embodiment for use in an internet marketplace is presented wherein the relevancy for a descriptive factor associated with an item is increased when a user selects that item. To prevent abuse of the relevancy determination system, various embodiments incorporate abuse prevention measures. In one embodiment, a user's selection of the user's own items does not affect the relevancy system. In one embodiment, only a first selection of a particular item by a user will affect the relevancy system and any additional selections of that item will have no effect. In another embodiment, the size of the changes made due to the selections of particular user to the relevancy system are correlated to that user's reputation score.
Description
USING REPUTATION MEASURES TO IMPROVE SEARCH
RELEVANCE
TECHNICAL FIELD
The present invention relates to data retrieval. In particular, but not by way of limitation, the present invention discloses techniques for scoring the relevancy of items located in a computer search.
BACKGROUND
Computers are now used to store massive amounts of information. In order to locate particular information of interest, powerful and intuitive search mechanisms have been created.
For example, the World Wide Web portion of the Internet has grown exponentially since the late 1980's when the World Wide Web was first introduced. Early in the history of the World Wide Web, directories of web sites were used to guide users to web sites of interest. One of the most famous early web site directories was "Jerry's Guide to the World Wide Web" which was later renamed "Yahoo!". However, the rapid real-time growth of the Internet quickly made World Wide Web directories unmanageable and prone to being out of date.
Internet search engines such as Lycos, Alta Vista, and Google became the new method finding web sites on the Internet. Internet search engines allow a user to enter a few keywords related to the topic of interest and return with a large set of search results that contain the keywords entered by the user.
Internet search engines operate by "crawling" the World Wide Web to learn about new web pages and then create a searchable index of all the web pages that were visited. When a user enters a set of keywords, the search engine returns a set of web pages that contain the keywords entered by the user.
However, most queries entered by search engine users will map to thousands or even hundreds of thousands of results that contain the matching keywords. This information overload is not desired by the user. Thus, the real key to building a very good search engine is to sort the results by some type of relevancy measure.
In this manner, the user of an interne search engine may quickly find desired content.
BRIEF DESCRIPTION OF THE DRAWINGS
In the drawings, which are not necessarily drawn to scale, like numerals describe substantially similar components throughout the several views. Like numerals having different letter suffixes represent different instances of substantially similar components. The drawings illustrate generally, by way of example, but not by way of limitation, various embodiments discussed in the present document.
Figure 1 illustrates a diagrammatic representation of machine in the example form of a computer system within which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed.
Figure 2 illustrates a high level flow chart describing how one embodiment can modify a relevancy adjustment factor in response to user selections.
Figure 3 illustrates some database tables that may be used in various embodiments of the invention.
Figure 4 illustrates a high level flow chart describing how the relevancy adjustment factors created in the system of Figure 2 may be used to adjust relevancy scores for items in a search result set.
Figure 5 illustrates the relevancy adjustment factor system disclosed in Figure 2 with an added step to prevent abuse by aggressive users that click on their own items.
Figure 6 illustrates a relevancy score adjustment system of Figure 5 wherein a reputation score associated with each user is used to make adjustments that are correlated to the reputation score.
RELEVANCE
TECHNICAL FIELD
The present invention relates to data retrieval. In particular, but not by way of limitation, the present invention discloses techniques for scoring the relevancy of items located in a computer search.
BACKGROUND
Computers are now used to store massive amounts of information. In order to locate particular information of interest, powerful and intuitive search mechanisms have been created.
For example, the World Wide Web portion of the Internet has grown exponentially since the late 1980's when the World Wide Web was first introduced. Early in the history of the World Wide Web, directories of web sites were used to guide users to web sites of interest. One of the most famous early web site directories was "Jerry's Guide to the World Wide Web" which was later renamed "Yahoo!". However, the rapid real-time growth of the Internet quickly made World Wide Web directories unmanageable and prone to being out of date.
Internet search engines such as Lycos, Alta Vista, and Google became the new method finding web sites on the Internet. Internet search engines allow a user to enter a few keywords related to the topic of interest and return with a large set of search results that contain the keywords entered by the user.
Internet search engines operate by "crawling" the World Wide Web to learn about new web pages and then create a searchable index of all the web pages that were visited. When a user enters a set of keywords, the search engine returns a set of web pages that contain the keywords entered by the user.
However, most queries entered by search engine users will map to thousands or even hundreds of thousands of results that contain the matching keywords. This information overload is not desired by the user. Thus, the real key to building a very good search engine is to sort the results by some type of relevancy measure.
In this manner, the user of an interne search engine may quickly find desired content.
BRIEF DESCRIPTION OF THE DRAWINGS
In the drawings, which are not necessarily drawn to scale, like numerals describe substantially similar components throughout the several views. Like numerals having different letter suffixes represent different instances of substantially similar components. The drawings illustrate generally, by way of example, but not by way of limitation, various embodiments discussed in the present document.
Figure 1 illustrates a diagrammatic representation of machine in the example form of a computer system within which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed.
Figure 2 illustrates a high level flow chart describing how one embodiment can modify a relevancy adjustment factor in response to user selections.
Figure 3 illustrates some database tables that may be used in various embodiments of the invention.
Figure 4 illustrates a high level flow chart describing how the relevancy adjustment factors created in the system of Figure 2 may be used to adjust relevancy scores for items in a search result set.
Figure 5 illustrates the relevancy adjustment factor system disclosed in Figure 2 with an added step to prevent abuse by aggressive users that click on their own items.
Figure 6 illustrates a relevancy score adjustment system of Figure 5 wherein a reputation score associated with each user is used to make adjustments that are correlated to the reputation score.
2 DETAILED DESCRIPTION
The following detailed description includes references to the accompanying drawings, which form a part of the detailed description. The drawings show illustrations in accordance with example embodiments. These embodiments, which are also referred to herein as "examples," are described in enough detail to enable those skilled in the art to practice the invention. It will be apparent to one skilled in the art that specific details in the example embodiments are not required in order to practice the present invention.
Although the example embodiments are mainly disclosed with reference to internet marketplace systems, the teachings can be used with other types of systems that incorporate a search engine. For example, social networking web sites or media presentation web sites may incorporate the teachings of the present invention. The example embodiments may be combined, other embodiments may be utilized, or structural, logical and electrical changes may be made without departing from the scope what is claimed. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope is defined by the appended claims and their equivalents.
In this document, the terms "a" or "an" are used, as is common in patent documents, to include one or more than one. In this document, the term "or" is used to refer to a nonexclusive or, such that "A or B" includes "A but not B,"
"B
but not A," and "A and B," unless otherwise indicated. Furthermore, all publications, patents, and patent documents referred to in this document are incorporated by reference herein in their entirety, as though individually incorporated by reference. In the event of inconsistent usages between this document and those documents so incorporated by reference, the usage in the incorporated reference(s) should be considered supplementary to that of this document; for irreconcilable inconsistencies, the usage in this document controls.
Computer Systems Figure 1 illustrates a diagrammatic representation of a machine in the example form of a computer system 100 within which a set of instructions 124, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed. In alternative embodiments, the machine
The following detailed description includes references to the accompanying drawings, which form a part of the detailed description. The drawings show illustrations in accordance with example embodiments. These embodiments, which are also referred to herein as "examples," are described in enough detail to enable those skilled in the art to practice the invention. It will be apparent to one skilled in the art that specific details in the example embodiments are not required in order to practice the present invention.
Although the example embodiments are mainly disclosed with reference to internet marketplace systems, the teachings can be used with other types of systems that incorporate a search engine. For example, social networking web sites or media presentation web sites may incorporate the teachings of the present invention. The example embodiments may be combined, other embodiments may be utilized, or structural, logical and electrical changes may be made without departing from the scope what is claimed. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope is defined by the appended claims and their equivalents.
In this document, the terms "a" or "an" are used, as is common in patent documents, to include one or more than one. In this document, the term "or" is used to refer to a nonexclusive or, such that "A or B" includes "A but not B,"
"B
but not A," and "A and B," unless otherwise indicated. Furthermore, all publications, patents, and patent documents referred to in this document are incorporated by reference herein in their entirety, as though individually incorporated by reference. In the event of inconsistent usages between this document and those documents so incorporated by reference, the usage in the incorporated reference(s) should be considered supplementary to that of this document; for irreconcilable inconsistencies, the usage in this document controls.
Computer Systems Figure 1 illustrates a diagrammatic representation of a machine in the example form of a computer system 100 within which a set of instructions 124, for causing the machine to perform any one or more of the methodologies discussed herein, may be executed. In alternative embodiments, the machine
3 may operate as a standalone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine may operate in the capacity of a server or a client machine in server-client network environment, or as a peer machine in a peer-to-peer (or distributed) network environment. The machine may be a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a network server, a network router, a network switch, a network bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single machine is illustrated in Figure 1, the term "machine" shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.
The example computer system 100 illustrated in Figure 1 includes a processor 102 (e.g., a central processing unit (CPU), a graphics processing unit (GPU) or both), a main memory 104, a static memory 106, which may communicate with each other via a bus 108. The computer system 100 may further include a video display adapter 110 that drives a video display system 115 such as a Liquid Crystal Display (LCD) or a Cathode Ray Tube (CRT). The example computer system 100 also includes an alphanumeric input device 112 (e.g., a keyboard), a cursor control device 114 (e.g., a mouse or trackball), a disk drive unit 116, a signal generation device 118 (e.g., a speaker), and a network interface device 120. Note that various embodiments of a computer system will not always include all of these peripheral devices.
The disk drive unit 116 includes a machine-readable medium 122 on which is stored one or more sets of computer instructions and data structures (e.g., instructions 124 also known as 'software') embodying or utilized by any one or more of the methodologies or functions described herein. The instructions 124 may also reside, completely or at least partially, within the main memory 104 and/or within the processor 102 during execution thereof by the computer system 100, the main memory 104 and the processor 102 also constituting machine-readable media.
The instructions 124 for operating computer system 100 may be transmitted or received over a network 126 via the network interface device
The example computer system 100 illustrated in Figure 1 includes a processor 102 (e.g., a central processing unit (CPU), a graphics processing unit (GPU) or both), a main memory 104, a static memory 106, which may communicate with each other via a bus 108. The computer system 100 may further include a video display adapter 110 that drives a video display system 115 such as a Liquid Crystal Display (LCD) or a Cathode Ray Tube (CRT). The example computer system 100 also includes an alphanumeric input device 112 (e.g., a keyboard), a cursor control device 114 (e.g., a mouse or trackball), a disk drive unit 116, a signal generation device 118 (e.g., a speaker), and a network interface device 120. Note that various embodiments of a computer system will not always include all of these peripheral devices.
The disk drive unit 116 includes a machine-readable medium 122 on which is stored one or more sets of computer instructions and data structures (e.g., instructions 124 also known as 'software') embodying or utilized by any one or more of the methodologies or functions described herein. The instructions 124 may also reside, completely or at least partially, within the main memory 104 and/or within the processor 102 during execution thereof by the computer system 100, the main memory 104 and the processor 102 also constituting machine-readable media.
The instructions 124 for operating computer system 100 may be transmitted or received over a network 126 via the network interface device
4 utilizing any one of a number of well-known transfer protocols such as the File Transfer Protocol (FTP).
While the machine-readable medium 122 is shown in an example embodiment to be a single medium, the term "machine-readable medium" should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The term "machine-readable medium" shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies described herein, or that is capable of storing, encoding or carrying data structures utilized by or associated with such a set of instructions. The term "machine-readable medium" shall accordingly be taken to include, but not be limited to, solid-state memories, optical media, flash memory, magnetic media, and carrier wave signals.
For the purposes of this specification, the term "module" includes an identifiable portion of computer code, computational or executable instructions, data, or computational object to achieve a particular function, operation, processing, or procedure. A module need not be implemented in software; a module may be implemented in software, hardware/circuitry, or a combination of software and hardware.
Search Engines Search engines are computer programs that are designed to allow a computer user to search a particular domain of information. A search engine typically allows a computer user to enter a set of search keywords and then the search engine generates a set of search results from the search domain that contain the user-specified keywords.
Very popular forms of search engine are the World Wide Web search engines that are available on the global Internet. A World Wide Web search engine allows a web user to enter a set of search keywords and then the World Wide Web search engine returns a search result set World Wide Web pages that contain the user-specified search keywords.
World Wide Web search engines typically operate by having an automated program that learns about new web pages (commonly known as a
While the machine-readable medium 122 is shown in an example embodiment to be a single medium, the term "machine-readable medium" should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The term "machine-readable medium" shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies described herein, or that is capable of storing, encoding or carrying data structures utilized by or associated with such a set of instructions. The term "machine-readable medium" shall accordingly be taken to include, but not be limited to, solid-state memories, optical media, flash memory, magnetic media, and carrier wave signals.
For the purposes of this specification, the term "module" includes an identifiable portion of computer code, computational or executable instructions, data, or computational object to achieve a particular function, operation, processing, or procedure. A module need not be implemented in software; a module may be implemented in software, hardware/circuitry, or a combination of software and hardware.
Search Engines Search engines are computer programs that are designed to allow a computer user to search a particular domain of information. A search engine typically allows a computer user to enter a set of search keywords and then the search engine generates a set of search results from the search domain that contain the user-specified keywords.
Very popular forms of search engine are the World Wide Web search engines that are available on the global Internet. A World Wide Web search engine allows a web user to enter a set of search keywords and then the World Wide Web search engine returns a search result set World Wide Web pages that contain the user-specified search keywords.
World Wide Web search engines typically operate by having an automated program that learns about new web pages (commonly known as a
5 "web crawler") visit World Wide Web pages to continually learn about the content that is available on the World Wide Web. The information obtained by the automated web crawler program is used to create a searchable index of all the web pages that were visited by the automated web crawler program. That searchable index is used by the Internet search engine to generate search results Determining Relevancy for Search Engines When a user enters a set of keywords into a search engine, the search engine returns a set of results that contain the keywords entered by the user.
For A massive information overload is not the result desired by an internet Many Internet search engines have implemented various versions of such
For A massive information overload is not the result desired by an internet Many Internet search engines have implemented various versions of such
6
7 One disadvantage of such relevancy systems is that once the system used to calculate the relevancy score becomes widely known, that relevancy system becomes subject to abuse by people wishing to artificially raise the profile of their web sites. For example, if a particular commercial web site wishes to generate a lot of traffic for its web site, then that web site may create many external web sites that link to the main commercial web site. In this manner, the relevancy scoring system of the Internet search engine may be tricked into ranking that commercial site highly despite the fact that the same entity created all those multiple links to the main web site.
Relevancy for Dynamic Data Sets As set forth in the previous section, one possible method for determining the relevancy of items is a data set is to use some known indicator as to the relative popularity of the items within that data set. Within the context of searching Internet web sites, the relative popularity of a particular web site may be inferred from how many other web sites link to that particular web site.
Thus, the number of links to a particular web site may be used as part of a relevancy score within an Internet search engine.
However, in a very dynamic data domain wherein the items in the domain are changing all of the time, such relatively static indications of popularity are not very useful for determining relevancy. For example, in an online marketplace that continually presents new items available for sale, the set of items in the search domain (the items currently available for sale) is continually changing as items are sold and new items are offered for sale.
Thus the items available for sale represent a "dynamic data domain". Any links to web pages associated with the items for sale are relatively useless since the items will often be sold before many links are ever made to the web pages associated with items for sale. Thus, other methods are needed to create relevancy scores for a dynamic data domain.
Since the items in a dynamic data domain are continually changing, any information connected directly to a specific item in the dynamic domain (such as a link to a web page for an item currently in the data domain) is not useful for generally determining relevancy since that specific item may soon be gone.
Instead, factors that describe a particular popular item in the data domain (and thus can also be used to describe other similar items) are useful for relevancy.
Furthermore, since the goal is to determine the intent of a user that is searching a dynamic data domain, any measure of relevancy would ideally be correlated to the actual search request made by that user. To meet these goals, a system has been devised that generates relevancy rankings based upon user item selections that are made after a set of search result items has been presented to the user in response to the user's search request.
In the disclosed dynamic data domain relevancy system, the system responds to a user search request with a set of items that fulfil the requirements of the user's search request. The items are normally then displayed in list form with limited additional information. The user may then select any item presented in the search results in order to obtain more information on that selected item. This selection of an item in the search results by the user acts as a popularity vote for that particular item within the context of the user's original search. Note that to be useful for future relevancy determinations; some descriptive factor of the selected item must be abstracted out from the user's selection in order to use that descriptive factor for future relevancy determinations. In this manner, when a relevancy determination must be performed for the same search request but that selected item no longer exists, similar items may be identified in the dynamic data domain with a high relevancy using the descriptive factor that was abstracted out of the selected item.
In one embodiment, the descriptive factors that may be used to identify similar relevant items in the future are additional words in a descriptive field for an item that were not part of the user's original search query keywords. Thus, if a user enters a particular search query and then selects a set of items that all contain a description field with a particular word that was not in the original search query, then having that particular word may raise the relevancy of items for that particular search query. Similarly, if a set of items in the query results all share a particular word but none of those items was selected by the user then having that particular word may lower the relevancy score for items for that search query. Note that other factors may be used and this is just one example of a descriptive factor that may be used to identify similar items in the future.
Relevancy for Dynamic Data Sets As set forth in the previous section, one possible method for determining the relevancy of items is a data set is to use some known indicator as to the relative popularity of the items within that data set. Within the context of searching Internet web sites, the relative popularity of a particular web site may be inferred from how many other web sites link to that particular web site.
Thus, the number of links to a particular web site may be used as part of a relevancy score within an Internet search engine.
However, in a very dynamic data domain wherein the items in the domain are changing all of the time, such relatively static indications of popularity are not very useful for determining relevancy. For example, in an online marketplace that continually presents new items available for sale, the set of items in the search domain (the items currently available for sale) is continually changing as items are sold and new items are offered for sale.
Thus the items available for sale represent a "dynamic data domain". Any links to web pages associated with the items for sale are relatively useless since the items will often be sold before many links are ever made to the web pages associated with items for sale. Thus, other methods are needed to create relevancy scores for a dynamic data domain.
Since the items in a dynamic data domain are continually changing, any information connected directly to a specific item in the dynamic domain (such as a link to a web page for an item currently in the data domain) is not useful for generally determining relevancy since that specific item may soon be gone.
Instead, factors that describe a particular popular item in the data domain (and thus can also be used to describe other similar items) are useful for relevancy.
Furthermore, since the goal is to determine the intent of a user that is searching a dynamic data domain, any measure of relevancy would ideally be correlated to the actual search request made by that user. To meet these goals, a system has been devised that generates relevancy rankings based upon user item selections that are made after a set of search result items has been presented to the user in response to the user's search request.
In the disclosed dynamic data domain relevancy system, the system responds to a user search request with a set of items that fulfil the requirements of the user's search request. The items are normally then displayed in list form with limited additional information. The user may then select any item presented in the search results in order to obtain more information on that selected item. This selection of an item in the search results by the user acts as a popularity vote for that particular item within the context of the user's original search. Note that to be useful for future relevancy determinations; some descriptive factor of the selected item must be abstracted out from the user's selection in order to use that descriptive factor for future relevancy determinations. In this manner, when a relevancy determination must be performed for the same search request but that selected item no longer exists, similar items may be identified in the dynamic data domain with a high relevancy using the descriptive factor that was abstracted out of the selected item.
In one embodiment, the descriptive factors that may be used to identify similar relevant items in the future are additional words in a descriptive field for an item that were not part of the user's original search query keywords. Thus, if a user enters a particular search query and then selects a set of items that all contain a description field with a particular word that was not in the original search query, then having that particular word may raise the relevancy of items for that particular search query. Similarly, if a set of items in the query results all share a particular word but none of those items was selected by the user then having that particular word may lower the relevancy score for items for that search query. Note that other factors may be used and this is just one example of a descriptive factor that may be used to identify similar items in the future.
8 For example, a user that wishes to purchase a portable digital music player may enter the search query "ipod nano" in a search engine for an online marketplace. In response to the "ipod nano" search query into the online marketplace, the system may present the items with a description field that is listed in the left column. The center column contains the various words from the description field after removing the original search query ("ipod nano") and common stop words (and, or, in, the, for, etc.).
Table 1: Possible search query results for "ipod nano" search query Reults to Query: "ipod nano" Extra terms Item selected:?
Sealed 4Gb Ipod Nano Sealed, 4Gb Yes iPod Nano black black No iPod Nano leather skin Leather, skin No iPod nano FM transmitter FM, transmitter No iPod Nano sealed sealed Yes scratched white iPod Nano Scratched, white No iPod Nano transmitter for car Transmitter, car No New ipod Nano 8GB black 8GB, black Yes new 4GB white ipod nano New, 4GB, white Yes New leather ipod nano case New, leather, case No A user that is interested in purchasing a new iPod Nano device may click on the entries for "sealed 4Gb Ipod Nano", "iPod Nano sealed", and "New ipod Nano 8GB black", and "new 4GB white ipod nano". Thus, in future searches for "ipod nano", the items that include the extra terms that were selected by the user should receive an increased relevancy score. One method of performing this is to assign a relevancy adjustment factor to each possible extra word. That relevancy adjustment factor associated with an extra word will adjust the relevancy score for items that have a description with that extra word. When a user selects an item, the relevancy adjustment factors for the extra words associated with that selected item will be increased. Thus, the extra terms associated with the four items selected (sealed, 4Gb, sealed, new, 8GB, black, new, 4GB) should have their relevancy adjustment factors increased. Note that
Table 1: Possible search query results for "ipod nano" search query Reults to Query: "ipod nano" Extra terms Item selected:?
Sealed 4Gb Ipod Nano Sealed, 4Gb Yes iPod Nano black black No iPod Nano leather skin Leather, skin No iPod nano FM transmitter FM, transmitter No iPod Nano sealed sealed Yes scratched white iPod Nano Scratched, white No iPod Nano transmitter for car Transmitter, car No New ipod Nano 8GB black 8GB, black Yes new 4GB white ipod nano New, 4GB, white Yes New leather ipod nano case New, leather, case No A user that is interested in purchasing a new iPod Nano device may click on the entries for "sealed 4Gb Ipod Nano", "iPod Nano sealed", and "New ipod Nano 8GB black", and "new 4GB white ipod nano". Thus, in future searches for "ipod nano", the items that include the extra terms that were selected by the user should receive an increased relevancy score. One method of performing this is to assign a relevancy adjustment factor to each possible extra word. That relevancy adjustment factor associated with an extra word will adjust the relevancy score for items that have a description with that extra word. When a user selects an item, the relevancy adjustment factors for the extra words associated with that selected item will be increased. Thus, the extra terms associated with the four items selected (sealed, 4Gb, sealed, new, 8GB, black, new, 4GB) should have their relevancy adjustment factors increased. Note that
9 the extra words may be listed more than once since those terms existed in more than one item selected by the user.
As a corollary, extra words from item descriptions that were not selected may have their relevancy adjustment scores decreased. In one embodiment, such terms must not be in any selected item and must appear in more than one non-selected item. In such an embodiment, the terms "transmitter" and "leather"
may have their adjustment relevancy scores decreased.
Determining Relevancy Adjustment Factors Figure 2 illustrates a high level flow chart describing how one possible embodiment could operate to modify the relevancy adjustment factors of extra words in the description. The initial relevancy adjustment factor for extra words can be set to a neutral value such as one ("1"). Figure 3 illustrates some database tables that may be used in various embodiments of the invention.
Referring to the top of Figure 2, the system first receives a search query at stage 210. Next, at stage 220, the system creates a set of search results fulfil the requirements of the user's search query from stage 210. Note that this set of search results may be sorted by relevancy as will be described later.
At stage 230, the system displays a portion of results to the user. In an internet marketplace embodiment, the results may comprise a set of items available for sale. At stage 240, view another portion of the search results, the user may select items to view in greater detail, or leave this set of results.
If the user decides to view another portion of the search results then the system selects another portion of the search results to display and returns to stage 230 to display those results.
If the user decides to view an item from the search results in greater detail, the system proceeds to stage 250. Since the user selected the item, this item is deemed to be relevant to people who entered the particular search query that was entered back at stage 210. Thus, the system will increase the relevancy adjustment factor of descriptive factors related to this selected item for this particular search query.
As set forth earlier, one embodiment uses the additional words in a descriptive field for the item that were not part of the search query as a descriptive factor that can be used to identify similar items in the future.
Thus, at stage 250, the system identifies words from a description field that were not part of search query (if any) and adds these additional descriptive words to a database table 320 associated with a search query entry in a table of popular search queries 310 if these additional words are not already in the additional descriptive words database table 320.
Next, at stage 255, the system increases the relevancy adjustment factor for the additional descriptive words that were identified in the previous stage.
The relevancy adjustment factor may be kept in the same database table 320 as the additional descriptive words. Not that relevancy adjustment factor for each word is done on per search query basis since relevancy of a descriptive word will vary heavily depending on the item. For example, the word "Persian" may be very relevant for rugs but completely irrelevant for iPods.
After modifying the relevancy adjustment factor of the additional words for the selected item, the system displays the selected item to the user in greater detail at stage 260. Additional processing will depend on user input at stage 270.
If the user requests to see the next or previous item then the system will obtain the information associated with that item and return to step 250 to handle the appropriate relevancy adjustment factor modifications and display of that item.
If the user decides to return to the list view of the search results then the system returns to stage 230 to display the search results in the list view.
If the user decides to leave this particular search query at stage 270 (or leaves this search query at earlier stage 240) then the system may determine if any relevancy adjustment factor decreases should be made. At stage 280, the system first determines if at least one item was viewed. If no item was viewed then no relevancy adjustment factor changes may be made since there is insufficient information on whether the user was really interested or disinterested in the displayed items. If at least one item was viewed then the system may proceed to step 290 to possibly reduce one or more relevancy adjustment factors associated with items that were not selected. The system will identify common additional descriptive words that exist in the non selected items. In one embodiment, the system requires that a descriptive word not be in any of the selected items and be in at least two items that were presented to the user but not selected by the user before reducing the relevancy adjustment factor of that descriptive word. Descriptive words that pass this test may have their relevancy adjustment factors reduced. Note that not all relevancy system embodiments will implement the relevancy adjustment factor reduction system disclosed with reference to stages 680 and 690.
Using Relevancy Adjustment Factors Figure 4 illustrates a high level flow chart describing how one possible embodiment could use the relevancy adjustment factors created in the system of Figure 2 to adjust relevancy scores for items in a search result set. Note that the system illustrated in Figure 4 could be used within stage 220 of the system in Figure 2.
Initially, a search query is received at step 410. Then at stage 420, the system then searches the item database to generate an initial set of results that fulfil the requirements of the search query entered at stage 410.
After obtaining the initial search result, the search results must be sorted by relevancy. To achieve this goal, the system retrieves the relevancy adjustment factors for the additional descriptive words in the items in the initial result at stage 430.
Next, at stage 440, the relevancy adjustment factors are applied to relevancy adjustment scores given to each item in the initial search result.
In one embodiment, the relevancy adjustment factor may be multiplied against an initial relevancy score given to an item in a set of search query results to adjust the relevancy score of the item. Table 2 lists one possible set of relevancy adjustment factors for such an embodiment wherein some extra words associated with an "ipod nano" search query are listed. The relevancy adjustment factors for the extra words may be normalized to stay within a defined range. For example, the set of relevancy adjustment factors have been normalized to stay within the range of zero to two.
Table 2: "ipod nano" search query relevancy adjustment factor Extra Description Words Relevancy adjustment factor Sealed 1.5 black 0.8 leather 0.4 transmitter 0.32 white 0.74 4GB 0.9 case 0.37 2GB 0.6 8GB 1.2 new 1.3 To apply the relevancy adjustment factors given in Table 2 the relevancy adjustment factors are multiplied against an initial relevancy score given to an item if that item has the associated extra word in its description. Thus, referring to Table 2, items in a result set for an "ipod nano" search query with highly relevant additional descriptive words such as "sealed", "8GB",, and "new" will increase the relevancy score for those items. Similarly, items in a result set for an "ipod nano" search query with largely irrelevant additional descriptive words such as "leather", "transmitter", or "case" will reduce the relevancy score for those items. Many other method of using the relevancy adjustment factor to modify an initial relevancy score may be used.
In an alternate embodiment, the relevancy adjustment factor may be added to an initial relevancy score for an item to adjust the item's relevancy score. Table 3 lists one possible set of relevancy adjustment factors for such an embodiment wherein some extra words associated with an "ipod nano" search query are listed. The relevancy adjustment factors for the extra words may be normalized to stay within a defined range such as -100 to 100.
Table 3: "ipod nano" search query relevancy adjustment factor Extra Description Words Relevancy adjustment factor Sealed 73 black -4 leather -70 transmitter -83 white 2 case -80 new 82 Note that in Table 3, the highly desirable terms ("sealed", "8GB", and "new") have large positive relevancy adjustment factors. Similarly, the undesirable terms ("transmitter", "leather", and "case") have large negative scores. The remaining neutral terms will have relatively little effect on the relevancy score.
After adjusting an initial set of relevancy scores, the items are then ordered according to the adjusted relevancy score at stage 450. The relevancy sorted set of items may then be presented to the user. Since the result set has been sorted with items similar to previously selected items from earlier searches with the same query placed at the top, the user should quickly be able to find a desired item quickly.
Preventing abuse of a Relevancy systems for Dynamic Data Sets As set forth in the discussion on Internet search engines that rely upon hyperlinks to a web site as a measure of that web site's popularity, such Internet search engines can be abused by people that create thousands of unrelated web sites that link to a specific web site. This multitude of links to a specific web site will create a false appearance of popularity of that specific web site.
Similar methods of abuse may be attempted on the dynamic data set relevancy system disclosed in the previous sections.
For example, very aggressive sellers on an interne marketplace may attempt to create automated programs that repeatedly select the items that such aggressive sellers have posted for sale on the intern& marketplace. In this manner, such aggressive sellers may be attempting to make the items that such aggressive sellers post onto the interne marketplace look popular such that those items will receive an increased relevancy score.
To prevent such abuse, a set of various different restraints may be imposed on the relevancy scoring system to stop users from abusing the relevancy scoring system. A first restraint that may be implemented for preventing such abuse may be directed to prevent the exact scenario described in the previous paragraph. Figure 5 illustrates the relevancy adjustment factor system disclosed in Figure 2 but with an added step to prevent abuse by aggressive sellers that click on their own items posted for sale.
Referring to Figure 5, stage 545 has been added after a user selects an item for viewing in greater detail. At stage 545, the system determines if the selected item is an item that was posted by this particular user or if this user has already viewed this particular item. If either case is true, then the system skips the relevancy adjustment factor modification stages 550 and 555 and instead goes directly to stage 560 where the system displays the item to the user. In this manner, the system prevents a user from repeatedly selecting his own item.
Furthermore, stage 545 prevents a user from creating a second account and then repeatedly selecting his own item from that second account.
Preventing abuse of a Relevancy systems with User Reputation Scores In interne marketplace systems, it is common to have a reputation score for buyers and sellers that participate in the intemet marketplace such that people have some sort of measure as to whether the other party in a potential transaction should be trusted or not. These reputation scores are generally created by having users provide feedback on the other party in a transaction on the interne marketplace after that transaction is completed (or is otherwise ended). In one embodiment of the disclosed system, such a user reputation score has been incorporated into the relevancy system. Incorporating user reputation scores into a relevancy system improves the results of the relevancy system and reduces the possibility of abuse of the relevancy system.
Figure 6 illustrates a relevancy score adjustment system for dynamic data sets wherein a reputation score associated with each user has been incorporated into the relevancy system. The system of Figure 6 is the same as the system of Figure 5 except that the user's reputation is taken in consideration when making changes to the relevancy adjustment factors. Specifically, stage 655 has been changed to indicate that the system increases the relevancy adjustment factor by an amount correlated to the user's reputation score.
Similarly, stage 690 has been changed to indicate that the system reduces the relevancy adjustment factor by an amount correlated to the user's reputation score.
Incorporating user reputation scores into the relevancy system provides a number of significant advantages to the relevancy system. One advantage is that changes made to the relevancy adjustment factors may be made manner that is correlated to the user's skill. An experience user will have a higher reputation score such that selections by that experienced user will change the relevancy system more than a novice user.
Another advantage is that incorporating user reputation scores into the relevancy system can be used to prevent abuse of the relevancy system.
Specifically, an aggressive seller may attempt to thwart the restriction set forth in stage 645 that only allows one selection of an item by a particular user to adjust that item by creating a large number of new accounts and selecting the user's item from each of those new accounts. By setting the reputation score of new accounts to be zero or another low value, the selections made by such new accounts will have no or very little effect on the relevancy system. Thus, the creation of a large number of new accounts cannot be used to abuse the relevancy system.
In one embodiment, the users may have different reputation scores for different categories of products available at an interne marketplace. Thus, a person may have a high reputation for buying and selling electronics but only a very novice reputation for buying and selling housewares. In such an embodiment the system would identify the category of product searched and use the user's reputation in that category when making changes to relevancy adjustment factors. In this manner, the system factors in a person's specific skill set such that their selections in their categories of high reputation will have significant effects on the relevancy system but their selections in other areas will not have significant effects on the relevancy system. Note that this will require a user to participate in a number of successful transactions before that user's selections have a significant effect on the relevancy. This helps prevent a person from attempting to create many accounts that participate in one transaction each and then using those many accounts to abuse the relevancy system.
Integration with other Relevancy Systems The relevancy system for dynamic data sets that has been disclosed may be integrated with other relevancy systems that are based on other factors.
For example, an alternate system may use the reputation of sellers when determining relevancy such that sellers with higher reputations receive higher relevancy scores than sellers with low reputations. In such a system, buyers will be presented with more reliable sellers at the top of the search results. The presented relevancy system for dynamic data sets could be combined with such a system (or multiple other relevancy systems) such that a combined relevance score is used to present search results.
Although the relevancy system has largely been disclosed with reference to an internet marketplace embodiment, it must be stressed that the relevancy system can be used in many other embodiments. In other embodiments, the user reputation score may be replace with another similar measure of a user's experience with a system. For example, in an embodiment for a message posting board the user reputation score may be replaced with a number of postings made by that user. Furthermore, the invention has been described with a descriptive factor of other words in description field item but any other descriptive factor than can be used to identify similar items in the future can be used.
The preceding description is intended to be illustrative, and not restrictive. For example, the above-described embodiments (or one or more aspects thereof) may be used in combination with each other. Other embodiments will be apparent to those of skill in the art upon reviewing the above description. The scope of the claims should, therefore, be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled. In the appended claims, the terms "including"
and "in which" are used as the plain-English equivalents of the respective terms "comprising" and "wherein." Also, in the following claims, the terms "including" and "comprising" are open-ended, that is, a system, device, article, or process that includes elements in addition to those listed after such a term in a claim are still deemed to fall within the scope of that claim. Moreover, in the following claims, the terms "first," "second," and "third," etc. are used merely as labels, and are not intended to impose numerical requirements on their objects.
While embodiments of the invention have been described in the detailed description, the scope of the claims should not be limited by the preferred embodiments set forth in the examples, but should be given the broadest interpretation consistent with the description as a whole.
As a corollary, extra words from item descriptions that were not selected may have their relevancy adjustment scores decreased. In one embodiment, such terms must not be in any selected item and must appear in more than one non-selected item. In such an embodiment, the terms "transmitter" and "leather"
may have their adjustment relevancy scores decreased.
Determining Relevancy Adjustment Factors Figure 2 illustrates a high level flow chart describing how one possible embodiment could operate to modify the relevancy adjustment factors of extra words in the description. The initial relevancy adjustment factor for extra words can be set to a neutral value such as one ("1"). Figure 3 illustrates some database tables that may be used in various embodiments of the invention.
Referring to the top of Figure 2, the system first receives a search query at stage 210. Next, at stage 220, the system creates a set of search results fulfil the requirements of the user's search query from stage 210. Note that this set of search results may be sorted by relevancy as will be described later.
At stage 230, the system displays a portion of results to the user. In an internet marketplace embodiment, the results may comprise a set of items available for sale. At stage 240, view another portion of the search results, the user may select items to view in greater detail, or leave this set of results.
If the user decides to view another portion of the search results then the system selects another portion of the search results to display and returns to stage 230 to display those results.
If the user decides to view an item from the search results in greater detail, the system proceeds to stage 250. Since the user selected the item, this item is deemed to be relevant to people who entered the particular search query that was entered back at stage 210. Thus, the system will increase the relevancy adjustment factor of descriptive factors related to this selected item for this particular search query.
As set forth earlier, one embodiment uses the additional words in a descriptive field for the item that were not part of the search query as a descriptive factor that can be used to identify similar items in the future.
Thus, at stage 250, the system identifies words from a description field that were not part of search query (if any) and adds these additional descriptive words to a database table 320 associated with a search query entry in a table of popular search queries 310 if these additional words are not already in the additional descriptive words database table 320.
Next, at stage 255, the system increases the relevancy adjustment factor for the additional descriptive words that were identified in the previous stage.
The relevancy adjustment factor may be kept in the same database table 320 as the additional descriptive words. Not that relevancy adjustment factor for each word is done on per search query basis since relevancy of a descriptive word will vary heavily depending on the item. For example, the word "Persian" may be very relevant for rugs but completely irrelevant for iPods.
After modifying the relevancy adjustment factor of the additional words for the selected item, the system displays the selected item to the user in greater detail at stage 260. Additional processing will depend on user input at stage 270.
If the user requests to see the next or previous item then the system will obtain the information associated with that item and return to step 250 to handle the appropriate relevancy adjustment factor modifications and display of that item.
If the user decides to return to the list view of the search results then the system returns to stage 230 to display the search results in the list view.
If the user decides to leave this particular search query at stage 270 (or leaves this search query at earlier stage 240) then the system may determine if any relevancy adjustment factor decreases should be made. At stage 280, the system first determines if at least one item was viewed. If no item was viewed then no relevancy adjustment factor changes may be made since there is insufficient information on whether the user was really interested or disinterested in the displayed items. If at least one item was viewed then the system may proceed to step 290 to possibly reduce one or more relevancy adjustment factors associated with items that were not selected. The system will identify common additional descriptive words that exist in the non selected items. In one embodiment, the system requires that a descriptive word not be in any of the selected items and be in at least two items that were presented to the user but not selected by the user before reducing the relevancy adjustment factor of that descriptive word. Descriptive words that pass this test may have their relevancy adjustment factors reduced. Note that not all relevancy system embodiments will implement the relevancy adjustment factor reduction system disclosed with reference to stages 680 and 690.
Using Relevancy Adjustment Factors Figure 4 illustrates a high level flow chart describing how one possible embodiment could use the relevancy adjustment factors created in the system of Figure 2 to adjust relevancy scores for items in a search result set. Note that the system illustrated in Figure 4 could be used within stage 220 of the system in Figure 2.
Initially, a search query is received at step 410. Then at stage 420, the system then searches the item database to generate an initial set of results that fulfil the requirements of the search query entered at stage 410.
After obtaining the initial search result, the search results must be sorted by relevancy. To achieve this goal, the system retrieves the relevancy adjustment factors for the additional descriptive words in the items in the initial result at stage 430.
Next, at stage 440, the relevancy adjustment factors are applied to relevancy adjustment scores given to each item in the initial search result.
In one embodiment, the relevancy adjustment factor may be multiplied against an initial relevancy score given to an item in a set of search query results to adjust the relevancy score of the item. Table 2 lists one possible set of relevancy adjustment factors for such an embodiment wherein some extra words associated with an "ipod nano" search query are listed. The relevancy adjustment factors for the extra words may be normalized to stay within a defined range. For example, the set of relevancy adjustment factors have been normalized to stay within the range of zero to two.
Table 2: "ipod nano" search query relevancy adjustment factor Extra Description Words Relevancy adjustment factor Sealed 1.5 black 0.8 leather 0.4 transmitter 0.32 white 0.74 4GB 0.9 case 0.37 2GB 0.6 8GB 1.2 new 1.3 To apply the relevancy adjustment factors given in Table 2 the relevancy adjustment factors are multiplied against an initial relevancy score given to an item if that item has the associated extra word in its description. Thus, referring to Table 2, items in a result set for an "ipod nano" search query with highly relevant additional descriptive words such as "sealed", "8GB",, and "new" will increase the relevancy score for those items. Similarly, items in a result set for an "ipod nano" search query with largely irrelevant additional descriptive words such as "leather", "transmitter", or "case" will reduce the relevancy score for those items. Many other method of using the relevancy adjustment factor to modify an initial relevancy score may be used.
In an alternate embodiment, the relevancy adjustment factor may be added to an initial relevancy score for an item to adjust the item's relevancy score. Table 3 lists one possible set of relevancy adjustment factors for such an embodiment wherein some extra words associated with an "ipod nano" search query are listed. The relevancy adjustment factors for the extra words may be normalized to stay within a defined range such as -100 to 100.
Table 3: "ipod nano" search query relevancy adjustment factor Extra Description Words Relevancy adjustment factor Sealed 73 black -4 leather -70 transmitter -83 white 2 case -80 new 82 Note that in Table 3, the highly desirable terms ("sealed", "8GB", and "new") have large positive relevancy adjustment factors. Similarly, the undesirable terms ("transmitter", "leather", and "case") have large negative scores. The remaining neutral terms will have relatively little effect on the relevancy score.
After adjusting an initial set of relevancy scores, the items are then ordered according to the adjusted relevancy score at stage 450. The relevancy sorted set of items may then be presented to the user. Since the result set has been sorted with items similar to previously selected items from earlier searches with the same query placed at the top, the user should quickly be able to find a desired item quickly.
Preventing abuse of a Relevancy systems for Dynamic Data Sets As set forth in the discussion on Internet search engines that rely upon hyperlinks to a web site as a measure of that web site's popularity, such Internet search engines can be abused by people that create thousands of unrelated web sites that link to a specific web site. This multitude of links to a specific web site will create a false appearance of popularity of that specific web site.
Similar methods of abuse may be attempted on the dynamic data set relevancy system disclosed in the previous sections.
For example, very aggressive sellers on an interne marketplace may attempt to create automated programs that repeatedly select the items that such aggressive sellers have posted for sale on the intern& marketplace. In this manner, such aggressive sellers may be attempting to make the items that such aggressive sellers post onto the interne marketplace look popular such that those items will receive an increased relevancy score.
To prevent such abuse, a set of various different restraints may be imposed on the relevancy scoring system to stop users from abusing the relevancy scoring system. A first restraint that may be implemented for preventing such abuse may be directed to prevent the exact scenario described in the previous paragraph. Figure 5 illustrates the relevancy adjustment factor system disclosed in Figure 2 but with an added step to prevent abuse by aggressive sellers that click on their own items posted for sale.
Referring to Figure 5, stage 545 has been added after a user selects an item for viewing in greater detail. At stage 545, the system determines if the selected item is an item that was posted by this particular user or if this user has already viewed this particular item. If either case is true, then the system skips the relevancy adjustment factor modification stages 550 and 555 and instead goes directly to stage 560 where the system displays the item to the user. In this manner, the system prevents a user from repeatedly selecting his own item.
Furthermore, stage 545 prevents a user from creating a second account and then repeatedly selecting his own item from that second account.
Preventing abuse of a Relevancy systems with User Reputation Scores In interne marketplace systems, it is common to have a reputation score for buyers and sellers that participate in the intemet marketplace such that people have some sort of measure as to whether the other party in a potential transaction should be trusted or not. These reputation scores are generally created by having users provide feedback on the other party in a transaction on the interne marketplace after that transaction is completed (or is otherwise ended). In one embodiment of the disclosed system, such a user reputation score has been incorporated into the relevancy system. Incorporating user reputation scores into a relevancy system improves the results of the relevancy system and reduces the possibility of abuse of the relevancy system.
Figure 6 illustrates a relevancy score adjustment system for dynamic data sets wherein a reputation score associated with each user has been incorporated into the relevancy system. The system of Figure 6 is the same as the system of Figure 5 except that the user's reputation is taken in consideration when making changes to the relevancy adjustment factors. Specifically, stage 655 has been changed to indicate that the system increases the relevancy adjustment factor by an amount correlated to the user's reputation score.
Similarly, stage 690 has been changed to indicate that the system reduces the relevancy adjustment factor by an amount correlated to the user's reputation score.
Incorporating user reputation scores into the relevancy system provides a number of significant advantages to the relevancy system. One advantage is that changes made to the relevancy adjustment factors may be made manner that is correlated to the user's skill. An experience user will have a higher reputation score such that selections by that experienced user will change the relevancy system more than a novice user.
Another advantage is that incorporating user reputation scores into the relevancy system can be used to prevent abuse of the relevancy system.
Specifically, an aggressive seller may attempt to thwart the restriction set forth in stage 645 that only allows one selection of an item by a particular user to adjust that item by creating a large number of new accounts and selecting the user's item from each of those new accounts. By setting the reputation score of new accounts to be zero or another low value, the selections made by such new accounts will have no or very little effect on the relevancy system. Thus, the creation of a large number of new accounts cannot be used to abuse the relevancy system.
In one embodiment, the users may have different reputation scores for different categories of products available at an interne marketplace. Thus, a person may have a high reputation for buying and selling electronics but only a very novice reputation for buying and selling housewares. In such an embodiment the system would identify the category of product searched and use the user's reputation in that category when making changes to relevancy adjustment factors. In this manner, the system factors in a person's specific skill set such that their selections in their categories of high reputation will have significant effects on the relevancy system but their selections in other areas will not have significant effects on the relevancy system. Note that this will require a user to participate in a number of successful transactions before that user's selections have a significant effect on the relevancy. This helps prevent a person from attempting to create many accounts that participate in one transaction each and then using those many accounts to abuse the relevancy system.
Integration with other Relevancy Systems The relevancy system for dynamic data sets that has been disclosed may be integrated with other relevancy systems that are based on other factors.
For example, an alternate system may use the reputation of sellers when determining relevancy such that sellers with higher reputations receive higher relevancy scores than sellers with low reputations. In such a system, buyers will be presented with more reliable sellers at the top of the search results. The presented relevancy system for dynamic data sets could be combined with such a system (or multiple other relevancy systems) such that a combined relevance score is used to present search results.
Although the relevancy system has largely been disclosed with reference to an internet marketplace embodiment, it must be stressed that the relevancy system can be used in many other embodiments. In other embodiments, the user reputation score may be replace with another similar measure of a user's experience with a system. For example, in an embodiment for a message posting board the user reputation score may be replaced with a number of postings made by that user. Furthermore, the invention has been described with a descriptive factor of other words in description field item but any other descriptive factor than can be used to identify similar items in the future can be used.
The preceding description is intended to be illustrative, and not restrictive. For example, the above-described embodiments (or one or more aspects thereof) may be used in combination with each other. Other embodiments will be apparent to those of skill in the art upon reviewing the above description. The scope of the claims should, therefore, be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled. In the appended claims, the terms "including"
and "in which" are used as the plain-English equivalents of the respective terms "comprising" and "wherein." Also, in the following claims, the terms "including" and "comprising" are open-ended, that is, a system, device, article, or process that includes elements in addition to those listed after such a term in a claim are still deemed to fall within the scope of that claim. Moreover, in the following claims, the terms "first," "second," and "third," etc. are used merely as labels, and are not intended to impose numerical requirements on their objects.
While embodiments of the invention have been described in the detailed description, the scope of the claims should not be limited by the preferred embodiments set forth in the examples, but should be given the broadest interpretation consistent with the description as a whole.
Claims (22)
1. A computer-implemented method of inferring relevancy from search query results, said method comprising:
accepting a search query from a user of a search engine;
generating a set of search result items in response to said search query from said user;
accepting a selection by said user of a first item from said set of search result items; and modifying a relevancy adjustment factor for a descriptive factor associated with said first item by an amount correlated to a reputation score of said user, said descriptive factor being abstracted out from said first item such that said descriptive factor identifies said first item and a second item, said second item not included in said set of search result items.
accepting a search query from a user of a search engine;
generating a set of search result items in response to said search query from said user;
accepting a selection by said user of a first item from said set of search result items; and modifying a relevancy adjustment factor for a descriptive factor associated with said first item by an amount correlated to a reputation score of said user, said descriptive factor being abstracted out from said first item such that said descriptive factor identifies said first item and a second item, said second item not included in said set of search result items.
2. The computer-implemented method of inferring relevancy from search query results as set forth in claim 1 wherein said reputation score of said user is based on a rating given by a party that entered into a transaction with said user.
3. The method of inferring relevancy from search query results as set forth in claim 1 wherein said descriptive factor comprises a word from a description field of said second item.
4. The computer-implemented method of inferring relevancy from search query results as set forth in claim 1 wherein said generating said set of search result items in response to said search query from said user comprises ranking said search result items using a composite relevancy score based upon said relevancy adjustment factor.
5. The computer-implemented method of inferring relevancy from search query results as set forth in claim 1 wherein said modifying of said relevancy adjustment factor for said descriptive factor was not performed when said user posted said first item.
6. The computer-implemented method of inferring relevancy from search query results as set forth in claim 1 wherein said relevancy adjustment factor for said descriptive factor is only valid for said search query.
7. The computer-implemented method of inferring relevancy from search query results as set forth in claim 1 said method further comprising modifying a relevancy adjustment factor for a second descriptive factor associated with an item in said set of search result that was not selected by said user by an amount correlated to said reputation score of said user.
8. The computer-implemented method of inferring relevancy from search query results as set forth in claim 7 wherein said second descriptive factor is in more than one item and not in said item selected by said user.
9. The computer-implemented method of inferring relevancy from search query results as set forth in claim 1 wherein said relevancy adjustment factor is used to adjust a relevancy score of a search result item in said set of search result items.
10. The computer-implemented method of inferring relevancy from search query results as set forth in claim 1 wherein said search result items comprise items for sale in an online marketplace.
11. A computer-readable medium, said computer readable medium comprising a set of instructions for inferring relevancy from search query results, said set of instructions to perform operations according to the method of any one of claims 1 to 10.
12. A computer-implemented method comprising:
accepting a search query from a user of a search engine;
generating a set of search result items in response to said search query from said user;
accepting a selection by said user of a first item from said set of search result items; and modifying a relevancy adjustment factor for a descriptive factor associated with said first item by an amount correlated to a reputation score of said user, said descriptive factor identifying said first item and a second item.
accepting a search query from a user of a search engine;
generating a set of search result items in response to said search query from said user;
accepting a selection by said user of a first item from said set of search result items; and modifying a relevancy adjustment factor for a descriptive factor associated with said first item by an amount correlated to a reputation score of said user, said descriptive factor identifying said first item and a second item.
13. The computer-implemented method of claim 12, wherein said reputation score of said user is based on a rating given by a party that entered into a transaction with said user.
14. The computer-implemented method of claim 12, wherein said descriptive factor comprises a word from a description field of said second item.
15. The computer-implemented method of claim 12, wherein said generating of said set of search result items in response to said search query from said user comprises ranking said search result items using a composite relevancy score based upon said relevancy adjustment factor.
16. The computer-implemented method of claim 12, wherein said modifying of said relevancy adjustment factor for said descriptive factor was not performed when said user posted said first item.
17. The computer-implemented method of claim 12, wherein said relevancy adjustment factor for said descriptive factor is only valid for said search query.
18. The computer-implemented method of claim 12, further comprising modifying a relevancy adjustment factor for a second descriptive factor associated with an item in said set of search result items that was not selected by said user by an amount correlated to said reputation score of said user.
19. The computer-implemented method of claim 12, wherein said second descriptive factor is in more than one item and not in said item selected by said user.
20. The computer-implemented method of claim 12, wherein said relevancy adjustment factor is used to adjust a relevancy score of a search result item in said set of search result items.
21. The computer-implemented method of claim 12, wherein said search result items comprise items for sale in an online marketplace.
22. A computer-readable medium, said computer readable medium comprising a set of instructions for inferring relevancy from search query results, said set of instructions to perform operations according to the method of any one of claims 12 to 21.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA2856645A CA2856645C (en) | 2007-11-30 | 2008-11-25 | Using reputation measures to improve search relevance |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/948,788 US8583633B2 (en) | 2007-11-30 | 2007-11-30 | Using reputation measures to improve search relevance |
US11/948,788 | 2007-11-30 | ||
PCT/US2008/013118 WO2009070287A1 (en) | 2007-11-30 | 2008-11-25 | Using reputation measures to improve search relevance |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA2856645A Division CA2856645C (en) | 2007-11-30 | 2008-11-25 | Using reputation measures to improve search relevance |
Publications (2)
Publication Number | Publication Date |
---|---|
CA2706773A1 CA2706773A1 (en) | 2009-06-04 |
CA2706773C true CA2706773C (en) | 2014-07-15 |
Family
ID=40676786
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA2856645A Active CA2856645C (en) | 2007-11-30 | 2008-11-25 | Using reputation measures to improve search relevance |
CA2706773A Active CA2706773C (en) | 2007-11-30 | 2008-11-25 | Using reputation measures to improve search relevance |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA2856645A Active CA2856645C (en) | 2007-11-30 | 2008-11-25 | Using reputation measures to improve search relevance |
Country Status (8)
Country | Link |
---|---|
US (3) | US8583633B2 (en) |
EP (1) | EP2225671A4 (en) |
JP (1) | JP5141994B2 (en) |
KR (1) | KR101215791B1 (en) |
CN (2) | CN101884042B (en) |
AU (1) | AU2008330082B2 (en) |
CA (2) | CA2856645C (en) |
WO (1) | WO2009070287A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9063986B2 (en) | 2007-11-30 | 2015-06-23 | Ebay Inc. | Using reputation measures to improve search relevance |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4725627B2 (en) * | 2008-10-02 | 2011-07-13 | ブラザー工業株式会社 | Communication device |
US9336310B2 (en) | 2009-07-06 | 2016-05-10 | Google Inc. | Monitoring of negative feedback systems |
US8627476B1 (en) * | 2010-07-05 | 2014-01-07 | Symantec Corporation | Altering application behavior based on content provider reputation |
CN102456057B (en) * | 2010-11-01 | 2016-08-17 | 阿里巴巴集团控股有限公司 | Search method based on online trade platform, device and server |
US20120210240A1 (en) * | 2011-02-10 | 2012-08-16 | Microsoft Corporation | User interfaces for personalized recommendations |
US9870424B2 (en) * | 2011-02-10 | 2018-01-16 | Microsoft Technology Licensing, Llc | Social network based contextual ranking |
US8819000B1 (en) * | 2011-05-03 | 2014-08-26 | Google Inc. | Query modification |
US8825644B1 (en) | 2011-10-14 | 2014-09-02 | Google Inc. | Adjusting a ranking of search results |
US8887238B2 (en) | 2011-12-07 | 2014-11-11 | Time Warner Cable Enterprises Llc | Mechanism for establishing reputation in a network environment |
US8606777B1 (en) * | 2012-05-15 | 2013-12-10 | International Business Machines Corporation | Re-ranking a search result in view of social reputation |
US9152714B1 (en) | 2012-10-01 | 2015-10-06 | Google Inc. | Selecting score improvements |
CN103793388B (en) * | 2012-10-29 | 2017-08-25 | 阿里巴巴集团控股有限公司 | The sort method and device of search result |
US9298785B2 (en) | 2013-07-19 | 2016-03-29 | Paypal, Inc. | Methods, systems, and apparatus for generating search results |
US10140644B1 (en) * | 2013-10-10 | 2018-11-27 | Go Daddy Operating Company, LLC | System and method for grouping candidate domain names for display |
US9866526B2 (en) | 2013-10-10 | 2018-01-09 | Go Daddy Operating Company, LLC | Presentation of candidate domain name stacks in a user interface |
CN103914553A (en) * | 2014-04-14 | 2014-07-09 | 百度在线网络技术(北京)有限公司 | Search method and search engine |
CN104899322B (en) * | 2015-06-18 | 2021-09-17 | 百度在线网络技术(北京)有限公司 | Search engine and implementation method thereof |
US10198512B2 (en) * | 2015-06-29 | 2019-02-05 | Microsoft Technology Licensing, Llc | Search relevance using past searchers' reputation |
US10872124B2 (en) * | 2018-06-27 | 2020-12-22 | Sap Se | Search engine |
Family Cites Families (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5855015A (en) | 1995-03-20 | 1998-12-29 | Interval Research Corporation | System and method for retrieval of hyperlinked information resources |
JP3470782B2 (en) | 1996-01-09 | 2003-11-25 | 沖電気工業株式会社 | Information retrieval device |
US6493702B1 (en) * | 1999-05-05 | 2002-12-10 | Xerox Corporation | System and method for searching and recommending documents in a collection using share bookmarks |
US7080064B2 (en) * | 2000-01-20 | 2006-07-18 | International Business Machines Corporation | System and method for integrating on-line user ratings of businesses with search engines |
US20020103798A1 (en) | 2001-02-01 | 2002-08-01 | Abrol Mani S. | Adaptive document ranking method based on user behavior |
US7698276B2 (en) * | 2002-06-26 | 2010-04-13 | Microsoft Corporation | Framework for providing a subscription based notification system |
US20040015416A1 (en) * | 2002-07-22 | 2004-01-22 | Benjamin David Foster | Seller configurable merchandising in an electronic marketplace |
US6829599B2 (en) * | 2002-10-02 | 2004-12-07 | Xerox Corporation | System and method for improving answer relevance in meta-search engines |
GB0227613D0 (en) * | 2002-11-27 | 2002-12-31 | Hewlett Packard Co | Collecting browsing effectiveness data via refined transport buttons |
US8856163B2 (en) * | 2003-07-28 | 2014-10-07 | Google Inc. | System and method for providing a user interface with search query broadening |
US7822631B1 (en) * | 2003-08-22 | 2010-10-26 | Amazon Technologies, Inc. | Assessing content based on assessed trust in users |
US20050222987A1 (en) * | 2004-04-02 | 2005-10-06 | Vadon Eric R | Automated detection of associations between search criteria and item categories based on collective analysis of user activity data |
US20060010117A1 (en) | 2004-07-06 | 2006-01-12 | Icosystem Corporation | Methods and systems for interactive search |
US8010460B2 (en) * | 2004-09-02 | 2011-08-30 | Linkedin Corporation | Method and system for reputation evaluation of online users in a social networking scheme |
WO2007002820A2 (en) * | 2005-06-28 | 2007-01-04 | Yahoo! Inc. | Search engine with augmented relevance ranking by community participation |
KR100776697B1 (en) | 2006-01-05 | 2007-11-16 | 주식회사 인터파크지마켓 | Method for searching products intelligently based on analysis of customer's purchasing behavior and system therefor |
US20070168344A1 (en) | 2006-01-19 | 2007-07-19 | Brinson Robert M Jr | Data product search using related concepts |
US9443333B2 (en) * | 2006-02-09 | 2016-09-13 | Ebay Inc. | Methods and systems to communicate information |
US7844603B2 (en) * | 2006-02-17 | 2010-11-30 | Google Inc. | Sharing user distributed search results |
US7603350B1 (en) * | 2006-05-09 | 2009-10-13 | Google Inc. | Search result ranking based on trust |
EP1855245A1 (en) * | 2006-05-11 | 2007-11-14 | Deutsche Telekom AG | A method and a system for detecting a dishonest user in an online rating system |
US20070266025A1 (en) * | 2006-05-12 | 2007-11-15 | Microsoft Corporation | Implicit tokenized result ranking |
US20070288602A1 (en) * | 2006-06-09 | 2007-12-13 | Ebay Inc. | Interest-based communities |
JP5122795B2 (en) * | 2006-11-28 | 2013-01-16 | 株式会社エヌ・ティ・ティ・ドコモ | Search system and search method |
US20080288481A1 (en) * | 2007-05-15 | 2008-11-20 | Microsoft Corporation | Ranking online advertisement using product and seller reputation |
US8548996B2 (en) * | 2007-06-29 | 2013-10-01 | Pulsepoint, Inc. | Ranking content items related to an event |
US8583633B2 (en) | 2007-11-30 | 2013-11-12 | Ebay Inc. | Using reputation measures to improve search relevance |
US20100010987A1 (en) * | 2008-07-01 | 2010-01-14 | Barry Smyth | Searching system having a server which automatically generates search data sets for shared searching |
US8886633B2 (en) * | 2010-03-22 | 2014-11-11 | Heystaks Technology Limited | Systems and methods for user interactive social metasearching |
-
2007
- 2007-11-30 US US11/948,788 patent/US8583633B2/en active Active
-
2008
- 2008-11-25 WO PCT/US2008/013118 patent/WO2009070287A1/en active Application Filing
- 2008-11-25 JP JP2010535997A patent/JP5141994B2/en active Active
- 2008-11-25 CN CN200880118613.8A patent/CN101884042B/en active Active
- 2008-11-25 KR KR1020107014547A patent/KR101215791B1/en active IP Right Grant
- 2008-11-25 EP EP08855484A patent/EP2225671A4/en not_active Ceased
- 2008-11-25 AU AU2008330082A patent/AU2008330082B2/en active Active
- 2008-11-25 CA CA2856645A patent/CA2856645C/en active Active
- 2008-11-25 CN CN201410274098.XA patent/CN104111974A/en active Pending
- 2008-11-25 CA CA2706773A patent/CA2706773C/en active Active
-
2013
- 2013-11-06 US US14/073,564 patent/US9063986B2/en active Active
-
2015
- 2015-05-29 US US14/725,815 patent/US20150261763A1/en not_active Abandoned
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9063986B2 (en) | 2007-11-30 | 2015-06-23 | Ebay Inc. | Using reputation measures to improve search relevance |
Also Published As
Publication number | Publication date |
---|---|
CN104111974A (en) | 2014-10-22 |
US8583633B2 (en) | 2013-11-12 |
AU2008330082A1 (en) | 2009-06-04 |
US9063986B2 (en) | 2015-06-23 |
JP2011505628A (en) | 2011-02-24 |
CA2856645A1 (en) | 2009-06-04 |
AU2008330082B2 (en) | 2011-12-22 |
CN101884042B (en) | 2014-07-16 |
US20150261763A1 (en) | 2015-09-17 |
CA2706773A1 (en) | 2009-06-04 |
US20140067785A1 (en) | 2014-03-06 |
EP2225671A4 (en) | 2011-05-11 |
EP2225671A1 (en) | 2010-09-08 |
US20090144259A1 (en) | 2009-06-04 |
WO2009070287A1 (en) | 2009-06-04 |
KR20100101621A (en) | 2010-09-17 |
KR101215791B1 (en) | 2012-12-26 |
CN101884042A (en) | 2010-11-10 |
JP5141994B2 (en) | 2013-02-13 |
CA2856645C (en) | 2017-01-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2706773C (en) | Using reputation measures to improve search relevance | |
US11036814B2 (en) | Search engine that applies feedback from users to improve search results | |
US9235627B1 (en) | Modifying search result ranking based on implicit user feedback | |
US9418122B2 (en) | Adaptive user interface for real-time search relevance feedback | |
US8924314B2 (en) | Search result ranking using machine learning | |
KR101303488B1 (en) | Search systems and methods using in-line contextual queries | |
US20060173822A1 (en) | System and method for optimization of results based on monetization intent | |
US20170024478A1 (en) | Search with more like this refinements | |
US20160179818A1 (en) | Determining search result rankings based on trust level values associated with sellers | |
CA2663011A1 (en) | Strategy for providing query results | |
US20090006357A1 (en) | Determining quality measures for web objects based on searcher behavior | |
US10909196B1 (en) | Indexing and presentation of new digital content | |
US20110238534A1 (en) | Methods and systems for improving the categorization of items for which item listings are made by a user of an ecommerce system | |
US9323832B2 (en) | Determining desirability value using sale format of item listing | |
JP2008158893A (en) | Information retrieval device, information retrieval program, and program storage medium | |
Nicholson et al. | How much of it is real? Analysis of paid placement in Web search engine results | |
US20120005182A1 (en) | Methods and systems for search engine results based on dynamic experiential usage by users | |
JP2002539559A (en) | Synergistic Internet bookmarks linking Internet search and hotlinks | |
US8190602B1 (en) | Searching a database of selected and associated resources | |
EP3065102A1 (en) | Search engine optimization for category web pages | |
TW201124861A (en) | Generating method for search results and information searching system. | |
JP5735191B1 (en) | SEARCH DEVICE, SEARCH METHOD, RECORDING MEDIUM, AND PROGRAM | |
Jiang | A usability approach to improving the user experience in web directories | |
AU2015258220A1 (en) | Method and system to narrow generic searches using related search terms |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request |