WO2002011446A2 - Transcript triggers for video enhancement - Google Patents

Transcript triggers for video enhancement Download PDF

Info

Publication number
WO2002011446A2
WO2002011446A2 PCT/EP2001/007965 EP0107965W WO0211446A2 WO 2002011446 A2 WO2002011446 A2 WO 2002011446A2 EP 0107965 W EP0107965 W EP 0107965W WO 0211446 A2 WO0211446 A2 WO 0211446A2
Authority
WO
WIPO (PCT)
Prior art keywords
video program
information
segment
user profile
rules
Prior art date
Application number
PCT/EP2001/007965
Other languages
French (fr)
Other versions
WO2002011446A3 (en
Inventor
Thomas Mcgee
Nevenka Dimitrova
Lalitha Agnihotri
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Priority to JP2002515840A priority Critical patent/JP2004505563A/en
Priority to KR1020027003919A priority patent/KR20020054325A/en
Priority to EP01951665A priority patent/EP1410637A2/en
Publication of WO2002011446A2 publication Critical patent/WO2002011446A2/en
Publication of WO2002011446A3 publication Critical patent/WO2002011446A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/16Analogue secrecy systems; Analogue subscription systems
    • H04N7/162Authorising the user terminal, e.g. by paying; Registering the use of a subscription channel, e.g. billing
    • H04N7/163Authorising the user terminal, e.g. by paying; Registering the use of a subscription channel, e.g. billing by receiver means only
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/4104Peripherals receiving signals from specially adapted client devices
    • H04N21/4126The peripheral being portable, e.g. PDAs or mobile phones
    • H04N21/41265The peripheral being portable, e.g. PDAs or mobile phones having a remote control device for bidirectional communication between the remote control device and client device
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7844Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • H04N21/2353Processing of additional data, e.g. scrambling of additional data or processing content descriptors specifically adapted to content descriptors, e.g. coding, compressing or processing of metadata
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44222Analytics of user selections, e.g. selection of programs or purchase activity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/4508Management of client data or end-user data
    • H04N21/4532Management of client data or end-user data involving end-user characteristics, e.g. viewer profile, preferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/454Content or additional data filtering, e.g. blocking advertisements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/462Content or additional data management, e.g. creating a master electronic program guide from data received from the Internet and a Head-end, controlling the complexity of a video stream by scaling the resolution or bit-rate based on the client capabilities
    • H04N21/4622Retrieving content or additional data from different sources, e.g. from a broadcast channel and the Internet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/475End-user interface for inputting end-user data, e.g. personal identification number [PIN], preference data
    • H04N21/4755End-user interface for inputting end-user data, e.g. personal identification number [PIN], preference data for defining user preferences, e.g. favourite actors or genre
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4782Web browsing, e.g. WebTV
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4786Supplemental services, e.g. displaying phone caller identification, shopping application e-mailing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/812Monomedia components thereof involving advertisement data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8126Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts
    • H04N21/8133Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts specifically related to the content, e.g. biography of the actors in a movie, detailed information about an article seen in a video program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • H04N21/8586Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by using a URL
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/16Analogue secrecy systems; Analogue subscription systems
    • H04N7/173Analogue secrecy systems; Analogue subscription systems with two-way working, e.g. subscriber sending a programme selection signal
    • H04N7/17309Transmission or handling of upstream communications
    • H04N7/17318Direct or substantially direct transmission and handling of requests

Definitions

  • the present invention is directed to the field of media technology. It is particularly directed to video and related transcript text. 2. Cross-Reference to Related Applications
  • This invention associates video with supplementary information using a text transcript, and extracts and augments textual features, as does co-pending application, Ser Nr. 09/351,086, Filed 1999 July 9 by the assignee, and incorporated by reference herein. 3. Description of the Related Art
  • Webcasting applications such as PointCast or Backweb, or the newer web browsers, ask the user which information categories and web sites the user is interested in. A web server then "pushes "information of interest to the user instead of waiting until the user requests it. This is done periodically and in an unobtrusive manner.
  • BNE Broadcast News Editor
  • Mitre Corporation enables such retrieval by automatically partitioning newscasts into individual story segments, and providing a summary of each story segment in the first line of the closed-caption text associated with the segment. Keywords from the closed-caption text or audio are also determined for each story segment.
  • the Broadcast News Navigator (BNN), also from Mitre Corporation, sorts story segments by the number of keywords in each story segment that match search words selected by the consumer. Accordingly, story segments likely to be of interest to a particular consumer can be readily identified.
  • BNN Broadcast News Navigator
  • Patents which disclose providing the user with information supplemental to a television program include US Patent No. 5,809,471 to Brodsky entitled "Retrieval of additional information not found in interactive TV or telephony signal by application using dynamically extracted vocabulary" and US Patent No. 6,005,565 to Legall et al.
  • keywords are extracted from a television program or closed caption text, creating a dynamically changing dictionary.
  • the user requests information based upon an item seen or word heard in the television broadcast.
  • the user's request is matched against the dictionary, and when there is a match, a search for supplemental information to display is initiated.
  • the user selects topics and sources to search.
  • the search tool performs a search of the electronic program guide and other information resources such as the World Wide Web, and displays the results.
  • Both the '471 patent and the '565 patent require that the user provide a keyword of interest.
  • Neither patent relates the supplementary information retrieved to the global context of the program, (i.e. news program), as opposed to the subject matter of the program (i.e. the Stock Market report).
  • transcript text is comprised of at least one of the following: video text, text generated by speech recognition software, program transcripts, electronic program guide information, and closed caption text that contains all or part of the program information.
  • Video text is superimposed or overlaid text displayed in the foreground, with the image as a background.
  • Anchor names for example, often appear as video text.
  • Video text may also take the form of embedded text, for example, a street sign that can be identified and extracted from the video image.
  • supplementary information which is specific not just to the individual consumer's known interests or profile, but also to the context of the program being viewed.
  • news segments would be associated with links to the Cable Network News (CNN) Web page while commercials would be associated with additional product information.
  • CNN Cable Network News
  • the method and system would use learning models to continually develop new associations between the television content and other media content as well as to customize which type and how much supplementary information should be displayed. In this way, supplementary information would be integrated seamlessly with a television program without disturbing the viewer or requiring any action on the viewer's part.
  • the present invention addresses the foregoing needs by providing a system, (i.e., a method, an apparatus, and computer-executable process steps), for retrieval of supplementary information associated with a video segment, for display on the consumer's video display.
  • the system includes a recognition engine for determining whether expanded keywords for retrieving supplementary information are contained in the closed captioned text accompanying the video segment or in other transcript related text. If a keyword is found, a stored rule indicates the supplementary information to be displayed, the information having been selected from a larger set of information, and selected in accordance with a user profile and the context of the segment. Alternatively, the transcript keywords are expanded and then matched to the user's profile.
  • the context of the segment is automatically determined based upon classification data. These data include the program classification, object tracking methods, natural language processing of transcript information and/or electronic program guide information.
  • the information is displayed in a window or superimposed unobtrusively over the main video segment.
  • the information is transmitted, for example to a handheld device or an email account, stored to secondary storage, or cached in local memory.
  • the system automatically recognizes the beginning and end of each segment, in the story classifications, and so is able to update the subset of rules to correspond to the program segment context.
  • the set of rules for associating supplementary information with the video segment being viewed is dynamic and based upon a learning model.
  • the set of rules is updated from a set of sources, including third-party sources, and makes information available to the user in accordance with the user's choices and pattern of behavior.
  • the rules are transmitted from a Personal Digital Assistant (PDA) enabled with a wireless connection.
  • PDA Personal Digital Assistant
  • Figure 1 depicts a system on which the present invention is implemented.
  • Figure 2 depicts elements of the processor contained within the system.
  • Figures 3a and 3b are flow diagrams used for explaining the operation of the present invention.
  • Figure 4 is a table illustrating supplementary information triggers for a given video segment, according to the present invention.
  • Figure 4a illustrates how keywords and triggers are expanded.
  • Figure 5 is a diagram of an embodiment of the invention illustrating a learning model.
  • Figure 6 is a diagram illustrating how the association rules database, for retrieving supplementary information, is updated and maintained.
  • Figure 7 is a diagram illustrating how supplementary information is displayed.
  • Figure 8 is a diagram illustrating one embodiment of the invention in which a set-top box is used.
  • Figure 9 is a diagram illustrating another embodiment of the invention in which a television display is used.
  • FIG. 1 shows a representative embodiment of a system on which the present invention is implemented.
  • a multimedia processor system 6 includes a processor 12, a memory 10, input/output circuitry 8, and other circuitry and components well known to those skilled in the art.
  • An analog video signal or a digital stream is input to the receiver 2. This stream is compliant with MPEG or other proprietary broadcast formats.
  • Video data is encoded using discrete cosine transform encoding and is arranged into variable length encoded data packets for transmission.
  • MPEG-2 is described in the International Standards Organization — Moving Pictures Experts Group Document "Coding of Moving Pictures and Audio", ISO/IEC JTCI/SC29/WG11, July, 1996.
  • MPEG is just one example of a format, which can be utilized in the system.
  • Transcript text transmitted in the video signal 162 is extracted by the transcript extractor 4 from either line 21 of the analog video signal or the user data field of the MPEG stream.
  • the transcript extractor 4 also partitions the video program into segments.
  • the transcript text for the particular frame may be stored in the memory 10. Alternatively, it is analyzed as a real-time data stream. Also stored in the memory 10 is Electronic Program Guide Information
  • EPG This information, describing television broadcast information for a period of days or weeks, is downloaded on user request or at a preprogrammed time. It is transmitted by local analog TV broadcasters over the vertical blanking interval or through MPEG-2 private tables on a "home barker" channel. It can also be transmitted via telephone line or through wireless means.
  • EPG data includes information such as the program's genre and subgenre, its rating, and a short program description. EPG data is used to determine the context of a program, such as whether it is a news program, a paid programming excerpt, a soap opera, or a travelogue.
  • Typical triggers could be “Clint Eastwood”, “environment”, “presidential election” or “hockey”. These triggers are expanded in one aspect of the invention to include synonymous and related terms.
  • a personal profile of the user's interests is established automatically, by user input, or by a combination of both methods.
  • the TiVoTM Personal TV Service allows the user to indicate which programs the user prefers using a "Thumbs Up” or “Thumbs Down” button on the TiVoTM remote. TiVoTM then builds upon this information to select other related programs the user likes to view.
  • supplementary data is retrieved, for example from the Internet 14 or proprietary sources 13 through the communication means 17.
  • Another source for supplementary data is, for example, another channel.
  • the data is then displayed to the user on a display 16 either as a Web page or a portion thereof or superimposed over the main video in a non-intrusive fashion.
  • a simple Uniform Resource Locator (URL) or informative message is returned to the viewer.
  • URL Uniform Resource Locator
  • Rules for associating these triggers with supplementary data are also stored in the secondary memory 18 and available from the memory 10. These rules are established through a default profile that is updated based on user behavior, or though a query program that prompts the user for interests and then generates the rule set. The rules are also received from a mobile device 15 such as a Personal Digital Assistant (PDA) or cell phone through the communications means 17. These rules associate supplementary information with the triggers, depending on the context of the program segment being viewed. For example, if a program segment is an advertisement for Clint Eastwood's new movie, the context is commercial and the supplementary data retrieved is a description of the movie he is starring in. If a program segment is a description of Clint Eastwood's car accident, the context is news, and the supplementary data retrieved is a biographical web page or a link to www.cnn.com to obtain more information about why he is in the news.
  • PDA Personal Digital Assistant
  • association rules are also dependent upon a combination of EPG fields. For example, if “Clint Eastwood” appears in the actor's field of the EPG data, and the context is determined to be commercial, and the closed caption data is "We will be returning shortly to Clint Eastwood and Fist Full of Dollars after these announcements," then, the association rule retrieves supplementary data pertaining to the particular movie being shown. On the other hand, if “Clint Eastwood” does not appear in the actor's field of the EPG data, and the context is commercial, and the closed caption data is "High Plains Drifter starring Clint Eastwood will be aired on Friday,” then, the association rule retrieves supplementary data pertaining to showtimes for the movie.
  • association rule retrieves supplementary data by linking to the Clint Eastwood home page to find out more about the movie. Association rules also determine the category of media to be retrieved. For example, if "Kosovo" is the trigger and the program is sponsored by National Geographic, the association rule retrieves a map of the region. Alternatively, if the program segment context is news and the word "war" is located in the EPG data, then the association rule retrieves a recent political history of the region.
  • the system includes a video display with built-in processing and memory, or a separate set top box for processing and storing information.
  • These embodiments can include communication means or interface to communication means. Receipt of the video signal and Internet information is via wireless, satellite, cable or other media.
  • This system is modifiable to transmit the supplementary information via the communication means 17 as an output signal over a radio transmitter, or via wireless means, where the signal is embodied in a carrier wave 160.
  • the supplementary information is transmittable to an e-mail list, and/or downloadable to the voice mail feature of mobile devices 15 such as cell phones and or transmittable to a hand held device such as the Palm Pilot®.
  • FIG. 2 is a diagram of the processor elements.
  • a profile generator 50 generates and stores a profile of the user's known interests, which includes trigger information or keywords of interest. This is accomplished for example through user input, by having the user respond to a series of queries, by creating a default profile based on user characteristics which are modified by the user, or by monitoring user activity to discover areas of interest.
  • the rule generator 52 generates the association rules which logically combine each trigger with a variety of contexts to determine which supplementary information should be displayed to the user.
  • the recognition engine 54 compares each trigger with the transcript text and determines whether the trigger exists as a keyword in the text. When a trigger is matched, the retrieving portion 56 retrieves the supplementary information and the formatting portion 58, formats the data for display.
  • the context monitor 60 monitors the context to see whether it is changing due to the display of a new program segment. When a context change occurs, the context monitor 60 accesses the secondary storage 18 to retrieve a new subset of association rules.
  • the data updater 62 is used to update the supplementary information to incorporate new web sites, for example, or to reflect the results of searches performed by various search engines.
  • the repetition counter 64 counts the frequency with which a particular piece of information is requested and the clickstream monitor 66 measures the frequency with which a user requests supplementary data in general.
  • Figures 3a and 3b are flow diagrams illustrating the method of the invention.
  • the input video is input to a receiver.
  • the video is in analog or digital form.
  • the transcript extractor which is separate from or incorporated into the processor, extracts the transcript text in step S202 and identifies the beginning and end of each video segment.
  • the processor retrieves the keywords from the transcript text. Extraction of keywords is well known in the art and one such method of extraction is described in U.S.
  • Patent No.5,809,471 to Brodsky entitled "Retrieval of additional information not found in interactive TV or telephony signal by application using dynamically extracted vocabulary.”
  • these keywords 152 are extracted from the transcript text 150 and expanded 154 to achieve more meaningful and complete results, by associating them with synonymous or related keywords as shown in Figure 3a step S204.
  • a thesaurus is used for this purpose or a database such as Wordnet®.
  • Wordnet® is an on-line lexical reference system whose design is inspired by current psycholinguistic theories. The various parts of speech are organized into synonym sets, each representing one underlying lexical concept.
  • Keywords can also be expanded by identifying the theme of the transcript text. For example, the presence of the trigger "economy” in transcript text can be derived, when a number of words such as “inflation”, “Alan Greenspan”, and “unemployment rate” are simultaneously present. Similarly, the presence of the trigger “President Clinton” can be derived if the keyword “President of the United States” is present in the transcript text.
  • triggers are mapped to a variety of keywords depending on the level of understanding of the viewer. For example, if the viewer is a child or foreign-speaking viewer, the trigger "unemployment” would be mapped to the keyword phrase "without a job” but would not be mapped to the keyword “redundancy.”
  • the keywords are expanded as described above. Parental control is implemented below the program level at the program segment or contextual level. Therefore, parents need not worry if a commercial inappropriate for children is shown during an otherwise appropriate cartoon show, for example. The child viewer is presented with a special screen only during the commercial.
  • This special screen may take the form of a toy advertisement instead of merely a typical blocking screen.
  • Blocking triggers are also expanded to enhance the effectiveness of the blocking. For example, if the parent does not want the child to see video segments related to war, the trigger "war” is mapped to keywords and phrases such as "armed conflict” and "bombing.”
  • An example of trigger expansion is shown in Figure 4a 102 156.
  • the personal profile containing the triggers is read.
  • the processor matches the keywords developed from the transcript text with the triggers contained in the user profile in step S206. If there is no match, the processor continues by extracting additional transcript text.
  • step S207 of Fig. 3b the context of the ongoing video program is identified. This is done in several ways, using either the closed caption data, EPG data, object tracking methods, or low-level feature extraction such as color, motion, texture, or shape.
  • the context of the program segment is also extracted from the transcript text using natural language techniques. For example, Microsoft Corporation has developed software that learns by analyzing existing texts, including online dictionaries and encyclopedias, and automatically acquiring knowledge from this analysis. This knowledge is then used to help constrain the interpretation of the word "plane” in a sentence like "Flying planes can be dangerous" and to determine that the sentence pertains to aviation rather than woodworking.
  • Software also operates at the discourse level, using discourse analysis to identify the structure of the closed caption text and thereby its context. For example, a news program is identified because it would generally report the most important facts, "who, what, when, where, how" in its beginning. Accordingly, a program that began with the sentence "Clint Eastwood was in a gun fight, in Carmel California, at seven a.m. on Main Street, by a bystander with a home video camera" is identified as a news story.
  • the context is also available in the EPG data from the genre and sub genre fields or a combination of fields as explained above.
  • step S208 the association rules are read.
  • the association rules determine which supplementary data from a stored database should be retrieved, based upon the keyword and context.
  • step S209 the customized display modules are read. These modules enable the user to restrict the types of information, and therefore also the amount of information, the user wants to view. For example, the user may only wish to see the Uniform Resource Locator (URL) of a WWW page, only larger titles from the page, a page summary, or a full page. The user can choose the supplementary sources he wants to view and prioritize these sources.
  • the supplementary data is retrieved from a database stored in memory.
  • the database contains items of interest or pointers to items of interest, ancillary to the trigger. For example, the database contains any of the following: names of celebrities and public figures, geographic information such as countries, capitals, and presidents, product and brand names, assorted categories and topics.
  • the database is maintained and refreshed from an established set of sources. These include for example, the Bloomberg site, encyclopedias, thesauri, dictionaries, and a set of web sites or search engines. Information from the EPG and closed caption data is also incorporated into the database. A set of refresh and cleanup rules, as shown in Figures 5 and 6 is also stored in a database or a viewer's profile, for example, and maintained for managing the size of the database or profile and its currency. For example, "stale" items such as election results and links to information about polls and the candidates would be deleted after an election takes place.
  • the supplementary information is formatted for display. The information is displayed in a window or superimposed unobtrusively over the main video segment.
  • Figure 4 illustrates the set of association rules 100 for several triggers 102.
  • the first column represents the triggers 102 and columns 2-4 represent the possible contexts 104, 106, 108, 110 for the example triggers shown.
  • association rule 120 for the first trigger 102 "Clint Eastwood"
  • one of three different items of supplementary information 116, 118, 120 are retrieved for display, depending on the context in which Clint Eastwood appears in the video segment being viewed.
  • only one link is shown in each box of the example table, multiple links can exist.
  • Clint Eastwood appears in a commercial, the system will link to the WWW page located at www.imdb.com and display the page in accordance with the customized display model. If Clint Eastwood appears on a talk show, the talk show segment where he appears will be stored for retrieval 118 and or an alert sent to the viewer in real-time.
  • an offline alert is transmitted for later viewing, notifying the viewer that the segment has been stored.
  • Alerts are automatically or manually retrieved. Alert transmission is also keyed to a topic such that the alert is displayed the next time a Clint Eastwood movie is shown. If Clint Eastwood appears on a news program, the system will link to the WWW page located at www.cnn.com. Alerts have priorities enabling the user to select the circumstances when the user wants to be notified. For example, a user may only want to view alerts pertaining to severe weather warnings.
  • the second association rule 122 for the trigger 102 occidentalia deals with 4 different contexts. If the trigger "Macedonia" appears in an advertisement, the system links to the WWW page at www.travel.com 130.
  • Association rules 3-5 124 126 128 should be interpreted in the same manner as the above examples. As shown in the table, when certain triggers 102 such as "Meryl Streep" appear in transcript text, the system will only provide supplementary information for certain contexts. In the case of "Meryl Streep", supplementary information is only supplied for the Talk Show and News contexts. If desired, such a rule is broadened to apply to a list of well- known actors or all actors.
  • Figure 4a illustrates how both the triggers and keywords can be expanded to retrieve supplementary information.
  • the keyword 152 "Lyme Disease” is extracted from the transcript text 150.
  • the keyword 152 is then expanded to map to the additional key words “tick”, “tick bite”, “bull's eye rash” and "deer tick.” If any of these expanded keywords appear in the transcript text, supplementary information related to Lyme Disease will be retrieved.
  • Figure 4a also illustrates how triggers are expanded.
  • the trigger 102 "Lyme Disease” is expanded 156 to include the related terms "tick bite”, "West Nile virus, and "mosquito spraying.” Accordingly, if the transcript text 150 contains any of the expanded triggers the segment is stored, for example.
  • Figure 5 illustrates how a learning model is implemented to continually update the customized display modules and association rules.
  • the repetition counter 20 maintains a count of how often the user requests the same supplementary data, for example by clicking on a URL. Also, more than one piece of supplementary information may be retrieved bv the retrieving portion 56 of the processor, shown in Figure 2, for each segment and the user may select the information the user wishes to view. If a user requests a particular piece of supplementary data less than a predetermined amount of times, the stored association rules 26 are updated by the retrieval modifier 24 such that the supplementary data is eliminated from the rule or the rule is modified to include a new source.
  • the clickstream monitor 22 monitors how frequently the user requests any supplementary data. If the user selects supplementary data less than a predetermined amount of times, the custom display module 28 for that user is modified by the retrieval modifier 24 such that less information is presented to the user.
  • Figure 6 illustrates how the dynamic association rules database is updated and maintained.
  • the database contains items of interest or pointers to items of interest that can provide ancillary information, when triggered by a match between a keyword in the transcript text and a trigger in the user's profile.
  • the database is updated over time to reflect current events and to match the evolving user profile.
  • the existing data sources set 36 specifies the data sources from which the association rules database 26 is constructed.
  • the data sources set 36 which includes both external data 38 from a variety of published sources, proprietary information, and data from the Internet 14 is updated by the data updater 40 to incorporate new web sites, for example, or to reflect the results of searches performed by various search engines.
  • a set of refresh rules 32 is maintained to keep the size of the database at a preset limit. According to a set of established priorities, information is deleted when necessary.
  • a set of cleanup rules 34 is also maintained which specify when and how "stale" information can be deleted. Information in certain categories is date stamped, and information older than a preset number of months and/or years is deleted.
  • Figure 7 illustrates an embodiment in which the supplementary information 70 is displayed superimposed unobtrusively over the main video segment.
  • the supplementary information appears at the bottom of the picture.
  • Figure 8 illustrates an embodiment in which a set-top box 75 comprises a receiver 2, which receives video program and transcript text.
  • a transcript text extractor and segmenter 4 extracts the transcript text 150 from the video signal and associates it with segments of the video program such as commercials and news flashes.
  • a processor system 6 includes processing elements well known in the art ⁇ an input/output portion 8, a memory 10, and a processor 12. Via a communication means 17, the processor system retrieves information supplemental to the video program from a variety of sources. Three of these sources, the Internet 14, proprietary (non-public) databases 13, and mobile devices 15 such as PDAs are shown in the figure as examples.
  • the communication means 17 can connect to other devices not specifically shown, via wireless means, cable modem, a digital subscriber line, or a network, for example.
  • the secondary storage 18 is used to store the supplementary information as well as the rules for retrieving the information.
  • the set-top box can be interfaced to a display such as a PC display or a television.
  • Figure 9 illustrates another embodiment in which a television 80 comprises a receiver 2, a transcript text extractor and segmenter 4, a processor system 6, secondary storage 18, a communication means 17, and a display 16.
  • the processor system 6 includes processing elements well known in the art - - an input/output portion 8, a memory 10, and a processor 12.
  • the television 80 interfaces to sources of supplementary information via the communication means 17 which interfaces to the Internet 14, proprietary sources 13 and mobile devices 15, for example.
  • the present invention has been described with respect to particular illustrative embodiments. It is to be understood that the invention is not limited to the above-described embodiments and modifications thereto, and that various changes and modifications may be made by those of ordinary skill in the art without departing from the spirit and scope of the appended claims.

Abstract

A system and method for retrieving information supplemental to video programming. Transcript text is searched for terms of interest and information associated with the terms is identified. Depending upon a user profile and the category of video segment being viewed, the supplemental information is formatted for display. Over time, the rules for associating the supplemental information with the terms of interest may be modified using a learning model.

Description

Transcript triggers for video enhancement
BACKGROUND OF THE INVENTION 1. Field of the Invention
The present invention is directed to the field of media technology. It is particularly directed to video and related transcript text. 2. Cross-Reference to Related Applications
This invention associates video with supplementary information using a text transcript, and extracts and augments textual features, as does co-pending application, Ser Nr. 09/351,086, Filed 1999 July 9 by the assignee, and incorporated by reference herein. 3. Description of the Related Art In recent years, the number of media sources has increased and the volume of information from each source has also increased, resulting in information overload. Most consumers have neither the time nor the inclination to sift through the morass of information for what is pertinent to their wants and needs. Accordingly, so called "push technology" has developed. Webcasting applications such as PointCast or Backweb, or the newer web browsers, ask the user which information categories and web sites the user is interested in. A web server then "pushes " information of interest to the user instead of waiting until the user requests it. This is done periodically and in an unobtrusive manner.
Concurrently, as media technology has progressed, the lines between video, audio, and other media have been blurred. Advances in media technology have enabled the delivery of Internet information and other informational material to the consumer's video display, along with the traditional television programming. Because the Internet has become a tool of e-commerce, consumers are conditioned to view a combination of media, video, audio, and text information on the same or associated topics. Consumers are acquainted with the hyperlink concept and the notion of "drilling down" to retrieve additional information on a subject they are viewing on the World Wide Web (WWW).
Retrieval of this additional information can currently be accomplished using closed caption text, audio, and automated story segmentation and identification. The Broadcast News Editor (BNE), provided by Mitre Corporation, enables such retrieval by automatically partitioning newscasts into individual story segments, and providing a summary of each story segment in the first line of the closed-caption text associated with the segment. Keywords from the closed-caption text or audio are also determined for each story segment.
The Broadcast News Navigator (BNN), also from Mitre Corporation, sorts story segments by the number of keywords in each story segment that match search words selected by the consumer. Accordingly, story segments likely to be of interest to a particular consumer can be readily identified. However, using a combination of BNN and BNE requires that the consumer have an explicit search topic in mind, which is usually not the case in a typical channel-surfing scenario. Patents which disclose providing the user with information supplemental to a television program include US Patent No. 5,809,471 to Brodsky entitled "Retrieval of additional information not found in interactive TV or telephony signal by application using dynamically extracted vocabulary" and US Patent No. 6,005,565 to Legall et al. entitled "Integrated search of electronic program guide, internet and other information resources." In the '471 patent, keywords are extracted from a television program or closed caption text, creating a dynamically changing dictionary. The user requests information based upon an item seen or word heard in the television broadcast. The user's request is matched against the dictionary, and when there is a match, a search for supplemental information to display is initiated. In the '565 patent, the user selects topics and sources to search. Based on the user input, the search tool performs a search of the electronic program guide and other information resources such as the World Wide Web, and displays the results. Both the '471 patent and the '565 patent require that the user provide a keyword of interest. Neither patent relates the supplementary information retrieved to the global context of the program, (i.e. news program), as opposed to the subject matter of the program (i.e. the Stock Market report).
SUMMARY OF THE INVENTION
Accordingly, it would be advantageous to provide a method and system employing transcript text, for automatically providing supplementary multimedia information enhancing the consumer's television viewing experience. So called transcript text is comprised of at least one of the following: video text, text generated by speech recognition software, program transcripts, electronic program guide information, and closed caption text that contains all or part of the program information. Video text, is superimposed or overlaid text displayed in the foreground, with the image as a background. Anchor names for example, often appear as video text. Video text may also take the form of embedded text, for example, a street sign that can be identified and extracted from the video image.
It would also be advantageous to provide supplementary information, which is specific not just to the individual consumer's known interests or profile, but also to the context of the program being viewed. For example, news segments would be associated with links to the Cable Network News (CNN) Web page while commercials would be associated with additional product information. The method and system would use learning models to continually develop new associations between the television content and other media content as well as to customize which type and how much supplementary information should be displayed. In this way, supplementary information would be integrated seamlessly with a television program without disturbing the viewer or requiring any action on the viewer's part.
The present invention addresses the foregoing needs by providing a system, (i.e., a method, an apparatus, and computer-executable process steps), for retrieval of supplementary information associated with a video segment, for display on the consumer's video display. The system includes a recognition engine for determining whether expanded keywords for retrieving supplementary information are contained in the closed captioned text accompanying the video segment or in other transcript related text. If a keyword is found, a stored rule indicates the supplementary information to be displayed, the information having been selected from a larger set of information, and selected in accordance with a user profile and the context of the segment. Alternatively, the transcript keywords are expanded and then matched to the user's profile. The context of the segment is automatically determined based upon classification data. These data include the program classification, object tracking methods, natural language processing of transcript information and/or electronic program guide information.
The information is displayed in a window or superimposed unobtrusively over the main video segment. Alternatively, the information is transmitted, for example to a handheld device or an email account, stored to secondary storage, or cached in local memory. The system automatically recognizes the beginning and end of each segment, in the story classifications, and so is able to update the subset of rules to correspond to the program segment context.
In a further aspect of the invention, the set of rules for associating supplementary information with the video segment being viewed is dynamic and based upon a learning model. The set of rules is updated from a set of sources, including third-party sources, and makes information available to the user in accordance with the user's choices and pattern of behavior. In one embodiment, the rules are transmitted from a Personal Digital Assistant (PDA) enabled with a wireless connection.
This brief summary has been provided so that the nature of the invention will be understood quickly. A more complete understanding of the invention is obtained by reference to the following detailed description of the preferred embodiments thereof in connection with the attached drawings.
BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 depicts a system on which the present invention is implemented.
Figure 2 depicts elements of the processor contained within the system. Figures 3a and 3b are flow diagrams used for explaining the operation of the present invention.
Figure 4 is a table illustrating supplementary information triggers for a given video segment, according to the present invention.
Figure 4a illustrates how keywords and triggers are expanded. Figure 5 is a diagram of an embodiment of the invention illustrating a learning model.
Figure 6 is a diagram illustrating how the association rules database, for retrieving supplementary information, is updated and maintained.
Figure 7 is a diagram illustrating how supplementary information is displayed. Figure 8 is a diagram illustrating one embodiment of the invention in which a set-top box is used.
Figure 9 is a diagram illustrating another embodiment of the invention in which a television display is used.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
Figure 1 shows a representative embodiment of a system on which the present invention is implemented. In this embodiment, a multimedia processor system 6 includes a processor 12, a memory 10, input/output circuitry 8, and other circuitry and components well known to those skilled in the art. An analog video signal or a digital stream is input to the receiver 2. This stream is compliant with MPEG or other proprietary broadcast formats.
In accordance with the MPEG standard, video data is encoded using discrete cosine transform encoding and is arranged into variable length encoded data packets for transmission. One version of the MPEG standard, MPEG-2 is described in the International Standards Organization — Moving Pictures Experts Group Document "Coding of Moving Pictures and Audio", ISO/IEC JTCI/SC29/WG11, July, 1996. MPEG is just one example of a format, which can be utilized in the system. Transcript text, transmitted in the video signal 162, is extracted by the transcript extractor 4 from either line 21 of the analog video signal or the user data field of the MPEG stream. The transcript extractor 4 also partitions the video program into segments. The transcript text for the particular frame may be stored in the memory 10. Alternatively, it is analyzed as a real-time data stream. Also stored in the memory 10 is Electronic Program Guide Information
(EPG). This information, describing television broadcast information for a period of days or weeks, is downloaded on user request or at a preprogrammed time. It is transmitted by local analog TV broadcasters over the vertical blanking interval or through MPEG-2 private tables on a "home barker" channel. It can also be transmitted via telephone line or through wireless means. EPG data includes information such as the program's genre and subgenre, its rating, and a short program description. EPG data is used to determine the context of a program, such as whether it is a news program, a paid programming excerpt, a soap opera, or a travelogue.
Also stored in secondary storage 18 and available in the memory 10 is personal profile information, in the form of keywords or "triggers," describing the user's interests. Typical triggers could be "Clint Eastwood", "environment", "presidential election" or "hockey". These triggers are expanded in one aspect of the invention to include synonymous and related terms.
As is well known in the prior art, a personal profile of the user's interests is established automatically, by user input, or by a combination of both methods. For example, the TiVo™ Personal TV Service allows the user to indicate which programs the user prefers using a "Thumbs Up" or "Thumbs Down" button on the TiVo™ remote. TiVo™ then builds upon this information to select other related programs the user likes to view.
When a trigger matches keywords contained in the transcript text, supplementary data is retrieved, for example from the Internet 14 or proprietary sources 13 through the communication means 17. Another source for supplementary data is, for example, another channel. The data is then displayed to the user on a display 16 either as a Web page or a portion thereof or superimposed over the main video in a non-intrusive fashion. Alternatively or additionally, a simple Uniform Resource Locator (URL) or informative message is returned to the viewer.
Rules for associating these triggers with supplementary data such as World Wide Web (WWW) pages are also stored in the secondary memory 18 and available from the memory 10. These rules are established through a default profile that is updated based on user behavior, or though a query program that prompts the user for interests and then generates the rule set. The rules are also received from a mobile device 15 such as a Personal Digital Assistant (PDA) or cell phone through the communications means 17. These rules associate supplementary information with the triggers, depending on the context of the program segment being viewed. For example, if a program segment is an advertisement for Clint Eastwood's new movie, the context is commercial and the supplementary data retrieved is a description of the movie he is starring in. If a program segment is a description of Clint Eastwood's car accident, the context is news, and the supplementary data retrieved is a biographical web page or a link to www.cnn.com to obtain more information about why he is in the news.
As illustrated above, association rules are also dependent upon a combination of EPG fields. For example, if "Clint Eastwood" appears in the actor's field of the EPG data, and the context is determined to be commercial, and the closed caption data is "We will be returning shortly to Clint Eastwood and Fist Full of Dollars after these announcements," then, the association rule retrieves supplementary data pertaining to the particular movie being shown. On the other hand, if "Clint Eastwood" does not appear in the actor's field of the EPG data, and the context is commercial, and the closed caption data is "High Plains Drifter starring Clint Eastwood will be aired on Friday," then, the association rule retrieves supplementary data pertaining to showtimes for the movie. These differences can be determined, for example, by comparing the text of the credits with text extracted from the closed caption data. It there is a match, then the program being advertised is the program being viewed. Alternatively, natural language processing can be used to identify key phrases such as "returning to" which would also indicate that the program being advertised is the program being viewed. Alternatively, if "Clint Eastwood" does not appear in the actor's field of the
EPG data, and the context is commercial, and the closed caption data says "Clint Eastwood's new movie will be released shortly", then the association rule retrieves supplementary data by linking to the Clint Eastwood home page to find out more about the movie. Association rules also determine the category of media to be retrieved. For example, if "Kosovo" is the trigger and the program is sponsored by National Geographic, the association rule retrieves a map of the region. Alternatively, if the program segment context is news and the word "war" is located in the EPG data, then the association rule retrieves a recent political history of the region.
In alternative embodiments, the system includes a video display with built-in processing and memory, or a separate set top box for processing and storing information. These embodiments can include communication means or interface to communication means. Receipt of the video signal and Internet information is via wireless, satellite, cable or other media. This system is modifiable to transmit the supplementary information via the communication means 17 as an output signal over a radio transmitter, or via wireless means, where the signal is embodied in a carrier wave 160. The supplementary information is transmittable to an e-mail list, and/or downloadable to the voice mail feature of mobile devices 15 such as cell phones and or transmittable to a hand held device such as the Palm Pilot®.
Figure 2 is a diagram of the processor elements. A profile generator 50 generates and stores a profile of the user's known interests, which includes trigger information or keywords of interest. This is accomplished for example through user input, by having the user respond to a series of queries, by creating a default profile based on user characteristics which are modified by the user, or by monitoring user activity to discover areas of interest. The rule generator 52 generates the association rules which logically combine each trigger with a variety of contexts to determine which supplementary information should be displayed to the user. The recognition engine 54 compares each trigger with the transcript text and determines whether the trigger exists as a keyword in the text. When a trigger is matched, the retrieving portion 56 retrieves the supplementary information and the formatting portion 58, formats the data for display. The context monitor 60, monitors the context to see whether it is changing due to the display of a new program segment. When a context change occurs, the context monitor 60 accesses the secondary storage 18 to retrieve a new subset of association rules. The data updater 62 is used to update the supplementary information to incorporate new web sites, for example, or to reflect the results of searches performed by various search engines. The repetition counter 64 counts the frequency with which a particular piece of information is requested and the clickstream monitor 66 measures the frequency with which a user requests supplementary data in general. These intelligent agents work in conjunction with the retrieval modifier 68 to modify the type of information and amount of information presented to the user.
Figures 3a and 3b are flow diagrams illustrating the method of the invention. To begin, in step S201, the input video is input to a receiver. The video is in analog or digital form. The transcript extractor, which is separate from or incorporated into the processor, extracts the transcript text in step S202 and identifies the beginning and end of each video segment. Next, in step S203, the processor retrieves the keywords from the transcript text. Extraction of keywords is well known in the art and one such method of extraction is described in U.S. Patent No.5,809,471 to Brodsky, entitled "Retrieval of additional information not found in interactive TV or telephony signal by application using dynamically extracted vocabulary." As shown in Figure 4a, these keywords 152 are extracted from the transcript text 150 and expanded 154 to achieve more meaningful and complete results, by associating them with synonymous or related keywords as shown in Figure 3a step S204. A thesaurus is used for this purpose or a database such as Wordnet®. Wordnet® is an on-line lexical reference system whose design is inspired by current psycholinguistic theories. The various parts of speech are organized into synonym sets, each representing one underlying lexical concept.
Keywords can also be expanded by identifying the theme of the transcript text. For example, the presence of the trigger "economy" in transcript text can be derived, when a number of words such as "inflation", "Alan Greenspan", and "unemployment rate" are simultaneously present. Similarly, the presence of the trigger "President Clinton" can be derived if the keyword "President of the United States" is present in the transcript text.
Special rules apply when the supplementary data is contained in reference tools such as dictionaries and encyclopedias, as shown in Figure 4 114 132. In one mode, triggers are mapped to a variety of keywords depending on the level of understanding of the viewer. For example, if the viewer is a child or foreign-speaking viewer, the trigger "unemployment" would be mapped to the keyword phrase "without a job" but would not be mapped to the keyword "redundancy." In an alternate mode, the keywords are expanded as described above. Parental control is implemented below the program level at the program segment or contextual level. Therefore, parents need not worry if a commercial inappropriate for children is shown during an otherwise appropriate cartoon show, for example. The child viewer is presented with a special screen only during the commercial. This special screen may take the form of a toy advertisement instead of merely a typical blocking screen. Blocking triggers are also expanded to enhance the effectiveness of the blocking. For example, if the parent does not want the child to see video segments related to war, the trigger "war" is mapped to keywords and phrases such as "armed conflict" and "bombing." An example of trigger expansion is shown in Figure 4a 102 156. Returning to Figure 3 a, in step S205, the personal profile containing the triggers is read. The processor matches the keywords developed from the transcript text with the triggers contained in the user profile in step S206. If there is no match, the processor continues by extracting additional transcript text.
If there is a match, in step S207 of Fig. 3b, the context of the ongoing video program is identified. This is done in several ways, using either the closed caption data, EPG data, object tracking methods, or low-level feature extraction such as color, motion, texture, or shape. The context of the program segment is also extracted from the transcript text using natural language techniques. For example, Microsoft Corporation has developed software that learns by analyzing existing texts, including online dictionaries and encyclopedias, and automatically acquiring knowledge from this analysis. This knowledge is then used to help constrain the interpretation of the word "plane" in a sentence like "Flying planes can be dangerous" and to determine that the sentence pertains to aviation rather than woodworking.
Software also operates at the discourse level, using discourse analysis to identify the structure of the closed caption text and thereby its context. For example, a news program is identified because it would generally report the most important facts, "who, what, when, where, how" in its beginning. Accordingly, a program that began with the sentence "Clint Eastwood was in a gun fight, in Carmel California, at seven a.m. on Main Street, by a bystander with a home video camera" is identified as a news story. The context is also available in the EPG data from the genre and sub genre fields or a combination of fields as explained above.
Next, in step S208, the association rules are read. The association rules determine which supplementary data from a stored database should be retrieved, based upon the keyword and context. In step S209, the customized display modules are read. These modules enable the user to restrict the types of information, and therefore also the amount of information, the user wants to view. For example, the user may only wish to see the Uniform Resource Locator (URL) of a WWW page, only larger titles from the page, a page summary, or a full page. The user can choose the supplementary sources he wants to view and prioritize these sources. In step S210, the supplementary data is retrieved from a database stored in memory. The database contains items of interest or pointers to items of interest, ancillary to the trigger. For example, the database contains any of the following: names of celebrities and public figures, geographic information such as countries, capitals, and presidents, product and brand names, assorted categories and topics.
The database is maintained and refreshed from an established set of sources. These include for example, the Bloomberg site, encyclopedias, thesauri, dictionaries, and a set of web sites or search engines. Information from the EPG and closed caption data is also incorporated into the database. A set of refresh and cleanup rules, as shown in Figures 5 and 6 is also stored in a database or a viewer's profile, for example, and maintained for managing the size of the database or profile and its currency. For example, "stale" items such as election results and links to information about polls and the candidates would be deleted after an election takes place. Returning to Figure 3b, in step S211 , the supplementary information is formatted for display. The information is displayed in a window or superimposed unobtrusively over the main video segment. Alternatively, the information is formatted for transmittal, for example to a hand-held device such as the Palm Pilot™ distributed by Palm, Inc. or to an email account. Figure 4 illustrates the set of association rules 100 for several triggers 102. In the table, the first column represents the triggers 102 and columns 2-4 represent the possible contexts 104, 106, 108, 110 for the example triggers shown. Beginning with the association rule 120 for the first trigger 102, "Clint Eastwood", when this trigger 102 appears in a user's profile, one of three different items of supplementary information 116, 118, 120 are retrieved for display, depending on the context in which Clint Eastwood appears in the video segment being viewed. Although only one link is shown in each box of the example table, multiple links can exist. If Clint Eastwood appears in a commercial, the system will link to the WWW page located at www.imdb.com and display the page in accordance with the customized display model. If Clint Eastwood appears on a talk show, the talk show segment where he appears will be stored for retrieval 118 and or an alert sent to the viewer in real-time.
Alternatively, an offline alert is transmitted for later viewing, notifying the viewer that the segment has been stored.
Alerts are automatically or manually retrieved. Alert transmission is also keyed to a topic such that the alert is displayed the next time a Clint Eastwood movie is shown. If Clint Eastwood appears on a news program, the system will link to the WWW page located at www.cnn.com. Alerts have priorities enabling the user to select the circumstances when the user wants to be notified. For example, a user may only want to view alerts pertaining to severe weather warnings. The second association rule 122 for the trigger 102 Macedonia deals with 4 different contexts. If the trigger "Macedonia" appears in an advertisement, the system links to the WWW page at www.travel.com 130. If Macedonia is the subject of a talk show, the system links to an entry for "Macedonia" in Compton's Encyclopedia 132. If Macedonia is the subject of a news show, the user is tuned to the station where the program is being aired 134. If Macedonia is the subj ect of a program sponsored by National Geographic magazine, the system links to www.yahoo.com/maps 136 to display a map of Macedonia.
Association rules 3-5 124 126 128 should be interpreted in the same manner as the above examples. As shown in the table, when certain triggers 102 such as "Meryl Streep" appear in transcript text, the system will only provide supplementary information for certain contexts. In the case of "Meryl Streep", supplementary information is only supplied for the Talk Show and News contexts. If desired, such a rule is broadened to apply to a list of well- known actors or all actors.
Figure 4a illustrates how both the triggers and keywords can be expanded to retrieve supplementary information. For the example transcript text 150 shown, the keyword 152 "Lyme Disease" is extracted from the transcript text 150. The keyword 152 is then expanded to map to the additional key words "tick", "tick bite", "bull's eye rash" and "deer tick." If any of these expanded keywords appear in the transcript text, supplementary information related to Lyme Disease will be retrieved.
Figure 4a also illustrates how triggers are expanded. The trigger 102 "Lyme Disease" is expanded 156 to include the related terms "tick bite", "West Nile virus, and "mosquito spraying." Accordingly, if the transcript text 150 contains any of the expanded triggers the segment is stored, for example.
Figure 5 illustrates how a learning model is implemented to continually update the customized display modules and association rules. The repetition counter 20 maintains a count of how often the user requests the same supplementary data, for example by clicking on a URL. Also, more than one piece of supplementary information may be retrieved bv the retrieving portion 56 of the processor, shown in Figure 2, for each segment and the user may select the information the user wishes to view. If a user requests a particular piece of supplementary data less than a predetermined amount of times, the stored association rules 26 are updated by the retrieval modifier 24 such that the supplementary data is eliminated from the rule or the rule is modified to include a new source. The clickstream monitor 22 monitors how frequently the user requests any supplementary data. If the user selects supplementary data less than a predetermined amount of times, the custom display module 28 for that user is modified by the retrieval modifier 24 such that less information is presented to the user.
Figure 6 illustrates how the dynamic association rules database is updated and maintained. The database contains items of interest or pointers to items of interest that can provide ancillary information, when triggered by a match between a keyword in the transcript text and a trigger in the user's profile. The database is updated over time to reflect current events and to match the evolving user profile.
The existing data sources set 36, specifies the data sources from which the association rules database 26 is constructed. The data sources set 36 which includes both external data 38 from a variety of published sources, proprietary information, and data from the Internet 14 is updated by the data updater 40 to incorporate new web sites, for example, or to reflect the results of searches performed by various search engines. A set of refresh rules 32 is maintained to keep the size of the database at a preset limit. According to a set of established priorities, information is deleted when necessary. A set of cleanup rules 34 is also maintained which specify when and how "stale" information can be deleted. Information in certain categories is date stamped, and information older than a preset number of months and/or years is deleted.
Figure 7 illustrates an embodiment in which the supplementary information 70 is displayed superimposed unobtrusively over the main video segment. The supplementary information appears at the bottom of the picture.
Figure 8 illustrates an embodiment in which a set-top box 75 comprises a receiver 2, which receives video program and transcript text. A transcript text extractor and segmenter 4 extracts the transcript text 150 from the video signal and associates it with segments of the video program such as commercials and news flashes. A processor system 6 includes processing elements well known in the art ~ an input/output portion 8, a memory 10, and a processor 12. Via a communication means 17, the processor system retrieves information supplemental to the video program from a variety of sources. Three of these sources, the Internet 14, proprietary (non-public) databases 13, and mobile devices 15 such as PDAs are shown in the figure as examples. The communication means 17 can connect to other devices not specifically shown, via wireless means, cable modem, a digital subscriber line, or a network, for example. The secondary storage 18 is used to store the supplementary information as well as the rules for retrieving the information. The set-top box can be interfaced to a display such as a PC display or a television.
Figure 9 illustrates another embodiment in which a television 80 comprises a receiver 2, a transcript text extractor and segmenter 4, a processor system 6, secondary storage 18, a communication means 17, and a display 16. The processor system 6 includes processing elements well known in the art - - an input/output portion 8, a memory 10, and a processor 12. The television 80 interfaces to sources of supplementary information via the communication means 17 which interfaces to the Internet 14, proprietary sources 13 and mobile devices 15, for example. The present invention has been described with respect to particular illustrative embodiments. It is to be understood that the invention is not limited to the above-described embodiments and modifications thereto, and that various changes and modifications may be made by those of ordinary skill in the art without departing from the spirit and scope of the appended claims.

Claims

CLAIMS:
1. An association method for retrieving information supplemental to a video program comprising the steps of: receiving the video program (2); identifying in the video program at least one segment (4); receiving classification data for said at least one segment (4,2); receiving transcript text for the video program (4); identifying a user profile for a video program viewer (50); identifying a set of rules (52) incorporating the classification data, for associating the supplementary information with the video program, when the transcript text and the user profile satisfy a set of conditions; and automatically retrieving the supplementary information based upon the set of rules for display on a display (56).
2. The method according to Claim 1, wherein the set of rules (100) includes information from the user profile (102).
3. The method according to Claim 2, wherein the user profile contains at least one trigger (102) which identifies a topic of interest to the video program viewer.
4. A method according to Claim 3, wherein the set of conditions specifies that a recognition engine (54) retrieve the supplementary information only when a keyword in the transcript text matches (S206) the at least one trigger (102) in the user profile.
5. The method according to Claim 1, wherein the transcript text is comprised of closed caption text, video text, program transcripts or electronic program guide information.
6. The method according to Claim 1 , wherein the transcript text (150) is generated by speech recognition software.
7. The method according to Claim 1, further including the step of receiving at least a portion of the set of rules (100) from a mobile device (15) or a third-party source (13).
8. The method according to Claim 1, wherein at least part of the supplementary information and pointers to the supplementary information are stored in a database (26) or transmitted to a personal digital assistant (15) or to an electronic mail address (14).
9. The method according to Claim 1 wherein the retrieval of the supplementary information (116,118,120) is in real-time.
10. The method according to Claim 1, wherein the supplementary information
(116,118,120) is formatted for display in a window (70) or for superimposition over the video program on a display (16).
11. The method according to Claim 1 , wherein the supplementary information is text information (114) or a page from the World Wide Web (116).
12. The method according to Claim 5, further including the step of automatically selecting the set of rules (100) for each video program segment from the electronic program guide information (150).
13. The method according to Claim 3, further including the step of automatically selecting the set of rules (100) by applying natural language processing to the transcript text (150) for each video program segment to identify whether a keyword (S203) in the transcript text (4) matches a trigger (102) in the user profile.
14. The method according to Claim 3, further including the step of identifying at least one keyword (S203, 152) in the transcript text (150), expanding the at least one keyword (S204, 152) to include related terms (154), and retrieving the supplementary information (S210) when the keyword or related terms matches (S206) the at least one trigger (102) in the user profile.
15. The method according to Claim 3, further including the step of automatically generating the set of rules (52) by applying discourse analysis to the transcript text (150) for each video program segment to identify whether a keyword (152) in the transcript text (150) matches a trigger (S206,102) in the user profile.
16. The method according to Claim 3, further including the step of expanding at least one trigger (154) in the user profile to include related terms, identifying at least one keyword in the transcript text, and retrieving the supplementary information when the trigger or related terms matches the at least one keyword in the transcript text.
17. The method according to Claim 8, further including the step of deleting (40) supplementary information (26) or pointers to supplementary information added to the database before a certain date or related to events that have terminated.
18. The method according to Claim 11 , wherein only the Uniform Resource Locator (URL) (28,70) of the page or wherein a portion of the page (28) which is less than the entire page or wherein a summary of the page (28) is displayed.
19. The method according to Claim 1, further including the step of monitoring (22) the amount of supplementary information viewed by the video program viewer, and the frequency (20) with which the video program viewer views the supplementary information, and varying (24) the amount of supplementary information formatted for display correspondingly, according to a predetermined formula.
20. The method according to Claim 1, wherein the supplemental information is included in an electronic mail message (15) or is downloaded (17) to a personal information manager (15).
21. An apparatus for retrieving information supplementary to a video program, the apparatus comprising: a receiver (2) which receives the video program, classification data for the video program, and transcript text for the video program; a transcript extractor (4) which identifies at least one segment within the video program and associates transcript text with said one segment; a context monitor (60,S207), which monitors the classification data (104,106,108,110) for each segment thereby identifying a context for each segment; a profile generator (50), which establishes a user profile for a video program viewer; a rule generator (52), incorporating the classification data (102,104,106,108,110), which establishes a set of rules (100) for associating supplementary information (116,118,120) with the video program, when the transcript text (150) and the user profile (102) satisfy a set of conditions; a retrieving portion (56), which retrieves the supplementary information (116,118,120), based upon the set of rules (100); a formatting portion (58) which formats (S211) the retrieved supplementary information for display along with the video program.
22. An apparatus according to Claim 21 wherein the retrieving portion retrieves (S210) the supplementary information (116,118,120) when a trigger (102) within the user profile matches (S206) a keyword (152) within the transcript text.
23. An apparatus according to Claim 22, wherein at least one trigger (102) in the user profile is expanded (156) to include related terms and the trigger and the related terms are compared (S206) with the keyword (152).
24. An apparatus according to Claim 22, wherein at least one keyword (152) within the transcript text (150) is expanded (154, S204) to include related terms and the trigger (102) is compared with the keyword (154) and the related terms.
25. An apparatus according to Claim 21, wherein the retrieving (S207, 104, 106, 108, 110) portion (56) retrieves information for the segment based upon the context of the segment.
26. Computer-executable process steps to retrieve information supplemental to a video program, the computer-executable process steps being stored on a computer-readable medium (18) and comprising: a receiving step (S201) to receive the video program, classification data describing the video program, and transcript text for the video program; a context identifying step (S207) to identify at least one segment in the video program and the context of the segment based upon the classification data; a keyword identification step (S203) to identify keywords in the transcript text for the at least one segment in the video program; a keyword expanding step (S204) to expand the keywords to include related terms; a personal profile retrieving step (S205) to retrieve a user profile for a viewer viewing the video program; a keyword matching step (S206) to match the keywords and the related terms with the at least one trigger in the user profile; an association rules retrieving step (S208) to retrieve a set of rules specifying which information supplemental to the video program will be retrieved, depending upon the identified context; a retrieving step (S210) to retrieve the supplementary information based upon the set of rules when the keyword matching step is successful; and a formatting step (S211) to format the retrieved supplementary information for display;
27. A signal (160), embodied in a carrier wave, representing a video program (162) and information supplemental thereto (116,118,120), comprising video program classification data (104,106,108,110); transcript text (150); a user profile (102); and rules (100) incorporating the video program classification data, for associating the supplementary information with the video program when the transcript text and the user profile satisfy a set of conditions (S206).
28. An apparatus for retrieving and displaying information supplemental to a video program comprising: means (2) for receiving the video program (162); means for identifying in the video program at least one segment (4); means for receiving program classification data describing the at least one segment (4,2); means for receiving transcript text (150) for the video program and associating the transcript text with the at least one segment (4); means for retrieving a user profile for a video program viewer (50); means for identifying (52) a set of rules (100), incorporating the classification data (104,106,108,110), for associating the supplementary information (116,118,120) with the video program, when the transcript text and the user profile (102) satisfy a set of conditions (S206); means for retrieving the supplementary information based upon the set of rules (56.S210); and means for formatting (58) the supplementary information for display along with the video program.
29. A set-top box (75) for a video program viewer, comprising: receiving means (2) which receives a video program (102), classification data for the video program (104,106,108,110), and transcript text (150) for the video program; transcript text extraction and segmenting means (4) which identifies at least one segment in the video program and associates transcript text with the at least one segment; communication means (17) which connects to at least one information source (14,13,15) and receives information supplemental to the video program (116,118,120); processor means (6) which a) retrieves a user profile (50) for the video program viewer which contains at least one trigger (102) reflecting an interest of the video program viewer, b) associates the classification data with the at least one segment (60, S207), c) identififes a set of rules (52) incorporating the classification data, for associating the supplemental information with the segment, d) searches the transcript text for a trigger contained in the user profile (54), e) retrieves the supplemental information (56), using the communication means (17) and based upon the set of rules (100), when the trigger (102) is contained within the transcript text (150), and f) formats (58) the retrieved supplemental information for display; and storage means (18) which stores the transcript text, the user profile, the set of rules, and the supplemental information.
30. The set-top box (75) according to Claim 29, wherein the receiving means receives a digital video program.
31. The set-top box according to Claim 29 (75), wherein the processor (12) decodes and formats the digital video program for display on an analog display.
32. The set-top box (75) according to Claim 29, wherein the video program viewer selects a destination (15) where the supplementary information will be transmitted via the communication means (17).
33. The set-top box (75) according to Claim 29, wherein more than one type of supplementary information (116,118,120) is retrieved by the processor (12) for each segment, the retrieved supplementary information is automatically placed in an order of priority according to the user profile (S209), and the supplementary information with highest priority is formatted for display (S211) by default.
34. The set-top box (75) according to Claim 29, wherein more than one type of supplementary information (116,118,120) is retrieved by the processor (12) for each segment, and the video program viewer selects the refrieved supplementary information the video program viewer wishes to view.
35. A television set (80) comprising: receiving means (2) which receives a video program (162), classification data for the video program (104,106,108,110), and transcript text (150) for the video program; transcript text extraction and segmenting means (4) which identifies at least one segment in the video program and associates transcript text with the at least one segment; communication means (17) which connects to at least one information source and receives information supplemental to the video program; processor means (12) which a) retrieves a user profile (50) for a video program viewer which contains at least one trigger reflecting an interest of the video program viewer, b) associates the classification data with the at least one segment (4,2), c) identifies a set (52) of rules (100), incorporating the classification data, for associating the supplemental information with the segment, d) searches the transcript text (54) for a trigger (102) contained in the user profile, e) retrieves the supplemental information (116, 118,120), using the communication means (17), and based upon the set of rules (100), when the trigger (102) is contained within the transcript text, and f) formats (58) the retrieved supplemental information for display; storage means (18) which stores the franscript text, the user profile, the set of rules, and the supplemental information; and display means which displays the video program and the refrieved and formatted supplemental information.
36. Computer-executable process steps to retrieve information supplemental to a video program, the computer-executable process steps being stored on a computer-readable medium (18) and comprising: a receiving step (S201) for receiving the video program, classification data describing the video program and transcript data for the video program; a segmenting step (S202) for identifying at least one segment in the video program and classification data for the segment; a first identifying step (S205) for identifying a user profile for a video program viewer; a second identifying step (S208) for identifying a set of rules incorporating the classification data, for associating the supplementary information with the video program, when the transcript text and the user profile satisfy a set of conditions; and a retrieving step (S210) for automatically retrieving the supplementary information based upon the set of rules.
PCT/EP2001/007965 2000-07-27 2001-07-11 Transcript triggers for video enhancement WO2002011446A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2002515840A JP2004505563A (en) 2000-07-27 2001-07-11 Transcript trigger information for video enhancement
KR1020027003919A KR20020054325A (en) 2000-07-27 2001-07-11 Transcript triggers for video enhancement
EP01951665A EP1410637A2 (en) 2000-07-27 2001-07-11 Transcript triggers for video enhancement

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US62718800A 2000-07-27 2000-07-27
US09/627,188 2000-07-27

Publications (2)

Publication Number Publication Date
WO2002011446A2 true WO2002011446A2 (en) 2002-02-07
WO2002011446A3 WO2002011446A3 (en) 2002-04-11

Family

ID=24513587

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2001/007965 WO2002011446A2 (en) 2000-07-27 2001-07-11 Transcript triggers for video enhancement

Country Status (5)

Country Link
EP (1) EP1410637A2 (en)
JP (1) JP2004505563A (en)
KR (1) KR20020054325A (en)
CN (1) CN1187982C (en)
WO (1) WO2002011446A2 (en)

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1351505A2 (en) * 2002-02-28 2003-10-08 Kabushiki Kaisha Toshiba Stream processing system with function for selectively playbacking arbitrary part of ream stream
WO2003105476A1 (en) * 2002-06-10 2003-12-18 Koninklijke Philips Electronics N.V. Anticipatory content augmentation
EP1383325A2 (en) * 2002-06-27 2004-01-21 Microsoft Corporation Aggregated EPG manager
WO2004053732A2 (en) * 2002-12-11 2004-06-24 Koninklijke Philips Electronics N.V. Method and system for utilizing video content to obtain text keywords or phrases for providing content related links to network-based resources
WO2004079592A1 (en) * 2003-03-01 2004-09-16 Koninklijke Philips Electronics N.V. Real-time synchronization of content viewers
EP1482735A1 (en) * 2002-04-12 2004-12-01 Mitsubishi Denki Kabushiki Kaisha Video content transmission device and method, video content storage device, video content reproduction device and method, meta data generation device, and video content management method
EP1497988A2 (en) * 2002-03-22 2005-01-19 Scientific-Atlanta, Inc. Exporting data from a digital home communication terminal to a client device
WO2005020579A1 (en) * 2003-08-25 2005-03-03 Koninklijke Philips Electronics, N.V. Real-time media dictionary
US7095871B2 (en) 1995-07-27 2006-08-22 Digimarc Corporation Digital asset management and linking media signals with related data using watermarks
US20070130611A1 (en) * 2005-12-02 2007-06-07 Microsoft Corporation Triggerless interactive television
WO2007141020A1 (en) * 2006-06-06 2007-12-13 Exbiblio B.V. Contextual dynamic advertising based upon captured rendered text
WO2008093989A1 (en) 2007-01-29 2008-08-07 Samsung Electronics Co, . Ltd. Method and system for facilitating information searching on electronic devices
WO2008031625A3 (en) * 2006-09-15 2008-12-11 Exbiblio Bv Capture and display of annotations in paper and electronic documents
EP2134075A1 (en) * 2008-06-13 2009-12-16 Sony Corporation Information processing apparatus, information processing method, and program
US7657064B1 (en) 2000-09-26 2010-02-02 Digimarc Corporation Methods of processing text found in images
US7778438B2 (en) 2002-09-30 2010-08-17 Myport Technologies, Inc. Method for multi-media recognition, data conversion, creation of metatags, storage and search retrieval
US7778440B2 (en) 2002-09-30 2010-08-17 Myport Technologies, Inc. Apparatus and method for embedding searchable information into a file for transmission, storage and retrieval
WO2010149814A1 (en) * 2009-06-24 2010-12-29 Francisco Monserrat Viscarri Device, method and system for generating additional audiovisual events
US7865925B2 (en) 2003-01-15 2011-01-04 Robertson Neil C Optimization of a full duplex wideband communications system
WO2011017316A1 (en) * 2009-08-07 2011-02-10 Thomson Licensing System and method for searching in internet on a video device
CN102087713A (en) * 2009-12-04 2011-06-08 索尼公司 Information processing device, information processing method, and program
US8014557B2 (en) 2003-06-23 2011-09-06 Digimarc Corporation Watermarking electronic text documents
US8225355B2 (en) 2006-05-01 2012-07-17 Canon Kabushiki Kaisha Program search apparatus and program search method for same
US8228563B2 (en) 2002-01-30 2012-07-24 Digimarc Corporation Watermarking a page description language file
GB2507097A (en) * 2012-10-19 2014-04-23 Sony Corp Providing customised supplementary content to a personal user device
EP2727370A2 (en) * 2011-06-30 2014-05-07 Intel Corporation Blended search for next generation television
US9008447B2 (en) 2004-04-01 2015-04-14 Google Inc. Method and system for character recognition
US20150127675A1 (en) 2013-11-05 2015-05-07 Samsung Electronics Co., Ltd. Display apparatus and method of controlling the same
US9030699B2 (en) 2004-04-19 2015-05-12 Google Inc. Association of a portable scanner with input/output and storage devices
US9075779B2 (en) 2009-03-12 2015-07-07 Google Inc. Performing actions based on capturing information from rendered documents, such as documents under copyright
US9081799B2 (en) 2009-12-04 2015-07-14 Google Inc. Using gestalt information to identify locations in printed information
US9630443B2 (en) 1995-07-27 2017-04-25 Digimarc Corporation Printer driver separately applying watermark and information
EP3080996A4 (en) * 2014-05-27 2017-08-16 Samsung Electronics Co., Ltd. Apparatus and method for providing information
US9740373B2 (en) 1998-10-01 2017-08-22 Digimarc Corporation Content sensitive connected content
US9762970B2 (en) 2002-10-04 2017-09-12 Tech 5 Access of stored video from peer devices in a local network
US10721066B2 (en) 2002-09-30 2020-07-21 Myport Ip, Inc. Method for voice assistant, location tagging, multi-media capture, transmission, speech to text conversion, photo/video image/object recognition, creation of searchable metatags/contextual tags, storage and search retrieval
WO2023220274A1 (en) * 2022-05-13 2023-11-16 Google Llc Entity cards including descriptive content relating to entities from a video

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10635723B2 (en) 2004-02-15 2020-04-28 Google Llc Search engines and systems with handheld document data capture devices
US9116890B2 (en) 2004-04-01 2015-08-25 Google Inc. Triggering actions in response to optically or acoustically capturing keywords from a rendered document
US9143638B2 (en) 2004-04-01 2015-09-22 Google Inc. Data capture from rendered documents using handheld device
US8620083B2 (en) 2004-12-03 2013-12-31 Google Inc. Method and system for character recognition
US8874504B2 (en) 2004-12-03 2014-10-28 Google Inc. Processing techniques for visual capture data from a rendered document
US8953908B2 (en) 2004-06-22 2015-02-10 Digimarc Corporation Metadata management and generation using perceptual features
US8346620B2 (en) 2004-07-19 2013-01-01 Google Inc. Automatic modification of web pages
US8115869B2 (en) * 2007-02-28 2012-02-14 Samsung Electronics Co., Ltd. Method and system for extracting relevant information from content metadata
US8510453B2 (en) * 2007-03-21 2013-08-13 Samsung Electronics Co., Ltd. Framework for correlating content on a local network with information on an external network
US8863221B2 (en) 2006-03-07 2014-10-14 Samsung Electronics Co., Ltd. Method and system for integrating content and services among multiple networks
US8209724B2 (en) 2007-04-25 2012-06-26 Samsung Electronics Co., Ltd. Method and system for providing access to information of potential interest to a user
US8843467B2 (en) 2007-05-15 2014-09-23 Samsung Electronics Co., Ltd. Method and system for providing relevant information to a user of a device in a local network
US8935269B2 (en) 2006-12-04 2015-01-13 Samsung Electronics Co., Ltd. Method and apparatus for contextual search and query refinement on consumer electronics devices
CN101272477A (en) * 2007-03-22 2008-09-24 华为技术有限公司 IPTV system, medium service apparatus and IPTV program searching and locating method
US9286385B2 (en) 2007-04-25 2016-03-15 Samsung Electronics Co., Ltd. Method and system for providing access to information of potential interest to a user
US8176068B2 (en) 2007-10-31 2012-05-08 Samsung Electronics Co., Ltd. Method and system for suggesting search queries on electronic devices
US8938465B2 (en) 2008-09-10 2015-01-20 Samsung Electronics Co., Ltd. Method and system for utilizing packaged content sources to identify and provide information based on contextual information
WO2010105245A2 (en) 2009-03-12 2010-09-16 Exbiblio B.V. Automatically providing content associated with captured information, such as information captured in real-time
US9323784B2 (en) 2009-12-09 2016-04-26 Google Inc. Image search using text-based elements within the contents of images
CN101930779B (en) * 2010-07-29 2012-02-29 华为终端有限公司 Video commenting method and video player
CN102346731B (en) 2010-08-02 2014-09-03 联想(北京)有限公司 File processing method and file processing device
TW201227366A (en) * 2010-12-31 2012-07-01 Acer Inc Method for integrating multimedia information source and hyperlink generation apparatus and electronic apparatus
CN103096173B (en) * 2011-10-27 2016-05-11 腾讯科技(深圳)有限公司 The information processing method of network television system and device
US8839309B2 (en) * 2012-12-05 2014-09-16 United Video Properties, Inc. Methods and systems for displaying contextually relevant information from a plurality of users in real-time regarding a media asset
CN104079988A (en) * 2014-06-30 2014-10-01 北京酷云互动科技有限公司 Television program related information pushing device and method
US10423727B1 (en) 2018-01-11 2019-09-24 Wells Fargo Bank, N.A. Systems and methods for processing nuances in natural language

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0848554A2 (en) * 1996-12-11 1998-06-17 International Business Machines Corporation Accessing television program information
US5818510A (en) * 1994-10-21 1998-10-06 Intel Corporation Method and apparatus for providing broadcast information with indexing
EP0952734A2 (en) * 1998-04-21 1999-10-27 International Business Machines Corporation System for selecting, accessing, and viewing portions of an information stream(s) using a television companion device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5818510A (en) * 1994-10-21 1998-10-06 Intel Corporation Method and apparatus for providing broadcast information with indexing
EP0848554A2 (en) * 1996-12-11 1998-06-17 International Business Machines Corporation Accessing television program information
EP0952734A2 (en) * 1998-04-21 1999-10-27 International Business Machines Corporation System for selecting, accessing, and viewing portions of an information stream(s) using a television companion device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TAKAGI T ET AL: "Conceptual Matching and its Application to Selection of TV Programs and BGMs" 1999 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS MAN AND CYBERNETICS. SMC'99. HUMAN COMMUNICATION AND CYBERNETICS. TOKYO, JAPAN, OCT. 12 - 15, 1999, IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, NEW YORK, NY: IEEE, US, vol. 3 OF 6, 12 October 1999 (1999-10-12), pages 269-273, XP002178872 ISBN: 0-7803-5732-9 *

Cited By (71)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9630443B2 (en) 1995-07-27 2017-04-25 Digimarc Corporation Printer driver separately applying watermark and information
US7095871B2 (en) 1995-07-27 2006-08-22 Digimarc Corporation Digital asset management and linking media signals with related data using watermarks
US9740373B2 (en) 1998-10-01 2017-08-22 Digimarc Corporation Content sensitive connected content
US7657064B1 (en) 2000-09-26 2010-02-02 Digimarc Corporation Methods of processing text found in images
US8644546B2 (en) 2000-09-26 2014-02-04 Digimarc Corporation Method and systems for processing text found in images
US8228563B2 (en) 2002-01-30 2012-07-24 Digimarc Corporation Watermarking a page description language file
EP1351505A2 (en) * 2002-02-28 2003-10-08 Kabushiki Kaisha Toshiba Stream processing system with function for selectively playbacking arbitrary part of ream stream
EP1351505A3 (en) * 2002-02-28 2006-01-18 Kabushiki Kaisha Toshiba Stream processing system with function for selectively playbacking arbitrary part of ream stream
US7389035B2 (en) 2002-02-28 2008-06-17 Kabushiki Kaisha Toshiba Stream processing system with function for selectively playbacking arbitrary part of ream stream
EP1497988A2 (en) * 2002-03-22 2005-01-19 Scientific-Atlanta, Inc. Exporting data from a digital home communication terminal to a client device
EP1497988A4 (en) * 2002-03-22 2005-08-31 Scientific Atlanta Exporting data from a digital home communication terminal to a client device
EP1482735A4 (en) * 2002-04-12 2005-09-21 Mitsubishi Electric Corp Video content transmission device and method, video content storage device, video content reproduction device and method, meta data generation device, and video content management method
EP1482735A1 (en) * 2002-04-12 2004-12-01 Mitsubishi Denki Kabushiki Kaisha Video content transmission device and method, video content storage device, video content reproduction device and method, meta data generation device, and video content management method
WO2003105476A1 (en) * 2002-06-10 2003-12-18 Koninklijke Philips Electronics N.V. Anticipatory content augmentation
EP1383325A3 (en) * 2002-06-27 2004-08-11 Microsoft Corporation Aggregated EPG manager
EP1383325A2 (en) * 2002-06-27 2004-01-21 Microsoft Corporation Aggregated EPG manager
US9832017B2 (en) 2002-09-30 2017-11-28 Myport Ip, Inc. Apparatus for personal voice assistant, location services, multi-media capture, transmission, speech to text conversion, photo/video image/object recognition, creation of searchable metatag(s)/ contextual tag(s), storage and search retrieval
US8068638B2 (en) 2002-09-30 2011-11-29 Myport Technologies, Inc. Apparatus and method for embedding searchable information into a file for transmission, storage and retrieval
US8509477B2 (en) 2002-09-30 2013-08-13 Myport Technologies, Inc. Method for multi-media capture, transmission, conversion, metatags creation, storage and search retrieval
US8687841B2 (en) 2002-09-30 2014-04-01 Myport Technologies, Inc. Apparatus and method for embedding searchable information into a file, encryption, transmission, storage and retrieval
US10237067B2 (en) 2002-09-30 2019-03-19 Myport Technologies, Inc. Apparatus for voice assistant, location tagging, multi-media capture, transmission, speech to text conversion, photo/video image/object recognition, creation of searchable metatags/contextual tags, storage and search retrieval
US10721066B2 (en) 2002-09-30 2020-07-21 Myport Ip, Inc. Method for voice assistant, location tagging, multi-media capture, transmission, speech to text conversion, photo/video image/object recognition, creation of searchable metatags/contextual tags, storage and search retrieval
US9159113B2 (en) 2002-09-30 2015-10-13 Myport Technologies, Inc. Apparatus and method for embedding searchable information, encryption, transmission, storage and retrieval
US8135169B2 (en) 2002-09-30 2012-03-13 Myport Technologies, Inc. Method for multi-media recognition, data conversion, creation of metatags, storage and search retrieval
US7778438B2 (en) 2002-09-30 2010-08-17 Myport Technologies, Inc. Method for multi-media recognition, data conversion, creation of metatags, storage and search retrieval
US7778440B2 (en) 2002-09-30 2010-08-17 Myport Technologies, Inc. Apparatus and method for embedding searchable information into a file for transmission, storage and retrieval
US9070193B2 (en) 2002-09-30 2015-06-30 Myport Technologies, Inc. Apparatus and method to embed searchable information into a file, encryption, transmission, storage and retrieval
US9922391B2 (en) 2002-09-30 2018-03-20 Myport Technologies, Inc. System for embedding searchable information, encryption, signing operation, transmission, storage and retrieval
US8983119B2 (en) 2002-09-30 2015-03-17 Myport Technologies, Inc. Method for voice command activation, multi-media capture, transmission, speech conversion, metatags creation, storage and search retrieval
US9762970B2 (en) 2002-10-04 2017-09-12 Tech 5 Access of stored video from peer devices in a local network
WO2004053732A3 (en) * 2002-12-11 2004-11-25 Koninkl Philips Electronics Nv Method and system for utilizing video content to obtain text keywords or phrases for providing content related links to network-based resources
WO2004053732A2 (en) * 2002-12-11 2004-06-24 Koninklijke Philips Electronics N.V. Method and system for utilizing video content to obtain text keywords or phrases for providing content related links to network-based resources
US7865925B2 (en) 2003-01-15 2011-01-04 Robertson Neil C Optimization of a full duplex wideband communications system
WO2004079592A1 (en) * 2003-03-01 2004-09-16 Koninklijke Philips Electronics N.V. Real-time synchronization of content viewers
US8014557B2 (en) 2003-06-23 2011-09-06 Digimarc Corporation Watermarking electronic text documents
US8320611B2 (en) 2003-06-23 2012-11-27 Digimarc Corporation Watermarking electronic text documents
WO2005020579A1 (en) * 2003-08-25 2005-03-03 Koninklijke Philips Electronics, N.V. Real-time media dictionary
US9008447B2 (en) 2004-04-01 2015-04-14 Google Inc. Method and system for character recognition
US9030699B2 (en) 2004-04-19 2015-05-12 Google Inc. Association of a portable scanner with input/output and storage devices
US8307403B2 (en) 2005-12-02 2012-11-06 Microsoft Corporation Triggerless interactive television
US20070130611A1 (en) * 2005-12-02 2007-06-07 Microsoft Corporation Triggerless interactive television
EP1964406A4 (en) * 2005-12-02 2010-09-08 Microsoft Corp Triggerless interactive television
EP1964406A1 (en) * 2005-12-02 2008-09-03 Microsoft Corporation Triggerless interactive television
WO2007064438A1 (en) 2005-12-02 2007-06-07 Microsoft Corporation Triggerless interactive television
US8225355B2 (en) 2006-05-01 2012-07-17 Canon Kabushiki Kaisha Program search apparatus and program search method for same
WO2007141020A1 (en) * 2006-06-06 2007-12-13 Exbiblio B.V. Contextual dynamic advertising based upon captured rendered text
WO2008031625A3 (en) * 2006-09-15 2008-12-11 Exbiblio Bv Capture and display of annotations in paper and electronic documents
EP2108157A4 (en) * 2007-01-29 2012-09-05 Samsung Electronics Co Ltd Method and system for facilitating information searching on electronic devices
US8782056B2 (en) 2007-01-29 2014-07-15 Samsung Electronics Co., Ltd. Method and system for facilitating information searching on electronic devices
WO2008093989A1 (en) 2007-01-29 2008-08-07 Samsung Electronics Co, . Ltd. Method and system for facilitating information searching on electronic devices
EP2108157A1 (en) * 2007-01-29 2009-10-14 Samsung Electronics Co., Ltd. Method and system for facilitating information searching on electronic devices
EP2134075A1 (en) * 2008-06-13 2009-12-16 Sony Corporation Information processing apparatus, information processing method, and program
US9094736B2 (en) 2008-06-13 2015-07-28 Sony Corporation Information processing apparatus, information processing method, and program
US9075779B2 (en) 2009-03-12 2015-07-07 Google Inc. Performing actions based on capturing information from rendered documents, such as documents under copyright
WO2010149814A1 (en) * 2009-06-24 2010-12-29 Francisco Monserrat Viscarri Device, method and system for generating additional audiovisual events
US9009758B2 (en) 2009-08-07 2015-04-14 Thomson Licensing, LLC System and method for searching an internet networking client on a video device
US10038939B2 (en) 2009-08-07 2018-07-31 Thomson Licensing System and method for interacting with an internet site
US9596518B2 (en) 2009-08-07 2017-03-14 Thomson Licensing System and method for searching an internet networking client on a video device
WO2011017316A1 (en) * 2009-08-07 2011-02-10 Thomson Licensing System and method for searching in internet on a video device
US9081799B2 (en) 2009-12-04 2015-07-14 Google Inc. Using gestalt information to identify locations in printed information
CN102087713A (en) * 2009-12-04 2011-06-08 索尼公司 Information processing device, information processing method, and program
EP2727370A4 (en) * 2011-06-30 2015-04-01 Intel Corp Blended search for next generation television
EP2727370A2 (en) * 2011-06-30 2014-05-07 Intel Corporation Blended search for next generation television
US9137484B2 (en) 2012-10-19 2015-09-15 Sony Corporation Device, method and software for providing supplementary information
GB2507097A (en) * 2012-10-19 2014-04-23 Sony Corp Providing customised supplementary content to a personal user device
US20150127675A1 (en) 2013-11-05 2015-05-07 Samsung Electronics Co., Ltd. Display apparatus and method of controlling the same
EP3066839A4 (en) * 2013-11-05 2017-08-23 Samsung Electronics Co., Ltd. Display apparatus and method of controlling the same
US10387508B2 (en) 2013-11-05 2019-08-20 Samsung Electronics Co., Ltd. Method and apparatus for providing information about content
US11409817B2 (en) 2013-11-05 2022-08-09 Samsung Electronics Co., Ltd. Display apparatus and method of controlling the same
EP3080996A4 (en) * 2014-05-27 2017-08-16 Samsung Electronics Co., Ltd. Apparatus and method for providing information
WO2023220274A1 (en) * 2022-05-13 2023-11-16 Google Llc Entity cards including descriptive content relating to entities from a video

Also Published As

Publication number Publication date
JP2004505563A (en) 2004-02-19
CN1187982C (en) 2005-02-02
WO2002011446A3 (en) 2002-04-11
CN1393107A (en) 2003-01-22
EP1410637A2 (en) 2004-04-21
KR20020054325A (en) 2002-07-06

Similar Documents

Publication Publication Date Title
WO2002011446A2 (en) Transcript triggers for video enhancement
US5809471A (en) Retrieval of additional information not found in interactive TV or telephony signal by application using dynamically extracted vocabulary
US7685620B2 (en) Apparatus and method of searching for desired television content
US8839283B2 (en) Blocking television commercials and providing an archive interrogation program
US7240354B2 (en) Apparatus and method for blocking television commercials with a content interrogation program
US6569206B1 (en) Facilitation of hypervideo by automatic IR techniques in response to user requests
US6493707B1 (en) Hypervideo: information retrieval using realtime buffers
US6490580B1 (en) Hypervideo information retrieval usingmultimedia
US9202523B2 (en) Method and apparatus for providing information related to broadcast programs
US7725467B2 (en) Information search system, information processing apparatus and method, and information search apparatus and method
US7802177B2 (en) Hypervideo: information retrieval using time-related multimedia
US7765462B2 (en) Facilitation of hypervideo by automatic IR techniques utilizing text extracted from multimedia document in response to user requests
US8209724B2 (en) Method and system for providing access to information of potential interest to a user
US8646006B2 (en) System and method for automatically authoring interactive television content
US20020184195A1 (en) Integrating content from media sources
KR20030007727A (en) Automatic video retriever genie
WO2003065229A1 (en) System and method for the efficient use of network resources and the provision of television broadcast information
JP2010218385A (en) Content retrieval device and computer program

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): CN JP KR

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR

ENP Entry into the national phase

Ref country code: JP

Ref document number: 2002 515840

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 1020027003919

Country of ref document: KR

121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: A3

Designated state(s): CN JP KR

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR

WWE Wipo information: entry into national phase

Ref document number: 018028810

Country of ref document: CN

WWP Wipo information: published in national office

Ref document number: 1020027003919

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2001951665

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2001951665

Country of ref document: EP