WO2001039178A1 - Referencing web pages by categories for voice navigation - Google Patents

Referencing web pages by categories for voice navigation Download PDF

Info

Publication number
WO2001039178A1
WO2001039178A1 PCT/EP2000/011299 EP0011299W WO0139178A1 WO 2001039178 A1 WO2001039178 A1 WO 2001039178A1 EP 0011299 W EP0011299 W EP 0011299W WO 0139178 A1 WO0139178 A1 WO 0139178A1
Authority
WO
WIPO (PCT)
Prior art keywords
search
internet
search criterion
assigned
phoneme sequence
Prior art date
Application number
PCT/EP2000/011299
Other languages
French (fr)
Inventor
Tom Stolk
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Priority to JP2001540761A priority Critical patent/JP2003515832A/en
Priority to EP00977543A priority patent/EP1157373A1/en
Publication of WO2001039178A1 publication Critical patent/WO2001039178A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Definitions

  • the invention relates to a speech recognition method of selecting Internet addresses of Internet pages assigned to a search criterion, to a computer program product in accordance with the introductory part of claim 8 and to a selection method of selecting Internet addresses of Internet pages assigned to a search criterion.
  • Such a speech recognition method such a selection method and such a computer program product is known from the speech recognition software "Free Speech Browser", which has been marketed by Philips since mid-November 1999.
  • the known speech recognition software is loaded into an internal memory of the computer and is processed by the computer, the known speech recognition method is executed.
  • this computer further processes the software of the so-called Internet browser (for example, "Microsoft Explorer” by Microsoft) and the computer is connected to the Internet, a user of the computer can select by commands spoken into a microphone Internet pages which are then displayed on a monitor of the computer.
  • the user of the computer speaks an Internet address into the microphone, after which the text information of the Internet address recognized via the speech recognition method is delivered to the Internet browser. Then the Internet browser scans the Internet page of the respective computer server connected to the Internet featured by this Internet address and displays this Internet page on the monitor.
  • a user liked to search for Internet pages containing information about a certain search criterion - such as, for example, books of motorcars - the user is to proceed in accordance with the known speech recognition method of selecting Internet pages as follows.
  • the user speaks the Internet address of a so-called search engine - such as, for example, YAHOO or ALTAVISTA - into the microphone and waits for the search engine to be displayed on the monitor.
  • the user speaks the search criterion into the microphone, which is recognized in accordance with the speech recognition method and is inserted into an entry field of the Internet page of the search engine.
  • the user receives a survey Internet page with hyperlinks to Internet pages on the monitor, which pages contain information about the entered search criterion.
  • a user can, for example, speak the command "search" and then a search criterion into the microphone and a search phoneme sequence will be determined for this search criterion in accordance with the speech recognition method.
  • the search phoneme sequence is compared to a number of stored search criterion phoneme sequences of search criterions for which at least one Internet address of an Internet page is assignedly stored which page contains the information corresponding to the search criterion. If one of the stored search criterion phoneme sequences sufficiently corresponds to the determined search criterion phoneme sequence, the Internet page of the Internet address assigned to this search criterion phoneme sequence is displayed on the monitor.
  • the speech recognition method as claimed in claim 1 it has proved to be advantageous to provide the measures as claimed in claim 3.
  • the database stored in a computer server connected to the Internet and continuously updated containing Internet addresses and search criterions is retrieved after long or short time distances by the computer which executes the speech recognition method and stored by the computer. This advantageously achieves, on the one hand, that the result of a search is displayed on the monitor very fast and, on the other hand, always topical data are processed.
  • a user himself can then enter a search criterion as search criterion text information and one or more Internet addresses with information about the search criterion via a keyboard of the computer. Subsequently, a search criterion phoneme sequence corresponding to the entered search criterion text information is detected and stored assigned to the entered Internet address(es) in the database.
  • the thus updated database can be stored with the computer, but also be conveyed to the computer server over the Internet to update the database stored in the computer server.
  • a search phoneme sequence associated to a search criterion is delivered to the Internet server, which stores the database with Internet addresses and search criterion phoneme sequences.
  • the Internet server compares the received search phoneme sequence with stored search criterion phoneme sequences and the result of the search is then delivered to the user's computer via the Internet and displayed on the monitor.
  • Fig. 1 diagrammatically shows in the form of a block diagram a computer connected to the Internet by which a speech recognition method is executed for selecting Internet addresses from Internet pages assigned to a search criterion
  • Fig. 2 shows text information and picture information of an Internet page displayed on a monitor
  • Fig. 3 shows a command table stored in a command determining stage of the computer shown in Fig. 1,
  • Fig. 4 shows a first search table stored in search storage means of the computer shown in Fig. 1
  • Fig. 5 shows a second search table stored in the search storage means of the computer shown in Fig. 1
  • Fig. 6 shows an Internet index page displayed on the monitor
  • Fig. 7 shows a second search table shown in Fig. 5 with a further search criterion and an associated Internet address entered in the second search table.
  • Fig. 1 diagrammatically shows in the form of a block diagram a computer 1, a first computer server 2 and a second computer server 3, which are all connected to the Internet NET.
  • a monitor 5 To the monitor port 4 of the computer 1 is connected a monitor 5 by which picture information BI and text information TI of an Internet page can be displayed.
  • a monitor signal MS containing picture information BI and text information TI can be delivered through the monitor port 4 of the computer 1 to the monitor 5.
  • the first computer program product contains software code sections and may be formed, for example, by the known computer software "Microsoft Explorer” of Microsoft or, for example, by the known computer software "Netscape Navigator” of Netscape.
  • An Internet address URL can be applied to the Internet browser 6 via a keyboard 7 and the Internet browser 6 is then arranged for searching for the computer server 2 or 3 connected to the Internet NET and featured by the Internet address URL.
  • the Internet browser 6 scans and receives the Internet page featured by the Internet address URL and stored with a computer server 2 or 3, which page usually contains text information TI and picture information BI and is HTML coded.
  • the Internet browser 6, after receiving the text information TI and picture information BI of an Internet page, delivers this monitor signal MS containing this information to the monitor port 4.
  • Fig. 2 is shown an Internet page 8, which can be displayed or reproduced respectively by the monitor 5.
  • the Internet page 8 contains text information Til and TI2 as well as picture information BI 1 and B 12.
  • the Internet page 8 contains further text information TI3, TI4, TI5, TI6 and TI7, which are shown underlined and form hypertexts HT(HL) of so-called hyperlinks HL.
  • Each hyperlink contains both a hypertext HT(HL) and an Internet address URL(HL) of the hyperlink HL assigned to the hypertext HT(HL) which, however, is not displayed by the monitor 5.
  • the Internet browser 6 loads text information TI and picture information BI of the Internet page featured by the Internet address URL(HL) of the activated hyperlink HL, as this was described above.
  • a second computer program product which, when it runs on the computer 1, forms a speech recognition device 9, as the result of which the computer 1 executes a speech recognition method.
  • the speech recognition device 9 is arranged for controlling the Internet browser 6.
  • the speech recognition device 9 is arranged for delivering the Internet address URL of an Internet page selected by a user of the computer 1 via a spoken command.
  • the speech recognition device 9 has hyperlink identification means 10.
  • the hyperlink identification means 10 can be applied text information TI of the Internet page 8 displayed by the monitor 5 and detected by the Internet browser 6.
  • the hyperlink identification means 10 are arranged for detecting the text information TI3 to TI7 of hypertexts HT(HL) of the hyperlink HL from the text information TI of the received Internet page 8.
  • This text information TI3 to TI7 can be delivered as hypertexts HT(HL) of the hyperlink HL of the Internet page 8 by the hyperlink identification means 10.
  • the hyperlink identification means 10 are further arranged for detecting the
  • the speech recognition device 9 further includes correlation means 11 for determining first phoneme sequences PI1[HT(HL)] corresponding to these hypertexts HT(HL).
  • the correlation means 11 include a correlation stage 12 and a word memory 13. In the word memory 13 are stored 64,000 words in English as a so-called background lexicon. Stored in the word memory 13 and assigned to each of these words is a phoneme sequence PI, which corresponds to the acoustic pronunciation of this word.
  • the correlation stage 12 is arranged for determining a first phoneme sequence PI1[HT(HL)] for each hypertext HT(HL) of a hyperlink HL delivered to the correlation stage 12 by the hyperlink identification means 10.
  • the correlation stage 12 is then arranged for comparing text portions of the hypertext HT(HL) of a hyperlink HL with words stored in the word memory 13.
  • the phoneme sequence PI assigned to this word and stored in the word memory 13 is incorporated in the first phoneme sequence PI1[HT(HL)] of this hypertext HT(HL).
  • the speech recognition device 9 further includes a command determining stage 14 by which a command table 15 shown in Fig. 3 is stored.
  • a command table 15 shown in Fig. 3 is stored.
  • the first phoneme sequences PI1[HT(HL)] determined by the correlation stage 12 and delivered to the command determining stage 14.
  • the command table 15 of the command determining stage 14 are stored first phoneme sequences PI1[HT(HL)] and Internet addresses URL(HL) for each hyperlink HL of the Internet page 8 displayed by the monitor 5.
  • the hypertexts HT(HL) of the phoneme sequences PI1[HT(HL)] stored in the command determining stage 14 form the spoken commands that can be recognized by the speech recognition device 9 when the Internet page 8 is displayed by the monitor 5.
  • the speech recognition device 9 also recognizes search commands and subsequent search criterions in spoken commands, which will be further discussed hereinbelow.
  • the computer 1 has an audio port 16 to which a microphone 17 can be connected to the computer 1.
  • a user of the computer 1 can speak a command into the microphone 17, after which an audio signal AS corresponding to the command is delivered to the audio port 16 by the microphone 17.
  • the user can speak a part of or also the whole text information TI3, TI4, TI5, TI6 or TI7 of a hypertext HT(HL) of a hyperlink HL into the microphone 17 as a command.
  • the user of the computer 1 can also speak a command into the microphone 17, which command represents one of the search commands that can be recognized by the speech recognition device 9 and represents a search criterion to find Internet pages whose contents correspond to the search criterion. This will be further explained below with reference to the following examples of application.
  • the speech recognition device further includes receiving means 18 for receiving an audio signal AS of a user-uttered command applied to the audio port 16.
  • the receiving means 18 include an input amplifier for amplifying the audio signal AS and an analog-to-digital converter for digitizing the analog audio signal AS.
  • the receiving means 18 can produce digital audio data AD representing the command uttered by the user.
  • the speech recognition device 9 further includes speech recognition means 19 for detecting a phoneme sequence P corresponding to the spoken command and for detecting the hyperlink HL selected by the user by comparing the determined phoneme sequence P with the phoneme sequences P[HT(HL)] stored in the command word determining stage 14.
  • the speech recognition means 19 include a speech recognition stage 20 and the command word determining stage 14.
  • the speech recognition stage 20 can be supplied with digital audio data AD which can be delivered by the receiving means 18.
  • the speech recognition stage 20 is arranged for detecting the phoneme sequence P corresponding to the digital audio data AD of the command spoken by the user, as this has already been known for a long time with speech recognition devices.
  • a phoneme sequence P detected by the speech recognition stage 20 can be delivered by this stage to the command word determining stage 14.
  • the speech recognition stage 20 is further arranged for comparing the detected phoneme sequence P with phoneme sequences of recognizable search commands stored in the speech recognition stage 20.
  • a command spoken by the user represents a search command known to the search recognition stage 20 or contains same
  • the phoneme sequence detected by the speech recognition stage 20 for the search criterion following the search command in the spoken command can be delivered as a search phoneme sequence SP to the command determining stage 14.
  • the command determining stage 14 After receiving a phoneme sequence P from the speech recognition stage 20, the command determining stage 14 is arranged for comparing the received phoneme sequence P with phoneme sequences P[HT(HL)] stored in the command table 15. The command determining stage 14 is further arranged for delivering the hyperlink HL Internet address URL(HL) stored in the command table 15, of which the hyperlink phoneme sequence P[HT(HL)] of the hypertext HT(HL) corresponds best to the phoneme sequence P delivered to the command determining stage 14.
  • the speech recognition device 9 further includes control means 21 for controlling the Internet browser 6 to enable reception of text information TI and picture information BI of the Internet page featured by the hyperlink HL selected by the user.
  • the hyperlink HL Internet address URL(HL) determined by the command determining stage 14 can be applied to the control means 21.
  • the control means 21 form an interface to the Internet browser 6 and deliver the Internet address URL(HL) of the selected hyperlink HL applied to the control means 21 to the Internet browser 6 in a data format so that the first computer program product can immediately process the Internet address URL.
  • the speech recognition device 9 henceforth has search storage means 22 which store a first search table 23 shown in Fig. 4 and a second search table 24 shown in Fig. 5.
  • search storage means 22 which store a first search table 23 shown in Fig. 4 and a second search table 24 shown in Fig. 5.
  • search criterion phoneme sequences KP(KT) of search criterions KT In a first column of the search tables 23 and 24 are stored search criterion phoneme sequences KP(KT) of search criterions KT.
  • a second column of the search tables 23 and 24 are stored one or more Internet addresses URL[KT(HL)] of Internet pages stored and assigned to the search criterion phoneme sequences KP(KT) contained in the first column, the contents of these Internet pages containing information that was assigned to the search criterion KT.
  • URL[BOOKS(l)] http://www.amazone.com features an
  • a third column of the search tables 23 and 24 contains hypertexts
  • the Internet page 8 is displayed by the monitor 5 and that the user of the computer 1 would like to receive information about books from the Internet NET.
  • the user speaks the command "SEARCH BOOKS" into the microphone 17, after which a respective audio signal AS is delivered to the receiving means 18 and corresponding audio data AD are delivered to the speech recognition stage 20 by the receiving means 18.
  • the speech recognition stage 20 detects phoneme sequences P corresponding to the words "SEARCH” and "BOOKS" and compares the phoneme sequence P of the first word "SEARCH” to phoneme sequences of search commands that can be recognized and are stored in the speech recognition stage 20.
  • the speech recognition stage 20 recognizes the first word of the spoken command as a search command and delivers the phoneme sequence P of the second word "BOOKS" as a search phoneme sequence SP to the command determining stage 14.
  • the command determining stage 14 is then arranged for comparing the search phoneme sequences SP delivered thereto with search criterion phoneme sequences KP(KT) contained in the first column of the first search table 23.
  • This Internet address URL[BOOKS(l)] http://www.amazon.com is then delivered to the control means 21 by the command determining stage 14.
  • a further advantage of the speech recognition method is provided in that the provider of the first search table 23 forming a database may ask for a registration fee from persons or businesses whose Internet address URL[KT(HL)] is to be registered in the first search table 23 with a certain search criterion KT, as a result of which an economically interesting method of doing business on the Internet is obtained.
  • the provider of the first search table 23 uses a third computer server 25 which is connected to the Internet NET and by which the first search table 23 respectively updated by the provider is stored.
  • the control means 21 deliver an Internet address URL featuring the third computer server 25 to the Internet browser 6, after which the updated search table 23 is received by the Internet browser 6 and stored in the search storage means 22.
  • This updating of the first search table 23 may advantageously be effected automatically and without the user of the computer 1 being involved. This offers the advantage that the first search table 23 is continuously updated in all the computers running the second computer program product.
  • the provider of the first search table 23 may also demand different registration fees for entering the Internet address URL[KT(HL)] from a person or business for different time ranges in the first search table 23.
  • the Internet page 8 is displayed by the monitor 5 and the user of the computer 1 would like to have information about baby clothing from the Internet NET.
  • the user speaks the command "SEARCH BABY CLOTHES" into the microphone 17, after which the speech recognition stage 20 - as described above - applies a phoneme sequence P corresponding to the search criterion "BABY CLOTHES" as a search phoneme sequence SP to the command determining stage 14.
  • the command determining stage 14 determines that the search criterion phoneme sequence KP(BABY CLOTHES) entered on the third line of the first search table 23 has a sufficiently large correspondence to the search phoneme sequence SP. Subsequently, the command determining stage 14 reads the four Internet addresses URL[B ABY CLOTHES(l)] to URL[BABY CLOTHES(4)] and the four associated hypertexts HT[BABY CLOTHES(l)] to HT[BABY CLOTHES(4)] stored and assigned to the search criterion phoneme sequence KP(BABY CLOTHES) in the first search table 23 from the search storage means 22 and delivers this information to the control means 21.
  • the control means 21 then generate text information TI of an Internet index page 26 shown in Fig. 6 and apply this to the Intemet browser 6 to be displayed on the monitor 5.
  • Text information TI8 indicates the search criterion KT entered as a search command by the user.
  • Text information TI9 to Til 2 forms hyperlinks HL to the Internet pages featured by the Internet addresses URL[BABY CLOTHES(l)] to URL[BABY CLOTHES(4)] with information about the search criterion "BABY CLOTHES", which hyperlinks can be activated by the user .
  • This information entered with the keyboard 7 is received by the correlation stage 12.
  • the search table 24 shown in Fig. 7 is then stored in the search storage means 22.
  • the user of the computer 1 can also assign search criterions KT to Internet addresses URL of interest to himself and store them in the second search table 24.
  • the first search table 23 is then provided by the provider of the database and updated, in contrast to which the second search table 24 can be provided and updated continuously by the user of the computer 1 himself.
  • the user can also enter a plurality of Internet addresses URL[KT(HL)] and hypertexts HT(HL) of hyperlinks HL to the search criterion text information KT by using the keyboard 7. They are then also entered in the search table 24, as this is represented in the third row of the search table 24 under the search criterion "VIDEO".
  • the user would like to order a pizza from his pizza service. For this purpose the user speaks the command "MY PIZZA SERVICE" into the microphone 17. The speech recognition stage 20 then compares the detected phoneme sequence P with phoneme sequences of recognizable speech commands and recognizes the speech command "MY".
  • a search phoneme sequence SP of the search criterion "PHOTOS" detected by the speech recognition stage 20 is compared by the command determining stage 14 to search criterion phoneme sequences KP[KT(HL)] stored in the first search table 23.
  • the command determining stage 14 determines that none of the stored search criterion phoneme sequences KP[KT(HL)] has a sufficiently great correspondence to the search phoneme sequence SP.
  • Search text information ST "PHOTOS" detected by the speech recognition stage 20 is delivered by it to the command determining stage 14 and delivered by this stage to the control means 21.
  • the first search table 23 is not stored, as explained with reference to the first example of application, in the second computer program product, thus in the search storage means 22 of the computer 1, and continuously updated.
  • the Internet address URL of the third computer server 25 stored in the control means of the second computer program product and, subsequently, the detected search phoneme sequence SP of the search criterion KT is delivered to the Internet browser 6 which delivers the search phoneme sequence SP to the third computer server 25 via the Internet NET.
  • the third computer server 25 comprises means corresponding to the command determining stage 14 and the search storage means 22 of the computer 1.
  • the third computer server 25 executes a selection method of selecting Internet addresses URL[KT(HL)] of Internet pages assigned to a search criterion
  • a search phoneme sequence SP sent via the Internet NET to the third computer server 25 is received by the third computer server 25.
  • the third computer server 25 detects from a search table stored in the third computer server 25 a search criterion phoneme sequence KP(KT) that has a sufficiently large correspondence to the received search phoneme sequence SP and detects at least one Internet address URL[(KT(HL)] stored and assigned to this search criterion phoneme sequence KP(KT). These one or more Internet addresses URL[KT(HL)] detected by the third computer server 25 via the search criterion KT are then delivered to the computer by the third computer sever 25, by which computer the search phoneme sequence SP is received.
  • the speech recognition method is subdivided - as described with reference to the second example of embodiment - into a part to be processed by the computer (client) of the user and a part to be processed by the computer server, in which phonemes or information corresponding to the phonemes are transmitted from the client to the server, there are two essential advantages.
  • the speaker-dependent processing operations of the speech recognition method are processed at the client's, so that the server advantageously need not process speaker-dependent information.
  • all processing operations that cost much memory space are uniformly processed by the server, so that the computers (clients) of the user advantageously need not have much memory capacity.
  • the second computer program product can be loaded from a CD-ROM or a floppy disc into the internal memory of the computer 1 and thus, advantageously, can be installed in the computer 1 in an extremely simple manner.
  • the speech recognition method and the second computer program product can be implemented and processed respectively by any product that can be connected to the Internet NET.
  • Such products may be, for example, a personal digital assistant, a set top box or a mobile telephone, which can set up a connection to the Internet.
  • the hypertext HT(HL) of a hyperlink HL of an Internet page displayed by the monitor 5 contains the same text information as a search command that can be detected by the speech recognition stage 20 together with the subsequent search criterion KT, the hyperlink HL of the Internet page can be activated, for example, by speaking a command "CLICK" before the hypertext HT(HL) is spoken.
  • the second computer program product may also form part of the first computer program product - thus part of the Internet browser.

Abstract

In a speech recognition method of selecting Internet addresses (URL[KT(HL)]) of Internet pages (8) assigned to a search criterion (KT) the method comprises the following steps: receiving (18) a spoken command (AS) of a user which represents a search command and a search criterion (KT), and determining a search phoneme sequence (SP) corresponding to the search criterion (KT) of the spoken command (AS) and determining a search criterion phoneme sequence (KP(KT)) that has a sufficiently large correspondence to the search phoneme sequence (SP) from stored (22; 25) search criterion phoneme sequences (KP(KT)) of search criterions (KT) and determining (14) at least one stored Internet address (URL[KT(HL)]) of at least one Internet page assigned to the search criterion (KT) assigned to the determined search criterion phoneme sequence (KP(KT)) and displaying this at least one assigned Internet page.

Description

REFERENCING WEB PAGES BY CATEGORIES FOR VOICE NAVIGATION
The invention relates to a speech recognition method of selecting Internet addresses of Internet pages assigned to a search criterion, to a computer program product in accordance with the introductory part of claim 8 and to a selection method of selecting Internet addresses of Internet pages assigned to a search criterion.
Such a speech recognition method, such a selection method and such a computer program product is known from the speech recognition software "Free Speech Browser", which has been marketed by Philips since mid-November 1999. When the known speech recognition software is loaded into an internal memory of the computer and is processed by the computer, the known speech recognition method is executed. When this computer further processes the software of the so-called Internet browser (for example, "Microsoft Explorer" by Microsoft) and the computer is connected to the Internet, a user of the computer can select by commands spoken into a microphone Internet pages which are then displayed on a monitor of the computer.
To achieve this the user of the computer speaks an Internet address into the microphone, after which the text information of the Internet address recognized via the speech recognition method is delivered to the Internet browser. Then the Internet browser scans the Internet page of the respective computer server connected to the Internet featured by this Internet address and displays this Internet page on the monitor.
If a user liked to search for Internet pages containing information about a certain search criterion - such as, for example, books of motorcars - the user is to proceed in accordance with the known speech recognition method of selecting Internet pages as follows. The user speaks the Internet address of a so-called search engine - such as, for example, YAHOO or ALTAVISTA - into the microphone and waits for the search engine to be displayed on the monitor. Subsequently, the user speaks the search criterion into the microphone, which is recognized in accordance with the speech recognition method and is inserted into an entry field of the Internet page of the search engine. Finally, the user speaks the search command "SEARCH" into the microphone to activate the search procedure of the search engine. As a result of the search the user receives a survey Internet page with hyperlinks to Internet pages on the monitor, which pages contain information about the entered search criterion.
With the known speech recognition method and selection method of selecting Internet pages the disadvantage has appeared that the user is to speak certain information or commands into the microphone three times at specific instants until, after a relatively long period of time, the result of the search is displayed on the monitor.
It is an object of the invention to eliminate the problems defined above and provide an improved speech recognition method, an improved selection method and an improved computer program product. These objects are achieved with such a speech recognition method by the measures of claim 1, with such a computer program product with the measures of the characterizing part of claim 8 and with such a selection method with the measures of claim 10.
As a result, a user can, for example, speak the command "search" and then a search criterion into the microphone and a search phoneme sequence will be determined for this search criterion in accordance with the speech recognition method. The search phoneme sequence is compared to a number of stored search criterion phoneme sequences of search criterions for which at least one Internet address of an Internet page is assignedly stored which page contains the information corresponding to the search criterion. If one of the stored search criterion phoneme sequences sufficiently corresponds to the determined search criterion phoneme sequence, the Internet page of the Internet address assigned to this search criterion phoneme sequence is displayed on the monitor. This offers the advantage that a user needs to speak a command into the microphone only once to immediately obtain an Internet page with information about the search criterion on the monitor. Particularly advantageous is the fact that the creator of the database, which contains the stored search criterion phoneme sequences and stored Internet addresses, can ask for registration fee from persons or companies whose Internet address is to be entered in a database under a search criterion. In this manner the speech recognition method as claimed in claim 1 reveals, in addition to technical measures according to the invention, an economically interesting new method of running businesses on the Internet.
With the speech recognition method as claimed in claim 1 it has proved to be advantageous to provide the measure as claimed in claim 2. This achieves that when more than one Internet address for a search criterion phoneme sequence is stored in the database, an Internet index page with hypertexts of hyperlinks to Internet pages is shown which pages contain information to the search criterion. As a result, the user, by activating the hyperlink of the Internet index page, can have one Internet page after the other with information about the search criterion shown on the monitor.
With the speech recognition method as claimed in claim 1 it has proved to be advantageous to provide the measures as claimed in claim 3. As a result, the database stored in a computer server connected to the Internet and continuously updated containing Internet addresses and search criterions is retrieved after long or short time distances by the computer which executes the speech recognition method and stored by the computer. This advantageously achieves, on the one hand, that the result of a search is displayed on the monitor very fast and, on the other hand, always topical data are processed.
With the speech recognition method as claimed in claim 1 it has proved to be advantageous to provide the measures as claimed in claim 4. A user himself can then enter a search criterion as search criterion text information and one or more Internet addresses with information about the search criterion via a keyboard of the computer. Subsequently, a search criterion phoneme sequence corresponding to the entered search criterion text information is detected and stored assigned to the entered Internet address(es) in the database. The thus updated database can be stored with the computer, but also be conveyed to the computer server over the Internet to update the database stored in the computer server.
This advantageously achieves that a user can enter his own search criterions and associated Internet addresses in which a search can be made by speaking a special search command (for example "MY"), which will be further discussed with reference to the example of embodiment. With the speech recognition method as claimed in claim 1 it has proved to be advantageous to provide the measures as claimed in claim 5. They achieve, when a search criterion is entered as a spoken command whose search criterion phoneme sequence is not stored in the database, that search text information is determined for the search criterion and delivered to a computer server which forms a search engine such as, for example, "YAHOO". The result of the search engine is then displayed by the monitor as this was described above. With the speech recognition method as claimed in claim 1 it has proved to be advantageous to provide the measures as claimed in claim 6. They provide that a search phoneme sequence associated to a search criterion is delivered to the Internet server, which stores the database with Internet addresses and search criterion phoneme sequences. The Internet server then compares the received search phoneme sequence with stored search criterion phoneme sequences and the result of the search is then delivered to the user's computer via the Internet and displayed on the monitor.
With the speech recognition method as claimed in claim 1 it has proved to be advantageous to provide the measures as claimed in claim 7. They can help the user decide, by selecting the search command, which of the search criterion phoneme sequences stored in the database are to be compared with the search phoneme sequence.
With a computer program product as claimed in claim 8 it has proved to be advantageous to store this product on a medium that can be read by a computer so as to simplify transportation and installation of the computer program product.
The invention will be described with reference to five examples of application of a first example of embodiment shown in the Figures and with reference to a second example of embodiment to which, however, the invention is not restricted.
Fig. 1 diagrammatically shows in the form of a block diagram a computer connected to the Internet by which a speech recognition method is executed for selecting Internet addresses from Internet pages assigned to a search criterion,
Fig. 2 shows text information and picture information of an Internet page displayed on a monitor,
Fig. 3 shows a command table stored in a command determining stage of the computer shown in Fig. 1,
Fig. 4 shows a first search table stored in search storage means of the computer shown in Fig. 1, Fig. 5 shows a second search table stored in the search storage means of the computer shown in Fig. 1,
Fig. 6 shows an Internet index page displayed on the monitor, and
Fig. 7 shows a second search table shown in Fig. 5 with a further search criterion and an associated Internet address entered in the second search table.
Fig. 1 diagrammatically shows in the form of a block diagram a computer 1, a first computer server 2 and a second computer server 3, which are all connected to the Internet NET. To the monitor port 4 of the computer 1 is connected a monitor 5 by which picture information BI and text information TI of an Internet page can be displayed. A monitor signal MS containing picture information BI and text information TI can be delivered through the monitor port 4 of the computer 1 to the monitor 5.
In an internal memory of the computer 1 can be loaded a first computer program product which, when running on the computer 1, forms a so-called Internet browser 6. The first computer program product contains software code sections and may be formed, for example, by the known computer software "Microsoft Explorer" of Microsoft or, for example, by the known computer software "Netscape Navigator" of Netscape.
An Internet address URL can be applied to the Internet browser 6 via a keyboard 7 and the Internet browser 6 is then arranged for searching for the computer server 2 or 3 connected to the Internet NET and featured by the Internet address URL. Once the sought computer server 2 or 3 has been found, the Internet browser 6 scans and receives the Internet page featured by the Internet address URL and stored with a computer server 2 or 3, which page usually contains text information TI and picture information BI and is HTML coded. The Internet browser 6, after receiving the text information TI and picture information BI of an Internet page, delivers this monitor signal MS containing this information to the monitor port 4.
In Fig. 2 is shown an Internet page 8, which can be displayed or reproduced respectively by the monitor 5. The Internet page 8 contains text information Til and TI2 as well as picture information BI 1 and B 12. The Internet page 8 contains further text information TI3, TI4, TI5, TI6 and TI7, which are shown underlined and form hypertexts HT(HL) of so-called hyperlinks HL. Each hyperlink contains both a hypertext HT(HL) and an Internet address URL(HL) of the hyperlink HL assigned to the hypertext HT(HL) which, however, is not displayed by the monitor 5. When a user of the computer 1 - for example by actuating keys of the keypad
7 - selects one of the represented hypertexts TI3 to TI7 and thus activates the hyperlink HL, the Internet browser 6 loads text information TI and picture information BI of the Internet page featured by the Internet address URL(HL) of the activated hyperlink HL, as this was described above. In the internal memory of the computer 1 can further be stored a second computer program product which, when it runs on the computer 1, forms a speech recognition device 9, as the result of which the computer 1 executes a speech recognition method. The speech recognition device 9 is arranged for controlling the Internet browser 6. For this purpose, the speech recognition device 9 is arranged for delivering the Internet address URL of an Internet page selected by a user of the computer 1 via a spoken command. The speech recognition device 9 has hyperlink identification means 10. To the hyperlink identification means 10 can be applied text information TI of the Internet page 8 displayed by the monitor 5 and detected by the Internet browser 6. The hyperlink identification means 10 are arranged for detecting the text information TI3 to TI7 of hypertexts HT(HL) of the hyperlink HL from the text information TI of the received Internet page 8. This text information TI3 to TI7 can be delivered as hypertexts HT(HL) of the hyperlink HL of the Internet page 8 by the hyperlink identification means 10. The hyperlink identification means 10 are further arranged for detecting the
Internet addresses URL(HL) of the hyperlink HL - not shown in Fig. 2 - from the text information TI of the received Internet page 8. To each hypertext HT(HL) of a hyperlink HL that can be produced by the hyperlink identification means 10 can be assigned an Internet address URL(HL) of the hyperlink HL. The speech recognition device 9 further includes correlation means 11 for determining first phoneme sequences PI1[HT(HL)] corresponding to these hypertexts HT(HL). For this purpose the correlation means 11 include a correlation stage 12 and a word memory 13. In the word memory 13 are stored 64,000 words in English as a so-called background lexicon. Stored in the word memory 13 and assigned to each of these words is a phoneme sequence PI, which corresponds to the acoustic pronunciation of this word.
The correlation stage 12 is arranged for determining a first phoneme sequence PI1[HT(HL)] for each hypertext HT(HL) of a hyperlink HL delivered to the correlation stage 12 by the hyperlink identification means 10. The correlation stage 12 is then arranged for comparing text portions of the hypertext HT(HL) of a hyperlink HL with words stored in the word memory 13. When a large degree of correspondence has been detected between a text portion of the hypertext HT(HL) of the hyperlink HL and a word of the word memory 13, the phoneme sequence PI assigned to this word and stored in the word memory 13 is incorporated in the first phoneme sequence PI1[HT(HL)] of this hypertext HT(HL).
The speech recognition device 9 further includes a command determining stage 14 by which a command table 15 shown in Fig. 3 is stored. In a first column of the command table 15 are stored the first phoneme sequences PI1[HT(HL)] determined by the correlation stage 12 and delivered to the command determining stage 14. In a third column of the command table 15 are stored the Internet addresses URL(HL) of the hyperlink HL detected by the hyperlink identification means 10 and delivered to the command determining stage 14 by the correlation stage 12.
This achieves that in the command table 15 of the command determining stage 14 are stored first phoneme sequences PI1[HT(HL)] and Internet addresses URL(HL) for each hyperlink HL of the Internet page 8 displayed by the monitor 5. The hypertexts HT(HL) of the phoneme sequences PI1[HT(HL)] stored in the command determining stage 14 form the spoken commands that can be recognized by the speech recognition device 9 when the Internet page 8 is displayed by the monitor 5. In addition, the speech recognition device 9 also recognizes search commands and subsequent search criterions in spoken commands, which will be further discussed hereinbelow.
The computer 1 has an audio port 16 to which a microphone 17 can be connected to the computer 1. A user of the computer 1 can speak a command into the microphone 17, after which an audio signal AS corresponding to the command is delivered to the audio port 16 by the microphone 17. For activating a hyperlink F£L the user can speak a part of or also the whole text information TI3, TI4, TI5, TI6 or TI7 of a hypertext HT(HL) of a hyperlink HL into the microphone 17 as a command. The user of the computer 1 can also speak a command into the microphone 17, which command represents one of the search commands that can be recognized by the speech recognition device 9 and represents a search criterion to find Internet pages whose contents correspond to the search criterion. This will be further explained below with reference to the following examples of application.
The speech recognition device further includes receiving means 18 for receiving an audio signal AS of a user-uttered command applied to the audio port 16. The receiving means 18 include an input amplifier for amplifying the audio signal AS and an analog-to-digital converter for digitizing the analog audio signal AS. The receiving means 18 can produce digital audio data AD representing the command uttered by the user.
The speech recognition device 9 further includes speech recognition means 19 for detecting a phoneme sequence P corresponding to the spoken command and for detecting the hyperlink HL selected by the user by comparing the determined phoneme sequence P with the phoneme sequences P[HT(HL)] stored in the command word determining stage 14. For this purpose, the speech recognition means 19 include a speech recognition stage 20 and the command word determining stage 14.
The speech recognition stage 20 can be supplied with digital audio data AD which can be delivered by the receiving means 18. The speech recognition stage 20 is arranged for detecting the phoneme sequence P corresponding to the digital audio data AD of the command spoken by the user, as this has already been known for a long time with speech recognition devices. A phoneme sequence P detected by the speech recognition stage 20 can be delivered by this stage to the command word determining stage 14.
The speech recognition stage 20 is further arranged for comparing the detected phoneme sequence P with phoneme sequences of recognizable search commands stored in the speech recognition stage 20. When a command spoken by the user represents a search command known to the search recognition stage 20 or contains same, the phoneme sequence detected by the speech recognition stage 20 for the search criterion following the search command in the spoken command can be delivered as a search phoneme sequence SP to the command determining stage 14.
After receiving a phoneme sequence P from the speech recognition stage 20, the command determining stage 14 is arranged for comparing the received phoneme sequence P with phoneme sequences P[HT(HL)] stored in the command table 15. The command determining stage 14 is further arranged for delivering the hyperlink HL Internet address URL(HL) stored in the command table 15, of which the hyperlink phoneme sequence P[HT(HL)] of the hypertext HT(HL) corresponds best to the phoneme sequence P delivered to the command determining stage 14.
The speech recognition device 9 further includes control means 21 for controlling the Internet browser 6 to enable reception of text information TI and picture information BI of the Internet page featured by the hyperlink HL selected by the user. For this purpose, the hyperlink HL Internet address URL(HL) determined by the command determining stage 14 can be applied to the control means 21. The control means 21 form an interface to the Internet browser 6 and deliver the Internet address URL(HL) of the selected hyperlink HL applied to the control means 21 to the Internet browser 6 in a data format so that the first computer program product can immediately process the Internet address URL. This achieves that for selecting a hyperlink HL of the Internet page 8 shown on the monitor 5, a user can speak one of the text information signals TI3 to TI7 into the microphone 17 as a command and the Internet page featured by the Internet address URL(HL) of the selected hyperlink HL is automatically retrieved by the respective Internet server 2 or 3 and displayed by the monitor 5.
The speech recognition device 9 henceforth has search storage means 22 which store a first search table 23 shown in Fig. 4 and a second search table 24 shown in Fig. 5. In a first column of the search tables 23 and 24 are stored search criterion phoneme sequences KP(KT) of search criterions KT. For example, the search criterion phoneme sequence KP(BOOKS) stored in the second row of the first search table 23 of the search criterion KT = "BOOKS" would correspond to the phoneme sequence P detected by the speech recognition stage 20, if the user spoke the word "BOOKS" into the microphone 17.
In a second column of the search tables 23 and 24 are stored one or more Internet addresses URL[KT(HL)] of Internet pages stored and assigned to the search criterion phoneme sequences KP(KT) contained in the first column, the contents of these Internet pages containing information that was assigned to the search criterion KT. In this manner, for example, an Internet address URL[BOOKS(l)] = http://www.amazone.com features an
Internet page by which books can be searched for on the Internet NET. Similarly, for example, an Internet address URL[MUSIC(1)] = http://www.mtv.com stored in the search table 24 and assigned to the search criterion phoneme sequence KP(MUSIC) features an
Internet page of the music channel MTV.
A third column of the search tables 23 and 24 contains hypertexts
HT[KT(HL)] of hyperlinks HL assigned to a search criterion phoneme sequence KP(KT) when the search criterion phoneme sequence KP(KT) in the second column is assigned more than one Internet address URL[KT(HL)]. This will be further discussed with reference to a second example of application.
In accordance with a first example of application of the speech recognition method of selecting Internet pages it is assumed that the Internet page 8 is displayed by the monitor 5 and that the user of the computer 1 would like to receive information about books from the Internet NET. To achieve this, the user speaks the command "SEARCH BOOKS" into the microphone 17, after which a respective audio signal AS is delivered to the receiving means 18 and corresponding audio data AD are delivered to the speech recognition stage 20 by the receiving means 18. The speech recognition stage 20 detects phoneme sequences P corresponding to the words "SEARCH" and "BOOKS" and compares the phoneme sequence P of the first word "SEARCH" to phoneme sequences of search commands that can be recognized and are stored in the speech recognition stage 20. The speech recognition stage 20 recognizes the first word of the spoken command as a search command and delivers the phoneme sequence P of the second word "BOOKS" as a search phoneme sequence SP to the command determining stage 14.
The command determining stage 14 is then arranged for comparing the search phoneme sequences SP delivered thereto with search criterion phoneme sequences KP(KT) contained in the first column of the first search table 23. The command determining stage 14 then determines that the search phoneme sequence SP largely matches the search criterion phoneme sequence KP(BOOKS) stored in the second row of the first search table 23 and reads the Internet address URL[BOOKS(l)] = http://www.amazon.com stored in the second column of the second row of the first search table 23 and assigned to this search criterion phoneme sequence KP(BOOKS) from the search storage means 22.
This Internet address URL[BOOKS(l)] = http://www.amazon.com is then delivered to the control means 21 by the command determining stage 14. The control means 21 deliver this Internet address URL[BOOKS(l)] in the data format processed by the Internet browser 6 as an Internet address URL = http://www.amazon.com to the Internet browser 6, as a result of which the Internet page featured by this Internet address URL is displayed on the monitor 5.
This achieves that the user only had to speak the command "SEARCH BOOKS" into the microphone to obtain an Internet page with information about the search criterion "BOOKS" on the monitor 5. Advantageously, it is not necessary for a corresponding hyperlink HL to be contained in the Internet page 8 displayed on the monitor 5. This advantage is particularly obtained in that the first search table 23 already stores Internet addresses URL[KT(HL)] for certain search criteria KT and, therefore, need not first be retrieved from the Internet NET with a so-called search engine (for example YAHOO).
A further advantage of the speech recognition method is provided in that the provider of the first search table 23 forming a database may ask for a registration fee from persons or businesses whose Internet address URL[KT(HL)] is to be registered in the first search table 23 with a certain search criterion KT, as a result of which an economically interesting method of doing business on the Internet is obtained.
Since Internet addresses URL relatively often change, it is advantageous to continuously update the first search table 23. For this purpose the provider of the first search table 23 uses a third computer server 25 which is connected to the Internet NET and by which the first search table 23 respectively updated by the provider is stored. At certain instants (for example, every week or each time a search command is recognized by the speech recognition stage 20), the control means 21 deliver an Internet address URL featuring the third computer server 25 to the Internet browser 6, after which the updated search table 23 is received by the Internet browser 6 and stored in the search storage means 22. This updating of the first search table 23 may advantageously be effected automatically and without the user of the computer 1 being involved. This offers the advantage that the first search table 23 is continuously updated in all the computers running the second computer program product. Furthermore, the provider of the first search table 23 may also demand different registration fees for entering the Internet address URL[KT(HL)] from a person or business for different time ranges in the first search table 23.
According to the second example of embodiment of the speech recognition method of selecting Internet pages it is assumed that the Internet page 8 is displayed by the monitor 5 and the user of the computer 1 would like to have information about baby clothing from the Internet NET. For this purpose, the user speaks the command "SEARCH BABY CLOTHES" into the microphone 17, after which the speech recognition stage 20 - as described above - applies a phoneme sequence P corresponding to the search criterion "BABY CLOTHES" as a search phoneme sequence SP to the command determining stage 14.
The command determining stage 14 determines that the search criterion phoneme sequence KP(BABY CLOTHES) entered on the third line of the first search table 23 has a sufficiently large correspondence to the search phoneme sequence SP. Subsequently, the command determining stage 14 reads the four Internet addresses URL[B ABY CLOTHES(l)] to URL[BABY CLOTHES(4)] and the four associated hypertexts HT[BABY CLOTHES(l)] to HT[BABY CLOTHES(4)] stored and assigned to the search criterion phoneme sequence KP(BABY CLOTHES) in the first search table 23 from the search storage means 22 and delivers this information to the control means 21.
The control means 21 then generate text information TI of an Internet index page 26 shown in Fig. 6 and apply this to the Intemet browser 6 to be displayed on the monitor 5. Text information TI8 then indicates the search criterion KT entered as a search command by the user. Text information TI9 to Til 2 forms hyperlinks HL to the Internet pages featured by the Internet addresses URL[BABY CLOTHES(l)] to URL[BABY CLOTHES(4)] with information about the search criterion "BABY CLOTHES", which hyperlinks can be activated by the user .
This offers the advantage that when more than one Internet addresses URL[KT(HL)] is registered for a search criterion KT in the search table 23 or 24, the Internet index page 26 is shown and the user, by activating the hyperlink HL of the Internet index page 26, can retrieve one Internet page with information about the search criterion KT after the other of an Internet server connected to the Internet NET. According to a third example of application of the speech recognition method it is assumed that the user of the computer 1 would like to link the Internet page of a certain pizza service to the search criterion "PIZZA SERVICE" to find the page back rapidly. For this purpose, the user enters search criterion text information KTI = "PIZZA SERVICE" and the assigned Internet address URL[PIZZA SERVICE(HL)] = http://www.pizza.com with the keys of the keyboard 7.
This information entered with the keyboard 7 is received by the correlation stage 12. The correlation stage 12 then determines for the received search criterion text information KT = "PIZZA SERVICE" an associated search criterion phoneme sequence KP(PIZZA SERVICE) and applies this sequence together with the received Internet address URL[PIZZA SERVICE(HL)] to the search storage means 22.
In Fig. 7 is shown the second search table 24 in which, in addition to the second search table 24 shown in Fig. 5, the fifth row of the search criterion phoneme sequence KP(PIZZA SERVICE) and the Internet address URL[PIZZA SERVICE(HL)] = http://www.pizza.com were entered. The search table 24 shown in Fig. 7 is then stored in the search storage means 22.
This advantageously achieves that the user of the computer 1 can also assign search criterions KT to Internet addresses URL of interest to himself and store them in the second search table 24. The first search table 23 is then provided by the provider of the database and updated, in contrast to which the second search table 24 can be provided and updated continuously by the user of the computer 1 himself.
It may be observed that the user can also enter a plurality of Internet addresses URL[KT(HL)] and hypertexts HT(HL) of hyperlinks HL to the search criterion text information KT by using the keyboard 7. They are then also entered in the search table 24, as this is represented in the third row of the search table 24 under the search criterion "VIDEO". According to the fourth example of application of the speech recognition method, the user would like to order a pizza from his pizza service. For this purpose the user speaks the command "MY PIZZA SERVICE" into the microphone 17. The speech recognition stage 20 then compares the detected phoneme sequence P with phoneme sequences of recognizable speech commands and recognizes the speech command "MY".
Then the speech recognition stage 20 delivers the phoneme sequence P of the search criterion KT = "PIZZA SERVICE" as a specially featured search phoneme sequence SP to the command determining stage 14. The command determining stage 14 compares specially featured search phoneme sequences SP only with search criterion phoneme sequences KP(KT) entered in the second search table 24, because the second search table 24 contains the search criterions KT and assigned Internet addresses URL[KT(HL)] entered by the user. Subsequently, the Internet addresses URL[KT(HL)] = http://www.pizza.com delivered to the control means 21 by the command determining stage 14 is delivered as an Internet address URL to the Internet browser 6, after which the desired Internet page of the pizza service for ordering a pizza is shown on the monitor 5.
This offers the advantage that the user, by selecting the search command "SEARCH" or "MY", can decide whether a search is to be made in the first search table 23 or in the second search table 24 for Internet addresses URL[KT(HL)] corresponding to the search criterion KT.
It may be observed that the user - for example by speaking the command "GO TO BABY CLOTHES" - can determine that only the Internet address URL[KT(3)] = http://www.baby.com of the Internet page stored in the first search table 23 for the search criterion KT is to be delivered to the control means 21 and displayed by the monitor 5, which Internet page was featured as the most interesting Internet page for this search criterion KT by the provider of the first search table 23.
For the user this brings in the advantage that, upon request (when he speaks "GO TO" and not "SEARCH" as a command) the user is specially pointed to the search criterion on a highly interesting Internet page. For the provider of the first search table 23 it is advantageous that he may ask an additional registration fee from persons or businesses whose Internet addresses URL[KT(HL)] are to be highlighted in the first search table 23.
According to a fifth example of application of the speech recognition method the user speaks the command "SEARCH PHOTOS" into the microphone 17. Subsequently, a search phoneme sequence SP of the search criterion "PHOTOS" detected by the speech recognition stage 20 is compared by the command determining stage 14 to search criterion phoneme sequences KP[KT(HL)] stored in the first search table 23. The command determining stage 14 then determines that none of the stored search criterion phoneme sequences KP[KT(HL)] has a sufficiently great correspondence to the search phoneme sequence SP. Search text information ST = "PHOTOS" detected by the speech recognition stage 20 is delivered by it to the command determining stage 14 and delivered by this stage to the control means 21. When search text information ST is received, the control means 21 deliver an Internet address URL of a search engine (for example, URL = http://www.yahoo.com of the search engine YAHOO) stored by the control means 21 to the Internet browser 6. Subsequently, the control means 21 deliver the search text information ST to the Internet browser 6, which inserts the search text information ST into the entry field of the Internet page of YAHOO and activates the search with the search engine "YAHOO". The YAHOO search engine then finds Internet pages to be assigned to the search criterion PHOTOS and a respective Internet index page is displayed on the picture screen 5.
This offers the advantage that when for a search criterion KT of a spoken search command no search criterion phoneme sequence KP[KT(HL)] is stored in the search table 23 or 24, the search for Intemet pages containing information about this search criterion KT is transferred to a search engine.
According to a second example of embodiment not shown in the Figures of the speech recognition method, the first search table 23 is not stored, as explained with reference to the first example of application, in the second computer program product, thus in the search storage means 22 of the computer 1, and continuously updated. When a user speaks a command into the microphone, the Internet address URL of the third computer server 25 stored in the control means of the second computer program product and, subsequently, the detected search phoneme sequence SP of the search criterion KT is delivered to the Internet browser 6 which delivers the search phoneme sequence SP to the third computer server 25 via the Internet NET.
According to the second example of embodiment the third computer server 25 comprises means corresponding to the command determining stage 14 and the search storage means 22 of the computer 1. The third computer server 25 executes a selection method of selecting Internet addresses URL[KT(HL)] of Internet pages assigned to a search criterion
KT. A search phoneme sequence SP sent via the Internet NET to the third computer server 25 is received by the third computer server 25.
Subsequently, the third computer server 25 detects from a search table stored in the third computer server 25 a search criterion phoneme sequence KP(KT) that has a sufficiently large correspondence to the received search phoneme sequence SP and detects at least one Internet address URL[(KT(HL)] stored and assigned to this search criterion phoneme sequence KP(KT). These one or more Internet addresses URL[KT(HL)] detected by the third computer server 25 via the search criterion KT are then delivered to the computer by the third computer sever 25, by which computer the search phoneme sequence SP is received.
This offers the advantage that the search table, which may need a large memory capacity, need not be stored in each computer that runs the second computer program product. In addition, there is no longer a must to continuously update the search table in these computers.
When the speech recognition method is subdivided - as described with reference to the second example of embodiment - into a part to be processed by the computer (client) of the user and a part to be processed by the computer server, in which phonemes or information corresponding to the phonemes are transmitted from the client to the server, there are two essential advantages. The speaker-dependent processing operations of the speech recognition method are processed at the client's, so that the server advantageously need not process speaker-dependent information. Furthermore, all processing operations that cost much memory space are uniformly processed by the server, so that the computers (clients) of the user advantageously need not have much memory capacity.
It may be observed that the second computer program product can be loaded from a CD-ROM or a floppy disc into the internal memory of the computer 1 and thus, advantageously, can be installed in the computer 1 in an extremely simple manner.
It may be observed that the speech recognition method and the second computer program product can be implemented and processed respectively by any product that can be connected to the Internet NET. Such products may be, for example, a personal digital assistant, a set top box or a mobile telephone, which can set up a connection to the Internet.
It may be observed that determining whether a phoneme sequence P has a sufficiently large correspondence to another phoneme sequence is a routine job for the expert.
It may be observed that when the hypertext HT(HL) of a hyperlink HL of an Internet page displayed by the monitor 5 contains the same text information as a search command that can be detected by the speech recognition stage 20 together with the subsequent search criterion KT, the hyperlink HL of the Internet page can be activated, for example, by speaking a command "CLICK" before the hypertext HT(HL) is spoken.
It may be observed that the second computer program product may also form part of the first computer program product - thus part of the Internet browser.
It may be observed that all the measures explained in connection with the Internet can equally well be applied to any data networks.

Claims

CLAIMS:
1. A speech recognition method of selecting Internet addresses (URL[KT(HL)]) of Internet pages (8) assigned to a search criterion (KT), the method comprising the following steps: reception (18) of a spoken command (AS) of a user, which represents a search command and a search criterion (KT) and determining (20) a search phoneme sequence (SP) corresponding to the search criterion (KT) of the spoken command (AS) and determining (14) a search criterion phoneme sequence (KP(KT)) which has a sufficiently large correspondence to the search phoneme sequence (SP) from stored (22; 25) search criterion phoneme sequences (KP(KT)) of search criterions (KT) and determining (14) at least one of the Internet addresses (URL[KT(HL)]) of at least one
Internet page assigned to the search criterion (KT) which stored address is assigned to the determined search criterion phoneme sequence (KP(KT)) and displaying (5) this at least one assigned Internet page.
2. A speech recognition method as claimed in claim 1, in which, when at least two Internet addresses (URL[KT(HL)]) of the determined search criterion phoneme sequence (KP(KT)) are assignedly stored, the display (5) of an Internet index page (26) is effected in which hypertexts (HT[KT(HL)]) stored and assigned to the determined Internet addresses (URL[KT(HL)]) are represented by hyperlinks (HL) so as to provide that the user by activating one of these hyperlinks (HL) effects the display (5) of an Internet page selected by the user and to be assigned to the search criterion (KT).
3. A speech recognition method as claimed in claim 1, in which the method comprises the following further steps: retrieving (21) at least one search criterion phoneme sequence (KP(KT)) and at least one assigned Internet address (URL[KT(HL)]) of at least one Internet page assigned to a search criterion (KT) by a computer server (25) connected to the Internet (net) and storing (22) the retrieved search criterion phoneme sequences (KP(KT)) and assigned Intemet addresses (URL[KT(HL)]) for executing the steps of the speech recognition method.
4. A speech recognition method as claimed in claim 1, in which the method comprises the following further steps: receiving (12) search criterion text information (KTI) and at least one assigned Internet address (URL[KT(HL)]) and determining (12) a search criterion phoneme sequence (KP(KT)) corresponding to the search criterion text information (KTI) and storing (22) the determined search criterion phoneme sequence (KP(KT)) and the received Internet addresses (URL[KT(HL)]) for executing the steps of the speech recognition method.
5. A speech recognition method as claimed in claim 1, in which when none of the stored search criterion phoneme sequences (KP[KT(HL)]) has a sufficiently large correspondence to the search phoneme sequence (SP), search text information (ST) about the search criterion (KT) of the spoken command is determined (20) and delivered to a computer server (25) connected to the Internet (NET), by which server a computer program product (6) of a search engine is processed, to enable the display of an Internet index page (5) containing hypertexts (HT(HL)) of hyperlinks (HL) detected by the search engine.
6. A speech recognition method as claimed in claim 1, in which the method comprises the following further steps: transmitting (21) the determined phoneme sequence (SP) to an Internet server (25), which is featured by an Internet address (URL) predefined by the search command, and determining from search criterion phoneme sequences (KP[KT(HL)]) stored by the Internet server (25) a search criterion phoneme sequence (KP[KT(HL)]) that has a sufficiently large correspondence to the search phoneme sequence (SP) and determining at least one of the determined search criterion phoneme sequences (KP[KT(HL)]) having the Internet address (URL[HT(HL)]) stored and assigned to the Internet server (25) from an Internet page assigned to at least one of the search criterions (KT) and displaying this at least one assigned Internet page.
7. A speech recognition method as claimed in claim 1, in which the method comprises the following further steps: laying down (20) a selection of the stored search criterion phoneme sequences (KP(KT)) to determine the correspondence with the search phone sequence (SP) by comparing the search command of the user with possible search commands.
8. A computer program product (9) which can be directly loaded into the internal memory of a digital computer (1) and comprises software code sections, characterized in that with the computer (1) the steps of the method are executed in accordance with claim 1 when the product (9) runs on the computer (1).
9. A computer program product (9) as claimed in claim 8, characterized in that it is stored on a medium that can be read by the computer.
10. A selection method of selecting Internet addresses (URL[KT(HL)]) of Internet pages (8) assigned to a search criterion (KT), the method comprising the following steps: receiving a search phoneme sequence (SP) which represents a search criterion (KT) of a spoken command (AS) of a user and determining from stored (25) search criterion phoneme sequences (KP(KT)) of search criterions (KT) a search criterion phoneme sequence (KP(KT)) that has a sufficiently large correspondence to the search phoneme sequence (SP) and determining at least one stored Internet address (URL[KT(HL)]) assigned to the determined search criterion phoneme sequence (KP(KT)) of at least one Internet page assigned to the search criterion and producing the at least one determined Internet address (URL[KT(HL)]) to enable a display (5) of this at least one assigned Internet page.
PCT/EP2000/011299 1999-11-25 2000-11-10 Referencing web pages by categories for voice navigation WO2001039178A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2001540761A JP2003515832A (en) 1999-11-25 2000-11-10 Browse Web Pages by Category for Voice Navigation
EP00977543A EP1157373A1 (en) 1999-11-25 2000-11-10 Referencing web pages by categories for voice navigation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP99890371 1999-11-25
EP99890371.0 1999-11-25

Publications (1)

Publication Number Publication Date
WO2001039178A1 true WO2001039178A1 (en) 2001-05-31

Family

ID=8244028

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2000/011299 WO2001039178A1 (en) 1999-11-25 2000-11-10 Referencing web pages by categories for voice navigation

Country Status (3)

Country Link
EP (1) EP1157373A1 (en)
JP (1) JP2003515832A (en)
WO (1) WO2001039178A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7162424B2 (en) 2001-04-26 2007-01-09 Siemens Aktiengesellschaft Method and system for defining a sequence of sound modules for synthesis of a speech signal in a tonal language
US7289960B2 (en) 2001-10-24 2007-10-30 Agiletv Corporation System and method for speech activated internet browsing using open vocabulary enhancement
US7324947B2 (en) 2001-10-03 2008-01-29 Promptu Systems Corporation Global speech user interface
US7428273B2 (en) 2003-09-18 2008-09-23 Promptu Systems Corporation Method and apparatus for efficient preamble detection in digital data receivers
US7519534B2 (en) 2002-10-31 2009-04-14 Agiletv Corporation Speech controlled access to content on a presentation medium
US7729910B2 (en) 2003-06-26 2010-06-01 Agiletv Corporation Zero-search, zero-memory vector quantization
US8321427B2 (en) 2002-10-31 2012-11-27 Promptu Systems Corporation Method and apparatus for generation and augmentation of search terms from external and internal sources
USRE44326E1 (en) 2000-06-08 2013-06-25 Promptu Systems Corporation System and method of voice recognition near a wireline node of a network supporting cable television and/or video delivery
WO2016058425A1 (en) * 2014-10-17 2016-04-21 百度在线网络技术(北京)有限公司 Voice search method, apparatus and device, and computer storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5960399A (en) * 1996-12-24 1999-09-28 Gte Internetworking Incorporated Client/server speech processor/recognizer
WO1999050830A1 (en) * 1998-03-30 1999-10-07 Microsoft Corporation Information retrieval and speech recognition based on language models

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5960399A (en) * 1996-12-24 1999-09-28 Gte Internetworking Incorporated Client/server speech processor/recognizer
WO1999050830A1 (en) * 1998-03-30 1999-10-07 Microsoft Corporation Information retrieval and speech recognition based on language models

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DATABASE INSPEC [online] INSTITUTE OF ELECTRICAL ENGINEERS, STEVENAGE, GB; KATSUURA M ET AL: "The WWW browser system with spoken keyword recognition", XP002160022, Database accession no. 6264744 *
LAU R ET AL: "WebGALAXY: beyond point and click -- a conversational interface to a browser", COMPUTER NETWORKS AND ISDN SYSTEMS,NL,NORTH HOLLAND PUBLISHING. AMSTERDAM, vol. 29, no. 8-13, 1 September 1997 (1997-09-01), pages 1385 - 1393, XP004095333, ISSN: 0169-7552 *
TRANSACTIONS OF THE INFORMATION PROCESSING SOCIETY OF JAPAN, FEB. 1999, INF. PROCESS. SOC. JAPAN, JAPAN, vol. 40, no. 2, pages 443 - 452, ISSN: 0387-5806 *

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
USRE44326E1 (en) 2000-06-08 2013-06-25 Promptu Systems Corporation System and method of voice recognition near a wireline node of a network supporting cable television and/or video delivery
US7162424B2 (en) 2001-04-26 2007-01-09 Siemens Aktiengesellschaft Method and system for defining a sequence of sound modules for synthesis of a speech signal in a tonal language
US9848243B2 (en) 2001-10-03 2017-12-19 Promptu Systems Corporation Global speech user interface
US11070882B2 (en) 2001-10-03 2021-07-20 Promptu Systems Corporation Global speech user interface
US11172260B2 (en) 2001-10-03 2021-11-09 Promptu Systems Corporation Speech interface
US10932005B2 (en) 2001-10-03 2021-02-23 Promptu Systems Corporation Speech interface
US8005679B2 (en) 2001-10-03 2011-08-23 Promptu Systems Corporation Global speech user interface
US10257576B2 (en) 2001-10-03 2019-04-09 Promptu Systems Corporation Global speech user interface
US8983838B2 (en) 2001-10-03 2015-03-17 Promptu Systems Corporation Global speech user interface
US8407056B2 (en) 2001-10-03 2013-03-26 Promptu Systems Corporation Global speech user interface
US7324947B2 (en) 2001-10-03 2008-01-29 Promptu Systems Corporation Global speech user interface
US8818804B2 (en) 2001-10-03 2014-08-26 Promptu Systems Corporation Global speech user interface
US7289960B2 (en) 2001-10-24 2007-10-30 Agiletv Corporation System and method for speech activated internet browsing using open vocabulary enhancement
US10121469B2 (en) 2002-10-31 2018-11-06 Promptu Systems Corporation Efficient empirical determination, computation, and use of acoustic confusability measures
US9305549B2 (en) 2002-10-31 2016-04-05 Promptu Systems Corporation Method and apparatus for generation and augmentation of search terms from external and internal sources
US8321427B2 (en) 2002-10-31 2012-11-27 Promptu Systems Corporation Method and apparatus for generation and augmentation of search terms from external and internal sources
US9626965B2 (en) 2002-10-31 2017-04-18 Promptu Systems Corporation Efficient empirical computation and utilization of acoustic confusability
US11587558B2 (en) 2002-10-31 2023-02-21 Promptu Systems Corporation Efficient empirical determination, computation, and use of acoustic confusability measures
US10748527B2 (en) 2002-10-31 2020-08-18 Promptu Systems Corporation Efficient empirical determination, computation, and use of acoustic confusability measures
US8959019B2 (en) 2002-10-31 2015-02-17 Promptu Systems Corporation Efficient empirical determination, computation, and use of acoustic confusability measures
US7519534B2 (en) 2002-10-31 2009-04-14 Agiletv Corporation Speech controlled access to content on a presentation medium
US8862596B2 (en) 2002-10-31 2014-10-14 Promptu Systems Corporation Method and apparatus for generation and augmentation of search terms from external and internal sources
US8185390B2 (en) 2003-06-26 2012-05-22 Promptu Systems Corporation Zero-search, zero-memory vector quantization
US7729910B2 (en) 2003-06-26 2010-06-01 Agiletv Corporation Zero-search, zero-memory vector quantization
US7428273B2 (en) 2003-09-18 2008-09-23 Promptu Systems Corporation Method and apparatus for efficient preamble detection in digital data receivers
WO2016058425A1 (en) * 2014-10-17 2016-04-21 百度在线网络技术(北京)有限公司 Voice search method, apparatus and device, and computer storage medium

Also Published As

Publication number Publication date
EP1157373A1 (en) 2001-11-28
JP2003515832A (en) 2003-05-07

Similar Documents

Publication Publication Date Title
US9202247B2 (en) System and method utilizing voice search to locate a product in stores from a phone
US6604076B1 (en) Speech recognition method for activating a hyperlink of an internet page
US6941273B1 (en) Telephony-data application interface apparatus and method for multi-modal access to data applications
US7228327B2 (en) Method and apparatus for delivering content via information retrieval devices
US6885736B2 (en) System and method for providing and using universally accessible voice and speech data files
JP2002536755A (en) Intercommunication system and method between user and system
US20070179778A1 (en) Dynamic Grammar for Voice-Enabled Applications
US20120253800A1 (en) System and Method for Modifying and Updating a Speech Recognition Program
US20070050191A1 (en) Mobile systems and methods of supporting natural language human-machine interactions
EP2289231A1 (en) A system and method utilizing voice search to locate a procuct in stores from a phone
JP3278222B2 (en) Information processing method and apparatus
EP1157373A1 (en) Referencing web pages by categories for voice navigation
JP2002539481A (en) Method using multiple speech recognizers
US20050102147A1 (en) Method of speech-based navigation in a communications network and of implementing a speech input possibility in private information units
JP2005004782A (en) Information processing system, information processor, information processing method, and personal digital assistant
US7346651B2 (en) Method of searching information site by item keyword and action keyword
JP2003005778A (en) Voice recognition portal system
CN1316076A (en) User-profile-driven mapping of hyperlinks onto URLS
JPH10164249A (en) Information processor
KR100381605B1 (en) Ars voice web hosting service system and the method thereof
JP2003319085A (en) Voice information retrieval unit and voice information retrieval method
KR20010044834A (en) System and method for processing speech-order
US20080133240A1 (en) Spoken dialog system, terminal device, speech information management device and recording medium with program recorded thereon
JP2002099294A (en) Information processor
JP2001222494A (en) Device, system and method for retrieving communication address

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

ENP Entry into the national phase

Ref country code: JP

Ref document number: 2001 540761

Kind code of ref document: A

Format of ref document f/p: F

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2000977543

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2000977543

Country of ref document: EP

WWR Wipo information: refused in national office

Ref document number: 2000977543

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 2000977543

Country of ref document: EP