WO2008100420A1

WO2008100420A1 - Providing network-based access to personalized user information

Info

Publication number: WO2008100420A1
Application number: PCT/US2008/001677
Authority: WO
Inventors: Christopher F. Mcconnell; Claire Mcconnell; Kevin Loftus; Jennifer W. Parker
Original assignee: Mcconnell Christopher F; Claire Mcconnell; Kevin Loftus; Parker Jennifer W
Priority date: 2007-02-09
Filing date: 2008-02-08
Publication date: 2008-08-21

Abstract

An apparatus and method for providing network-based access to personalized user content. The apparatus may include a memory that is publicly-accessible via a communications network. The memory may store configuration information that is customized by a user and that defines personalized content of the user. The apparatus may further include an interface module in communication with the memory. The interface module may also be publicly-accessible via the communications network and may include a processor configured to access the configuration information to determine the personalized content to be provided to a remote user device associated with the user. The processor may further be configured to establish a communications connection with the remote user device via the communications network, access the personalized content, and provide, in an audio form, at least a portion of the personalized content to the remote user device via the communications network.

Description

PROVIDING NETWORK-BASED ACCESS TO PERSONALIZED USER

INFORMATION

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application claims benefit under 35 U.S.C. § 119(e) of provisional U.S. Patent Application No. 60/889,092, filed February 9, 2007 and entitled "A SYSTEM AND METHOD FOR RETRIEVING INFORMATION FROM A WEB-BASED SYSTEM USING DIGITIZED AUDIO SYSTEMS," the disclosure of which is hereby incorporated by reference in its entirety.

BACKGROUND

[0002] The public is increasingly using computers to store and access information that affects their daily lives. Personal information such as appointments, tasks, and contacts, as well as enterprise data such as data in spreadsheets, databases, word processing documents and the like are all types of information that are particularly amenable to storage in a computer because of the ease of updating, organizing, and accessing such information. In addition, computers are able to remotely access time-sensitive information, such as stock quotes, weather reports and so forth, on or near a real-time basis from the Internet and/or other networks. To perform all of the tasks required of them, computers have become quite sophisticated and computationally powerful. In addition, computers have become more versatile in the manner in which they can be implemented.

[0003] For example, a computer may be embedded within another device, such as a consumer product, so as to enable the product to have enhanced functionality that is beyond the capabilities of a typical device. Generally, however, a user may have one primary computer on which data is stored. Other devices may be synchronized with portions of the information stored on this primary computer. Thus, while a user has access to his or her primary computer - that is, while the user is at home or at the office - the user is able to easily access the full computational power to perform a desired task.

[0004] Increasingly, however, a user will require access to such information while traveling or while simply away from his or her computer. People want access to specific information on-demand, wherever they are. Furthermore, users are often multitasking and would prefer to request and receive information in an audio format, leaving hands and eyes free for other tasks. Unfortunately, the full analytical power of a computer is, for the most part, immobile.

[0005] For example, a desktop computer is designed to be placed at a fixed location, and is, therefore, unsuitable for mobile applications. Similarly, a consumer product with an embedded computer would be immobile and have limited functionality in most cases. Laptop computers are more transportable than desktop computers, and have comparable computing power, but are costly and still fairly cumbersome, and in many situations it is awkward or virtually impossible to retrieve data from a laptop (e.g., walking to work, driving an automobile, commuting on the sub- way, etc.). In addition, long range wireless Internet connectivity (e.g., via a wireless wide area network (WAN)) is expensive and still not widely available, and a cellular telephone connection for such a laptop is slow by current Internet standards. In addition, having remote Internet connectivity is duplicative of the Internet connectivity a user may have at his or her home or office, with an attendant duplication of costs.

[0006] Conventionally, a personal digital assistant (PD A) can be used to access a user' s information. Such a PDA can connect intermittently with a computer through a cradle or IR beam and thereby upload or download information with the computer. Some PDAs can access the information through a wireless connection, or may double as a cellular telephone. However, PDAs have numerous shortcomings. For example, PDAs are expensive, often duplicate some of the computing power that already exists in the user's computer, sometimes require a subscription to a costly service, often require synchronization with a base station or personal computer, are difficult to use - both in terms of learning to use a PDA and in terms of a PDA's small screen and input devices requiring two-handed use - and have limited functionality as compared to a user's computer. As the amount of mobile computing power is increased, the expense and complexity of PDAs has increased concomitantly. In addition, because a conventional PDA stores the user's information on-board, a PDA carries with it the risk of data loss through theft or loss of the PDA.

[0007] Voice activated software is a growing trend. In general, however, software applications that are capable of recognizing speech are either impersonal, menu driven, server- based systems or are primarily intended for a user that is co-located with the computer. For example, voice recognition systems for call centers need to be run on powerful servers due to the systems' large size and complexity. Such systems are large and complex in part because they need to be able to recognize speech from speakers having a variety of accents and speech patterns. Such systems, despite their complex nature, are still typically limited to menu-driven responses, and cannot deliver information according to a user determined preference. In other words, a caller to a typical voice recognition software package must proceed through one or more layers of a menu to get to the desired functions, rather than being able to simply speak the desired request and have the system recognize the request.

[0008] Conventional methods for improving such software's ability to recognize diverse commands from a desperate number of voices typically involve providing a large speech vocabulary for the software to attempt to match to a spoken command. Using a large vocabulary, however, again requires a powerful computing device because of the many comparisons that would need to be made in order to match a sound, word or phrase in the large vocabulary to a spoken command. In contrast, conventional voice recognition software that is designed to run on a personal computer is primarily directed to dictation, and such software is further limited to being used while the user is in front of the computer and to accessing simple menu items that are determined by the software. Thus, conventional voice recognition software merely serves to act as a replacement for or a supplement to typical input devices, such as a keyboard or mouse. Under both regimes, the information that the user receives back in response to audio signals is often quite limited, not in an audio format, or is inflexible and impersonal.

[0009] Cellular telephones, and telephones in general, are pervasive. In developed countries (and even in some developing countries), a significant portion of the population has access to a land-line or cellular telephone. Moreover, Voice over Internet Protocol (VoIP) is rapidly emerging to make telephony even more ubiquitous. Recently, cellular telephones have begun to merge with other types of consumer electronics, such as digital cameras, PDAs, smartphones, digital music players, and the like to create portable, multi-purpose telecommunications devices that provide consumers with an ever-increasing array of features while "on the move." Some conventional cellular telephones have limited voice activation capability to perform simple tasks using audio commands such as calling the telephone of a specified person (the number is stored in the cellular phone). Similarly, some advanced cellular telephones can recognize sounds in the context of receiving simple commands. In such conventional systems, the software involved simply identifies a known command (i.e., sound) which causes the desired function to be performed, such as calling a desired person. In other words, a conventional system matches a sound to a desired function, without determining the meaning of the word(s) spoken.

[0010] Despite such advances, current multi-purpose telecommunication devices have various shortcomings. For example, such devices may be expensive to purchase and expensive to own, typically requiring users to incur significant monthly charges for data and/or voice services. In addition, current telecommunications devices often employ small screens for displaying information. Thus, displayed information that is retrieved by current telecommunications devices, using WAN Internet connectivity and/or other cellular-based networks, may be difficult to read. Aside from being an annoyance, such limitations may compromise user safety if, for example, users attempt to read the displayed information while driving a vehicle.

[0011] One way to overcome one or more of the foregoing shortcomings is to enable users to receive information using conventional aspects of the telecommunications device, i.e., enabling users to receive requested information in an audio form.

[0012] For example, Personal Audio Link® (PAL®) software from Adondo Corporation enables users to call their personal computer to request information and to retrieve the requested information in an audio form. However, to take advantage of such capabilities, users generally must keep their personal computer turned on and connected to a network, such as the Internet. For some users, this may not be practicable or even feasible.

[0013] Alternatively, users may call a service, such as Tellme from Tellme Networks, Inc., that employs an interactive voice response (IVR) system to receive and process user requests for information, such as weather reports, stock quotes, sports scores, etc., and to provide the information to the users in the form of an audio response. However, the information made available to the users is determined by the service provider. As such, users are unable to customize the service according to their personal preferences or needs. Thus, such services have limited flexibility as to what information can be retrieved by the users and do not provide users the opportunity for personalization.

[0014] Thus, as people require access to both private and public information it is desirable to have a reliable always-on system to readily retrieve information anywhere, anytime. SUMMARY

[0015] The disclosed embodiments include an apparatus and method for providing network-based access to personalized user content. The apparatus may include a memory that is publicly-accessible via a communications network. The memory may store configuration information that is customized by a user and that defines personalized content of the user. The apparatus may further include an interface module in communication with the memory. The interface module may also be publicly-accessible via the communications network and may include a processor configured to access the configuration information to determine the personalized content to be provided to a remote user device associated with the user. The processor may further be configured to establish a communications connection with the remote user device via the communications network, access the personalized content, and provide, in an audio form, at least a portion of the personalized content to the remote user device via the communications network.

[0016] The method may include storing configuration information in a datastore that is publicly-accessible via a communications network. The configuration information may be customized by a user and may define personalized content of the user. The configuration information may be accessed to determine the personalized content to be provided to a remote user device associated with the user. The method may further include accessing the personalized content, establishing a communications connection with the remote user device via the communications network, and providing, in an audio form, at least a portion of the personalized content to the remote user device via the communications network.

BRIEF DESCRIPTION OF THE DRAWINGS

[0017] The foregoing and other aspects of the disclosed embodiments will be better understood from the following detailed description with reference to the drawings.

[0018] FIGs. 1 and 2 are diagrams of exemplary network configurations in which one or more embodiments may be implemented.

[0019] FIG. 3 is a block diagram of a web-based system for providing access to personalized user content.

[0020] FIGs. 4 A, 4B, 4C and 5 are flow diagrams of exemplary methods for providing access to personalized user content according to one or more embodiments.

[0021] FIGs. 6A, 6B and 6C are illustrations of exemplary web-based user interfaces for configuring user preferences associated with the personalized user content. DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS

[0022] The rapid development of the Internet and cellular technology has led users to demand access to information wherever they are at all times of the day or night. In addition, it is preferable that the user can determine what information they are getting and that the information can be personal information (e.g., e-mail, appointments, contacts, passwords, etc.), public information (e.g., news, stock quotes, music, podcasts, etc.) or enterprise information (e.g., client data, price lists, sales information, etc.).

[0023] The disclosed embodiments include a web-based system that enables a user to place a phone call, from a cell phone or any type of telephony device, to obtain the information they desire in an audio format. The system may be configured to respond to user commands, which can be transmitted via DTMF tone, voice, text, and the like. The system may then respond with the desired audio information.

[0024] Software applications, such as Adondo® Personal Audio Link (PAL®) software (which is described in U.S. Patent Application Nos. 10/529,415, 11/048,948, 11/300,042, the disclosures of which are hereby incorporated by reference in their entirety), have been designed for audio communication between a remote communications device, such as a cellular phone, and a personal computing device, such as a home personal computer. Although such software greatly increases a user's access to data, it generally requires that the user's personal computing device remain on and available for connection to user's remote devices. Often a user's personal computing device is not available when the user is in a remote location. For example, the user's computer may not be left running all of the time, or firewalls may limit external access to data on the computer. Consequently, it is desirable to have data available from an alternative source that is "always-on."

[0025] A system and method for operatively connecting a user's telephony communications device with a publicly-accessible computer or server by way of telephony signals (e.g., voice, DTMF, or text SMS commands, etc.) is described herein. In one embodiment, a telephony communications device such as, for example, a cellular phone, a smartphone, or the like, may be used to transmit audio or spoken commands and responses to a web-based system, which may include a telephony server that operates in conjunction with a configuration server. The telephony server and the configuration server may be publicly- accessible (i.e., may be available to a variety of users as part of a web-based service or product).

[0026] In another embodiment, the configuration server may initiate a phone call to the user's telephony device by way of the same telephony server. Furthermore, the user can configure the data stored and/or defined on the configuration server. Thus, a session between the user and the configuration server may be personalized such that that the user has access to data that is significant and/or customized to him or her.

[0027] As noted above, the web-based system may include a telephony server and a configuration server, though it will be appreciated that the functions of the telephony server and the configuration server may be consolidated on a single server. The telephony server may be configured to connect an incoming call from a user, access configuration information, including the user's unique and tailored configuration residing on the configuration server, recognize user commands transmitted in the form of DTMF signals, voice commands or commands via other telephony medium, obtain the desired information for the user, and communicate the information back to the user in an audio form. The configuration server may store each user's configuration information {i.e., personal preferences, personal data, etc.). When the user desires to obtain the information, the user may place a phone call, using any type of phone, to a phone number mapped to the telephony server. The call may then be connected over a telephony network to the telephony server. The telephony server may answer the phone call, possibly query the configuration server for information about the caller, retrieve the desired information from a data source for the user, and send an telephony message back to the user. Information from the configuration server may be temporarily cached to the telephony server while the telephony server is connected to the user. Alternatively, the activities performed by the telephony server and the configuration server may be combined and performed on one server so that the telephony server actually stores the configuration information directly.

[0028] The configuration server may be operatively connected to the telephony server and may execute a configuration program. The configuration program may be configured to identify the incoming communication as coming from a specific user when a phone transmission is answered by the telephony server. The user's personal configuration data may then be opened, and, in one embodiment, cached to the telephony server. An interface program on the configuration server and/or the telephony server may employ speech recognition software to recognize the user's spoken utterance and/or text-to-speech software to enable the user to access personalized user information such as appointment and/or email data, personal database information, favorite web-sites on the Internet or other network and the like; thereby enabling the user to retrieve data from multiple sources in audio format. It will be appreciated that the disclosed embodiments may enable a user to use a telephony communications device to communicate with his/her personalized data and preferences from any location.

[0029] For example, a user may operate a cellular phone to access his or her account on the web-based system. Upon establishing communications, the user may request any type of information the software component is configured to access ranging from stored e-mail to favorite podcasts or up-to-the minute stock quotes. In another embodiment, the web-based system may contact the user by way of such cellular phone to, for example, notify the user of an appointment or the like, or provide the user with a breaking news alert. It will also be appreciated that the cellular phone may not perform any voice recognition or contain any of the user information that the user wishes to access. In fact, an "off-the-shelf cellular phone or wired phone or the like may be used with a remote computer running software according to one embodiment of the present invention. As a result, the disclosed embodiments enable a user to use the extensive computing power of a telephony and/or configuration server from any location, and by using any of a wide variety of communications devices.

[0030] An example of such a web-based system is discussed below in connection with FIG. 1. Exemplary device configurations of the telephony server, the configuration server and one or more remote communications devices are discussed below in connection with FlG. 2. As noted above, an interface program may operatively interconnect a telephony device to other software programs and/or data files (such as an XML file) for the purpose of implementing one or more embodiments, and an exemplary configuration of such program and software is discussed below in connection with FIG. 3. An exemplary method of a user-initiated transaction is discussed below in connection with FIGs. 4A, 4B and 4C. An exemplary method of a computer-initiated transaction is discussed below in connection with FIG. 5. FIGs. 6 A, 6B and 6C illustrate a web-based user configuration interface that allows a user to specify aspects of the configuration software.

[0031] Turning now to FIG. 1, an exemplary configuration server 100 is illustrated in which aspects of one or more embodiments may be implemented. Configuration server 100 may be any general purpose or specialized computing device capable of performing the methods discussed herein. It will be appreciated that the configuration server 100 may be configured in any number of ways while remaining consistent with an embodiment. The configuration server 100 may be a remote server (or interconnected servers) that is essentially always available to operatively connect with a telephony server 140.

[0032] For example, the configuration sever 100 may be configured to accept an input regarding a telephony request via a signal from the telephony server 140. A configuration program 130 executing on the configuration server 100 may be accessed by the user with a data signal from the telephony server 140. Additionally, the configuration program 130 may be pre- configured by the user by way of a wide variety of methods, such as a software program residing on a personal computer, a web-based interface, or a telephone call. The configuration program 130 on the configuration server may be part of a system 300, which will be further discussed below in connection with FIGs. 3, 6A, 6B and 6C.

[0033] The configuration server 100 may be operatively connected to a network 120, which may include a public switched telephone network (PSTN), a cellular network, a Voice over Internet Protocol (VoIP) network, a radio network, a local area network (LAN), a wide area network (WAN), or the Internet. The configuration server 100 may include a processor 112 for data processing, a memory 110 for storing data, and an input/output (I/O) 114 for communicating with the network 120 and/or another communications medium such as a telephone line or the like. It will be appreciated that the processor 112 of the configuration server 100 may be a single processor, or may be a plurality of interconnected processors. The memory 110 may be, for example, random-access-memory (RAM), read-only-memory (ROM), a hard drive, CD-ROM, a universal serial bus (USB) storage device, or any combination thereof. In addition, the memory 110 may be located internal or external to the configuration server 100. The I/O 114 may be any hardware and/or software component that permits a user or external device to communicate with the configuration server 100. The I/O 114 may be a plurality of devices located internally and/or externally to the configuration server 100.

[0034] The configuration server 100 may be coupled to the telephony server 140, which may receive and send phone calls via a network 148 (which may be the same or different than the network 120 above). Thus, the network 148 may include a PSTN, a cellular network, a VoIP network, a radio network, a LAN, a WAN, or the Internet. Like the configuration server 100, the telephony server 140 may include a processor 142, a memory 141, and an I/O 144 for communicating with the network 148. It will be appreciated that the processor 142 of telephony server 140 may be a single processor, or may be a plurality of interconnected processors. The memory 141 may be, for example, RAM, ROM, a hard drive, CD-ROM, USB storage device, or any combination thereof. In addition, the memory 141 may be located internal or external to the telephony server 140. The I/O 144 may be any hardware and/or software component that permits a user or external device to communicate with the telephony server 140.

[0035] The telephony server 140 may include an interface program 146 that enables the telephone server 140 to answer a telephony call and translate the audio information received on that call into digital information that can be used by either the configuration server 100 or by the telephony server 140. Like the configuration program 130, the interface program 146 may be part of the system 300 described in connection with FIG. 3 below.

[0036] In a preferred embodiment, the telephony server 140 may be a VoIP server that is receiving digitized internet protocol (IP) packets over the Internet. In one example, the telephony server 140 may receive a phone call from a user, and request information about the user in order to answer the phone call from the user. For example, the telephony server 140 may be programmed with caller identification software to recognize that a call from a specific number is from John Smith. When the telephony server 140 receives the phone call, it may request information from the configuration server 100 about John Smith, such as pin number, and other personal configuration information about John Smith. The telephony server 140 and the configuration serverlOO may then work together to respond to John Smith requests. In a preferred embodiment, the programs handling the large bandwidth associated with audio files are residing on the telephony server 140 while as much other computational work and data storage as possible is handled by the configuration server 100. Furthermore, as noted above, the configuration server 100 and the telephony server 140 may be the same computer or server.

[0037] As shown in FIG. 2, a user may call a telephone number corresponding to a virtual telephone 206 that resides within the telephony server 140 by way of a remote telephone 204 or a cellular phone 208. The remote telephone 204 and the cellular telephone 208 may include any type of device configured for wired and/or wireless telephony based communications. Examples include, but are not limited to, a cellular telephone, a PDA, a smartphone, a cordless telephone, a corded telephone, a personal computer having telephony software, a personal computer having a VoIP connection, a personal computer having instant messaging software, an automobile telephony system (e.g., an integrated telephony system within an automobile that may communicate with a network via a radio, a cellular, or a satellite signal).

[0038] In such an embodiment, the telephony server 140 may monitor all incoming calls for a predetermined signal or the like, and upon detecting such signal, the telephony server 140 may forward such information from the call to the configuration program 130 or other software component on the configuration server 100. In such a manner, the configuration server 100 may, upon receiving data about the telephony call from the telephony serverl40, retrieve stored information from a datastore that is on or connected to the configuration server 100 or the network 120. The configuration server 100 may send a digitized answer to the telephony server 140 where the information may be formatted into a telephony response and delivered to the user via telephony medium. Conversely, the configuration server 100 may initiate a conversation with the user by calling the user at either the remote telephone 204 or the cellular phone 208. As may be appreciated, the configuration server 100 may operatively connect to the telephony server 140, which may use the virtual telephone 206. In another embodiment VoIP software may reside in the configuration server 100, and the virtual phone 206 will be directly connected by the configuration server 100.

[0039] It will be appreciated that the remote telephone 204 and the cellular phone 208 may be any type of device for reproducing telephony signals at a distance in which sound is converted into electrical impulses (in either analog or digital format) and transmitted either by way of wire or wirelessly by, for example, a cellular network or the like. As may be appreciated, the use of a telephone for remotely accessing the configuration server 100 may ensure relatively low cost and ready availability of handsets for the user. In addition, any type or number of peripherals may be employed in connection with the remote telephone 204 and the cellular phone 208, and any such type of peripheral is equally consistent with one or more embodiments. In addition, any type of filtering or noise cancellation hardware or software may be used - either at the remote telephone 204, the cellular phone 208, the configuration server 100, and/or the telephony server 140 - so as to increase the signal strength and/or clarity of the signal received from the remote telephone 204 and the cellular phone 208. It will be appreciated that in an alternate embodiment, a Session Initiation Protocol (SIP) telephone from a remote computer may be used to communicate via the telephony server 140 to the user.

[0040] It will be further appreciated that while one or more embodiments have been described in the context of a single user operating the remote telephone 204 or the cellular phone 208, any number of users and any number of remote telephones 204 and cellular phones 208 are consistent with an embodiment, and that telephony server 140 may be configured to handle a large number of incoming and outgoing telephony messages simultaneously.

[0041] Preferably such telephone communication may be configured by way of a VoIP connection. In such an embodiment, any remote phone 204 and/or cellular phone 208 may be able to dial a number that corresponds to the IP address of the telephony server 140. Often this number is a 10 digit number that is similar to a standard phone number. In other cases, the number may be an abbreviated dialing code. The telephony server 140 may then connect the configuration program 130 on the configuration server 100 by way of a connection 150, which may include an electronic connection via a PSTN, a cellular network, a VoIP network, a radio network, a LAN, a WAN, the Internet, and/or an electronic bus. In many embodiments, the interface program 146 accesses data and other programs from a variety of sources. In some embodiments, the interface program 146 may reside on the telephony server 140. In other embodiments, the interface program 146 may reside on the configuration server 100. In still other embodiments, the configuration server 100 and the telephony server 140 may be the same machine or device. Thus, as will be appreciated by one skilled in the art, the configuration program 130 and the interface program 146 may each be part of the system 300, which may be executed by one or more computing devices, such as the configuration server 100 and the telephone server 140. The system 300 is discussed in greater detail below in connection with FIGs. 3, 6A, 6B and 6C.

[0042] Turning now to FIG. 3, a block diagram of an exemplary system 300 is illustrated. As may be appreciated, in one embodiment, the system 300 may be modular in nature and some of software may be executed by the telephony server 140 and some of the software may be executed by the configuration server 100. In such a manner, the computing power of the configuration server 100 and the telephony serverl40 are utilized, rather than attempting to implement such software on a remote communications device such as, for example, the remote telephone 204 and the cellular phone 208 discussed above in connection with FIG. 2. It will be appreciated that each system component illustrated in FIG. 3 may be operatively connected to at least one other system component (as illustrated by the dotted lines). In addition, it will be appreciated that FTG. 3 illustrates only one embodiment, as other configurations components are consistent with one or more embodiment as well. It will be appreciated that the software components illustrated in FIG. 3 may be stand-alone programs, application program interfaces (APIs) or the like. In addition, some software components already may be present on the configuration server 100 and/or the telephony server 140, thus substantially lowering costs, reducing complexity, saving hard disk space, and improving efficiency.

[0043] A telephony input 302 is any type of component that permits a user to communicate by way of telephony signal (e.g., spoken utterance, DTMF signals, text messages) with the telephony server 140 via, for example, input devices as discussed above in connection with FIG. 2. Likewise, a telephony output 304 is provided for outputting electrical signals as sound for a user to hear. It will be appreciated that both the telephony input 302 and the telephony output 304 may be adapted for other purposes such as, for example, receiving and transmitting signals to the remote telephone 204, the cellular phone 208, and/or the network 120, including having the functionality necessary to establish a connection by way of the remote telephone 204, the cellular phone 208, and/or the network 120.

[0044] Also provided is a voice recognition software 310 which, as the name implies, is adapted to accept an electronic signal - such as a signal received by telephony input 302 - wherein the signal represents a spoken utterance by a user, and to decipher such utterance. The voice recognition software 310 may be, for example, any type of specialized or off-the-shelf voice recognition software. The voice recognition software 310 may include user training for better-optimized speech recognition. In other embodiments, it may be preferable to use a DTMF (dual tone multi-frequency) recognizer in a place of, or in addition to, voice recognition software. In addition, a text-to-speech engine 315 for communicating with a user is illustrated. The text-to-speech engine 315, in an embodiment, may generate spoken statements from electronic data (generally text-based), that are then transmitted to the user. In an embodiment as illustrated in FIG. 3, a natural language processing module 325 and a natural language synthesis module 330 may be provided to interpret and construct, respectively, spoken statements.

[0045] User data 320 may include any kind of information that is stored or accessible to the configuration server 100 and/or the telephony server 140, and that may be accessed and used in accordance with one or more embodiments. For example, a personal information data file 322 may be any type of computer file that contains any type of information. Email, appointment files, personal information and the like are examples of the type of information that is stored in the personal information data file 322. Additionally, the personal information data file 322 may be a type of file such as, for example, a spreadsheet, database, document file, email data, and so forth. Furthermore, the personal information data file 322 (as well as a network-based data file 324 discussed below) may be able to perform tasks at the user's direction such as, for example, print a document, send a fax, send an e-mail, interface with communications devices and/or systems, and so forth. Such functionality may be included in the personal information data file 322 and the network-based data file 324, or may be accessible to the personal information data file 322 and the network-based data file 324 by way of, for example, the telephony input 302 and the output 304, the Input/Output 350, and/or the like. It will be appreciated that an interface program 301 (which may correspond to the interface program 146) within the system 300 may be able to carry out such tasks using components, such as those discussed above, that are internal to the configuration server 100 and/or the telephony server 140, or the interface program 301 may interface - using the telephony input 302 and the output 304, the Input/Output 350, and/or the like - with devices external to the system 300.

[0046] An additional file that may be accessed by the system 300 on behalf of a user is the network-based data file 324. The network-based data file 324 may include text streams, queries, XML data, or other functionality that accesses the network 120, such as the Internet, to obtain up-to-date information for the user. Such information may be, for example, stock prices, weather reports, news, music, and the like. As will be appreciated, the term user data 320 as used herein refers to any type of data file including the personal information data file 322 and/or the network-based data file 324. A data file interface 335 is provided to permit the interface program 301 to access the user data 320. As may be appreciated, there may be a single data file interface 335, or multiple data file interfaces 335, which may interface only with specific files or file types. Also, in one embodiment, a system clock 340 is provided for enabling the interface program 301 to determine time and date information. Operatively connected (as indicated by the dotted lines) to the aforementioned system components is the interface program 301. Details of an exemplary user interface associated with such interface program 301 are discussed below in connection with FIGs. 6 A, 6B and 6C. However, the interface program 301 itself may be either a stand-alone program, or a software component that orchestrates the performance of tasks in accordance with one or more embodiments. For example, the interface program 301 may control the other software components, and also control what user data 320 is open and what "grammars" (expected phrases to be uttered by a user) are listened for.

[0047] It will be appreciated that the interface program 301 need not itself contain the user data 320 in which the user is interested. In such a manner, the interface program 301 may remain a relatively small and efficient program that can be modified and updated independently of any user data 320 or other software components as discussed above. In addition, such a modular configuration enables the interface program 301 to be used in any computer that is running any type of software components. As a result, many compatibility concerns are alleviated.

[0048] Furthermore, it will be appreciated that that the modular nature of the system 300 may allow some of the software to reside on the telephony server 140, and some of the software to reside on the configuration server 100. In some cases, the system 300 may reside entirely on one computer or server. In addition, the modularity of the system 300 may allow great flexibility in the configuration of the configuration server 100 and the telephony server 140, optimizing each to handle a significant number of simultaneous users.

[0049] It will also be appreciated that the modular nature of one or more embodiments may allow for the use of virtually any voice recognition software 310 (or DTMF recognition software). However, the large variances in human speech patterns and dialects often limit the accuracy of the voice recognition software 310. Thus, in one embodiment, the accuracy of the voice recognition software 310 may be improved by limiting the context of the spoken material the voice recognition software 310 is recognizing. For example, if the voice recognition software 310 is limited to recognizing words from a particular subject area, the voice recognition software 310 may be more likely to correctly recognize an utterance - that may sound similar to any number of unrelated words - as a word that is related to the desired subject area. Therefore, in one embodiment, the user data 320 that may be accessed by the interface program 301 may be configured and organized in such a manner as to perform such context limiting. Such configuration can be done in the user data 320 itself, rather than requiring a change to the interface program 301 or other software components as illustrated in FIG. 3.

[0050] For example, a spreadsheet application such as Microsoft® Excel® or the like may be configured for storing and accessing data in a manner suitable for use with the interface program 301. Script files, alarm files, look-up files, command files, solver files and the like are all types of spreadsheet files that are available for use in one or more embodiments.

[0051] A script file is a spreadsheet that provides for a spoken dialogue between a user and the configuration server 100 and/or the telephony server 140. For example, in one embodiment, one or more columns (or rows) of a spreadsheet represent a grammar that may be spoken by a user - and therefore will be recognized by the interface program 301 - and one or more columns (or rows) of the spreadsheet represent the response of the configuration server 100 and/or the telephony server 140. Thus, if a user says, for example, "hello," the configuration server 100 and/or the telephony server 140 may say "hi" or "good morning" or the like. Such a script file thereby enables a more user-friendly interaction with the configuration server 100 and/or the telephony server 140.

[0052] An alarm file, in one embodiment, has entries in one or more columns (or rows) of a spreadsheet that correspond to a desired function. For example, an entry in the spreadsheet may correspond to a reminder, set for a particular date and/or time, for the user to take medication, attend a meeting, etc. Thus, the interface program 301 may interface with a component such as the telephony output 304 to contact the user and inform him or her of the reminder. Thus, it will be appreciated that an alarm file is, in some embodiments, always active because it may be running to generate an action upon a predetermined condition.

[0053] A look-up file, in one embodiment, is a spreadsheet that contains information or is cross-referenced to information. In one embodiment, the information may be contained entirely within the look-up file, while in other embodiments the look-up file may reference information from data sources outside of the look-up file. For example, spreadsheets may contain cells that reference data that is available on the Internet (using, for example, "smart tags" or the like), and that can be "refreshed" at a predetermined interval to ensure the information is up-to-date. Therefore, a look-up file may be used to find information from a network such as, for example, stock quotes, sports scores, weather conditions and the like.

[0054] As noted above, a script file represents a simple application of spreadsheet technology that may be leveraged by the interface program 301 to provide a user with the desired information or to perform the desired task. It will be appreciated that, depending on the particular voice recognition software 310 being used in an embodiment, the syntax of such scripts affects what such software is listening for in terms of a spoken utterance from a user. One or more embodiments may provide flexible grammars, as well as a user-friendly way of programming such grammars, so a user does not have to remember an exact statement that must be spoken in order to cause the configuration server 100 and/or the telephony server 140 to yield a desired response.

[0055] One embodiment is configured so as to only open, for example, a lookup file when requested by a user. In such a manner, the number of grammars that the configuration server 100 and/or the telephony server 140 must potentially decipher is reduced, thereby increasing the speed and reliability of any such voice recognition. In addition, such a configuration also frees up the resources of the configuration server 100 and/or the telephony server for other activities. If a user desires to open such a file, the user may issue a verbal command such as, for example, "look up stock prices" or the like. The configuration server 100 and/or the telephony server 140 may then determine which of the personal information data file 322 or the network-based data file 324 corresponds to the spoken utterance and open it. The configuration server 100 and/or the telephony server 140 may then inform the user, by way of a verbal cue, that the data is now accessible.

[0056] Therefore, in such an exemplary configuration as illustrated in FIG. 3, the interface program 301, according to an embodiment, may be able to send information to and receive such information from a user. Such information may contain the user data 320 that may be contained within the configuration server 100 and/or the telephony server 140 (such as, for example, in the memory 110 or the memory 141, respectively), or in the network 120. A method of performing such tasks is discussed below in connection with FIGs. 4A, 4B, 4C and 5.

[0057] Turning now to Figs. 4A, 4B and 4C, flowcharts of an exemplary method of a user-initiated transaction in accordance with an embodiment are shown. As was noted in the discussion of alarm scripts in connection with FIG. 3 above, it will be appreciated that in one embodiment the interface program 301, by way of the telephony output 304, is able to initiate a transaction as well. An example of such a transaction is discussed below in connection with FIG. 5.

[0058] At step 405 of FIG. 4A, a user may establish communications with the system 300 by way of the telephony server 140. Such an establishment may take place, for example, by the user calling the telephony server 140 by way of the cellular phone 208 as discussed above in connection with FIG. 2. It will be appreciated that such an establishment may also have intermediate steps that may, for example, establish a security clearance to access the user data 320 or the like. At optional step 410, a "spoken" prompt may be provided to the user. Such a prompt may simply be to indicate to the user that the configuration server 100 and/or the telephony server 140 is ready to listen for a spoken utterance, or such prompt may comprise other information such as a date and time, or the like.

[0059] At step 415, a user request may be received by way of, for example, the telephony input 302. At step 420, the user request may be parsed and/or analyzed to determine the content of the request. Such parsing and/or analyzing may be performed by, for example, the voice recognition software 310 and/or the natural language processing module 325. At step 425, the desired function corresponding to the user's request may be determined. It will be appreciated that steps 410-425 may be repeated as many times as necessary for the voice recognition software 310 to recognize the user's request. Such repetition may be necessary, for example, when the communications channel by which the user is communicating with the configuration server 100 and/or the telephony server 140 is of poor quality, the user is speaking unclearly, or for any other reason.

[0060] If the determination of step 425 is that the user is requesting existing information or for configuration server 100 to perform an action, the method may proceed to step 430 of FTG. 4B. For example, the user may wish to have the telephony server 140 (after the information is accessed from the configuration server 100) read his or her appointments for the following day. Alternatively, the user may wish to find out current stock quotes. If instead the determination of step 425 is that the desired function corresponding to the user request is to add or create data, the method may proceed to step 450 of FIG. 4C. For example, the user may wish to record a message, enter a new phone number for an existing or new contact, and/or the like.

[0061] Thus, and turning now to FIG. 4B, at step 430 the requested user data 320 may be selected and retrieved by the interface program 301. As noted above in connection with FIG. 3, the appropriate data file interface 335 may be activated by the interface program 301 to interact with user data 320 and access the requested information. It will be appreciated that the determination of step 425 may result in a determination that the user is requesting a particular action be performed. For example, the user may wish to initiate a phone call. In such an embodiment, the interface program 301 directs Session Initiation Protocol (SIP) softphone software by way of the telephony input 302 and the telephony output 304, the Input/Output 350, and the like to place a call to a telephone number as directed by the user. In another embodiment, the user may request a call to a telephone number that resides in, for example, the Microsoft® Outlook® or other contact database. In such an embodiment the user requests that the interface program 301 call a particular name or other entry in the contact database and the interface program 301 causes the SIP softphone to dial the phone number associated with that name or other entry in the contact database. It will be appreciated that, while the present discussion relates to a single telephone call, any number of calls may be placed or connected, thereby allowing conference calls and the like.

[0062] When placing a call in such an embodiment, the interface program 301 may initiate, for example, a conference call utilizing the SIP phone, such that the user and one or more other users are connected together on the same line and, in addition, have the ability to verbally issue commands and request information from the program. Specific grammars would enable the program to "listen" quietly to the conversation among the users until the interface program 301 is specifically requested to provide information and/or perform a particular activity. Alternatively, the interface program 301 may "disconnect" from the user once the program has initiated the call to another user or a conference call among multiple users.

[0063] As discussed above in connection with FIG. 4A, the user may desire to add or create data instead of simply requesting to retrieve such data or take a specified action. Thus, referring now to FIG. 4C, at step 450 the user data 320, in the form of a new database, spreadsheet or the like — or as a new entry in an existing file — may be selected or created in accordance with the user instruction received in connection with FIG. 4A. At step 452, a spoken prompt is provided to the user, whereby the user is instructed to speak the new data or instruction. At step 454, the user response is received, and at step 456, the response may be parsed and/or analyzed. At step 458, the spoken data or field may be added to the user data 320 that was created or selected in step 450. At optional step 460, if necessary, a spoken prompt is again provided to the user to request additional new data. At optional step 462, such data is received in the form of the user's spoken response, and at optional step 464, such response may be parsed and/or analyzed. At step 466, a determination may be made as to whether further action is required. If so, the method returns to step 458 to add the spoken data or field to the user data 320. If no further action is required, at step 468 the conversation ends or is placed in a standby mode to await further user input. It will be appreciated that such prompting and receipt of user utterances takes place as discussed above in connection with FIGs. 4 A and 4B.

[0064] In contrast to the method described above in connection with FTGs. 4A, 4B and 4C, the method of FIG. 5 is an exemplary method of a computer-initiated transaction in accordance with an embodiment. Accordingly, at step 500 the user data 320 may be monitored. As may be appreciated, multiple instances of user data 320 may be monitored by the interface program 301 such as, for example, an alarm file, an appointment database, an email/scheduling program file and the like. In addition, external data may be monitored, such as news alerts, school closings, etc. At step 505, a determination may be made as to whether the user data 320 being monitored contains an action item (such as a wake-up call), or a user alert is deemed necessary (such as a significant change in a stock price). It will be appreciated that in an embodiment the interface program 301 may be adapted to use the system clock 340 to, for example, review entries in a database and determine which currently-occurring items may require action. If no action items are detected, the interface program 301 may continue monitoring the user data 320 at step 500. If the user data 320 does contain an action item, the interface program 301, at step 510, may initiate a communication with the user. Such an initiation may take place, for example, by the interface program 301 causing a software component to contact the user by way of the remote telephone 204 or the cellular phone 208.

[0065] At step 515, a spoken prompt may be issued to the user. For example, upon the user answering his or her cellular phone 208, the interface program 301 may cause the text-to- speech engine 315 to generate a statement regarding the action item. It will be appreciated that other, non-action-item-related statements may also be spoken to the user at such time such as, for example, security checks, predetermined pleasantries, and the like. At step 520, the user response is received, and at step 525, the response may be parsed and/or analyzed as discussed above in connection with FIGs. 4 A and 4B. At step 530, a determination may be made as to whether further action is required, based on the spoken utterance. If so, the method returns to step 515. If no further action is required, at optional step 535 the interface program 301 may make any adjustments that need to be made to the user data 320 to complete the user's request such as, for example, causing the data file interface 335 to save changes or settings, set an alarm, and the like. The interface program 301 may then return to step 500 to continue monitoring the user data 320. It will be appreciated that the user may disconnect from the configuration server 100 and/or the telephony server 140, or may remain connected to perform other tasks. In fact, the user may then, for example, issue instructions that are handled according to the method discussed above in connection with FIGs. 4A, 4B and/or 4C.

[0066] Thus, it will be appreciated that the interface program 301 is capable of both initiating and receiving contact from a user with respect to the user data 320 stored on or accessible to the configuration server 100 and/or the telephony server 140. It will also be appreciated that the interface program 301, in some embodiments, may run without being seen by the user, as the user accesses the configuration server 100 and/or the telephony server 140 remotely. In another embodiment, the user may have access to the interface program 301 by way of a web-interface. In yet another embodiment, the user may have local software running on a local computer that connects via a network (such as the network 120) to the configuration server 100, the telephony server 140, and/or the interface program 301. However, the user may have to configure or modify the interface program 301 so as to have the interface program 301 operate according to the user's preferences. Furthermore, it is possible that the interface program 301 may be configured using telephony signals only.

[0067] In one embodiment, the interface program 301 on the system 300 may access certain kinds of data, such as pre-existing audio files. For example, the user may call a specific phone number, then press *5 on their phone, and hear a desired podcast played over the remote telephone 204 and/or the cellular phone 208. When the user initially signs-up for the web based system, the user may provide log-in information such as a username and password, which the user can later use to access the configuration interface. In addition, the user may provide personal profile information such as address, zip code, phone number, birth date, interests, favorite retail stores, favorite products, email address, and server information, etc. The configuration interface may be accessed by way of a computer using a Web page, a computer program that is resident on a local computer (such as PAL®) or by a telephony device. When the user is connected with the interface program 301 the interface program 301 may retrieve the data from another source such as the Internet, and send either a pre-existing audio file to the user, or by way of text-to-speech converts text data to an audio signal.

[0068] FIG. 6A illustrates an embodiment of a web-based interface wherein the user may provide data to the system 300, including security information such as a PIN, to be stored on the configuration file. Use of such simple data as zip code may enable the user to receive publicly available information that is of personal interest to the user, such as appropriate weather reports, travel advisories, and movie times. FTG. 6B is an illustration of how a web-based program may show the user current data files that can be accessed using DTMF tones in the user's current configuration. A large number of DTMF tones may be pre-selected by the user to obtain data from a user designated (or default) URL or other defined storage location. FIG. 6C is illustrative of one configuration of a web-based program that allows the user to set the URL's for the interface program 301 to access when the user sends a command.

[0069] In another embodiment, the system 300 would incorporate the telephony server 140, the configuration server 100, and the user's personal computer with software such as PAL® to allow audio communication with the user's personal computer, which may be connected to the Internet. When the user calls the system 300, the system 300 may first attempt to route the call to the user's personal computer. If the user's personal computer were on and connected to the Internet, the user may hear desired information by way of a program such as PAL®. If the user's personal computer were not available, then the telephony server 140 may connect the connect the user's incoming call, access configuration information, including the user's unique and tailored configuration residing on the configuration server 100, recognize user commands transmitted in the form of DTMF signals, voice commands or commands via other telephony medium, obtain the desired information for the user, and communicate the information back to the user via network 148. The user may dial the same number; offline access to the configuration server 100 has no special number. On answering, the system 300 may answer the caller in a personalized manner, or may require an ID, such as a DTMF personal identification number. When the user's personal computer is connected to the Internet, specified data may, in one embodiment, be synchronized between the user's personal computer and the configuration information stored on the configuration server 100. In another embodiment, the user's personal computer may connect to a VoIP network, such as the network 120, and then to the configuration server 100.

[0070] In another embodiment, the system 300 may be capable of connecting the user to other people on a telephony network via a phone call. The user may request by way of telephony signal to make a phone call to one of his/her contacts, or to some other number available to the configuration server 100. The interface program 301 may confirm that the user wants to place this phone call, and then place the outgoing phone by way of the telephony server 140. The system 300 may remain on line, as in a conference call so that after the user has finished the call to the contact, the system 300 may respond to further calls. Alternatively, the system 300 may just transfer the call to the contact.

[0071] In a preferred embodiment, the system 300 may include control commands that allow the user to navigate throughout the system 300. Such commands may include: Pause, Skip, Resume, Go Back, Yes, No, Place Call, etc. The system 300 may be programmed to interpret user commands/requests communicated in variety of ways. In addition to DTMF touch tone key presses, speech recognition may be used by the system 300 as a means to interpret user requests. The system 300 may also interpret commands communicated via other means including, but not limited to: email, text messages, voice messages, web page entries, etc. In some embodiments, there may be duplication of commands such as DTMF 5 is the key (a single key) for Pause/Resume ( = interrupt or break ) but hitting 1-4 means "yes" and 7-# means "No" on a conventional keypad. Table 1 is an example of configuration commands that may be recognized by the interface program 301.

Table 1- System Configuration Commands

[0072] In a preferred embodiment, new users may be provided with a default configuration that will appear in their personal configuration interface and that will be stored on the configuration server 100. The configuration may include an identification of the content, media, and information that the user can access as well as the commands that the user may communicate to the system 300 to access the information. For example, in the default configuration, the DTMF tone *1 may prompt the system 300 to read the user's email. In this case, when the user calls the system 300 and presses * 1 on their phone, the system 300 may read their email over the phone. The voice command "Play CNN News" may be designated to play a CNN News podcast. In this case, when the user calls the system 300 and says "Play CNN News" the system 300 may play the CNN News podcast over the phone. The information that the user provides in their personal profile may determine either part or all of the user's default configuration. For example, based on a user's zip code, their default configuration may include specific DTMF tones assigned to access instant traffic reports for the major roads in the city where they live as well as the local weather forecast. The podcasts, RSS feeds and radio stations that the user may access in the default configuration may be tailored based on the user's personal interests. The user's birth date may determine the horoscope that is included as part of their default configuration. [0073] In another embodiment, in providing increased personalization and flexibility for the user, the user may modify his or her personal default configuration to specify, for example, the content, media, and information that they want to access from the system 300 and the commands that they want to communicate to access that information. For example, the user may specify the URL for a blog or RSS feed that they want to access and the DTMF tone (e.g., #7) that they want to press to hear the blog read to them after they call the system 300. They may also, for example, identify the podcast subscription site URL so they can hear a particular episode.

[0074] The user configuration information may be edited in a variety of ways. In one embodiment, the configuration information may be edited via a web interface. In another embodiment, the user may provide answers to audio questions using a telephony device to edit configuration information. In another embodiment, the user may call the system 300 and press DTMF tones on their phone to change their configuration information. In another embodiment, audio signals may be used to change the configuration information. In a more advanced embodiment, even an e-mail message or a text-message may be used to adjust the configuration information. Such a message may be authored by the user, by another individual, or by some other device or the system 300. In another embodiment, the user may edit their touch-tone settings in their version of desktop software such as PAL® software, and these PAL® settings may be automatically communicated to the system 300 and then either used as the configuration information or used as the basis for modifying the configuration information. There are many possible ways in which user configuration information may be created and/or edited, and nothing whatsoever here is intended to limited the scope of one or more embodiments.

[0075] When the user calls the system 300 from any cell phone or telephony device to obtain the information that they desire, the system 300 may receive the call and identify the user. In one embodiment the system 300 may identify the user directly from the individualized phone number that the user dialed. In another embodiment, the user may call a generic number (such as an 800 number) and then enter a unique extension to begin accessing the information. In this case, the system 300 may identify the user from the combination of the generic number and extension. In another embodiment, the user may call the system 300 and then enter a security code which would uniquely identify the user. Once the system 300 has identified the user, the system 300 would access and use that user's configuration information to recognize the user's commands and then provide the appropriate information to the user in return.

[0076] Many types of audio responses are possible from the system 300. These can range from pragmatic to entertaining, and cover a wide range of media types. For example, text- based information (including numeric data, web feeds, email, etc.) may be recited to the user and/or recorded for later use by a text-to-speech engine. For example, the user may access: email content, stock quotes, traffic reports, weather reports, podcasts, live audio streams, songs and playlists, recordings of all types, document content, spreadsheet data, database content, website content, blogs, RSS Feeds and the like. The system 300 may be programmed to interact with and provide information from different content sources, including both public and private sources. Similarly, the information may reside on the configuration server 100 or on other computers or servers. For example, a user interested in accessing information about movies playing locally may call the system 300 and press a DTMF key to access a movie content site like Fandango.com.

[0077] In another embodiment, the system 300 may determine the location of the user when they call in from a cell phone, based on the cell phone's location program or other GPS system. A user interested in finding the location of nearby Starbucks may call the system 300 and press a DTMF key to access the Starbucks web site. The system 300 may then automatically enter the zip code of the user's current location into the web site to obtain the results, and then read the Starbucks location information from that web page back to the user. If the user's cell phone has GPS or other precise location technology, the system 300 may provide directions to the closest Starbuck's or other desired location {e.g., a gas station, ATM, etc.).

[0078] In addition to playing audio and/or reciting text, in a preferred embodiment, the system 300 may be capable of recording audio content. This would be useful, for example, if the user wished to record some thoughts by dictation, or transmit an audio e-mail. For example, the user may call the system 300, press (say) "Star Zero" and then begin recording a message. When finished, (as determined, for example, by a detected period of silence) the system 300 may save and/or transmit the recording. The system 300 may be able to record inbound audio (the user's voice) for "Take a Note" or "Reply to that Message" commands. In one example, the interface is user-friendly such as: user: #8 (DTMF = Reply to All) System: "Reply to that Message?" user: # 1 (DTMF = Yes) System: "I'm recording now..." user: [ speaks message ] user: [ 3 seconds of relative silence ]

System: "Should I send that message?" user: #0 (DTMT = No)

System: "Should I save it?" (MP3) user: #1 (DTMF = Yes)

System: "I saved your message as a Draft."

[0079] In another configuration, in addition to providing information as a response to a request (where the user "pulls" the information from the system 300), the system 300 may provide information on a set schedule or as a result of the occurrence of an event (where the system 300 "pushes" the information to the user). The information "push" may occur on a onetime or a recurring basis. For example, the system 300 may call the user on a specific phone number at a particular time each day, and when the user answers the call, the system 300 may read the user's email and provide a traffic report and weather forecast. As another example, the system 300 may remind the user, on a regular schedule, to take medication. In this way, the system 300 may act as a sophisticated alarm system that may be programmed by a user of by a third party such as a doctor. The system 300 may also call the user when email arrives from someone that they have designated as an important or critical contact or when a particular podcast has been updated. The system 300 may also store voice mail for the user, and then call the user when a specific number of voice mails are stored.

[0080] Furthermore, the system 300 may be configured to provide other forms of communication with the user, in addition to immediate telephony responses, including but not limited to e-mail, text message, voice message, return phone calls, web posting, etc. These responses may be immediate or delayed, as desired. For example, the system 300 may receive such a text message request (as opposed to a touch tone signal or voice command) and "know" to call the user with a response, either right away or at a later time.

[0081] In another embodiment, content may be provided by third party sources such as advertisers. For example, a retailer or consumer products company may offer a podcast describing specials or discounts for the week or new product introductions. Users may elect to hear this information on demand or have it pushed to them on some set schedule or event driven basis. The system 300 may "rent" or sell space to third parties to provide "infomercials" to the user. For example, a golf equipment provider may sponsor "Tiger Wood's tip of the Day" which the user may get on demand. Similarly, advice from a well-known personality such as Oprah Winfrey may be provided by third party sponsor. In another embodiment, retailers may provide lists from which a user may purchase desired items. For example, a retailer such as Amazon.com may provide a list of the current bestsellers and allow the user to buy it using audio commands. In such an embodiment the user's credit card information may be in the configuration information, or may be input using DTMF. In a similar embodiment, retail grocery stores may sponsor space, and provide access to simplified grocery lists or take out menus that may be ordered over the user's telephony device.

[0082] In yet another embodiment one user potentially may "send" content and/or configuration fragments to another user's configuration information set. For example a user may have an "Inbox" for text and/or audio recordings sent from other users. Such an Inbox may offer all the capabilities of e-mail and voicemail, yet be more powerful still; since it can link to and/or contain a wide array of media types. The system 300 may allow users to set up "buddy lists" so that messages may be sent to or from other users in a confidential manner.

[0083] The system 300 may be programmed to operate in different navigation "modes". These modes may be changed back and forth by the user. In one mode, users who do not wish to be burdened with cumbersome menu-based navigation may call the system 300 and immediately communicate their commands. In another mode, the system 300 would act more like an IVR system by providing prompts to which the user may respond in order to hear the desired information. This may be more appropriate for users who do not wish to remember their configuration or who have specified a large amount of content to access. A third mode may be a combination of these two modes, where some content may be accessed with direct commands, and other content areas would be accessed in response to prompt from the system 300.

[0084] In a novel approach to navigating the data available to the user, the system 300 may sense a category request from the user (as a DTMF signal, voice command, etc.) and then begin reciting the available topics from within that category. So, for example, the user may press #2 or speak, "Traffic reports." The system 300 may then respond with a recitation like, "The following routes have heavy traffic: the Expressway, the Turnpike, the Blue Route, Route 202 ..." When the user wants to hear more about a specific category, the user may interrupt (also called "barge in") with a voice command, DTMF signal, loud noise, or other distinguishable sound or signal. At that point the system 300 may stop reciting and "drill in" to that particular topic. For example, the system 300 may have been interrupted at or near the moment mentioned above, and provide traffic details for Route 202. In a preferred embodiment, the user may navigate to adjacent topics with commands or signal equivalent to hear the Next Topic or Previous Topic. That would reduce the need to be so precise when interrupting a recitation. The system 300 may even anticipate that the user cannot act instantly, and may assume that the prior topic was intended (the Blue Route in the case above). Other user commands or signals may raise the navigation context back up in the "tree," or jump to other branches in the tree.

[0085] Thus, a system, method, and apparatus for operatively connecting to an external server with a telephony communications device by way of telephony media including verbal, audio, or text commands has been provided. While the disclosed embodiments have been described in connection with the various figures, it is to be understood that other similar embodiments may be used or modifications and additions may be made to the disclosed embodiments for performing the same or similar functions. For example, one skilled in the art will recognize that the disclosed embodiments may apply to any configuration of communications devices or software applications. Therefore, the present invention should not be limited to any single embodiment, but rather should be construed in breadth and scope in accordance with the appended claimsr'

Claims

What is Claimed:

1. A method for providing network-based access to personalized content, the method comprising: storing configuration information in a datastore that is publicly-accessible via a communications network, wherein the configuration information is customized by a user and defines personalized content of the user; accessing the configuration information to determine the personalized content to be provided to a remote user device; accessing the personalized content; establishing a communications connection with the remote user device via the communications network; and providing, in an audio form, at least a portion of the personalized content to the remote user device via the communications network.

2. The method of claim 1, further comprising receiving a signal from the remote user device via the communications network, wherein the signal comprises a user request for the personalized content defined in the configuration information.

3. The method of claim 2, further comprising interpreting the signal to process the user request.

4. The method of claim 2, wherein the signal comprises at least one of a spoken utterance, a dual-tone multi-frequency (DTMF) signal, or a text-based message.

5. The method of claim 1, wherein the communications network comprises at least one of a public switched telephone network (PSTN), a cellular network, a voice over internet protocol (VoIP) network, a radio network, a local area network (LAN), a wide area network (WAN), or the Internet.

6. The method of claim 1, wherein the remote user device comprises at least one of a cellular telephone, a personal digital assistant, a smartphone, a cordless telephone, a corded telephone, a first personal computer having telephony software, a second personal computer having a voice over internet protocol (VoIP) connection, a third personal computer having instant messaging software, or an automobile telephony system.

7. The method of claim 1, wherein the personalized content comprises at least one of a file- based content, an e-mail-based content, a web-based content, or an advertising-based content.

8. The method of claim 1, further comprising receiving the personalized content from the user and storing the personalized content in the datastore.

9. The method of claim 1, further comprising accessing the personalized content from at least one of the datastore, the Internet, or a personal computer associated with the user.

10. The method of claim 1, wherein the communications connection with the remote user device is initiated via a call from the remote user device.

11. The method of claim 1, further comprising initiating the communications connection with the remote user device upon at least one of a predetermined time or a predetermined event.

12. The method of claim 1, further comprising synchronizing the configuration information on the datastore with a personal computer associated with the user.

13. The method of claim 1, further comprising protecting the configuration information on the datastore via at least one security feature that prevents entities, other than the user, from accessing the configuration information.

14. The method of claim 1, further comprising receiving the configuration information from the user via a web-interface that is accessible over the Internet.

15. The method of claim 1, wherein the configuration information comprises at least one of user log-in information, user profile information, or a user command used to identify the personalized content to be provided to the remote user device.

16. The method of claim 1, further comprising initially establishing the configuration information as a default configuration.

17. The method of claim 16, further comprising receiving at least one of user preference information or user profile information and defining the default configuration in accordance therewith.

18. The method of claim 1, further comprising receiving a user request to edit the configuration information on the datastore via at least one of the remote user device or a personal computer associated with the user.

19. The method of claim 18, wherein the user request is received via at least one of a web- interface, a spoken utterance, a dual-tone multi-frequency (DTMF) signal, or a text-based message.

20. The method of claim 1 , further comprising identifying the user associated with the remote user device based on at least one of a dialed number or a personal code that is supplied by the remote user device.

21. The method of claim 1, further comprising determining a location of the remote user device and accessing the configuration information to determine the personalized content to be provided based on the location of the remote user device.

22. The method of claim 1, further comprising automatically entering at least a portion of the configuration information into a web-interface to access the personalized content from the Internet.

23. The method of claim 1, further comprising receiving the personalized content as a voice audio signal and recording the voice audio signal.

24. The method of claim 1, further comprising receiving a user request to purchase the personalized content and processing the user request using a user account defined in the configuration information.

25. The method of claim 1, further comprising receiving the personalized content from another user.

26. The method of claim 1 , further comprising receiving a call from the remote user device and routing the call to a personal computer associated with the user, wherein the call comprises a user request for the personalized content.

27. The method of claim 26, further comprising receiving the personalized content from the personal computer and providing the personalized content to the remote user device via the communications network.

28. The method of claim 26, further comprising processing the user request if the personal computer is not available.

29. The method of claim 1, further comprising processing the personalized content to provide the personalized content to the remote user device in the audio form.

30. The method of claim 1, further comprising manipulating the personalized content being provided to the remote user device in response to a user request.

31. A network-based apparatus for providing network-based access to personalized content, the network-based apparatus comprising: a memory for storing configuration information that is customized by a user and that defines personalized content of the user, wherein the memory is publicly-accessible via a communications network; and an interface module in communication with the memory, wherein the interface module is publicly-accessible via the communications network and includes a processor configured to: access the configuration information to determine the personalized content to be provided to a remote user device; establish a communications connection with the remote user device via the communications network; access the personalized content; and provide, in an audio form, at least a portion of the personalized content to the remote user device via the communications network.

32. The network-based apparatus of claim 31, wherein the processor is further configured to receive a signal from the remote user device via the communications network, wherein the signal comprises a user request for the personalized content defined in the configuration information.

33. The network-based apparatus of claim 32, wherein the processor is further configured to interpret the signal to process the user request.

34. The network-based apparatus of claim 32, wherein the signal comprises at least one of a spoken utterance, a dual-tone multi-frequency (DTMF) signal, or a text-based message.

35. The network-based apparatus of claim 31, wherein the communications network comprises at least one of a public switched telephone network (PSTN), a cellular network, a voice over internet protocol (VoIP) network, a radio network, a local area network (LAN), a wide area network (WAN), or the Internet.

36. The network-based apparatus of claim 31, wherein the remote user device comprises at least one of a cellular telephone, a personal digital assistant, a smartphone, a cordless telephone, a corded telephone, a first personal computer having telephony software, a second personal computer having a voice over internet protocol (VoIP) connection, a third personal computer having instant messaging software, or an automobile telephony system.

37. The network-based apparatus of claim 31, wherein the personalized content comprises at least one of a file-based content, an e-mail-based content, a web-based content, or an advertising- based content.

38. The network-based apparatus of claim 31, wherein the processor is further configured to receive the personalized content from the user and to store the personalized content in the memory.

39. The network-based apparatus of claim 31, wherein the processor is further configured to access the personalized content from at least one of the memory, the Internet, or a personal computer associated with the user.

40. The network-based apparatus of claim 31, wherein the processor is further configured to establish the communications connection with the remote user device in response to a call from the remote user device.

41. The network-based apparatus of claim 31, wherein the processor is further configured to initiate the communications connection with the remote user device upon at least one of a predetermined time or a predetermined event.

42. The network-based apparatus of claim 31, wherein the processor is further configured to synchronize the configuration information in the memory with a personal computer associated with the user.

43. The network-based apparatus of claim 31, wherein the processor is further configured to protect the configuration information in the memory by employing at least one security feature that prevents entities, other than the user, from accessing the configuration information.

44. The network-based apparatus of claim 31, wherein the processor is further configured to receive the configuration information from the user via a web-interface that is accessible over the Internet.

45. The network-based apparatus of claim 31, wherein the configuration information comprises at least one of user log-in information, user profile information, or a user command used to identify the personalized content to be provided to the remote user device.

46. The network-based apparatus of claim 31, wherein the processor is further configured to initially establish the configuration information as a default configuration.

47. The network-based apparatus of claim 46, wherein the processor is further configured to receive at least one of user preference information or user profile information and to define the default configuration in accordance therewith.

48. The network-based apparatus of claim 31, wherein the processor is further configured to edit the configuration information in the memory in response to a user request received via at least one of the remote user device or a personal computer associated with the user.

49. The network-based apparatus of claim 48, wherein the user request is received via at least one of a web-interface, a spoken utterance, a dual-tone multi-frequency (DTMF) signal, or a text-based message.

50. The network-based apparatus of claim 31, wherein the processor is further configured to identify the user associated with the remote user device based on at least one of a dialed number or a personal code that is supplied by the remote user device.

51. The network-based apparatus of claim 31 , wherein the processor is further configured to determine a location of the remote user device and to determine the personalized content to be provided based on the location of the remote user device.

52. The network-based apparatus of claim 31, wherein the processor is further configured to automatically enter at least a portion of the configuration information into a web-interface to access the personalized content from the Internet.

53. The network-based apparatus of claim 31, wherein the processor is further configured to receive the personalized content as a voice audio signal and to record the voice audio signal.

54. The network-based apparatus of claim 31, wherein the processor is further configured to receive a user request to purchase the personalized content and to process the user request using a user account defined in the configuration information.

55. The network-based apparatus of claim 31, wherein the processor is further configured to receive the personalized content from another user.

56. The network-based apparatus of claim 31, wherein the processor is further configured to receive a call from the remote user device and to route the call to a personal computer associated with the user, wherein the call comprises a user request for the personalized content.

57. The network-based apparatus of claim 56, wherein the processor is further configured to receive the personalized content from the personal computer and to provide the personalized content to the remote user device via the communications network.

58. The network-based apparatus claim 56, wherein the processor is further configured to process the user request if the personal computer is not available.

59. The network-based apparatus of claim 31, wherein the processor is further configured to process the personalized content to provide the personalized content to the remote user device in the audio form.

60. The network-based apparatus of claim 31, wherein the processor is further configured to manipulate the personalized content being provided to the remote user device in response to a user request.

61. A computer-readable medium having stored thereon computer-executable instructions for providing web-based access to personalized content, the computer-executable instructions comprising instructions for: storing configuration information in a datastore that is publicly-accessible via a communications network, wherein the configuration information is customized by a user and defines personalized content of the user; accessing the configuration information to determine the personalized content to be provided to a remote user device; accessing the personalized content; establishing a communications connection with the remote user device via the communications network; and providing, in an audio form, at least a portion of the personalized content to the remote user device via the communications network.

62. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for receiving a signal from the remote user device via the communications network, wherein the signal comprises a user request for the personalized content defined in the configuration information.

63. The computer-readable medium of claim 62, wherein the computer-executable instructions further comprise instructions for interpreting the signal to process the user request.

64. The computer-readable medium of claim 62, wherein the signal comprises at least one of a spoken utterance, a dual-tone multi-frequency (DTMF) signal, or a text-based message.

65. The computer-readable medium of claim 61, wherein the communications network comprises at least one of a public switched telephone network (PSTN), a cellular network, a voice over internet protocol (VoIP) network, a radio network, a local area network (LAN), a wide area network (WAN), or the Internet.

66. The computer-readable medium of claim 61, wherein the remote user device comprises at least one of a cellular telephone, a personal digital assistant, a smartphone, a cordless telephone, a corded telephone, a first personal computer having telephony software, a second personal computer having a voice over internet protocol (VoIP) connection, a third personal computer having instant messaging software, or an automobile telephony system.

67. The computer-readable medium of claim 61, wherein the personalized content comprises at least one of a file-based content, an e-mail-based content, a web-based content, or an advertising-based content.

68. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for receiving the personalized content from the user and storing the personalized content in the datastore.

69. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for accessing the personalized content from at least one of the datastore, the Internet, or a personal computer associated with the user.

70. The computer-readable medium of claim 61, wherein the communications connection with the remote user device is initiated via a call from the remote user device.

71. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for initiating the communications connection with the remote user device upon at least one of a predetermined time or a predetermined event.

72. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for synchronizing the configuration information on the datastore with a personal computer associated with the user.

73. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for protecting the configuration information on the datastore via at least one security feature that prevents entities, other than the user, from accessing the configuration information.

74. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for receiving the configuration information from the user via a web-interface that is accessible over the Internet.

75. The computer-readable medium of claim 61, wherein the configuration information comprises at least one of user log-in information, user profile information, or a user command used to identify the personalized content to be provided to the remote user device.

76. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for initially establishing the configuration information as a default configuration.

77. The computer-readable medium of claim 76, wherein the computer-executable instructions further comprise instructions for receiving at least one of user preference information or user profile information and defining the default configuration in accordance therewith.

78. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for receiving a user request to edit the configuration information on the datastore via at least one of the remote user device or a personal computer associated with the user.

79. The computer-readable medium of claim 78, wherein the user request is received via at least one of a web-interface, a spoken utterance, a dual-tone multi-frequency (DTMF) signal, or a text-based message.

80. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for identifying the user associated with the remote user device based on at least one of a dialed number or a personal code that is supplied by the remote user device.

81. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for determining a location of the remote user device and accessing the configuration information to determine the personalized content to be provided based on the location of the remote user device.

82. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for automatically entering at least a portion of the configuration information into a web-interface to access the personalized content from the Internet.

83. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for receiving the personalized content as a voice audio signal and recording the voice audio signal.

84. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for receiving a user request to purchase the personalized content and processing the user request using a user account defined in the configuration information.

85. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for receiving the personalized content from another user.

86. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for receiving a call from the remote user device and routing the call to a personal computer associated with the user, wherein the call comprises a user request for the personalized content.

87. The computer-readable medium of claim 86, wherein the computer-executable instructions further comprise instructions for receiving the personalized content from the personal computer and providing the personalized content to the remote user device via the communications network.

88. The computer-readable medium of claim 86, wherein the computer-executable instructions further comprise instructions for processing the user request if the personal computer is not available.

89. The computer-readable medium of claim 61, wherein the computer-executable instructions further comprise instructions for processing the personalized content to provide the personalized content to the remote user device in the audio form.

90. The computer-readable medium of claim 61 , wherein the computer-executable instructions further comprise instructions for manipulating the personalized content being provided to the remote user device in response to a user request.