WO2013056756A1 - Method and apparatus for displaying visual information about participants in a teleconference - Google Patents

Method and apparatus for displaying visual information about participants in a teleconference Download PDF

Info

Publication number
WO2013056756A1
WO2013056756A1 PCT/EP2012/003034 EP2012003034W WO2013056756A1 WO 2013056756 A1 WO2013056756 A1 WO 2013056756A1 EP 2012003034 W EP2012003034 W EP 2012003034W WO 2013056756 A1 WO2013056756 A1 WO 2013056756A1
Authority
WO
WIPO (PCT)
Prior art keywords
teleconference
participant
speaking
participants
during
Prior art date
Application number
PCT/EP2012/003034
Other languages
French (fr)
Inventor
Christos FOUSTERIS
Original Assignee
Siemens Enterprise Communications Gmbh & Co. Kg
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from PCT/EP2011/005234 external-priority patent/WO2013056721A1/en
Application filed by Siemens Enterprise Communications Gmbh & Co. Kg filed Critical Siemens Enterprise Communications Gmbh & Co. Kg
Priority to PCT/EP2012/003034 priority Critical patent/WO2013056756A1/en
Publication of WO2013056756A1 publication Critical patent/WO2013056756A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/1813Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
    • H04L12/1822Conducting the conference, e.g. admission, detection, selection or grouping of participants, correlating users to one or more conference sessions, prioritising transmission
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/1813Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
    • H04L12/1831Tracking arrangements for later retrieval, e.g. recording contents, participants activities or behavior, network status
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/42221Conversation recording systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • H04M3/567Multimedia conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/56Unified messaging, e.g. interactions between e-mail, instant messaging or converged IP messaging [CPM]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/41Electronic components, circuits, software, systems or apparatus used in telephone systems using speaker recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • H04M3/568Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
    • H04M3/569Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants using the instant speaker's algorithm

Definitions

  • the present invention relates generally to a method, an apparatus and a system for displaying visual information about participants in a teleconference.
  • Teleconference is nowadays a preferred method of communication for employees of medium and larger enterprises.
  • various methods have been presented to meet this requirement .
  • US 2006/0098085 Al discloses a method and apparatus for managing a display during a teleconference between a primary participant and one or more secondary participants.
  • a primary image corresponding to the primary participant and a subset of secondary images that correspond to secondary participants are displayed on first and second sections of the display, respectively.
  • By scrolling through the secondary images during the teleconference different subsets of the secondary images may be displayed.
  • To improve known methods for displaying visual information about participants in a teleconference is an object of the present invention.
  • a method comprising mixing of audio signals originating from participants in the teleconference. Further, an automatic identification of a participant currently speaking is provided. At least one static digital image associated with the identified participant currently speaking is displayed at least during a part of the time while this participant is speaking .
  • audio signal shall refer to all kinds of signals, especially digital signals that represent audible information of any kind, especially speech signals originating from participants of a teleconference.
  • teleconference shall refer to all kinds of telecommunication processes which support the communication among participants taking part in a conference by means of telecommunication equipment, including telephones, cameras, IP-phones, PC-clients, mobile phones or other kinds of telecommunication terminal devices (e.g. UCFE, Universal Communications Front End) .
  • these terminal devices are combinations of phones and (computer- ) screens.
  • at least some of the participants are at remote locations, so that they cannot communicate without using technical communication equipment.
  • the audio and possibly other signals, e.g. video signals, originating from various participants of the teleconference are mixed so that they can be made available to other participants.
  • the signals originating from the participant currently speaking are distributed to other participants so that these can listen to the participant currently speaking.
  • the mixing is preferably done by a conference bridge.
  • conference bridge shall refer to a system being configured to mix the signals, especially speech signals originating from the participants.
  • a conference bridge can e.g. be realized in the form of an application running on a personal computer.
  • a personal computer is frequently referred to as a media server or conference server. This server receives the signals originating from the terminal devices used by the participants and sends the mixed signals to the terminal devices .
  • teleconference shall also include telecommunication processes, in which participants communicate by audio signals and video signals and possibly by application sharing .
  • an automatic identification of a participant currently speaking is provided. This is also referred to as speaker recognition and is preferably done by voice recognition or by speech analysis.
  • the various technologies used to process and store voice prints include frequency estimation, hidden Markov models, Gaussian mixture models, pattern matching algorithms, neur- al networks, matrix representation, Vector Quantization and decision trees ("Speaker recognition" in Wikipedia, The Free Encyclopedia. Date of last revision: 11 May 2012 23:38 UTC. Date retrieved: 28 May 2012 18:57 UTC.
  • a static digital image associated with the identified participant currently speaking is displayed at least during a part of the time while this participant is speaking.
  • the term static digital image shall refer to a digital image with constant content, i.e. not changing, during a certain time period.
  • This static image may be a portrait of the current speaker or an image of a person associated with the current speaker or another kind of static digital image.
  • the method according to the invention is implemented by help of a component, preferably a UC component, that will reside on Front End Servers.
  • This component will preferably perform the following tasks:
  • Unified communications is the integration of real-time communication services such as instant messaging (chat) , presence information, telephony (including IP telephony), video conferencing, data sharing (including web connected electronic whiteboards aka IWB ' s or Interactive White
  • UC is not necessarily a single product, but a set of products that provides a consistent unified user interface and user experience across multiple devices and media types. There have been attempts at creating a single product solution however the most popular solution is dependent on multiple products.
  • UC can encompass all forms of communications that are exchanged via the medium of the TCP/IP network to include other forms of communications such as Internet Protocol Television (IPTV) and Digital Signage Communications as they become an integrated part of the network communications deployment and may be directed as one to one communications or broadcast communications from one to many.
  • IPTV Internet Protocol Television
  • UC allows an individual to send a message on one medium and receive the same communication on another medium. For example, one can receive a voicemail message and choose to access it through e-mail or a cell phone. If the sender is online according to the presence information and currently accepts calls, the response can be sent immediately through text chat or video call. Otherwise, it may be sent as a non real-time message that can be accessed through a variety of media .
  • Video conferences are suitable when a group o people from one location is communicating with a group of people on a different location or in cases where just two people communicate ("one-on-one discussions").
  • the merits of video conferencing in other cases, e.g. meetings of te- leworkers or large global enterprise entities with employees on many different locations are questionable.
  • the invention offers the possibility to have a solution right in between of (pure) audio and (audio- ) video confe- rences. While pure audio completely lacks of pictures and while video demands a continuous stream of pictures, the invention offers the possibility to match a picture to the voice (of the current speaker, the person the voice belongs to) and to display in real time a predefined picture on the terminal devices of all participants.
  • the invention offers the possibility to solve problems associated with conferences where participants are in their office or cubicle (e.g. in case of teleworking) , to reduce the required bandwidth if compared to video conferences.
  • Standard mechanisms of speaker recognition or "who is talking" detection may be used to easily implement the invention. These mechanisms are well known from contemporary UC web clients, where they are used to show the name of the current speaker.
  • the images to be displayed can be stored in photo repositories, which can be located on various places depending on the needs of the customer.
  • the customer's corporate directory may for instance be used as a source for the contact and communication resources and information about the company employees, who may be modeled in a CMP (Common Management Portal) as UM (User Management) users .
  • CMP Common Management Portal
  • UM User Management
  • Unified Messaging is the integration of different electronic messaging and communications media (e-mail, SMS, Fax, voicemail, video messaging, etc.) technologies into a single interface, accessible from a variety of different devices (Wikipedia: Unified messaging, author: Wikipedia contributors, publisher: Wikipedia , The Free Encyclopedia , date of last revision: 10 February 2012 12:55 UTC, date retrieved: 28 May 2012 18:31 UTC, Permanent link:
  • the method further comprises displaying a temporal sequence of a plurality of static images associated with the identified participant currently speaking at least during a part of the time while this participant is speaking.
  • temporal sequences of a plurality of static images associated with the identified participant currently speaking maybe digital slide shows of pictures (e.g. portraits, pictures showing the speaker in various professional or leisure time situations, etc.) of the same person or slide shows comprising pictures of persons associated with or related to the current speaker, e.g. colleagues, assistants, etc.
  • These slide shows may, however, also comprise documents like presentation slides or similar material.
  • This embodiment of the invention offers the possibility to provide non-verbal information to the listening participants during the verbal presentation or statement of the current speaker.
  • the method further comprises concurrently (i.e. si ⁇ multaneously) displaying a plurality of static images asso- ciated with the identified participant currently speaking at least during a part of the time while this participant is speaking.
  • a plurality of static images associated with the identified participant currently speak- ing maybe combinations of portraits and texts, e.g. the name of the speaker, his or her affiliation, title, etc.
  • This embodiment of the invention offers the possibility to provide additional information to the listening participants during the verbal presentation or statement of the current speaker.
  • the method further comprises displaying only one static digital image associated with the identified partic- ipant currently speaking at least during a part of the time while this participant is speaking.
  • the method further comprises upon start up of the te- leconference storing or checking to have been stored at least one digital image associated with each participant in a storage device accessible to a teleconference system managing the information displayed to the participants of the teleconference.
  • the digital images to displayed during the conference are copied from digital files, such as corporate directories, personal web pages, etc., con ⁇ taining user specific information such as e.g. Employee Name, Employee Location, Employee e-mail, Employee Picture, Employee Group, etc., and are subsequently stored to an ap- plication server.
  • These actions are preferably implemented using the LDAP (Lightweight Directory Access Protocol) .
  • these digital images are copied and stored in a digital data storage folder associated with this teleconference and created upon start up of this teleconference.
  • the digital data storage folder associated with this teleconference is deleted upon conference closure. This offers the advantage of saving storage space and meeting several requirements of standard policies for privacy protection .
  • an apparatus for displaying visual information about participants in a teleconference, the apparatus comprising a conference bridge for mixing of audio signals originating from participants in the teleconference, means for providing an automatic identification of a participant currently speaking and means for causing a communication terminal device to display at least one static digital image associated with the identified participant currently speaking at least during a part of the time while this participant is speaking.
  • a system for displaying visual information about participants in a teleconference, the system comprising at least one apparatus according to the invention and a plurality of communication terminal devices receiving communication data from the at least one apparatus.
  • Fig. 1 illustrates a preferred system configuration of a system according to the present invention.
  • a system for displaying visual information about participants in a teleconference, the system comprising at least one apparatus 4, 8, 9, 10, 11 according to the invention and a plurality of communication terminal devices la, lb, 2a, 2b, 3a and 3b receiving communication data from the at least one apparatus.
  • the communication terminal devices preferably comprise phone la, 2a and 3a and screens lb, 2b and 3b.
  • the screens are preferably computer screens, connected to a personal computer.
  • the phones and the screens are preferably connected to universal communication front ends (UCFE) 5, 6 and 7, which are preferably equipped with local storage means 5a, 6a and 7a.
  • UFE universal communication front ends
  • this at least one apparatus 4, 8, 9, 10, 11 is configured for displaying visual information about participants in a teleconference, the apparatus comprising a conference bridge 8 for mixing of audio signals originating from participants in the teleconference, means 9 for providing an automatic identification of a participant currently speaking and means 4 for causing a communication terminal device to display at least one static digital image associated with the identified participant currently speaking at least during a part of the time while this participant is speaking.
  • the digital images to displayed during the conference are copied from digital files, such as corporate directories 10, personal web pages, etc., containing user specific information 11 such as e.g. Employee Name, Employee Location, Employee e-mail, Employee Picture, Employee Group, etc., and are subsequently stored to an application server 4.
  • user specific information 11 such as e.g. Employee Name, Employee Location, Employee e-mail, Employee Picture, Employee Group, etc.
  • LDAP Lightweight Directory Access Protocol
  • the images to be displayed are stored in photo repositories, which can be located on various places depending on the needs of the customer, e.g. in universal communication front ends (UCFE) 5, 6, 7 or their storage devices 5a, 6a, 7a.
  • the customer's corporate directory 10 may for instance be used as a source for the contact and communication resources and information about the company employees, who may be modeled in a CMP (Common Management Portal) as UM (User Management) users.
  • CMP Common Management Portal
  • UM User Management
  • XML-files are stored on one or sev- eral of the universal communication front ends (UCFE) 5, 6, 7 or their storage devices 5a, 6a, 7a.
  • UCFE universal communication front ends
  • XML applications are based on the client-server architec- ture.
  • the client which is preferably running in the phone software, requests an XML document from the server-side program.
  • the HTTP/HTTPS GET request sent by the client includes the phone's call number, for instance:
  • the server-side program then preferably generates an XML document, which is preferably delivered to the phone over HTTP/HTTPS.
  • the XML document is preferably parsed and displayed on the graphic display.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Telephonic Communication Services (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

A method for displaying visual information about participants in a teleconference comprises mixing of audio signals originating from participants in the teleconference, providing an automatic identification of a participant currently speaking and displaying at least one static digital image associated with the identified participant currently speaking at least during a part of the time while this participant is speaking.

Description

METHOD AND APPARATUS FOR DISPLAYING VISUAL INFORMATION
ABOUT PARTICIPANTS IN A TELECONFERENCE
Description
BACKGROUND OF THE INVENTION
The present invention relates generally to a method, an apparatus and a system for displaying visual information about participants in a teleconference.
Teleconference is nowadays a preferred method of communication for employees of medium and larger enterprises. In order to meet the natural needs of humans to see a visual re- presentation of participants speaking in a teleconference, various methods have been presented to meet this requirement .
US 2006/0098085 Al discloses a method and apparatus for managing a display during a teleconference between a primary participant and one or more secondary participants. According to this publication, a primary image corresponding to the primary participant and a subset of secondary images that correspond to secondary participants are displayed on first and second sections of the display, respectively. By scrolling through the secondary images during the teleconference, different subsets of the secondary images may be displayed. To improve known methods for displaying visual information about participants in a teleconference is an object of the present invention.
SUMMARY OF THE INVENTION
According to the present invention a method is provided comprising mixing of audio signals originating from participants in the teleconference. Further, an automatic identification of a participant currently speaking is provided. At least one static digital image associated with the identified participant currently speaking is displayed at least during a part of the time while this participant is speaking .
In the present context the term audio signal shall refer to all kinds of signals, especially digital signals that represent audible information of any kind, especially speech signals originating from participants of a teleconference. The term teleconference shall refer to all kinds of telecommunication processes which support the communication among participants taking part in a conference by means of telecommunication equipment, including telephones, cameras, IP-phones, PC-clients, mobile phones or other kinds of telecommunication terminal devices (e.g. UCFE, Universal Communications Front End) . Preferably these terminal devices are combinations of phones and (computer- ) screens. Usually, at least some of the participants are at remote locations, so that they cannot communicate without using technical communication equipment.
The audio and possibly other signals, e.g. video signals, originating from various participants of the teleconference are mixed so that they can be made available to other participants. Typically, the signals originating from the participant currently speaking are distributed to other participants so that these can listen to the participant currently speaking. The mixing is preferably done by a conference bridge.
In the present context the term conference bridge shall refer to a system being configured to mix the signals, especially speech signals originating from the participants. Such a conference bridge can e.g. be realized in the form of an application running on a personal computer. Such a personal computer is frequently referred to as a media server or conference server. This server receives the signals originating from the terminal devices used by the participants and sends the mixed signals to the terminal devices .
The term teleconference shall also include telecommunication processes, in which participants communicate by audio signals and video signals and possibly by application sharing .
In order to be able to provide the signals originating from the current speaker to other participants without manually switching between the signal sources, an automatic identification of a participant currently speaking is provided. This is also referred to as speaker recognition and is preferably done by voice recognition or by speech analysis.
The various technologies used to process and store voice prints include frequency estimation, hidden Markov models, Gaussian mixture models, pattern matching algorithms, neur- al networks, matrix representation, Vector Quantization and decision trees ("Speaker recognition" in Wikipedia, The Free Encyclopedia. Date of last revision: 11 May 2012 23:38 UTC. Date retrieved: 28 May 2012 18:57 UTC.
Permanent link:
http : //en . wikipedia . org/w/index . php?title=Speaker_recogniti on&oldid=492101918) . For example, a method of speaker recognition is presented in the paper Look at Who's Talking: Voice Activity Detection by Automated Gesture Analysis by Marco Cristiani et. al . Another method is published in LOOK WHO'S TALKING: SPEAKER DETECTION USING VIDEO AND AUDIO COR- REL ATION by Ross Cutler and Larry Davis, Institute for Advanced Computer Studies University of Maryland, College Park. Both papers are easily found in the internet.
According to the invention, a static digital image associated with the identified participant currently speaking is displayed at least during a part of the time while this participant is speaking. The term static digital image shall refer to a digital image with constant content, i.e. not changing, during a certain time period. This static image may be a portrait of the current speaker or an image of a person associated with the current speaker or another kind of static digital image.
Preferably, the method according to the invention is implemented by help of a component, preferably a UC component, that will reside on Front End Servers. This component will preferably perform the following tasks:
1. Listen to specific UC events;
2. Create one folder per conference ID (Conf.ID) upon conference start up;
3. Copy images of participants in the Conf.ID Folder 4. Dynamically rename the picture of the active speaker and of previous speakers;
5. Generate XML and HTML pages;
6. Delete the Conf.ID folder upon conference closure.
Unified communications (UC) is the integration of real-time communication services such as instant messaging (chat) , presence information, telephony (including IP telephony), video conferencing, data sharing (including web connected electronic whiteboards aka IWB ' s or Interactive White
Boards), call control and speech recognition with non-realtime communication services such as unified messaging (integrated voicemail, e-mail, SMS and fax) (Unified communications in Wikipedia, The Free Encyclopedia, date of last revision: 21 May 2012 19:21 UTC, date retrieved: 28 May 2012 18:39 UTC, permanent link:
http : //en . wikipedia . org/w/index . php?title=Unified_communica tions&oldid=493707199) . UC is not necessarily a single product, but a set of products that provides a consistent unified user interface and user experience across multiple devices and media types. There have been attempts at creating a single product solution however the most popular solution is dependent on multiple products.
In its broadest sense UC can encompass all forms of communications that are exchanged via the medium of the TCP/IP network to include other forms of communications such as Internet Protocol Television (IPTV) and Digital Signage Communications as they become an integrated part of the network communications deployment and may be directed as one to one communications or broadcast communications from one to many. UC allows an individual to send a message on one medium and receive the same communication on another medium. For example, one can receive a voicemail message and choose to access it through e-mail or a cell phone. If the sender is online according to the presence information and currently accepts calls, the response can be sent immediately through text chat or video call. Otherwise, it may be sent as a non real-time message that can be accessed through a variety of media .
The invention will, depending on the chosen embodiment, provide various advantages:
The trend, especially in large companies, is nowadays to use video conferencing systems instead of just audio confe rences. While live video meets the need of humans to "put face to the voice", it is not always meeting the needs appropriately. Video conferences are suitable when a group o people from one location is communicating with a group of people on a different location or in cases where just two people communicate ("one-on-one discussions"). The merits of video conferencing in other cases, e.g. meetings of te- leworkers or large global enterprise entities with employees on many different locations are questionable.
Larger groups do not easily fit on a screen. Another question is if employees always like to be seen on a daily or weekly meeting for an hour straight, while they work remotely, may not be dressed for the occasion or when they want to "partially" participate in a video call.
The invention offers the possibility to have a solution right in between of (pure) audio and (audio- ) video confe- rences. While pure audio completely lacks of pictures and while video demands a continuous stream of pictures, the invention offers the possibility to match a picture to the voice (of the current speaker, the person the voice belongs to) and to display in real time a predefined picture on the terminal devices of all participants.
The invention offers the possibility to solve problems associated with conferences where participants are in their office or cubicle (e.g. in case of teleworking) , to reduce the required bandwidth if compared to video conferences. Standard mechanisms of speaker recognition or "who is talking" detection may be used to easily implement the invention. These mechanisms are well known from contemporary UC web clients, where they are used to show the name of the current speaker. The images to be displayed can be stored in photo repositories, which can be located on various places depending on the needs of the customer. The customer's corporate directory may for instance be used as a source for the contact and communication resources and information about the company employees, who may be modeled in a CMP (Common Management Portal) as UM (User Management) users .
Unified Messaging (or UM) is the integration of different electronic messaging and communications media (e-mail, SMS, Fax, voicemail, video messaging, etc.) technologies into a single interface, accessible from a variety of different devices (Wikipedia: Unified messaging, author: Wikipedia contributors, publisher: Wikipedia , The Free Encyclopedia , date of last revision: 10 February 2012 12:55 UTC, date retrieved: 28 May 2012 18:31 UTC, Permanent link:
http : //en . wikipedia . org/w/index . php?title=Unified_messaging &oldid=476111606) . While traditional communications systems delivered messages into several different types of stores such as voicemail systems, e-mail servers, and stand-alone fax machines, with Unified Messaging all types of messages are stored in one system. Voicemail messages, for example, can be delivered directly into the user's inbox and played either through a headset or the computer's speaker. This simplifies the user's experience (only one place to check for messages) and can offer new options for workflow such as appending notes or documents to forwarded voicemails.
According to a preferred embodiment of the present invention, the method further comprises displaying a temporal sequence of a plurality of static images associated with the identified participant currently speaking at least during a part of the time while this participant is speaking. Examples of such temporal sequences of a plurality of static images associated with the identified participant currently speaking maybe digital slide shows of pictures (e.g. portraits, pictures showing the speaker in various professional or leisure time situations, etc.) of the same person or slide shows comprising pictures of persons associated with or related to the current speaker, e.g. colleagues, assistants, etc. These slide shows may, however, also comprise documents like presentation slides or similar material. This embodiment of the invention offers the possibility to provide non-verbal information to the listening participants during the verbal presentation or statement of the current speaker.
According to a preferred embodiment of the present invention, the method further comprises concurrently (i.e. si¬ multaneously) displaying a plurality of static images asso- ciated with the identified participant currently speaking at least during a part of the time while this participant is speaking. Examples of such a plurality of static images associated with the identified participant currently speak- ing maybe combinations of portraits and texts, e.g. the name of the speaker, his or her affiliation, title, etc. This embodiment of the invention offers the possibility to provide additional information to the listening participants during the verbal presentation or statement of the current speaker.
According to a preferred embodiment of the present invention, the method further comprises displaying only one static digital image associated with the identified partic- ipant currently speaking at least during a part of the time while this participant is speaking.
According to a preferred embodiment of the present invention, the method further comprises upon start up of the te- leconference storing or checking to have been stored at least one digital image associated with each participant in a storage device accessible to a teleconference system managing the information displayed to the participants of the teleconference. Preferably, the digital images to displayed during the conference are copied from digital files, such as corporate directories, personal web pages, etc., con¬ taining user specific information such as e.g. Employee Name, Employee Location, Employee e-mail, Employee Picture, Employee Group, etc., and are subsequently stored to an ap- plication server. These actions are preferably implemented using the LDAP (Lightweight Directory Access Protocol) . According to a preferred embodiment of the present invention, these digital images are copied and stored in a digital data storage folder associated with this teleconference and created upon start up of this teleconference.
According to a preferred embodiment of the present invention, the digital data storage folder associated with this teleconference is deleted upon conference closure. This offers the advantage of saving storage space and meeting several requirements of standard policies for privacy protection .
According to the present invention, an apparatus is provided for displaying visual information about participants in a teleconference, the apparatus comprising a conference bridge for mixing of audio signals originating from participants in the teleconference, means for providing an automatic identification of a participant currently speaking and means for causing a communication terminal device to display at least one static digital image associated with the identified participant currently speaking at least during a part of the time while this participant is speaking.
According to the present invention, a system is provided for displaying visual information about participants in a teleconference, the system comprising at least one apparatus according to the invention and a plurality of communication terminal devices receiving communication data from the at least one apparatus.
BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 illustrates a preferred system configuration of a system according to the present invention.
DETAILED DESCRIPTION OF THE INVENTION
According to a preferred embodiment of the present invention, a system is provided for displaying visual information about participants in a teleconference, the system comprising at least one apparatus 4, 8, 9, 10, 11 according to the invention and a plurality of communication terminal devices la, lb, 2a, 2b, 3a and 3b receiving communication data from the at least one apparatus. The communication terminal devices preferably comprise phone la, 2a and 3a and screens lb, 2b and 3b. The screens are preferably computer screens, connected to a personal computer. The phones and the screens are preferably connected to universal communication front ends (UCFE) 5, 6 and 7, which are preferably equipped with local storage means 5a, 6a and 7a.
According to the present invention, this at least one apparatus 4, 8, 9, 10, 11 is configured for displaying visual information about participants in a teleconference, the apparatus comprising a conference bridge 8 for mixing of audio signals originating from participants in the teleconference, means 9 for providing an automatic identification of a participant currently speaking and means 4 for causing a communication terminal device to display at least one static digital image associated with the identified participant currently speaking at least during a part of the time while this participant is speaking.
Preferably, the digital images to displayed during the conference are copied from digital files, such as corporate directories 10, personal web pages, etc., containing user specific information 11 such as e.g. Employee Name, Employee Location, Employee e-mail, Employee Picture, Employee Group, etc., and are subsequently stored to an application server 4. These actions are preferably implemented using the LDAP (Lightweight Directory Access Protocol) .
Preferably, the images to be displayed are stored in photo repositories, which can be located on various places depending on the needs of the customer, e.g. in universal communication front ends (UCFE) 5, 6, 7 or their storage devices 5a, 6a, 7a. The customer's corporate directory 10 may for instance be used as a source for the contact and communication resources and information about the company employees, who may be modeled in a CMP (Common Management Portal) as UM (User Management) users.
The following is a listing of an example XML-file for the phones upon Conference creation.
Figure imgf000014_0001
Preferably it remains the same till the conference is closed. Preferably such XML-files are stored on one or sev- eral of the universal communication front ends (UCFE) 5, 6, 7 or their storage devices 5a, 6a, 7a.
XML applications are based on the client-server architec- ture. Comparable with web browsers and web servers in the WWW, the client, which is preferably running in the phone software, requests an XML document from the server-side program. The HTTP/HTTPS GET request sent by the client includes the phone's call number, for instance:
"137.223.238.174/serverProgram?phonenumber=4711"
The server-side program then preferably generates an XML document, which is preferably delivered to the phone over HTTP/HTTPS. In the phone, the XML document is preferably parsed and displayed on the graphic display.
The same mechanism can also be used
for matching a voice to the right picture
fireEvent: {} PmEvent type=activespeaker (BE service sends fireEvents to FE service. In case of an Active speaker, FE receives a fireEvent with type=activespeaker ) .
* * *

Claims

Claims
1. A method for displaying visual information about participants in a teleconference comprising: a) mixing of audio signals originating from participants in the teleconference; b) providing an automatic identification of a participant currently speaking; c) displaying at least one static digital image associated with the identified participant currently speaking at least during a part of the time while this participant is speaking.
2. The method according to claim 1, wherein a temporal sequence of a plurality of static images associated with the identified participant currently speaking is displayed at least during a part of the time while this participant is speaking.
3. The method according to one of the preceding claims, wherein a plurality of static images associated with the identified participant currently speaking is displayed con- currently at least during a part of the time while this participant is speaking.
4. The method according to one of the preceding claims, wherein only one static digital image associated with the identified participant currently speaking is displayed at least during a part of the time while this participant is speaking .
5. The method according to one of the preceding claims, wherein upon start up of the teleconference at least one digital image associated with each participant is stored or is checked to have been stored in a storage device accessible to a teleconference system managing the information displayed to the participants of the teleconference.
6. The method according to claim 5, wherein these digital images are copied and stored in a digital data storage folder associated with this teleconference and created upon start up of this teleconference.
7. The method according to claim 6, wherein the digital data storage folder associated with this teleconference is deleted upon conference closure.
8. Apparatus (4, 8, 9, 10, 11) for displaying visual information about participants in a teleconference comprising : a) a conference bridge (8) for mixing of audio signals originating from participants in the teleconference; b) means for providing an automatic identification of a participant currently speaking; c) means for causing a communication terminal device to display at least one static digital image associated with the identified participant currently speaking at least during a part of the time while thi participant is speaking.
9. System for displaying visual information about partic ipants in a teleconference comprising: a) at least one apparatus according to claim 8; b) a plurality of communication terminal devices (la, lb, 2a, 2b, 3a and 3b) receiving communication data from the at least one apparatus.
PCT/EP2012/003034 2011-10-18 2012-07-18 Method and apparatus for displaying visual information about participants in a teleconference WO2013056756A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2012/003034 WO2013056756A1 (en) 2011-10-18 2012-07-18 Method and apparatus for displaying visual information about participants in a teleconference

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
PCT/EP2011/005234 WO2013056721A1 (en) 2011-10-18 2011-10-18 Method and apparatus for providing data produced in a conference
EPPCT/EP2011/005234 2011-10-18
PCT/EP2012/003034 WO2013056756A1 (en) 2011-10-18 2012-07-18 Method and apparatus for displaying visual information about participants in a teleconference

Publications (1)

Publication Number Publication Date
WO2013056756A1 true WO2013056756A1 (en) 2013-04-25

Family

ID=69464124

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2012/003034 WO2013056756A1 (en) 2011-10-18 2012-07-18 Method and apparatus for displaying visual information about participants in a teleconference

Country Status (1)

Country Link
WO (1) WO2013056756A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111294472A (en) * 2018-12-10 2020-06-16 T移动美国公司 Participant identification for teleconferencing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030158900A1 (en) * 2002-02-05 2003-08-21 Santos Richard A. Method of and apparatus for teleconferencing
US20040263636A1 (en) * 2003-06-26 2004-12-30 Microsoft Corporation System and method for distributed meetings
US20060098085A1 (en) 2004-11-05 2006-05-11 Nichols Paul H Display management during a multi-party conversation
US20090220065A1 (en) * 2008-03-03 2009-09-03 Sudhir Raman Ahuja Method and apparatus for active speaker selection using microphone arrays and speaker recognition
FR2949894A1 (en) * 2009-09-09 2011-03-11 Saooti Individual's e.g. moderator, courtesy determining method for e.g. broadcasting audio programs in radio, involves measuring time information of individual during discussion, and determining courtesy of individual from measured information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030158900A1 (en) * 2002-02-05 2003-08-21 Santos Richard A. Method of and apparatus for teleconferencing
US20040263636A1 (en) * 2003-06-26 2004-12-30 Microsoft Corporation System and method for distributed meetings
US20060098085A1 (en) 2004-11-05 2006-05-11 Nichols Paul H Display management during a multi-party conversation
US20090220065A1 (en) * 2008-03-03 2009-09-03 Sudhir Raman Ahuja Method and apparatus for active speaker selection using microphone arrays and speaker recognition
FR2949894A1 (en) * 2009-09-09 2011-03-11 Saooti Individual's e.g. moderator, courtesy determining method for e.g. broadcasting audio programs in radio, involves measuring time information of individual during discussion, and determining courtesy of individual from measured information

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"The Free Encyclopedia", vol. 19, 21 May 2012, UNIFIED COMMUNICATIONS IN WIKIPEDIA, pages: 21
"The Free Encyclopedia", vol. 23, 11 May 2012, article "Speaker recognition", pages: 38
ROSS CUTLER; LARRY DAVIS: "LOOK WHO'S TALKING: SPEAKER DETECTION USING VIDEO AND AUDIO COR-REL ATION", INSTITUTE FOR ADVANCED COMPUTER STUDIES UNIVERSITY OF MARYLAND

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111294472A (en) * 2018-12-10 2020-06-16 T移动美国公司 Participant identification for teleconferencing

Similar Documents

Publication Publication Date Title
Fish et al. The VideoWindow system in informal communication
Egido Teleconferencing as a technology to support cooperative work: Its possibilities and limitations
Abel Experiences in an exploratory distributed organization
US20120017149A1 (en) Video whisper sessions during online collaborative computing sessions
US7730411B2 (en) Re-creating meeting context
US20100153497A1 (en) Sharing expression information among conference participants
CN107995456A (en) Wisdom garden video conferencing system
US8301699B1 (en) Dynamically enabling features of an application based on user status
US8942364B2 (en) Per-conference-leg recording control for multimedia conferencing
WO2007129943A1 (en) Method and arrangement for management of virtual meetings
US9338396B2 (en) System and method for affinity based switching
US20120259924A1 (en) Method and apparatus for providing summary information in a live media session
US20130246636A1 (en) Providing additional information with session requests
JP2005136524A (en) Group / individual meeting system
Kraut et al. Prospects for video telephony
JP5877470B2 (en) Commercial communication system and method
Patrick The human factors of MBone videoconferences: Recommendations for improving sessions and software
AU743274B2 (en) Information retrieval system
US8842813B2 (en) Teleconferencing monitoring method
WO2013056756A1 (en) Method and apparatus for displaying visual information about participants in a teleconference
CN202334590U (en) Application server of multiparty multi-media communication system
US7469293B1 (en) Using additional information provided in session requests
CN113676691A (en) Intelligent video conference system and method
CN109743530B (en) Multi-party video conference method and system, server, computer equipment and medium
Memon et al. Internet based multimedia services and technologies in the context of e-government: A conceptual framework

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12750315

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12750315

Country of ref document: EP

Kind code of ref document: A1