US20010043719A1 - Hand pointing device - Google Patents

Hand pointing device Download PDF

Info

Publication number
US20010043719A1
US20010043719A1 US09/040,436 US4043698A US2001043719A1 US 20010043719 A1 US20010043719 A1 US 20010043719A1 US 4043698 A US4043698 A US 4043698A US 2001043719 A1 US2001043719 A1 US 2001043719A1
Authority
US
United States
Prior art keywords
person
recognized
images
image
image pickup
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US09/040,436
Other versions
US6385331B2 (en
Inventor
Kenichi Harakawa
Kenichi Unno
Norio Igawa
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Takenaka Corp
Original Assignee
Takenaka Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Takenaka Corp filed Critical Takenaka Corp
Assigned to TAKENAKA CORPORATION reassignment TAKENAKA CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HARAKAWA, KENICHI, IGAWA, NORIO, UNNO, KENICHI
Publication of US20010043719A1 publication Critical patent/US20010043719A1/en
Application granted granted Critical
Publication of US6385331B2 publication Critical patent/US6385331B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/0304Detection arrangements using opto-electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/10Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals
    • A63F2300/1087Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by input arrangements for converting player-generated signals into game device control signals comprising photodetecting means, e.g. a camera

Definitions

  • the present invention relates to a hand pointing apparatus, and more specifically to a hand pointing apparatus for picking up a person to be recognized and for determining a position or a direction pointed to by the person to be recognized.
  • a hand pointing input apparatus which comprises a display for displaying predetermined information, an illuminating device for illuminating an information inputting person who comes to the display, and a plurality of image pickup devices for picking up the image of the approaching information inputting person from different directions, wherein a plurality of image pickup devices image pickup images of situations where the approaching information inputting person points with a finger or the like to an optional position on the display, the information inputting person is recognized in accordance with a plurality of images obtained by the image pickup, the position on the display pointed to by the information inputting person is determined, a cursor or the like is displayed on the position pointed to on the display, and the position on the display pointed to is recognized as being clicked at the time of detecting the fact that the information inputting person has performed a clicking action by raising a thumb, whereby a predetermined processing is performed (see, for example, Japanese Patent Application Laid-open (JP-A) Nos. 4-271423, 5-19957, 5-3241
  • the information inputting person can give various instructions to an information processing apparatus and input various information to the information processing apparatus without touching an input device such as a keyboard or a mouse, it is possible to simplify the operation for using the information processing apparatus.
  • an object which is not a subject to be recognized for example, the luggage of the information inputting person or trash, may exist around the information inputting person who is the subject to be recognized.
  • the surroundings of the information inputting person are also illuminated by an illuminating light emitted from the illuminating device.
  • this object which is not the subject to be recognized is present as a high-luminance object in the images picked up by the image pickup device.
  • an object which is not the subject to be recognized is recognized as the information inputting person by mistake.
  • a three-dimensional coordinate of a feature point has been heretofore determined by a calculation from the position of the feature point of the information inputting person on the picked-up image (for example, a tip of his/her forefinger or the like) so as to thereby determine the position on the display pointed to by the information inputting person.
  • the calculation processing for determining the three-dimensional coordinate of the feature point is complicated. Due to this fact, a long time is required for the determination of the instruction from the information inputting person in the same manner as the above-described case.
  • the present invention was completed in consideration of the above facts. It is a first object of the present invention to provide a hand pointing apparatus having a simple construction and being capable of reducing the time required for the determination of an instruction from a person to be recognized.
  • a hand pointing apparatus comprises: illuminating means for illuminating a person to be recognized; a plurality of image pickup means, located in different positions wherein the image pickup range is adjusted for each image so that the person to be recognized who is illuminated by the above-described illuminating means, may be within the image pickup range, and an illuminated range on a floor surface, which is illuminated by the above-described illuminating means, may be out of the image pickup range; and determining means for extracting an image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, the situations being indicative of the person to be recognized pointing to either a specific position or a specific direction, and for determining either the position or the direction pointed to by the person to be recognized.
  • the person to be recognized may point to a specific position on, for example, the surface of a display screen or the like of a display, or may point to a specific direction (for example, the direction in which a specific object exists as seen from the person to be recognized).
  • the determining means extracts the image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, where the situations are indicative of the person to be recognized pointing to either the specific position or the specific direction, and the determining means determines either the position or the direction pointed to by the person to be recognized.
  • a three-dimensional coordinate of a feature point of the person to be recognized (a point whose position is changed in response to the motion by the person to be recognized to point to a specific position or a specific direction, for example, a tip of a predetermined part, (for example, the hand, the finger, or the like), of the body of the person to be recognized making the pointing motion, the tip of a pointer held by the person to be recognized or the like)
  • the determination of the specific position or direction pointed to can be accomplished based on the position of the person to be recognized and the three-dimensional coordinates of the feature point.
  • the image pickup range of a plurality of pickup means is adjusted so that the person to be recognized, who is illuminated by the illuminating means, may be within the image pickup range, and the illuminated range on the floor surface which is illuminated by the illuminating means, may be out of the image pickup range.
  • the possibility that this object which is not the subject to be recognized comes within the image pickup range of the image pickup means is reduced.
  • the object which is not the subject to be recognized comes within the image pickup range, the object is not illuminated by the illuminating means and its luminance is thus reduced.
  • the image part corresponding to the object which is not the subject to be recognized exists in the image picked up by the image pickup means. Even if the image part corresponding to the object which is not the subject to be recognized exists, the luminance of the image part is reduced.
  • the image pickup range of a plurality of image pickup means is adjusted so that the person to be recognized, who is illuminated by the illuminating means, may be within the image pickup range, and the illuminated range on the floor surface which is illuminated by the illuminating means, may be out of the image pickup range.
  • a hand pointing apparatus comprises: a plurality of illuminating means for illuminating a person to be recognized from different directions; a plurality of image pickup means, located in different positions corresponding to each of the plurality of illuminating means, wherein an image pickup range is adjusted so that the person to be recognized, who is illuminated by the corresponding illuminating means, may be within the image pickup range, and the illuminated range on a floor surface, which is illuminated by the corresponding illuminating means, may be out of the image pickup range; controlling means for switching on/off the plurality of illuminating means one by one in sequence, and for controlling so as to image pickup the person to be recognized pointing to either a specific position or a specific direction by the image pickup means corresponding to the switched-on illuminating means; and determining means for extracting an image part corresponding to the person to be recognized from a plurality of images based on a plurality of images picked up by the plurality of image pickup
  • the second aspect of the present invention is provided with a plurality of illuminating means for illuminating the person to be recognized from different directions.
  • the plurality of image pickup means are located in different positions corresponding to a plurality of illuminating means.
  • the image pickup range of the plurality of image pickup means is adjusted so that the person to be recognized, who is illuminated by the corresponding illuminating means, may be within the image pickup range, and the illuminated range on the floor surface, which is illuminated by the corresponding illuminating means, may be out of the image pickup range.
  • the possibility that this object which is not the subject to be recognized comes within the image pickup range of the image pickup means is reduced. Even if this object comes within the image pickup range of the image pickup means, the luminance of the picked-up image is reduced.
  • the controlling means switches on/off a plurality of illuminating means one by one in sequence, and controls so as to pickup the images of the person to be recognized pointing to either a specific position or a specific direction by the image pickup means corresponding to the switched-on illuminating means, whereby the picked-up images are output from each of the image pickup means.
  • the image pickup is performed by the image pickup means at low luminance.
  • the determining means extracts the image part corresponding to the person to be recognized from a plurality of images based on a plurality of images output by a plurality of image pickup means, and then it determines either the position or the direction indicated by the person to be recognized.
  • the image part corresponding to the object which is not the subject to be recognized exists. Even if this image part exists, the image part corresponding to the person to be recognized is extracted in accordance with a plurality of images whose luminance is low.
  • a hand pointing apparatus comprises: a plurality of illuminating means for illuminating a person to be recognized from different directions; at least one image pickup means for picking up the image of the person to be recognized, who is illuminated by the illuminating means; discriminating means for switching on/off the plurality of illuminating means one by one in sequence, for comparing a plurality of images of the person to be recognized pointing to either a specific position or a specific direction picked up by the same image pickup means during the switching on of the plurality of illuminating means, and for discriminating between an image part corresponding to the person to be recognized and an image part other than the image part corresponding to the person to be recognized in the plurality of images for at least one image pickup means; and determining means for extracting the image part corresponding to the person to be recognized from the plurality of images picked up by the image pickup means based on a result of a discrimination by the discriminating means, and for determining either the position or the direction
  • the discriminating means of the third aspect of the present invention switches on/off a plurality of illuminating means one by one in sequence, compares a plurality of images of the person to be recognized pointing to either a specific position or a specific direction picked up by the same image pickup means during the switching on of a plurality of illuminating means, and discriminates between the image part corresponding to the person to be recognized and the image part other than the image part corresponding to the person to be recognized in a plurality of images for at least one image pickup means.
  • the luminance is always high in the image part corresponding to the person to be recognized in a plurality of images picked up by the same image pickup means during the switching on of a plurality of illuminating means.
  • the luminance is thus considerably varied in the image part corresponding to the objects which are not the subject to be recognized such as luggage and trash on the floor surface around the person to be recognized, depending on the direction of the illumination during the image pickup.
  • the determining means extracts the image part corresponding to the person to be recognized from the plurality of images picked up by the image pickup means based on the result of the discrimination by the discriminating means, and determines either the position or the direction pointed to by the person to be recognized. Therefore, it is possible to extract the image part corresponding to the person to be recognized in a short image by a simple processing without performing complicated image processing. It is also possible to reduce the time required for determining an instruction from the person to be recognized without the use of an image processor or the like having a high processing speed and a complicated construction.
  • a hand pointing apparatus comprises: illuminating means for illuminating a person to be recognized; a plurality of image pickup means for picking up the image of the person to be recognized, who is illuminated by the illuminating means from different directions; determining means for extracting an image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, the situations being indicative of the person to be recognized pointing to either a specific position or a specific direction, and for determining either the position or the direction pointed to by the person to be recognized; and preventing means for preventing an object which is not the subject to be recognized from remaining on the floor surface around the person to be recognized.
  • the fourth aspect of the present invention is provided with the preventing means for preventing an object which is not the subject to be recognized from remaining on the floor surface around the person to be recognized. Since this prevents the object which is not the subject to be recognized from remaining around the person to be recognized, it is possible to prevent the image part corresponding to the object which is not the subject to be recognized from existing in the images picked up by the image pickup means.
  • the determining means extracts the image part corresponding to the person to be recognized based on a plurality of images obtained by the image pickup means, and determines either the position or the direction pointed to by the person to be recognized. Thus, it is possible to extract the image part corresponding to the person to be recognized in a short time by a processing without performing complicated image processing. It is therefore possible to reduce the time required for determining an instruction from the person to be recognized without the use of an image processor or the like having a high processing speed and a complicated construction.
  • an inclined surface (slope) formed on the floor surface around the person to be recognized can be used as the preventing means.
  • a relatively large object which is not the subject to be recognized for example, the luggage of the person to be recognized
  • the object which is not the subject to be recognized slides down on the inclined surface.
  • Air flow generating means such as a fan for generating an air flow around the person to be recognized may be also applied as the preventing means.
  • a relatively small object which is not the subject to be recognized for example, small trash, dust or the like
  • a storage tank for storing water or the like around the person to be recognized may be also arranged as the preventing means.
  • this storage tank may be circular in shape so that the water or the like may circulate through the storage tank, whereby it may be used as the preventing means.
  • the fourth aspect of the present invention since there is provided a preventing means for preventing an object which is not the subject to be recognized from remaining on the floor surface around the person to be recognized, the effect is obtained in which it is possible to provide a hand pointing apparatus of simple construction wherein the time required for the determination of an instruction from the person to be recognized is reduced.
  • a hand pointing apparatus comprises: illuminating means for illuminating a person to be recognized who arrives at a predetermined place; a plurality of image pickup means for picking up the image of the person to be recognized, who is illuminated by the illuminating means from different directions; storing means for storing information for corresponding the three-dimensional coordinates of a plurality of virtual points positioned near the predetermined place, to the positions of the plurality of virtual points on the plurality of images picked up by the plurality of image pickup means; and determining means: for extracting an image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, the situations being indicative of the person to be recognized pointing to either a specific position or a specific direction; for determining the position of a feature point of the person to be recognized in each of the images; for determining the three-dimensional coordinate of the feature point based on the determined position of the feature
  • the storing means stores therein the information for corresponding the three-dimensional coordinates of a plurality of virtual points positioned near the predetermined place to the positions of the plurality of virtual points on the plurality of images picked up by the plurality of image pickup means.
  • the determining means extracts the image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, where the situations are indicative of the person to be recognized pointing to either a specific position or a specific direction, and the determining means determines the position of the feature point of the person to be recognized in the each image.
  • the determining means determines the three-dimensional coordinates of the feature point based on the determined position of the feature point and the information stored in the storing means, and determines either the position or the direction pointed to by the person to be recognized based on the determined three-dimensional coordinates of the feature point.
  • a correspondence between the three-dimensional coordinates of a plurality of virtual points positioned near the predetermined place, and the positions of the plurality of virtual points on the plurality of images picked up by the plurality of image pickup means is previously confirmed from the information stored in the storing means.
  • the three-dimensional coordinates of the feature point of the person to be recognized is determined based on the information stored in the storing means.
  • the three-dimensional coordinate of the feature point of the person to be recognized can be determined by a very simple processing. Therefore, it is possible to reduce the time required for the determination of an instruction from the person to be recognized without the use of an image processor or the like having a high processing speed and a complicated construction.
  • the storing means stores the information for corresponding the three-dimensional coordinates of many virtual points constantly spaced in a lattice arrangement near the predetermined place, to the positions of these many virtual points on the plurality of images picked up by the plurality of image pickup means.
  • the three-dimensional coordinate of the feature point can be determined in the following manner, for example.
  • the determining means of the fifth aspect of the present invention can determine the position of the feature point of the person to be recognized in the images, extract from the images the virtual points positioned in a region within a predetermined range including the feature point on the images, and determine the three-dimensional coordinates of the feature point in accordance with the three-dimensional coordinates of the common virtual points extracted from the images.
  • the virtual points positioned in the region within a predetermined range including the feature point on the images are extracted from the images, whereby all the virtual points which are likely to exist in the region adjacent to the feature point on the three-dimensional coordinate are extracted.
  • An area of this region can be defined in response to a space between the virtual points.
  • the determining means determines the three-dimensional coordinates of the feature point based on the three-dimensional coordinates of the common virtual points extracted from the images.
  • the images picked up by the image pickup means show the situation within the image pickup range, namely, the subject projected on a plane. Therefore, even if a plurality of points, which are positioned as if they were superimposed when seen from the image pickup means, have different three-dimensional coordinates, the points are located in the same position when picked up on a two-dimensional image.
  • the three-dimensional coordinates of the feature point are determined from the three-dimensional coordinates of the common extracted virtual points, whereby the three-dimensional coordinates of the feature point can be determined with a higher level of accuracy.
  • the information to be stored in the storing means can be set permanently based on the result of an experimental measurement or the like of the three-dimensional coordinates of plural virtual points positioned near a predetermined place, and the positions of plural virtual points on the images picked up by the image pickup means.
  • the position between a predetermined place at which the person to be recognized arrives and the image pickup means or when this positional relationship is considerably different in design depending on the individual hand pointing apparatuses, it is necessary to reset the information to be stored in the storing means.
  • the fifth aspect of the present invention further can comprise: generating means for allowing the plurality of image pickup means to pickup images of the situations where markers are positioned in the positions of the virtual points, the generating means for generating the information for corresponding the three-dimensional coordinates of the virtual points to the positions of the virtual points on the images based on the three-dimensional coordinates of the virtual points and the marker positions on the images picked up by the plurality of image pickup means, and the generating means for allowing the storing means to store the generated information.
  • Any marker will do as long as the marker is easy to identify on the images obtained by the image pickup.
  • a particular-color mark and a light-emission source such as LED can be used as the marker.
  • the marker may be manually positioned in a predetermined position by a person.
  • the marker may be automatically positioned by moving means for moving the marker to an optional position. When the marker is moved by the moving means, the three-dimensional coordinates of a predetermined position can be determined from the amount of movement of the marker caused by the moving means.
  • the generating means is provided in the above-mentioned manner, whereby the information for corresponding the three-dimensional coordinates of the virtual points to the positions of the virtual points on the images is automatically generated.
  • the information for corresponding the three-dimensional coordinates of the virtual points to the positions of the virtual points on the images is automatically generated.
  • the information for corresponding the three-dimensional coordinates of a plurality of virtual points positioned near a predetermined place at which the person to be recognized arrives, to the positions of a plurality of virtual points on a plurality of images picked up by a plurality of image pickup means is stored.
  • the three-dimensional coordinates of the feature point is determined based on the position of the feature point on a plurality of images picked up by a plurality of image pickup means and the stored information.
  • a hand pointing apparatus comprises: illuminating means for illuminating a person to be recognized; a plurality of image pickup means for picking up the image of the person to be recognized, who is illuminated by the illuminating means from different directions; determining means for extracting an image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, the situations being indicative of the person to be recognized pointing to either a specific position or a specific direction, and for determining either the position or the direction pointed to by the person to be recognized; first detecting means for extracting the image part corresponding to a predetermined part of the body of the person to be recognized from the plurality of images, and for detecting a change in any one of either an area of the extracted image part, an outline of the extracted image part and a length of an outline of the extracted image part; and processing means for executing a predetermined processing when the change is detected by the first
  • the sixth aspect of the present invention is provided with the first detecting means for extracting the image part corresponding to a predetermined part (for example, the hand, the arm or the like) of the body of the person to be recognized in the plurality of images and for detecting a change in either the area of the extracted image part, the change in the contour of the extracted image part, or the change in the length of the contour line of the extracted image part.
  • the processing means executes a predetermined processing when a change is detected by the first detecting means.
  • the area, the contour, and the length of the contour line of the image part can be relatively easily detected.
  • the person to be recognized moves a predetermined part of the body, even if his/her motion is not a predefined motion, in almost all cases, the area, the contour, and the length of the contour, and the length of the contour line of the image part corresponding to a predetermined part are changed.
  • the sixth aspect of the present invention since a change in the area, the contour, or the length of the contour line of the image part is used, it is possible to improve the degree of freedom of movement which the person to be recognized has in order to instruct the processing means to execute a predetermined processing. This movement can be also detected in a short time. Thus, the effect is obtained in which the instruction from the person to be recognized can be determined in a short time.
  • a hand pointing apparatus comprises: illuminating means for illuminating a person to be recognized; a plurality of image pickup means for picking up the image of person to be recognized, who is illuminated by the illuminating means from different directions; determining means for extracting an image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, the situations being indicative of the person to be recognized pointing to either a specific position or a specific direction, for determining the three-dimensional coordinates of the feature point whose position is changed when the person to be recognized bends or extends an arm and the three-dimensional coordinates of a reference point whose position is not changed even if the person to be recognized bends or extends an arm, and for determining either the position or the direction pointed to by the person to be recognized in accordance with the three-dimensional coordinates of the feature point and the three-dimensional coordinates of the reference point; and
  • the determining means extracts the image part corresponding to the person to be recognized from a plurality of images, determines the three-dimensional coordinates of the feature point whose position is changed when the person to be recognized bends or extends an arm and the three-dimensional coordinates of the reference point whose position is not changed even if the person to be recognized bends or extends an the arm, and determines either the position or the direction pointed to by the person to be recognized based on the three-dimensional coordinates of the feature point and the three-dimensional coordinates of the reference point.
  • the processing means calculates the distance between the reference point and the feature point, and executes a predetermined processing based on the change in the distance between the reference point and the feature point.
  • the tip of the hand, the finger or the like of the person to be recognized or the point corresponding to the tip or the like of a pointer held by the person to be recognized can be used as the feature point.
  • a point corresponding to the body (such as the chest and the shoulder joint) of the person to be recognized can be used as the reference point.
  • the pointed position or direction pointed to is determined by the determining means. If the person to be recognized makes a motion to bend or extend the arm, the distance between the reference point and the feature point is changed, so that a predetermined processing is thus performed based on this change in the distance.
  • the direction in which the image pickup means picks up the image can be set so that the reference point and the feature point can be reliably detected without taking into account motions such as the raising and lowering of the finger. Furthermore, since whether or not the execution of a predetermined processing is instructed is determined on the basis of the change in the distance (relative position) between the reference point and the feature point, it is unnecessary to detect additional image features in order to determine whether or not the execution of a predetermined processing is being instructed. In addition, the distance between the reference point and the feature point scarcely changes even if a person makes a motion to point to a specific position or a specific direction.
  • the seventh aspect of the present invention it is possible to reliably detect the motion of the person to be recognized to instruct the execution of a predetermined processing (the motion to bend or extend the arm) in a short time.
  • the instruction from the person to be recognized can thus be confirmed in a short time.
  • the processing means can execute, as a predetermined processing, the processing associated with the position or direction pointed to by the person to be recognized, for example, when the distance between the reference point and the feature point is changed. Since the motion to bend or extend the arm is a very natural motion, if this motion is used to instruct the above-described execution of a predetermined processing, the person to be recognized can make the motion for instructing the execution of a predetermined processing without feeling a sense of uncomfortableness.
  • the direction of the change in the distance between the reference point and the feature point due to the motion to bend or extend the arm is of two types (a direction of increase in the distance and a direction of reduction in the distance).
  • a first predetermined processing may be carried out when the distance between the reference point and the feature point is increased.
  • a second predetermined processing differing from the first predetermined processing may be carried out when the distance between the reference point and the feature point is reduced.
  • the first predetermined processing is carried out.
  • the second predetermined processing is carried out. It is therefore possible for the person to be recognized to select the processing to be executed from either the first predetermined processing or and second predetermined processing, similarly to such as left and right clicks of a mouse.
  • the person to be recognized makes either the extending motion or the bending motion, whereby it is possible to reliably execute the processing selected from either the first predetermined processing or second predetermined processing by the person to be recognized.
  • the determination of whether or not the execution of a predetermined processing is instructed on the basis of a change in the distance between the reference point and the feature point more particularly, for example, the magnitudes of the change in the distance between the reference point and the feature point are compared. If the change in the distance is a predetermined value or more, it is possible to determine that the execution of a predetermined processing is instructed. However, if the distance between the reference point and the feature point is considerably changed due to other motions having no intention of the execution of a predetermined processing, then it is possible that the instruction from the person to be recognized may be mistaken.
  • the processing means detects the rate of change in the distance between the reference point and the feature point, that is, the velocity of the change, and executes a predetermined processing when the detected velocity of change is a at threshold value or more.
  • the velocity of the change in the distance between the reference point and the feature point is detected, and a predetermined processing is then executed only when the detected velocity of the change is at the threshold value or more.
  • the person to be recognized makes a specific motion to quickly bend or extend on arm, whereby the velocity of the change in the distance between the reference point and the feature point reaches the threshold value or more, so that a predetermined processing is executed.
  • the rate of recognition of the motion of the person to be recognized for instructing the execution of a predetermined processing is improved. Only when the person to be recognized makes a motion for instructing the execution of a predetermined processing, is this motion reliably detected allowing a predetermined processing to be carried out.
  • the seventh aspect of the present invention further comprises threshold value setting means for requesting the person to be recognized to bend or extend the arm and for previously setting the threshold value based on the rate of the change in the distance between the reference point and the feature point when the person to be recognized bends or extends the arm.
  • the threshold value as to whether or not the processing means executes a predetermined processing is previously set based on the rate of the change in the distance between the reference point and the feature point when the person to be recognized bends or extends an arm (quickly bends or extends an arm) in order to allow the processing means to execute a predetermined processing, whereby the threshold value can be obtained in response to the physique, muscular strength, or the like of the individual persons to be recognized.
  • Whether or not the execution of a predetermined processing is instructed is determined by the use of this threshold value, whereby it is possible to reliably detect the motion of the person to be recognized to instruct the execution of a predetermined processing and to execute a predetermined processing, regardless of any variation in physique, muscular strength, or the like, depending on the individual person to be recognized.
  • the seventh aspect of the present invention further comprises second detecting means for extracting the image part corresponding to the arm of the person to be recognized from the plurality of images and for detecting whether or not the arm of the person to be recognized is lowered, wherein the processing means continues in its current state when the second detecting means detects that the arm of the person to be recognized is lowered. Namely, an execution state is continued when the processing is carried out, while a stop state is continued when the processing is stopped.
  • the person to be recognized does not need to keep raising the arm in order to continuously execute a certain processing, the task of the person to be recognized can be reduced.
  • the position or direction pointed to by the person to be recognized is determined on the basis of the three-dimensional coordinates of the feature point whose position is changed when the person to be recognized bends or extends an arm and on the basis of the three-dimensional coordinates of the reference point whose position is not changed even if the person to be recognized bends and extends an arm, and a predetermined processing is also executed based on the change in the distance between the reference point and the feature point.
  • a predetermined processing is also executed based on the change in the distance between the reference point and the feature point.
  • FIG. 1 is a perspective view showing surroundings of an information input space.
  • FIG. 2 is a block diagram showing a schematic constitution of a hand pointing input apparatus according to the present embodiment.
  • FIG. 3 schematically shows an example of a relationship between an illumination range of an illuminator and an image pickup range of a video camera.
  • FIG. 4 is a perspective view of the information input space showing an example of a mark plate.
  • FIG. 5 is a flow chart of an initialization processing of information about a lattice point position.
  • FIG. 6 is a flow chart of an illumination control processing.
  • FIG. 7 is a timing chart showing a timing of the switch-on/off of illuminators A, B by the illumination control processing of FIG. 6 and of an output (capture) of an image picked up by the video camera.
  • FIGS. 8A and 8B are a flow chart of an instruction determination processing.
  • FIG. 9 is a side view of the information input space for describing a calculation of the height of an information inputting person and the position of the information inputting person on a floor surface.
  • FIG. 10A is an image illustration showing an image of hand of the information inputting person picked up by the video camera.
  • FIG. 10B is a conceptual view of a search range for the lattice point for determining a coordinate of a feature point and three-dimensional coordinate of the feature point.
  • FIG. 11A is a plan view of the information input space for describing the determination of the position on a display pointed to by the information inputting person.
  • FIG. 11B is a side view of the information input space shown in FIG. 11A.
  • FIGS. 12 A- 12 C are image illustrations showing an example of a motion of the information inputting person.
  • FIG. 13 schematically shows another example of the relationship between the illumination range of the illuminator and the image pickup range of the video camera.
  • FIG. 14 is a flow chart of the illumination control processing in an arrangement shown in FIG. 13.
  • FIG. 15 is a timing chart showing the timing of the switch-on/off of the illuminators A, B by the illumination control processing of FIG. 14.
  • FIG. 16 is a perspective view of an aspect of a slope platform arranged on the floor surface in the information input space.
  • FIG. 17 is a perspective view of the information input space showing another example of the mark plate.
  • FIG. 18 is a perspective view of the information input space showing an example of a movement of a marker position by a robot arm unit.
  • FIG. 19 is a flow chart of another example of the instruction determination processing.
  • FIG. 20 is a flow chart of a further example of the instruction determination processing.
  • FIG. 21 is a flow chart of the processing for setting the click motion speed.
  • FIG. 22A is an image illustration for describing a forward click motion.
  • FIG. 22B is an image illustration for describing a backward click motion.
  • FIG. 23 is an image illustration for describing a data conversion into a dummy model.
  • a large-screen display 12 is built into a wall surface in a place at which an information inputting person 10 , who is the person to be recognized of the present invention arrives.
  • Known display means such as a liquid crystal display (LCD), a plasma display, a cathode ray tube (CRT) and an optical fiber display can be applied as the display 12 .
  • the display 12 is connected to an information processor 14 composed of a personal computer or the like (see FIG. 2).
  • the information processor 14 allows various types of information to be displayed on a display surface in various display forms, such as a figure, a table, a character, an image or the like.
  • the information inputting person 10 arrives at the place (information input space) shown in FIG. 1 in front of the display 12 .
  • the information inputting person 10 points to a position on the display surface of the display 12 on which various information is displayed, while he/she makes a click motion (described below in detail), whereby he/she gives various instructions to the information processor 14 and allows various types of processing to be executed.
  • a controller 22 of a hand pointing input apparatus 20 is connected to the information processor 14 .
  • the controller 22 includes CPU 22 A, ROM 22 B, RAM 22 C, and an I/O interface 22 D. These elements are connected to each other through a bus.
  • the information processor 14 a non-volatile memory 24 capable of updating stored contents, a display 26 for displaying various types of information and a keyboard 28 for inputting various instructions and data by an operator are connected to the I/O interface 22 D.
  • An illumination control device 30 is also connected to the I/O interface 22 D of the controller 22 .
  • a plurality of near-infrared light illuminators 32 A and 32 B for emitting a light of a wavelength within a near-infrared range in a beam manner are connected to the illumination control device 30 .
  • the near-infrared light illuminators 32 A and 32 B are arranged in different positions over the information input space. Their radiation ranges are adjusted so that the illuminators 32 A and 32 B may illuminate, from different directions, the information inputting person 10 who arrives at the information input space (see FIG. 3, too).
  • the illumination control device 30 controls the switch-on/off of the illuminators 32 A and 32 B in response to the instruction from the controller 22 .
  • a pickup control device 34 is connected to the I/O interface 22 D of the controller 22 .
  • a plurality of video cameras 36 A and 36 B arranged in different positions over the information input space (see FIG. 1) are connected to this image pickup control device 34 .
  • the video cameras 36 A and 36 B include an area sensor composed of a near-infrared-light-sensitive CCD or the like.
  • a filter for transmitting only the light of the wavelength within the near-infrared range is also disposed on the light-incident side on an imaging lens for forming incident light into an image on a receptor surface of the area sensor.
  • the video camera 36 A is oriented so that the information inputting person 10 who arrives at the information input space may be within an image pickup range. It is also oriented so that the light emitted from the illuminator 32 A corresponding to the video camera 36 A dose not fall directly on the imaging lens, and so that the center of the image pickup range may cross the center of the range illuminated by the illuminator 32 A at a predetermined height from the floor surface in the information input space. Therefore, the image pickup range of the video camera 36 A is adjusted so that the range on the floor surface illuminated by the illuminator 32 A corresponding to the video camera 36 A may be out of the image pickup range.
  • the video camera 36 B is oriented so that the information inputting person 10 who arrives at the information input space may be within the image pickup range, the light emitted from the illuminator 32 B corresponding to the video camera 36 B may not fall directly on the imaging lens and the center of the image pickup range may cross the center of the range illuminated by the illuminator 32 B at a predetermined height from the floor surface in the information input space. Therefore, the image pickup range of the video camera 36 B is adjusted so that the range on the floor surface illuminated by the illuminator 32 B corresponding to the video camera 36 B may be out of the image pickup range.
  • the image pickup ranges of the video cameras 36 A and 36 B are adjusted so that the ranges on the floor surface illuminated by the different illuminators corresponding to the video cameras may be out of the image pickup ranges.
  • a mark plate driving unit 38 is also connected to the I/O interface 22 D of the controller 22 .
  • the hand pointing input apparatus 20 comprises a mark plate 40 arranged near the information input space.
  • the mark plate 40 is composed of a multiplicity of marks 40 A which are recorded so as to be equally spaced in a matrix form on a transparent flat plate.
  • the mark plate 40 can be moved so that it may move a cross the information input space in a direction perpendicular to the main surface of the mark plate 40 (a direction shown by arrow A in FIG. 4).
  • the marks 40 A are colored with a color which is easy to recognize on the image (for example, red).
  • the mark plate driving unit 38 allows the mark plate 40 to be moved in the direction of the arrow A in FIG. 4 in response to an instruction from the controller 22 .
  • step 100 the mark plate driving unit 38 allows the mark plate 40 to be moved to a predetermined position (a position corresponding to an end of the moving range of the mark plate 40 ), namely, a reference position.
  • step 102 the three-dimensional coordinates (x, y, z) of the multiplicity of marks 40 A recorded on the mark plate 40 in the information input space, in the current position of the mark plate 40 are calculated.
  • step 104 the image of the information input space is picked up by the video cameras 36 A and 36 B through the image pickup control device 34 .
  • step 106 the image of the information input space picked up by the video camera 36 A (referred to as an image A) is captured through the image pickup control device 34 .
  • step 108 the marks 40 A in the image A captured in step 106 are recognized (extracted).
  • step 110 the positions (X A , Y A ) of all the recognized marks 40 A on the image A are calculated.
  • step 112 the three-dimensional coordinates (x, y, z) in the information input space of all the marks 40 A in the image A are made to correspond to the positions (X A , Y A ) of all the marks 40 A on the image A, and this correspondence is stored in the memory 24 as the lattice point position information of the video camera 36 A.
  • subsequent steps 114 through 120 the processes of the video camera 36 B are performed in the same manner as in the above-described steps 106 through 112 .
  • the image of the information input space picked up by the video camera 36 B (referred to as an image B) is captured through the image pickup control device 34 .
  • the marks 40 A in the image B captured in step 114 are recognized (extracted).
  • the positions (X B , Y B ) of all the recognized marks 40 A on the image B are calculated.
  • step 120 the three-dimensional coordinates (x, y, z) in the information input space of all the marks 40 A in the image B are made to correspond to the positions (X B , Y B ) of all the marks 40 A on the image B, and this correspondence is stored in the memory 24 as the lattice point position information of the video camera 36 B.
  • step 122 whether or not the mark plate 40 is moved to a final position (a position corresponding to the end opposite to the predetermined position in step 100 within the moving range of the mark plate 40 )is determined. If the determination is negative in step 122 , the processing proceeds to step 124 .
  • step 124 the mark plate driving unit 38 allows the mark plate 40 to be moved in a predetermined direction by a fixed distance (specially, the distance corresponding to the space between the marks 40 A on the mark plate 40 ) Then, the processing is returned to step 102 .
  • steps 102 through 124 are repeated.
  • the multiplicity of marks 40 A recorded on the mark plate 40 are moved to the positions corresponding to the multiplicity of lattice points (corresponding to virtual points) which are uniformity spaced in a lattice arrangement in the information input space.
  • the correspondence between the three-dimensional coordinates of the lattice points in the information input space and the positions thereof on the image A is stored in the memory 24 as the lattice point position information of the video camera 36 A.
  • the correspondence between the three-dimensional coordinates of the lattice points in the information input space and the positions thereof on the image B is also stored in the memory 24 as the lattice point position information of the video camera 36 B.
  • the lattice point position information initialized by the above-mentioned lattice point position information initialization corresponds to the information for corresponding the three-dimensional coordinates of the virtual points to the positions of the virtual points on the image.
  • the memory 24 corresponds to the storing means of the present invention. Since the mark plate 40 and the mark plate driving unit 38 are used only for the above-mentioned lattice point position information initialization and are not used for the following processing, the mark plate 40 and the mark plate driving unit 38 may be removed after the initialization.
  • step 130 the illumination control device 30 switches on the illuminator 32 A and switches off the illuminator 32 B.
  • step 132 an image of the information input space is picked up by the video camera 36 A, and the image is then output from the video camera 36 A.
  • step 134 whether or not a predetermined time period has passed since the illuminator 32 A was switched on is determined. Processing dose not begin until a positive determination is made.
  • step 134 If an affirmative determination is made in step 134 , the processing proceeds to step 136 .
  • step 136 the illumination control device 30 switches off the illuminator 32 A and switches on the illuminator 32 B.
  • step 138 an image of the information input space is picked up by the video camera 36 B, and the image is then output from the video camera 36 B.
  • step 140 whether or not a predetermined time period has passed since the illuminator 32 A was switched on is determined. Processing dose not begin until a positive determination is made. Then, if an affirmative determination is made in step 140 , the processing returns to step 130 .
  • the above-described illumination control processing allows the illuminators 32 A and 32 B to be alternately switched on/off at a predetermined time interval.
  • the illuminator 32 A is switched on, the image is picked up by the video camera 36 A, and image data indicating the image A picked up by the video camera 36 A is then output to the controller 22 through the image pickup control device 34 .
  • the illuminator 32 B is switched on, the image is picked up by the video camera 36 B, and the image data indicating the image B picked up by the video camera 36 B is then output to the controller 22 through the image pickup control device 34 .
  • the image pickup is performed by means of a near-infrared light
  • the luminance of the image part corresponding to the information inputting person 10 in the picked up image is not influenced, and thus not altered by a change in the luminance of the display 12 when a visible light is emitted therefrom, or by the skin color or clothing color of the information inputting person 10 . Therefore, in the instruction determination processing as described below, the image part corresponding to the information inputting person 10 can be extracted with a high level of accuracy.
  • a fluorescent tube which is processed so that light of the wavelength of the near-infrared range is not be emitted therefrom, is disposed near the information input space, the processing is not influenced by this fact.
  • the above-described alternate switch-on/off of the illuminators 32 A and 32 B does not give an uncomfortable feeling to the information inputting person 10 .
  • step 150 the image data indicating the image A output from the video camera 36 A and the image data indicating the image B output from the video camera 36 B are captured at the timing shown in FIG. 7.
  • step 152 whether or not the information inputting person 10 is present in the information input space is determined based on the image data of the images A and B captured in step 150 .
  • the image of the information input space is picked up by the video camera 36 A when the illuminator 32 A alone is switched on, and the image pickup range of the video camera 36 A is adjusted so as to be out of the range on the floor surface illuminated by the illuminator 32 A. Accordingly, even if an object 50 A which is not a subject to be recognized (see FIG. 3) such as the luggage of the information inputting person 10 or trash is present within the range on the floor surface illuminated by the illuminator 32 A, this object 50 A which is not the subject to be recognized is not within the image pickup range of the video camera 36 A. Furthermore, if an object 50 B which is not the subject to be recognized (see FIG.
  • the image of the information input space is picked up by the video camera 36 B when the illuminator 32 B alone is switched on, and the image pickup range of the video camera 36 B is adjusted so that it may be out of the range on the floor surface illuminated by the illuminator 32 B. Accordingly, even if the object 50 B which is not the subject to be recognized is present on the floor surface illuminated by the illuminator 32 B, this object 50 B which is not the subject to be recognized is not within the image pickup range of the video camera 36 B.
  • the image of the object 50 A which is not the subject to be recognized is picked up by the video camera 36 B and thus the image part corresponding to the object 50 A which is not the subject to be recognized is present in the image B.
  • the luminance of the image part corresponding to the object 50 A is very low.
  • step 152 whether or not the information inputting person 10 is present in the information input space can be determined by a very simple determination of, for example, whether or not the image part having a high luminance, and an area of a predetermined value or more, is present in the images A and B.
  • step 152 no processing is carried out and the instruction determination processing is completed.
  • step 154 the processing from step 154 corresponds to the determining means of the present invention.
  • the image part corresponding to a full-length image of the information inputting person 10 are extracted from the images A and B.
  • the image part corresponding to the full-length image of the information inputting person 10 can be also easily extracted by determining a continuous region which is composed of high-luminance pixels and has the area of a predetermined value or more.
  • step 156 the height of the information inputting person 10 is calculated based on the image part corresponding to the full-length image of the information inputting person 10 .
  • f denotes a focal length of the imaging lens of the video camera positioned at a point O
  • H denotes the distance between an intersection point Q of a vertical line passing through the point O and the floor surface in the information input space and the point O
  • R denotes the distance between the point Q and a point P on the floor surface on which the information inputting person 10 is standing
  • a distance h between a point P′ corresponding to the top of the head of the information inputting person 10 and the point P is defined as the height of the information inputting person 10 .
  • denotes ⁇ POQ
  • ⁇ ′ denotes ⁇ P′OQ
  • h′ denotes the length of the image of the information inputting person formed on the receptor surface of the area sensor of the video camera
  • a point p denotes an imaging point on the receptor surface corresponding to the point P
  • a point p′ denotes the imaging point on the receptor surface corresponding to the point P′
  • r denotes the distance between a center ⁇ of the receptor surface and the point p
  • r′ denotes the distance between the center ⁇ of the receptor surface and the point p′
  • the angles ⁇ , ⁇ ′ and the distances r, r′ can be determined by the following equations (1) through (4).
  • the height h of the information inputting person 10 and the distance R can be determined by the following equations (5) and (6).
  • step 156 the distances r and r′ are determined from either the image A or the image B picked up by the video cameras 36 A or 36 B, and these determined distances r and r′ are then substituted in the equation (5), whereby the height h of the information inputting person 10 can be found.
  • step 156 the distances r are found from the images A and B, and the determined distances r are then substituted in the equation (6) so that the distances R are found, whereby the position (two-dimensional coordinates) of the information inputting person 10 on the floor surface is determined.
  • next step 158 the three-dimensional coordinates (x 0 , y 0 , z 0 ) of a reference point P 0 of the information inputting person 10 is determined based on the height h of the information inputting person 10 and the position of the information inputting person 10 on the floor surface determined in step 156 .
  • the point (the point P 0 shown in FIG. 11) corresponding to the back of the information inputting person 10 or the like can be used as the reference point P 0 .
  • the height (for example, the value z 0 ) of the reference point P 0 , corresponding to the back of the information inputting person 10 , from the floor surface is calculated in accordance with the height h of the information inputting person 10 .
  • the position (plane coordinates) of the information inputting person 10 on the floor surface is set to the plane coordinate (for example, the values x 0 and y 0 ) of the reference point P 0 , whereby the three-dimensional coordinates of the reference point P 0 can be determined.
  • step 159 whether or not the information inputting person 10 makes the pointing motion (the motion to point toward the display 12 using a by the finger or the like) is determined based on the shapes of the image parts corresponding to the full-length images of the information inputting person 10 in the images A and B. Since the direction of the display 12 seen from the information inputting person 10 is already known, the determination in step 159 can be accomplished by, for example, determining whether or not the portion projecting toward the display 12 , as seen from the information inputting person 10 , is present at the height determinable as the position of the hand of the information inputting person 10 , in the image part corresponding to the full-length image of the information inputting person 10 .
  • step 159 the determination that the information inputting person 10 is making a pointing motion is determined. If a negative determination is made in step 159 , no processing is performed and the instruction determination processing is completed. On the other hand, if an affirmative determination is made in step 159 , the processing proceeds to step 160 .
  • step 160 a feature point P X of the information inputting person 10 in the image A is extracted on the basis of the image data indicating the image A captured from the video camera 36 A, and the position (X A , Y A ) of the feature point P X on the image A is calculated.
  • the point corresponding to the fingertip pointing to the display 12 or the like can be used as the feature point P X of the information inputting person 10 .
  • this calculation can be accomplished by defining, as the position of the feature point P X , the position whose the tip of the portion projecting toward the display 12 is positioned at a height determinable as the position of the hand of the information inputting person 10 , in the image part indicating the full-length image of the information inputting person 10 .
  • step 162 all the lattice points whose positions on the image A are within the range (a range R shown in FIG. 10B) of (X A ⁇ dX, Y A ⁇ dY) are searched based on the lattice point position information of the video camera 36 A stored in the memory 24 .
  • the sizes of dX and dY are defined on the basis of the space between the lattice points (the space between the marks 40 A) so that at least one lattice point or more may be extracted.
  • a wide-angle lens is used as the imaging lens of the video camera.
  • dX and dY are constant, the longer the distance between the video camera and the lattice points gets, the more lattice points are within the range of (X A ⁇ dX, Y A ⁇ dY), thereby resulting in a deterioration of the accuracy of calculating the three-dimensional coordinates of the feature point P X as described below.
  • dX and dY are set so that the values thereof are reduced as the distance from the video camera to dX and dY gets longer on the three-dimensional coordinates.
  • the range corresponding to (X A ⁇ dX, Y A ⁇ dY) on the three-dimensional coordinate is shaped into a quadrangular pyramid whose bottom surface is positioned on the side of the video camera.
  • the virtual points positioned within a predetermined range including the feature point on the image are extracted.
  • step 164 in the same manner as the previous step 160 , the feature point P X of the information inputting person 10 in the image B is extracted on the basis of the image data indicating the image B, captured from the video camera 36 B, and the position (X B , Y B ) of the feature point P X on the image B is calculated.
  • step 166 in the same manner as the previous step 162 , all the lattice points whose positions on the image B are within the range of (X B ⁇ dX, Y B ⁇ dY) are searched on the basis of the lattice point position information of the video camera 36 B stored in the memory 24 .
  • the virtual points positioned within a predetermined range including the feature point on the image are also extracted.
  • next step 168 the common extracted lattice points are determined on the basis of the lattice points extracted from the images A and B as described above. By this determination, only a plurality of lattice points in the position adjacent to the feature point P X in the information input space are extracted.
  • step 170 the three-dimensional coordinates of the common lattice points extracted from the images A and B are captured from the lattice point position information.
  • the three-dimensional coordinates of the feature point P X are calculated by an interpolation from the three-dimensional coordinates of plural lattice points in the position adjacent to the feature point in the information input space, (more specifically, a coordinate value of the three-dimensional coordinates of the feature point is found by a weighted average of the coordinate values of the three-dimensional coordinates of plural lattice points).
  • a rate of interpolation from the three-dimensional coordinates of the common lattice points extracted from the images A and B (a weight to the coordinate values of the three-dimensional coordinates of the lattice points) is determined based on the positions on the images A, and B of the common lattice points extracted from the images A and B, the position (X A , Y A ) of the feature point P X on the image A, and the position (X B , Y B ) of the feature point P X on the image B.
  • this rate of interpolation can be determined so that the weight of the coordinate values of the three-dimensional coordinates of the lattice points in the positions adjacent to the feature points on the images A and B may be increased.
  • step 174 the three-dimensional coordinates (X X , Y X , Z X ) of the feature point P X are calculated on the basis of the three-dimensional coordinates of the common lattice points extracted from the images A and B and the rate of interpolation determined in step 172 .
  • step 176 based on the three-dimensional coordinates of the reference point P 0 of the information inputting person calculated in the previous step 158 , and the three-dimensional coordinates of the feature point P X calculated in step 174 , the direction of an extended virtual line (see virtual line 54 in FIG. 11) connecting the reference point and the feature point is determined as the direction pointed to by the information inputting person 10 , and the coordinates (plane coordinate) of the intersection point (see point S in FIG. 11) of the plane, including the display surface of the large-screen display 12 , and the virtual line are calculated in order to determine the position pointed to by the information inputting person 10 .
  • the coordinates (plane coordinate) of the intersection point see point S in FIG. 11
  • step 178 whether or not the information inputting person 10 is pointing to the display surface of the large-screen display 12 is determined based on the coordinates determined in step 176 . If a negative determination is made, a monitor flag (the flag for monitoring the click motion) is set at 0 in step 180 so as to thereby complete the instruction determination processing. On the other hand, if an affirmative determination is made in step 178 , the coordinates indicating the position pointed to by the information inputting person 10 calculated in step 176 are output to the information processor 14 . Thus, the information processor 14 performs the processing, for example, it allows a cursor to be displayed at a predetermined position, which is judged the position pointed to by the information inputting person 10 , on the display surface of the display 12 .
  • step 184 From the next step 184 and the steps following step 184 , whether or not the information inputting person 10 makes the click motion is determined.
  • the click motion is defined as any motion of the hand of the information inputting person (for example, bending and turning a wrist, bending and extending a finger or the like).
  • step 184 the image part corresponding to the hand of the information inputting person 10 in the image A is extracted so that the area of the corresponding image part is calculated, and the image part corresponding to the hand of the information inputting person 10 in the image B is also extracted so that the area of the corresponding image part is calculated.
  • next step 186 whether or not the monitor flag is 1 is determined. Since a negative determination in step 186 indicates that the information inputting person 10 has not pointed to the display surface of the display 12 during the previous instruction determination processing, the monitor flag is set at 1 in step 188 .
  • the next step 190 the area of the image part corresponding to the hand of the information inputting person 10 calculated in step 184 is stored in the RAM 22 C in order to later determine the click motion, and the instruction determination processing is completed.
  • step 192 the area calculated in step 184 is compared to the area stored in the RAM 22 C or the like (the area which is calculated when the information inputting person 10 starts pointing at the display surface of the display 12 , namely, the time when the monitor flag was set at 1 in step 188 ), whereby, whether or not the area of the image part corresponding to the hand of the information inputting person 10 is changed beyond a predetermined value, is determined.
  • a negative determination in step 192 indicates that the information inputting person 10 has not made the click motion, so that the instruction determination processing is completed without any processing.
  • step 192 When the information inputting person 10 bends or turns the wrist (for example, changes from the attitude shown in FIG. 12B into the attitude shown FIG. 12C or vice versa) or he/she bends or extends a finger, the areas of the image parts corresponding to the hand of the information inputting person 10 in the images A and B are changed beyond a predetermined value, whereby an affirmative determination is made in step 192 .
  • step 192 the information indicating “click detected” is output to the information processor 14 in step 194 .
  • the monitor flag is set at 0 and the instruction determination processing is then completed.
  • the information processor 14 determines that a predetermined position on the display surface of the display 12 , pointed to by the information inputting person 10 , (the position corresponding to the coordinates input in step 182 ) is clicked. Then, the information processor 14 performs the processing in response to the information displayed at a predetermined position on the display surface of the display 12 .
  • the controller 22 of the hand pointing input apparatus 20 repeats the above-described instruction determination processing at a predetermined time interval, whereby it is possible to determine, in real time, the position on the display surface of the display 12 pointed to by the information inputting person 10 and whether or not the click motion is detected.
  • various uses are possible as described below by combining the instruction determination processing with the processing executed by the information processor 14 .
  • the display 12 is installed on the wall surface in an underground shopping mall or the like, and a product advertisement or the like is displayed on the display 12 by the information processor 14 .
  • the hand pointing input apparatus 20 permits an interactive communication with a user, for example, a picture may be displayed describing a particular product in detail, in response to the instruction of the user (the information inputting person). Furthermore, if the user possesses a pre-paid card, the user can buy the product by paying with this card.
  • the display 12 is installed in an entrance of a building, and an information map giving a guide to the building or the like is displayed on the display 12 by the information processor 14 .
  • the hand pointing input apparatus 20 permits interactive communication with the user, for example, a picture may be displayed describing in detail the place in the building which the user intends to visit, or a route to the place the user intends to visit may be shown in response to the instruction of the user (the information inputting person).
  • the display 12 may be arranged outside the clean room so as to be visible from inside the clean room, and the contents of the operating and other manuals are displayed on the display 12 in response to the instruction from the operator in the clean room determined by the hand pointing input apparatus 20 , whereby interactive communication between the inside and the outside of the clean room is possible, so that operating efficiency in the clean room is improved.
  • the large-screen display 12 , the hand pointing input apparatus 20 , and the information processor 14 may be operated as a game machine in an amusement park.
  • an explanation may be displayed on the display 12 , and an optional position on the display surface of the display 12 is pointed at.
  • the image pickup range of the video camera 36 A is adjusted so that the range on the floor surface illuminated by the illuminator 32 A may be out of the image pickup range of the video camera 36 A
  • the image pickup range of the video camera 36 B is adjusted so that the range on the floor surface illuminated by the illuminator 32 B may be out of the image pickup range of the video camera 36 B.
  • the image pickup is performed by the video camera 36 A when the illuminator 32 A alone is switched on, while the image pickup is performed by the video camera 36 B when the illuminator 32 B alone is switched on.
  • the present invention is not limited to this example. Even if the range on the floor surface illuminated by the illuminator 32 is within the image pickup range of the video camera, it is possible to pickup images from which the image parts corresponding to the information inputting person 10 are easily extracted.
  • the image pickup range of a video camera 36 includes the range on the floor surface illuminated by the illuminator 32 A, and the range on the floor surface illuminated by the illuminator 32 B.
  • the object 50 A, which is not the subject to be recognized on the floor surface illuminated by the illuminator 32 A, and the object 50 B, which is not the subject to be recognized on the floor surface illuminated by the illuminator 32 B, are picked up by the video camera 36 .
  • the illumination control processing shown in FIG. 14 may be performed.
  • step 250 the illuminator 32 A is switched on and the illuminator 32 B is switched off. Then, in step 252 , an image of information input space is picked up by the video camera 36 .
  • step 254 the image data output from the video camera 36 (the image indicated by the image data is referred to as a first image) is captured and stored in the RAM 22 C.
  • step 256 whether or not a predetermined time T passes after the illuminator 32 A is switched on is determined. Until a predetermined time T passes, the processing is not performed. If an affirmative determination is made in step 256 , the processing proceeds to step 258 .
  • step 258 the illuminator 32 B is switched on, and the illuminator 32 A is switched off after a predetermined time to passes after the illuminator 32 B is switched on (where it should be noted that t 0 ⁇ T: see FIG. 15).
  • step 260 an image of the information input space is picked up by the video camera 36 .
  • step 262 the image data output from the video camera 36 (the image indicated by the image data is referred to as a second image) is captured.
  • step 264 the lower luminance value of the luminance values of a certain pixel in the first and second images is selected based on the image data indicating the first image stored in the RAM 22 C in step 254 , and the image data indicating the second image captured in step 262 .
  • the selected luminance value is used as the luminance value of the pixel. This processing is performed for all the pixels, whereby new image data is generated and the generated image data is output.
  • step 262 it is possible to obtain the image in which only the image part corresponding to the information inputting person 10 has high luminance, namely, the image from which the image part corresponding to the information inputting person 10 is easily extracted (or the image data indicating this data).
  • step 266 whether or not a predetermined time T passes after the illuminator 32 B is switched on is determined. Until a predetermined time T passes, the processing is not performed. If an affirmative determination is made in step 266 , the processing proceeds to step 268 . In step 268 , the illuminator 32 A is switched on, and the illuminator 32 B is switched off after a predetermined time to passes after the illuminator 32 A is switched on. Then, the processing is returned to step 252 .
  • FIG. 13 For a simple description, a single video camera 36 alone is shown in FIG. 13, and the processing alone for a single video camera 36 is shown in FIG. 14. However, even if a plurality of video cameras 36 for picking up the information input space from different directions are provided, the above-described processing is performed for each video camera 36 , whereby it is possible to obtain the images from which the image parts corresponding to the information inputting person 10 are easily extracted.
  • the image data is captured in synchronization with the switch-on/off timing of the illuminators 32 A and 32 B, only during the time period when either the illuminator 32 A or 32 B is switched on.
  • the image data is captured at a period of 1/integral part of the predetermined time T (see FIGS. 14 and 15), whereby the processing in step 264 may be performed at a period of 2 ⁇ T.
  • the overlap period time to intervenes between cycles instead of selecting the lower luminance value of each pixel in the previous step 264 , for example, the overlap period time to intervenes between cycles, while the illuminators 32 A and 32 B are alternately switched on in fixed cycles (whereby the ratio of the amount of time of switch-on for each illuminator 32 A and 32 B, is 50+a% where a corresponds to the overlap period time).
  • average luminance in one switch-on cycle of the illuminators 32 A and 32 B may be used as the luminance of each pixel.
  • the direct-current component alone of the change in the luminance is extracted by a low-pass filter, a fast Fourier transform, or the like, whereby the luminance value corresponding to the extracted direct-current component of the luminance change may be used as the luminance value of each pixel.
  • the relatively high luminance value is used as the luminance value of the pixel corresponding to the information inputting person 10 which is always illuminated by the illuminator 32 A or 32 B during one switch-on cycle of the illuminators 32 A and 32 B. It is thus possible to obtain an image from which the image part corresponding to the information inputting person 10 is easily extracted.
  • a slope platform 58 may be arranged on the floor surface in the information input space.
  • the slope platform 58 includes an inclined surface 58 A which is formed so that it may surround the information inputting person 10 who enters the information input space.
  • the slope platform 58 prevents the information inputting person 10 from putting the luggage or the like near himself/herself, so that the luggage or the like is put apart from the information inputting person 10 , namely, out of the image pickup range of the video camera 36 .
  • a fan or the like for generating an air flow may be provided around the information inputting person 10 so that the object which is not the subject to be recognized may be blown away by the air flow.
  • a storage tank for storing water or the like may be also arranged around the information inputting person 10 .
  • the storage tank may be circular in shape so that the water or the like may circulate through the storage tank. With a construction such as this, it is also possible to prevent an object which is not the subject to be recognized from remaining around the information inputting person 10 .
  • the lattice point position information is set by the use of the mark plate 40 composed of many marks 40 A which are recorded so that they may be equally spaced in a matrix shape on the transparent flat plate
  • the present invention is not limited to this example.
  • a mark plate 62 in which markers composed of many light emitting devices 62 A such as LED are arranged in a matrix shape on the transparent flat plate, may be used.
  • one light emitting device 62 A at a time is sequentially switched on. Whenever each light emitting device 62 A is switched on, the three-dimensional coordinates of the switched-on light emitting device 62 A are calculated. An image of the information input space is picked up by the video cameras 36 A and 36 B. The position of the light emitting device 62 A on the images A and B is calculated. The three-dimensional coordinates of the light emitting device 62 A are made to correspond to the position of the light emitting device 62 A on the images A and B. This correspondence is stored in the memory 24 as the lattice point position information. After all the light emitting devices 62 A on the mark plate 62 are switched on, the mark plate 62 is moved by a fixed amount by the mark plate driving unit 38 . The above processing has only to be repeated.
  • the mark plate 40 and the mark plate 62 can be replaced by a robot arm unit 66 capable of moving a hand 66 B mounted on the end of an arm 66 A to an optional position in the information input space in which the marker composed of a light emitting device 68 is attached to the hand 66 B.
  • the light emitting device 68 is switched on, and the light emitting device 68 is moved to the positions corresponding to many lattice points constantly spaced in the lattice arrangement in the information input space. Whenever the light emitting device 68 is positioned in each position, the three-dimensional coordinates of the light emitting device 68 are calculated.
  • the image of the information input space is picked up by the video cameras 36 A and 36 B.
  • the position of the light emitting device 68 on the images A and B is calculated.
  • the three-dimensional coordinates of the light emitting device 68 are allowed to correspond to the position of the light emitting device 68 on the images A and B. This correspondence has only to be stored in the memory 24 as the lattice point position information.
  • the markers are manually positioned in the positions corresponding to the multiplicity of lattice points by the operator and an image of this situation is picked up, whereby the lattice point position information initialization alone may be automatically performed.
  • the mark plate shown in FIGS. 17 and 18 can be also applied to the use of at least one video camera and a plurality of illuminators as shown in FIG. 13.
  • the instruction determination processing shown in FIGS. 8A and 8B may be replaced by the instruction determination processing shown in FIG. 19.
  • the image data output from the video cameras 36 A and 36 B is captured in step 230 , and whether or not the information inputting person 10 is present in the information input space is then determined on the basis of the captured image data in next step 232 .
  • step 280 whether or not an arrival flag (the flag for indicating that the information inputting person 10 has arrived at the information input space) is 1 is determined. Since the initial value of the arrival flag is 0, the negative determination is first made in step 280 , so that the instruction determination processing is completed without any processing.
  • a predetermined attraction picture (the picture for attracting passersby near the information input space to the information input space) is displayed on the display 12 by the information processor 14 .
  • step 234 whether or not the arrival flag is 0 is determined. If the affirmative determination is made in step 234 , the processing proceeds to step 236 .
  • step 236 the information processor 14 is informed that the information inputting person has arrived at the information input space. Thus, the information processor 14 switches the picture displayed on the display 12 from the attraction picture to an initial picture (for example, for a product advertisement, this may be a picture indicating a product list or the like).
  • step 238 since the information inputting person has arrived at the information input space, the arrival flag is set at 1, an instruction flag, (the flag for indicating that the information inputting person 10 is pointing to the display surface of the display 12 ), and the monitor flag are set at 0, and then the processing proceeds to step 240 .
  • step 234 When a negative determination is made in step 234 , namely, when the information inputting person remains in the information input space after the previous execution of the instruction determination processing, the processing proceeds to step 240 without any processing in steps 236 and 238 .
  • step 240 in the same manner as steps 154 through 158 of the flow chart of FIGS. 8A and 8B, the image parts corresponding to the full-length image of the information inputting person 10 are extracted from the images picked up by the video cameras 36 A and 36 B, and the height h and the position on the floor surface of the information inputting person 10 are calculated, whereby the three-dimensional coordinates of the reference point of the information inputting person 10 are determined.
  • step 242 in the same manner as step 159 of the flow chart of FIGS. 8A and 8B, whether or not the information inputting person 10 is making a pointing motion is determined. If a negative determination is made in step 242 , whether or not the instruction flag is 1 is determined in step 270 . If a negative determination is also made in step 270 , the instruction determination processing is completed.
  • step 244 in the same manner as steps 160 through 176 of the flow chart of FIGS. 8A and 8B, the three-dimensional coordinates of the feature point of the information inputting person 10 are calculated, and the position pointed to by the information inputting person 10 is then calculated.
  • step 246 whether or not the information inputting person 10 points to the display surface of the display 12 is determined. If a negative determination is made in step 246 , the processing proceeds to step 270 . On the other hand, if an affirmative determination is made in step 246 , the pointing flag is set at 1 in step 247 . Then, in step 248 , the coordinates of the position on the display surface of the display 12 pointed to by the information inputting person 10 is output to the information processor 14 and the coordinates are stored in the RAM 22 C or the like. Thus, the information processor 14 allows the cursor or the like to be displayed at the position on the display surface of the display 12 pointed to by the information inputting person 10 .
  • step 250 the image part corresponding to the hand of the information inputting person 10 in the image is extracted so that the area thereof is calculated (step 250 ), and whether or not the monitor flag is 1 is determined (step 252 ). If a negative determination is made in step 252 , the monitor flag is set at 1 (step 254 ). The previously calculated area of the image part corresponding to the hand of the information inputting person is stored in the memory (step 256 ), and the instruction determination processing is completed.
  • step 252 If an affirmative determination is made in step 252 , the area calculated in step 250 is compared to the area stored in the RAM 22 C or the like, whereby whether or not the area of the image part corresponding to the hand of the information inputting person 10 is changed beyond a predetermined value is determined (step 258 ). If a negative determination is made in step 258 , the determination that the information inputting person 10 is not making a click motion is made, so that the instruction determination processing is completed without any processing. On the other hand, if an affirmative determination is made in step 258 , the information indicating click detected” is output to the information processor 14 (step 260 , whereby the information processor 14 executes a predetermined processing such as replacing the picture displayed on the display 12 ). Then, the monitor flag and the pointing flag are set at 0 (step 262 ), and the instruction determination processing is completed.
  • step 272 the coordinates of the position on the display surface of the display 12 pointed to by the information inputting person 10 , (calculated and stored in the RAM 22 C in step 248 ), are output to the information processor 14 .
  • the information processor 14 allows the cursor to remain displayed at the position where the cursor was displayed before the information inputting person 10 lowered the arm.
  • step 232 If the information inputting person 10 goes out of the information input space, a negative determination is made in step 232 even midway through a series of processing acts by the information processor 14 , so that the processing proceeds to step 280 . Since the arrival flag is set at 1 when the information inputting person 10 goes out of the information input space, the affirmative determination is made in step 280 . In step 282 , the information processor 14 is informed that the information inputting person 10 has gone out of the information input space. Thus, if the processing is midway through being executed, the information processor 14 stops the execution of the processing and switches the picture displayed on the display 12 to the attraction picture. In the next step 284 , the airmail flag is set at 0, and the instruction determination processing is completed.
  • the click motion is defined as any motion of the hand of the information inputting person (for example, bending and turning the wrist, bending and extending a finger or the like), the present invention is not limited to these examples.
  • a forward quick motion of the hand of the information inputting person 10 (see FIG. 22A, hereinafter referred to as a “forward click”) and a backward quick motion of the hand of the information inputting person 10 (see FIG. 22B, hereinafter referred to as a “backward click”) may be defined as the click motion.
  • the above-described click motion can be detected by, for example, the instruction determination processing shown in FIG. 20 instead of the instruction determination processing shown in FIGS. 8 and 19 .
  • step 310 in the same manner as step 152 of the flow chart of FIGS. 8A and 8B and step 232 of the flow chart of FIG. 19, whether or not the information inputting person 10 has arrived at (is present in) the information input space is determined.
  • This determination can also be accomplished by the very simple determination of, for example, whether or not an image part having a high luminance and an area of a predetermined value or more is present in the images A and B. If a negative determination is made in step 310 , the processing is delayed until an affirmative determination is made.
  • step 312 a click motion speed setting processing is executed.
  • step 290 the information processor 14 is given an instruction to display on the display 12 a message to request the information inputting person 10 to make the click motion.
  • the information processor 14 allows the massage to be displayed on the display 12 .
  • the information inputting person 10 bends or extends the arm and repeats the forward click motion or backward click motion.
  • step 292 a reference point/feature point coordinates calculation processing (the same processing as in steps 154 through 176 of the flow chart of FIGS. 8A and 8B) is performed, whereby the three-dimensional coordinates of the reference point P 0 and the feature point P X are determined.
  • step 294 whether or not the information inputting person 10 makes a pointing motion to point to the display 12 is determined. If a negative determination is made in step 294 , the processing returns to step 292 . Steps 292 and 294 are repeated until the information inputting person 10 makes the pointing motion. If an affirmative determination is made in step 294 , the processing proceeds to step 296 .
  • step 296 a distance k between the reference point P 0 and the feature point P X is calculated from the three-dimensional coordinates of the reference point P 0 , and the three-dimensional coordinate of the feature point P X which are captured in step 292 .
  • step 296 is repeated, during the second and later repetitions, the rate of the change of the distance k, that is, a velocity of change V, (a moving speed of the position of the feature point P X to the reference point P 0 ), is calculated based on the difference between the current value of the distance k and the previous value of the distance k. This calculation result is stored.
  • step 298 whether or not a predetermined time passes after the message requesting the click motion is displayed on the display 12 is determined. If the negative determination is made in step 298 , the processing is returned to step 292 , and steps 292 through 298 are repeated. Therefore, until a predetermined time passes after the massage of the request for the click motion is displayed, the calculation and storage of the velocity of change V of the distance k between the reference point P 0 and the feature point P X are repeated.
  • step 300 The previously calculated and stored velocity of change V is captured, and a click motion speed V 0 is set and stored as the threshold value, based on the transition of the velocity of change V during a single click motion of the information inputting person 10 .
  • This click motion speed V 0 is used as the threshold value for determining whether or not the information inputting person 10 is making the click motion in the processing described below.
  • a click motion speed V 0 can be set at, for example, a value which is slightly smaller than the average value of the velocity of change V during a single click motion of the information inputting person 10 .
  • the click motion speed V 0 may be set at a minimum value of the velocity of change V during a single click motion of the information inputting person 10 .
  • the moving speed (the velocity of change V) of the feature point P X varies depending on the information inputting person 10 .
  • the above-described click motion speed setting processing is executed every time an information inputting person 10 arrives at the information input space. Therefore, when a new information inputting person 10 arrives at the information input space, an appropriate new value is set as the click motion speed V 0 in response to the physique, muscular strength, or the like of the new information inputting person 10 .
  • step 314 the reference point/feature point coordinates calculation processing (the same processing as in steps 154 through 176 of the flow chart of FIGS. 8A and 8B) is performed, whereby the three-dimensional coordinates of the reference point P 0 and the feature point P X are determined.
  • step 316 whether or not the information inputting person 10 is making the pointing motion is determined based on the three-dimensional coordinates of the reference point P 0 and the feature point P X determined in step 314 .
  • step 334 whether or not the information inputting person 10 has left the information input space is determined. In the same manner as step 310 described above, this determination can also be accomplished by the very simple determination of, for example, whether or not the image part having a high luminance and an area of a predetermined value or more is absent from the images A and B. If a negative determination is made, the processing returns to step 314 . Steps 314 , 316 and 334 are repeated until the information inputting person 10 makes the pointing motion, steps 314 , 316 , 334 are repeated.
  • step 316 If an affirmative determination is made in step 316 , the processing proceeds to step 318 .
  • step 318 based on the three-dimensional coordinates of the reference point P 0 and the feature point P X calculated in step 314 , in the same manner as step 176 of the flow chart of FIGS. 8A and 8B, in order to determine the position pointed to by the information inputting person 10 , the coordinate of the intersection point on a the plane including the display surface of the large-screen display 12 , and the virtual line connecting the reference point and the feature point, are calculated.
  • step 320 whether or not the information inputting person 10 points to the display surface of the large-screen display 12 is determined based on the coordinate calculated in step 318 .
  • step 320 If a negative determination is made in step 320 , the processing proceeds to step 334 without any processing. On the other hand, if an affirmative determination is made in step 320 , in step 322 , the coordinates calculated in step 318 are output to the information processor 14 , whereby the information processor 14 is given the instruction to display the cursor. Thus, the information processor 14 performs the processing allowing the cursor to be displayed on a predetermined position, which is judged to be the position pointed to by the information inputting person 10 , on the display surface of the display 12 .
  • step 324 the distance k between the reference point P 0 and the feature point P X is calculated based on the three-dimensional coordinates of the reference point P 0 and the feature point P X and whether or not the distance k is changed is determined.
  • Step 324 is repeated, when the information inputting person 10 points to the display surface of the display 12 (when an affirmative determination is made in step 320 ). Since whether or not the distance k is changed cannot be determined when the distance k is calculated for the first time in step 324 , a negative determination is unconditionally made in step 324 .
  • step 324 the processing proceeds to step 326 .
  • step 326 the velocity of the change V of the distance k is calculated, and whether or not the calculated velocity of change V is the threshold value, (the click motion velocity V 0 set by the click motion velocity setting processing), or more is determined.
  • step 326 since the velocity of change V of the distance k cannot be determined when the distance k is calculated for the first time in step 324 , a negative determination is unconditionally made. If a negative determination is made in step 324 or 326 , the determination that the information inputting person 10 is not making a click motion is made, and the processing proceeds to step 334 without any processing.
  • step 324 or 326 If an affirmative determination is made in step 324 or 326 , the determination that the information inputting person 10 is making a click motion is made.
  • step 328 the direction of the change in the distance k is determined, and the processing branches in response to the result of the determination.
  • the processing proceeds to step 330 .
  • step 330 the information indicating that the forward click has been detected is output to the information processor 14 , and then the processing proceeds to step 334 .
  • step 332 the information indicating that the backward click has been detected is output to the information processor 14 , and then the processing proceeds to step 334 .
  • the information processor 14 determines that the current position pointed to by the information inputting person 10 is clicked. If the forward click is detected, a first processing corresponding to the current position pointed to is performed. If the backward click is detected, a second processing (differing from the first processing) corresponding to the current position pointed to is performed. When the information inputting person 10 goes out of the information input space, an affirmative determination is made in step 334 , and the processing returns to step 310 .
  • the click motion in the instruction determination processing is a very natural motion as the motion for pointing to and selecting a specific position on the display surface of the display 12 , the person to be recognized can make the click motion without feeling uncomfortable. Moreover, in the above description, since whether or not the click motion is performed, and whether the performed click motion is the forward click motion or the backward click motion, can be determined on the basis of the change in the distance k between the reference point and the feature point, the click motion can be detected in a short time. Since two types of click motion, (the forward click motion and the backward click motion), are also defined as the click motion, the information inputting person can selectively execute the first processing and the second processing.
  • the value of the distance k before detecting the forward or backward click motion is previously stored as the value corresponding to the neutral position. Then, the detection of the click motion is stopped until the value of the distance k reaches the value corresponding to the neutral position after the forward or backward click motion is detected.
  • the cursor may remain displayed at the position on which the cursor was displayed before the arm was lowered.
  • the position pointed to by the information inputting person is calculated on the basis of the three-dimensional coordinates of the reference point and the feature point of the information inputting person
  • the present invention is not limited to this example.
  • an image part 72 corresponding to the full-length image of the information inputting person 10 is extracted from the image picked up by the video camera, and the height h and the position on the floor surface of the information inputting person 10 are calculated.
  • the full-length image of the information inputting person is converted into a dummy model 74 on the basis of various parameters including their height h.
  • Various motions of the information inputting person including the motion to point to the display surface of the display 12 may be recognized on the basis of this dummy model.
  • the subject to be pointed to by the information inputting person is not limited to the display.
  • the information inputting person may point to an optional direction or to an optional object positioned at an unfixed distance from the information inputting person.
  • the instruction determination processing (for example, in step 176 of the flow chart of FIGS. 8A and 8B), the direction in which of the virtual line connecting the reference point and the feature point of the information inputting person extends is determined, whereby the direction pointed to by the information inputting person can be determined.
  • the information inputting person points to an optional object positioned at an unfixed distance from the information inputting person, in the previous step 176 , the extending direction of the virtual line is determined, and then the object on the end of the extending virtual line is determined, whereby the direction pointed to by the information inputting person can be determined.
  • the information inputting person may point to an optional direction in the following application.
  • the direction of emission of a spot light, and the directions of acoustic beams generated by a multiplicity of speakers in an array arrangement might be oriented to the direction pointed to by the operator (information inputting person).
  • the information inputting person may point to an optional object positioned at an unfixed distance from the information inputting person in the following application.
  • an optional object positioned at an unfixed distance from the information inputting person in the following application.
  • a crane and other machines might operated in response to instructions from the operator (information inputting person).
  • the information inputting person might give various instructions to various devices in home automation.
  • the present invention is not limited to this example.
  • the image of the information input space may be picked up by more video cameras whereby the instruction from the information inputting person is determined.

Abstract

Over an information input space to which an information inputting person comes, a pair of near-infrared light illuminators are arranged in such a manner that the illumination ranges thereof are adjusted so as to illuminate the information inputting person from different directions. A pair of near-infrared-light-sensitive video cameras are also arranged in different positions so as to correspond to the illuminators. The image pickup range of the video cameras is adjusted so that it is out of the range on the floor surface illuminated by the corresponding illuminator, while the information inputting person is within the image pickup range. A controller allows one illuminator at a time to be switched on/off. An image of the information inputting person is picked up by the video camera corresponding to the switched-on illuminator. The information inputting person is extracted based on the images picked up by the video cameras, whereby the position or direction pointed to by the information inputting person is determined.

Description

    BACKGROUND OF THE INVENTION
  • 1. Field of the Invention [0001]
  • The present invention relates to a hand pointing apparatus, and more specifically to a hand pointing apparatus for picking up a person to be recognized and for determining a position or a direction pointed to by the person to be recognized. [0002]
  • 2. Description of the Related Art [0003]
  • There has been heretofore known a hand pointing input apparatus which comprises a display for displaying predetermined information, an illuminating device for illuminating an information inputting person who comes to the display, and a plurality of image pickup devices for picking up the image of the approaching information inputting person from different directions, wherein a plurality of image pickup devices image pickup images of situations where the approaching information inputting person points with a finger or the like to an optional position on the display, the information inputting person is recognized in accordance with a plurality of images obtained by the image pickup, the position on the display pointed to by the information inputting person is determined, a cursor or the like is displayed on the position pointed to on the display, and the position on the display pointed to is recognized as being clicked at the time of detecting the fact that the information inputting person has performed a clicking action by raising a thumb, whereby a predetermined processing is performed (see, for example, Japanese Patent Application Laid-open (JP-A) Nos. 4-271423, 5-19957, 5-324181 or the like). [0004]
  • According to the above-described hand pointing input apparatus, since the information inputting person can give various instructions to an information processing apparatus and input various information to the information processing apparatus without touching an input device such as a keyboard or a mouse, it is possible to simplify the operation for using the information processing apparatus. [0005]
  • However, in an environment where the hand pointing input apparatus is actually operated, an object which is not a subject to be recognized, for example, the luggage of the information inputting person or trash, may exist around the information inputting person who is the subject to be recognized. The surroundings of the information inputting person are also illuminated by an illuminating light emitted from the illuminating device. Thus, if the above-described object which is not the subject to be recognized exists around the information inputting person, this object which is not the subject to be recognized is present as a high-luminance object in the images picked up by the image pickup device. Thus, there is a high possibility that an object which is not the subject to be recognized, is recognized as the information inputting person by mistake. [0006]
  • In order to avoid this wrong recognition of the information inputting person, it is necessary to improve the accuracy of the recognition of the information inputting person. For example, it is necessary to perform a complicated image processing such as the total recognition of the information inputting person by the use of a plurality of image features in addition to the luminance (for example, pattern matching or the like based on the subject is outline which is one of the image features). Therefore, since a heavy load is applied to the image processor for performing the image processing such as the recognition based on the picked-up images, this causes a long time to be taken until the instruction from the information inputting person can be determined. In order to reduce the time required for the determination of the instruction from the information inputting person, it is necessary to use an image processor with a higher processing speed. This causes the problem of the cost of the apparatus increasing. [0007]
  • Furthermore, a three-dimensional coordinate of a feature point has been heretofore determined by a calculation from the position of the feature point of the information inputting person on the picked-up image (for example, a tip of his/her forefinger or the like) so as to thereby determine the position on the display pointed to by the information inputting person. However, the calculation processing for determining the three-dimensional coordinate of the feature point is complicated. Due to this fact, a long time is required for the determination of the instruction from the information inputting person in the same manner as the above-described case. [0008]
  • Moreover, a motion raising the thumb has been heretofore predefined as representing a clicking action, and the motion of raising the thumb alone has been thus detected as the clicking. However, the degree of freedom of movement is low, which disadvantageously causes less ease-of-use. On the other hand, if motions other than the motion of raising the thumb are detected as the clicking, the processing to detect the clicking becomes complicated, causing a disadvantageously, long time to be taken before the clicking is detected. [0009]
  • SUMMARY OF THE INVENTION
  • The present invention was completed in consideration of the above facts. It is a first object of the present invention to provide a hand pointing apparatus having a simple construction and being capable of reducing the time required for the determination of an instruction from a person to be recognized. [0010]
  • It is a second object of the present invention to provide a hand pointing apparatus capable of improving the degree of freedom of the movement which the person to be recognized makes in order to give the instruction, without spending a long time in the determination of the instruction from the person to be recognized. [0011]
  • In order to achieve the above described objects, a hand pointing apparatus according to a first aspect of the present invention comprises: illuminating means for illuminating a person to be recognized; a plurality of image pickup means, located in different positions wherein the image pickup range is adjusted for each image so that the person to be recognized who is illuminated by the above-described illuminating means, may be within the image pickup range, and an illuminated range on a floor surface, which is illuminated by the above-described illuminating means, may be out of the image pickup range; and determining means for extracting an image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, the situations being indicative of the person to be recognized pointing to either a specific position or a specific direction, and for determining either the position or the direction pointed to by the person to be recognized. [0012]
  • In the first aspect of the present invention, the person to be recognized may point to a specific position on, for example, the surface of a display screen or the like of a display, or may point to a specific direction (for example, the direction in which a specific object exists as seen from the person to be recognized). The determining means extracts the image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, where the situations are indicative of the person to be recognized pointing to either the specific position or the specific direction, and the determining means determines either the position or the direction pointed to by the person to be recognized. By calculating a three-dimensional coordinate of a feature point of the person to be recognized (a point whose position is changed in response to the motion by the person to be recognized to point to a specific position or a specific direction, for example, a tip of a predetermined part, (for example, the hand, the finger, or the like), of the body of the person to be recognized making the pointing motion, the tip of a pointer held by the person to be recognized or the like), the determination of the specific position or direction pointed to can be accomplished based on the position of the person to be recognized and the three-dimensional coordinates of the feature point. [0013]
  • In the first aspect of the present invention, the image pickup range of a plurality of pickup means is adjusted so that the person to be recognized, who is illuminated by the illuminating means, may be within the image pickup range, and the illuminated range on the floor surface which is illuminated by the illuminating means, may be out of the image pickup range. Thus, even if an object which is not a subject to be recognized such as luggage or and a trash exists on the floor surface around the person to be recognized while the person to be recognized is illuminated, the possibility that this object which is not the subject to be recognized comes within the image pickup range of the image pickup means is reduced. Furthermore, even if the object which is not the subject to be recognized comes within the image pickup range, the object is not illuminated by the illuminating means and its luminance is thus reduced. Thus, there is little possibility of the image part corresponding to the object which is not the subject to be recognized existing in the image picked up by the image pickup means. Even if the image part corresponding to the object which is not the subject to be recognized exists, the luminance of the image part is reduced. [0014]
  • Thus, in an extraction of the image part corresponding to the person to be recognized by the determining means, it is possible to extract the image part corresponding to the person to be recognized in a short time by a simple processing without a complicated image processing. Therefore, it is possible to reduce the time required for the determination of the instruction from the person to be recognized without the use of an image processor or the like having a high processing speed and a complicated construction. [0015]
  • As described above, according to the first aspect of the present invention, the image pickup range of a plurality of image pickup means is adjusted so that the person to be recognized, who is illuminated by the illuminating means, may be within the image pickup range, and the illuminated range on the floor surface which is illuminated by the illuminating means, may be out of the image pickup range. Thus, an effect is obtained in which it is possible to provide a hand pointing apparatus of a simple construction whereby the time required for the determination of the instruction from the person to be recognized is reduced. [0016]
  • A hand pointing apparatus according to a second aspect of the present invention comprises: a plurality of illuminating means for illuminating a person to be recognized from different directions; a plurality of image pickup means, located in different positions corresponding to each of the plurality of illuminating means, wherein an image pickup range is adjusted so that the person to be recognized, who is illuminated by the corresponding illuminating means, may be within the image pickup range, and the illuminated range on a floor surface, which is illuminated by the corresponding illuminating means, may be out of the image pickup range; controlling means for switching on/off the plurality of illuminating means one by one in sequence, and for controlling so as to image pickup the person to be recognized pointing to either a specific position or a specific direction by the image pickup means corresponding to the switched-on illuminating means; and determining means for extracting an image part corresponding to the person to be recognized from a plurality of images based on a plurality of images picked up by the plurality of image pickup means, and for determining either the position or the direction pointed to by the person to be recognized. [0017]
  • The second aspect of the present invention is provided with a plurality of illuminating means for illuminating the person to be recognized from different directions. The plurality of image pickup means are located in different positions corresponding to a plurality of illuminating means. The image pickup range of the plurality of image pickup means is adjusted so that the person to be recognized, who is illuminated by the corresponding illuminating means, may be within the image pickup range, and the illuminated range on the floor surface, which is illuminated by the corresponding illuminating means, may be out of the image pickup range. Thus, as described in the first aspect of the present invention, even if an object which is not the subject to be recognized, such as luggage or trash, exists on the floor surface around the person to be recognized, the possibility that this object which is not the subject to be recognized comes within the image pickup range of the image pickup means is reduced. Even if this object comes within the image pickup range of the image pickup means, the luminance of the picked-up image is reduced. [0018]
  • The controlling means switches on/off a plurality of illuminating means one by one in sequence, and controls so as to pickup the images of the person to be recognized pointing to either a specific position or a specific direction by the image pickup means corresponding to the switched-on illuminating means, whereby the picked-up images are output from each of the image pickup means. Thus, even if an object which is not the subject to be recognized comes within the image pickup range, the image pickup is performed by the image pickup means at low luminance. [0019]
  • The determining means extracts the image part corresponding to the person to be recognized from a plurality of images based on a plurality of images output by a plurality of image pickup means, and then it determines either the position or the direction indicated by the person to be recognized. Thus, in the same manner as the first aspect of the present invention, there is little possibility that the image part corresponding to the object which is not the subject to be recognized exists. Even if this image part exists, the image part corresponding to the person to be recognized is extracted in accordance with a plurality of images whose luminance is low. Thus, it is possible to extract the image part corresponding to the person to be recognized in a short time by a simple processing without perfoming complicated image processing. [0020]
  • Therefore, the effect is obtained in which it is possible to provide the hand pointing apparatus wherein the time required for the determination of the instruction from the person to be recognized is reduced, without using an image processor or the like having a high processing speed and a complicated construction. [0021]
  • A hand pointing apparatus according to a third aspect of the present invention comprises: a plurality of illuminating means for illuminating a person to be recognized from different directions; at least one image pickup means for picking up the image of the person to be recognized, who is illuminated by the illuminating means; discriminating means for switching on/off the plurality of illuminating means one by one in sequence, for comparing a plurality of images of the person to be recognized pointing to either a specific position or a specific direction picked up by the same image pickup means during the switching on of the plurality of illuminating means, and for discriminating between an image part corresponding to the person to be recognized and an image part other than the image part corresponding to the person to be recognized in the plurality of images for at least one image pickup means; and determining means for extracting the image part corresponding to the person to be recognized from the plurality of images picked up by the image pickup means based on a result of a discrimination by the discriminating means, and for determining either the position or the direction pointed to by the person to be recognized. [0022]
  • The discriminating means of the third aspect of the present invention switches on/off a plurality of illuminating means one by one in sequence, compares a plurality of images of the person to be recognized pointing to either a specific position or a specific direction picked up by the same image pickup means during the switching on of a plurality of illuminating means, and discriminates between the image part corresponding to the person to be recognized and the image part other than the image part corresponding to the person to be recognized in a plurality of images for at least one image pickup means. [0023]
  • Since a plurality of illuminating means illuminate the person to be recognized from different directions, the luminance is always high in the image part corresponding to the person to be recognized in a plurality of images picked up by the same image pickup means during the switching on of a plurality of illuminating means. The luminance is thus considerably varied in the image part corresponding to the objects which are not the subject to be recognized such as luggage and trash on the floor surface around the person to be recognized, depending on the direction of the illumination during the image pickup. Therefore, by a very simple processing to compare the luminance of the image parts in the images to each other over a plurality of images (for example, to compare average values or minimum values of the luminance in each image part), it is possible to discriminate between the image part corresponding to the person to be recognized and the image part other than the image part corresponding to the person to be recognized in a plurality of images. [0024]
  • The determining means extracts the image part corresponding to the person to be recognized from the plurality of images picked up by the image pickup means based on the result of the discrimination by the discriminating means, and determines either the position or the direction pointed to by the person to be recognized. Therefore, it is possible to extract the image part corresponding to the person to be recognized in a short image by a simple processing without performing complicated image processing. It is also possible to reduce the time required for determining an instruction from the person to be recognized without the use of an image processor or the like having a high processing speed and a complicated construction. [0025]
  • A hand pointing apparatus according to a fourth aspect of the present invention comprises: illuminating means for illuminating a person to be recognized; a plurality of image pickup means for picking up the image of the person to be recognized, who is illuminated by the illuminating means from different directions; determining means for extracting an image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, the situations being indicative of the person to be recognized pointing to either a specific position or a specific direction, and for determining either the position or the direction pointed to by the person to be recognized; and preventing means for preventing an object which is not the subject to be recognized from remaining on the floor surface around the person to be recognized. [0026]
  • The fourth aspect of the present invention is provided with the preventing means for preventing an object which is not the subject to be recognized from remaining on the floor surface around the person to be recognized. Since this prevents the object which is not the subject to be recognized from remaining around the person to be recognized, it is possible to prevent the image part corresponding to the object which is not the subject to be recognized from existing in the images picked up by the image pickup means. The determining means extracts the image part corresponding to the person to be recognized based on a plurality of images obtained by the image pickup means, and determines either the position or the direction pointed to by the person to be recognized. Thus, it is possible to extract the image part corresponding to the person to be recognized in a short time by a processing without performing complicated image processing. It is therefore possible to reduce the time required for determining an instruction from the person to be recognized without the use of an image processor or the like having a high processing speed and a complicated construction. [0027]
  • For example, an inclined surface (slope) formed on the floor surface around the person to be recognized can be used as the preventing means. Thus, even if a relatively large object which is not the subject to be recognized (for example, the luggage of the person to be recognized) is placed around the person to be recognized, the object which is not the subject to be recognized slides down on the inclined surface. Thus, it is possible to prevent an object which is not the subject to be recognized, such as the luggage of the person to be recognized, from being placed around the person to be recognized. [0028]
  • Air flow generating means such as a fan for generating an air flow around the person to be recognized may be also applied as the preventing means. Thus, since a relatively small object which is not the subject to be recognized (for example, small trash, dust or the like) is blown away by the generated air flow, it is possible to prevent the object which is not the subject to be recognized such as small trash from remaining around the person to be recognized. A storage tank for storing water or the like around the person to be recognized may be also arranged as the preventing means. Furthermore, this storage tank may be circular in shape so that the water or the like may circulate through the storage tank, whereby it may be used as the preventing means. [0029]
  • According to the fourth aspect of the present invention, since there is provided a preventing means for preventing an object which is not the subject to be recognized from remaining on the floor surface around the person to be recognized, the effect is obtained in which it is possible to provide a hand pointing apparatus of simple construction wherein the time required for the determination of an instruction from the person to be recognized is reduced. [0030]
  • A hand pointing apparatus according to a fifth aspect of the present invention comprises: illuminating means for illuminating a person to be recognized who arrives at a predetermined place; a plurality of image pickup means for picking up the image of the person to be recognized, who is illuminated by the illuminating means from different directions; storing means for storing information for corresponding the three-dimensional coordinates of a plurality of virtual points positioned near the predetermined place, to the positions of the plurality of virtual points on the plurality of images picked up by the plurality of image pickup means; and determining means: for extracting an image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, the situations being indicative of the person to be recognized pointing to either a specific position or a specific direction; for determining the position of a feature point of the person to be recognized in each of the images; for determining the three-dimensional coordinate of the feature point based on the determined position of the feature point and the information stored in the storing means; and for determining either the position or the direction pointed to by the person to be recognized based on the determined three-dimensional coordinates of the feature point. [0031]
  • In the fifth aspect of the present invention, the storing means stores therein the information for corresponding the three-dimensional coordinates of a plurality of virtual points positioned near the predetermined place to the positions of the plurality of virtual points on the plurality of images picked up by the plurality of image pickup means. The determining means extracts the image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, where the situations are indicative of the person to be recognized pointing to either a specific position or a specific direction, and the determining means determines the position of the feature point of the person to be recognized in the each image. Then, the determining means determines the three-dimensional coordinates of the feature point based on the determined position of the feature point and the information stored in the storing means, and determines either the position or the direction pointed to by the person to be recognized based on the determined three-dimensional coordinates of the feature point. [0032]
  • Thus, in the fifth aspect of the present invention, a correspondence between the three-dimensional coordinates of a plurality of virtual points positioned near the predetermined place, and the positions of the plurality of virtual points on the plurality of images picked up by the plurality of image pickup means is previously confirmed from the information stored in the storing means. The three-dimensional coordinates of the feature point of the person to be recognized is determined based on the information stored in the storing means. Thus, the three-dimensional coordinate of the feature point of the person to be recognized can be determined by a very simple processing. Therefore, it is possible to reduce the time required for the determination of an instruction from the person to be recognized without the use of an image processor or the like having a high processing speed and a complicated construction. [0033]
  • On the other hand, in the fifth aspect of the present invention, it is desirable that many virtual points are stored by corresponding the three-dimensional coordinates thereof to the positions thereof on the images in order to determine the three-dimensional coordinates of the feature point of the person to be recognized with a high level of accuracy. More preferably, the storing means stores the information for corresponding the three-dimensional coordinates of many virtual points constantly spaced in a lattice arrangement near the predetermined place, to the positions of these many virtual points on the plurality of images picked up by the plurality of image pickup means. [0034]
  • In such a manner, many virtual points are constantly spaced in the lattice arrangement, whereby, even if the feature point is located in any position near the predetermined place, the virtual point is positioned in proximity to the feature point. The three-dimensional coordinate of the feature point are determined based on the three-dimensional coordinates of the virtual point which is likely to exist in proximity to the feature point on the three-dimensional coordinates, whereby the three-dimensional coordinates of the feature point can be determined with a high level of accuracy regardless of the position of the feature point on the three-dimensional coordinates. [0035]
  • When many virtual points are constantly spaced in the lattice arrangement in the above-described manner, the three-dimensional coordinate of the feature point can be determined in the following manner, for example. [0036]
  • Namely, the determining means of the fifth aspect of the present invention can determine the position of the feature point of the person to be recognized in the images, extract from the images the virtual points positioned in a region within a predetermined range including the feature point on the images, and determine the three-dimensional coordinates of the feature point in accordance with the three-dimensional coordinates of the common virtual points extracted from the images. [0037]
  • Thus, the virtual points positioned in the region within a predetermined range including the feature point on the images are extracted from the images, whereby all the virtual points which are likely to exist in the region adjacent to the feature point on the three-dimensional coordinate are extracted. An area of this region can be defined in response to a space between the virtual points. [0038]
  • Then, the determining means determines the three-dimensional coordinates of the feature point based on the three-dimensional coordinates of the common virtual points extracted from the images. The images picked up by the image pickup means show the situation within the image pickup range, namely, the subject projected on a plane. Therefore, even if a plurality of points, which are positioned as if they were superimposed when seen from the image pickup means, have different three-dimensional coordinates, the points are located in the same position when picked up on a two-dimensional image. On the other hand, since the common virtual points extracted from the images are present in the position adjacent to the feature point on the three-dimensional coordinates, the three-dimensional coordinates of the feature point are determined from the three-dimensional coordinates of the common extracted virtual points, whereby the three-dimensional coordinates of the feature point can be determined with a higher level of accuracy. [0039]
  • When a positional relationship is exactly constant between a predetermined place at which the person to be recognized arrives and the image pickup means, the information to be stored in the storing means can be set permanently based on the result of an experimental measurement or the like of the three-dimensional coordinates of plural virtual points positioned near a predetermined place, and the positions of plural virtual points on the images picked up by the image pickup means. On the other hand, when there is a variation in the position between a predetermined place at which the person to be recognized arrives and the image pickup means, or when this positional relationship is considerably different in design depending on the individual hand pointing apparatuses, it is necessary to reset the information to be stored in the storing means. [0040]
  • From this point of view, the fifth aspect of the present invention further can comprise: generating means for allowing the plurality of image pickup means to pickup images of the situations where markers are positioned in the positions of the virtual points, the generating means for generating the information for corresponding the three-dimensional coordinates of the virtual points to the positions of the virtual points on the images based on the three-dimensional coordinates of the virtual points and the marker positions on the images picked up by the plurality of image pickup means, and the generating means for allowing the storing means to store the generated information. [0041]
  • Any marker will do as long as the marker is easy to identify on the images obtained by the image pickup. For example, a particular-color mark and a light-emission source such as LED can be used as the marker. The marker may be manually positioned in a predetermined position by a person. Alternatively, the marker may be automatically positioned by moving means for moving the marker to an optional position. When the marker is moved by the moving means, the three-dimensional coordinates of a predetermined position can be determined from the amount of movement of the marker caused by the moving means. [0042]
  • The generating means is provided in the above-mentioned manner, whereby the information for corresponding the three-dimensional coordinates of the virtual points to the positions of the virtual points on the images is automatically generated. Thus, even if there is variation in the position between a predetermined place at which the person to be recognized arrives and the image pickup means, or when this positional relationship is considerably different in design depending on the individual hand pointing apparatuses, it is possible to obtain automatically the information for corresponding the three-dimensional coordinates of the virtual points to the positions of the virtual points on the images with a high level of accuracy. [0043]
  • According to the fifth aspect of the present invention, the information for corresponding the three-dimensional coordinates of a plurality of virtual points positioned near a predetermined place at which the person to be recognized arrives, to the positions of a plurality of virtual points on a plurality of images picked up by a plurality of image pickup means is stored. The three-dimensional coordinates of the feature point is determined based on the position of the feature point on a plurality of images picked up by a plurality of image pickup means and the stored information. Thus, the effect is obtained in which it is possible to provide a hand pointing apparatus of simple construction wherein the time required for the determination of an instruction from the person to be recognized is reduced and the accuracy of instruction determination is excellent. [0044]
  • A hand pointing apparatus according to a sixth aspect of the present invention comprises: illuminating means for illuminating a person to be recognized; a plurality of image pickup means for picking up the image of the person to be recognized, who is illuminated by the illuminating means from different directions; determining means for extracting an image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, the situations being indicative of the person to be recognized pointing to either a specific position or a specific direction, and for determining either the position or the direction pointed to by the person to be recognized; first detecting means for extracting the image part corresponding to a predetermined part of the body of the person to be recognized from the plurality of images, and for detecting a change in any one of either an area of the extracted image part, an outline of the extracted image part and a length of an outline of the extracted image part; and processing means for executing a predetermined processing when the change is detected by the first detecting means. [0045]
  • The sixth aspect of the present invention is provided with the first detecting means for extracting the image part corresponding to a predetermined part (for example, the hand, the arm or the like) of the body of the person to be recognized in the plurality of images and for detecting a change in either the area of the extracted image part, the change in the contour of the extracted image part, or the change in the length of the contour line of the extracted image part. The processing means executes a predetermined processing when a change is detected by the first detecting means. The area, the contour, and the length of the contour line of the image part can be relatively easily detected. Moreover, when the person to be recognized moves a predetermined part of the body, even if his/her motion is not a predefined motion, in almost all cases, the area, the contour, and the length of the contour, and the length of the contour line of the image part corresponding to a predetermined part are changed. [0046]
  • Therefore, according to the sixth aspect of the present invention, since a change in the area, the contour, or the length of the contour line of the image part is used, it is possible to improve the degree of freedom of movement which the person to be recognized has in order to instruct the processing means to execute a predetermined processing. This movement can be also detected in a short time. Thus, the effect is obtained in which the instruction from the person to be recognized can be determined in a short time. [0047]
  • On the other hand, when a person beings makes a movement to point to a specific position or a specific direction, even if the position or direction to be pointed to is changed, the fingertip or the like is generally merely moved along a virtual spherical surface centered in the vicinity of the shoulder joint, thereby resulting in little change in the distance between the fingertip or the like and the body including the shoulder joint. [0048]
  • Thus, a hand pointing apparatus according to a seventh aspect of the present invention comprises: illuminating means for illuminating a person to be recognized; a plurality of image pickup means for picking up the image of person to be recognized, who is illuminated by the illuminating means from different directions; determining means for extracting an image part corresponding to the person to be recognized from a plurality of images based on a plurality of images of situations picked up by the plurality of image pickup means, the situations being indicative of the person to be recognized pointing to either a specific position or a specific direction, for determining the three-dimensional coordinates of the feature point whose position is changed when the person to be recognized bends or extends an arm and the three-dimensional coordinates of a reference point whose position is not changed even if the person to be recognized bends or extends an arm, and for determining either the position or the direction pointed to by the person to be recognized in accordance with the three-dimensional coordinates of the feature point and the three-dimensional coordinates of the reference point; and processing means for calculating the distance between the reference point and the feature point and for executing a predetermined processing based on the change in the distance between the reference point and the feature point. [0049]
  • The determining means according to the seventh aspect of the present invention extracts the image part corresponding to the person to be recognized from a plurality of images, determines the three-dimensional coordinates of the feature point whose position is changed when the person to be recognized bends or extends an arm and the three-dimensional coordinates of the reference point whose position is not changed even if the person to be recognized bends or extends an the arm, and determines either the position or the direction pointed to by the person to be recognized based on the three-dimensional coordinates of the feature point and the three-dimensional coordinates of the reference point. The processing means calculates the distance between the reference point and the feature point, and executes a predetermined processing based on the change in the distance between the reference point and the feature point. For example, the tip of the hand, the finger or the like of the person to be recognized or the point corresponding to the tip or the like of a pointer held by the person to be recognized can be used as the feature point. For example, a point corresponding to the body (such as the chest and the shoulder joint) of the person to be recognized can be used as the reference point. [0050]
  • Thus, if the person to be recognized makes a motion to adjust the direction of the feature point with respect to the reference point so that the direction from the reference point toward the feature point may match the position or direction to be pointed to, the pointed position or direction pointed to is determined by the determining means. If the person to be recognized makes a motion to bend or extend the arm, the distance between the reference point and the feature point is changed, so that a predetermined processing is thus performed based on this change in the distance. [0051]
  • Thus, in the seventh aspect of the present invention, since the position or direction pointed to by the person to be recognized is determined from the positional relationship between the reference point and the feature point, the direction in which the image pickup means picks up the image can be set so that the reference point and the feature point can be reliably detected without taking into account motions such as the raising and lowering of the finger. Furthermore, since whether or not the execution of a predetermined processing is instructed is determined on the basis of the change in the distance (relative position) between the reference point and the feature point, it is unnecessary to detect additional image features in order to determine whether or not the execution of a predetermined processing is being instructed. In addition, the distance between the reference point and the feature point scarcely changes even if a person makes a motion to point to a specific position or a specific direction. [0052]
  • Therefore, according to the seventh aspect of the present invention, it is possible to reliably detect the motion of the person to be recognized to instruct the execution of a predetermined processing (the motion to bend or extend the arm) in a short time. The instruction from the person to be recognized can thus be confirmed in a short time. [0053]
  • The processing means can execute, as a predetermined processing, the processing associated with the position or direction pointed to by the person to be recognized, for example, when the distance between the reference point and the feature point is changed. Since the motion to bend or extend the arm is a very natural motion, if this motion is used to instruct the above-described execution of a predetermined processing, the person to be recognized can make the motion for instructing the execution of a predetermined processing without feeling a sense of uncomfortableness. [0054]
  • Furthermore, the direction of the change in the distance between the reference point and the feature point due to the motion to bend or extend the arm is of two types (a direction of increase in the distance and a direction of reduction in the distance). Thus, when the distance between the reference point and the feature point is increased, a first predetermined processing may be carried out. When the distance between the reference point and the feature point is reduced, a second predetermined processing differing from the first predetermined processing may be carried out. [0055]
  • Thus, when the person to be recognized makes a motion to extend an arm (in this case, the distance between the reference point and the feature point is increased), the first predetermined processing is carried out. When the person to be recognized makes a motion to bend the arm (in this case, the distance between the reference point and the feature point is reduced), the second predetermined processing is carried out. It is therefore possible for the person to be recognized to select the processing to be executed from either the first predetermined processing or and second predetermined processing, similarly to such as left and right clicks of a mouse. The person to be recognized makes either the extending motion or the bending motion, whereby it is possible to reliably execute the processing selected from either the first predetermined processing or second predetermined processing by the person to be recognized. [0056]
  • For the determination of whether or not the execution of a predetermined processing is instructed on the basis of a change in the distance between the reference point and the feature point, more particularly, for example, the magnitudes of the change in the distance between the reference point and the feature point are compared. If the change in the distance is a predetermined value or more, it is possible to determine that the execution of a predetermined processing is instructed. However, if the distance between the reference point and the feature point is considerably changed due to other motions having no intention of the execution of a predetermined processing, then it is possible that the instruction from the person to be recognized may be mistaken. [0057]
  • From this point of view, preferably, the processing means detects the rate of change in the distance between the reference point and the feature point, that is, the velocity of the change, and executes a predetermined processing when the detected velocity of change is a at threshold value or more. [0058]
  • In the seventh aspect of the present invention, the velocity of the change in the distance between the reference point and the feature point is detected, and a predetermined processing is then executed only when the detected velocity of the change is at the threshold value or more. In such a manner, the person to be recognized makes a specific motion to quickly bend or extend on arm, whereby the velocity of the change in the distance between the reference point and the feature point reaches the threshold value or more, so that a predetermined processing is executed. Thus, the rate of recognition of the motion of the person to be recognized for instructing the execution of a predetermined processing is improved. Only when the person to be recognized makes a motion for instructing the execution of a predetermined processing, is this motion reliably detected allowing a predetermined processing to be carried out. [0059]
  • Moreover, as the physique and muscular strength or the like varies depending on the person to be recognized, even if the person to be recognized makes a motion to quickly bend or extend an arm in order to allow the processing means to execute a predetermined processing, the velocity of the change in the distance between the reference point and the feature point varies depending on the individual person to be recognized. Therefore, in some cases, even if the person to be recognized makes a motion to quickly bend or extend an arm in order to instruct the processing means to execute a predetermined processing, this motion cannot be detected. In contrast to this, sometimes this motion is detected by mistake, although the person to be recognized has not made this motion. [0060]
  • Thus, preferably, the seventh aspect of the present invention further comprises threshold value setting means for requesting the person to be recognized to bend or extend the arm and for previously setting the threshold value based on the rate of the change in the distance between the reference point and the feature point when the person to be recognized bends or extends the arm. [0061]
  • In this manner, the threshold value as to whether or not the processing means executes a predetermined processing is previously set based on the rate of the change in the distance between the reference point and the feature point when the person to be recognized bends or extends an arm (quickly bends or extends an arm) in order to allow the processing means to execute a predetermined processing, whereby the threshold value can be obtained in response to the physique, muscular strength, or the like of the individual persons to be recognized. Whether or not the execution of a predetermined processing is instructed is determined by the use of this threshold value, whereby it is possible to reliably detect the motion of the person to be recognized to instruct the execution of a predetermined processing and to execute a predetermined processing, regardless of any variation in physique, muscular strength, or the like, depending on the individual person to be recognized. [0062]
  • Furthermore, the seventh aspect of the present invention further comprises second detecting means for extracting the image part corresponding to the arm of the person to be recognized from the plurality of images and for detecting whether or not the arm of the person to be recognized is lowered, wherein the processing means continues in its current state when the second detecting means detects that the arm of the person to be recognized is lowered. Namely, an execution state is continued when the processing is carried out, while a stop state is continued when the processing is stopped. Thus, since the person to be recognized does not need to keep raising the arm in order to continuously execute a certain processing, the task of the person to be recognized can be reduced. [0063]
  • According to the seventh aspect of the present invention, the position or direction pointed to by the person to be recognized is determined on the basis of the three-dimensional coordinates of the feature point whose position is changed when the person to be recognized bends or extends an arm and on the basis of the three-dimensional coordinates of the reference point whose position is not changed even if the person to be recognized bends and extends an arm, and a predetermined processing is also executed based on the change in the distance between the reference point and the feature point. Thus, the following effect is obtained. Namely, it is possible to reliably detect the motion of the person to be recognized to instruct the execution of a predetermined processing in a short time, and it is also possible to determine the instruction from the person to be recognized in a short time.[0064]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a perspective view showing surroundings of an information input space. [0065]
  • FIG. 2 is a block diagram showing a schematic constitution of a hand pointing input apparatus according to the present embodiment. [0066]
  • FIG. 3 schematically shows an example of a relationship between an illumination range of an illuminator and an image pickup range of a video camera. [0067]
  • FIG. 4 is a perspective view of the information input space showing an example of a mark plate. [0068]
  • FIG. 5 is a flow chart of an initialization processing of information about a lattice point position. [0069]
  • FIG. 6 is a flow chart of an illumination control processing. [0070]
  • FIG. 7 is a timing chart showing a timing of the switch-on/off of illuminators A, B by the illumination control processing of FIG. 6 and of an output (capture) of an image picked up by the video camera. [0071]
  • FIGS. 8A and 8B are a flow chart of an instruction determination processing. [0072]
  • FIG. 9 is a side view of the information input space for describing a calculation of the height of an information inputting person and the position of the information inputting person on a floor surface. [0073]
  • FIG. 10A is an image illustration showing an image of hand of the information inputting person picked up by the video camera. [0074]
  • FIG. 10B is a conceptual view of a search range for the lattice point for determining a coordinate of a feature point and three-dimensional coordinate of the feature point. [0075]
  • FIG. 11A is a plan view of the information input space for describing the determination of the position on a display pointed to by the information inputting person. [0076]
  • FIG. 11B is a side view of the information input space shown in FIG. 11A. [0077]
  • FIGS. [0078] 12A-12C are image illustrations showing an example of a motion of the information inputting person.
  • FIG. 13 schematically shows another example of the relationship between the illumination range of the illuminator and the image pickup range of the video camera. [0079]
  • FIG. 14 is a flow chart of the illumination control processing in an arrangement shown in FIG. 13. [0080]
  • FIG. 15 is a timing chart showing the timing of the switch-on/off of the illuminators A, B by the illumination control processing of FIG. 14. [0081]
  • FIG. 16 is a perspective view of an aspect of a slope platform arranged on the floor surface in the information input space. [0082]
  • FIG. 17 is a perspective view of the information input space showing another example of the mark plate. [0083]
  • FIG. 18 is a perspective view of the information input space showing an example of a movement of a marker position by a robot arm unit. [0084]
  • FIG. 19 is a flow chart of another example of the instruction determination processing. [0085]
  • FIG. 20 is a flow chart of a further example of the instruction determination processing. [0086]
  • FIG. 21 is a flow chart of the processing for setting the click motion speed. [0087]
  • FIG. 22A is an image illustration for describing a forward click motion. [0088]
  • FIG. 22B is an image illustration for describing a backward click motion. [0089]
  • FIG. 23 is an image illustration for describing a data conversion into a dummy model.[0090]
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • Embodiments of the present invention will be described below in detail with reference to the accompanying drawings. As shown in FIG. 1, a large-[0091] screen display 12 is built into a wall surface in a place at which an information inputting person 10, who is the person to be recognized of the present invention arrives. Known display means such as a liquid crystal display (LCD), a plasma display, a cathode ray tube (CRT) and an optical fiber display can be applied as the display 12.
  • The [0092] display 12 is connected to an information processor 14 composed of a personal computer or the like (see FIG. 2). The information processor 14 allows various types of information to be displayed on a display surface in various display forms, such as a figure, a table, a character, an image or the like. In the present embodiment, the information inputting person 10 arrives at the place (information input space) shown in FIG. 1 in front of the display 12. The information inputting person 10 points to a position on the display surface of the display 12 on which various information is displayed, while he/she makes a click motion (described below in detail), whereby he/she gives various instructions to the information processor 14 and allows various types of processing to be executed.
  • As shown in FIG. 2, a [0093] controller 22 of a hand pointing input apparatus 20 according to the present embodiment is connected to the information processor 14. The controller 22 includes CPU 22A, ROM 22B, RAM 22C, and an I/O interface 22D. These elements are connected to each other through a bus. The information processor 14, a non-volatile memory 24 capable of updating stored contents, a display 26 for displaying various types of information and a keyboard 28 for inputting various instructions and data by an operator are connected to the I/O interface 22D.
  • An [0094] illumination control device 30 is also connected to the I/O interface 22D of the controller 22. A plurality of near- infrared light illuminators 32A and 32B for emitting a light of a wavelength within a near-infrared range in a beam manner are connected to the illumination control device 30. As shown in FIG. 1, the near- infrared light illuminators 32A and 32B are arranged in different positions over the information input space. Their radiation ranges are adjusted so that the illuminators 32A and 32B may illuminate, from different directions, the information inputting person 10 who arrives at the information input space (see FIG. 3, too). The illumination control device 30 controls the switch-on/off of the illuminators 32A and 32B in response to the instruction from the controller 22.
  • A [0095] pickup control device 34 is connected to the I/O interface 22D of the controller 22. A plurality of video cameras 36A and 36B arranged in different positions over the information input space (see FIG. 1) are connected to this image pickup control device 34. Although an illustration of the video cameras 36A and 36B is omitted, the video cameras 36A and 36B include an area sensor composed of a near-infrared-light-sensitive CCD or the like. A filter for transmitting only the light of the wavelength within the near-infrared range is also disposed on the light-incident side on an imaging lens for forming incident light into an image on a receptor surface of the area sensor.
  • As shown in FIG. 3, the [0096] video camera 36A is oriented so that the information inputting person 10 who arrives at the information input space may be within an image pickup range. It is also oriented so that the light emitted from the illuminator 32A corresponding to the video camera 36A dose not fall directly on the imaging lens, and so that the center of the image pickup range may cross the center of the range illuminated by the illuminator 32A at a predetermined height from the floor surface in the information input space. Therefore, the image pickup range of the video camera 36A is adjusted so that the range on the floor surface illuminated by the illuminator 32A corresponding to the video camera 36A may be out of the image pickup range. In the same manner, the video camera 36B is oriented so that the information inputting person 10 who arrives at the information input space may be within the image pickup range, the light emitted from the illuminator 32B corresponding to the video camera 36B may not fall directly on the imaging lens and the center of the image pickup range may cross the center of the range illuminated by the illuminator 32B at a predetermined height from the floor surface in the information input space. Therefore, the image pickup range of the video camera 36B is adjusted so that the range on the floor surface illuminated by the illuminator 32B corresponding to the video camera 36B may be out of the image pickup range.
  • In this manner, the image pickup ranges of the [0097] video cameras 36A and 36B are adjusted so that the ranges on the floor surface illuminated by the different illuminators corresponding to the video cameras may be out of the image pickup ranges.
  • A mark [0098] plate driving unit 38 is also connected to the I/O interface 22D of the controller 22. As shown in FIG. 4, the hand pointing input apparatus 20 comprises a mark plate 40 arranged near the information input space. The mark plate 40 is composed of a multiplicity of marks 40A which are recorded so as to be equally spaced in a matrix form on a transparent flat plate. The mark plate 40 can be moved so that it may move a cross the information input space in a direction perpendicular to the main surface of the mark plate 40 (a direction shown by arrow A in FIG. 4). The marks 40A are colored with a color which is easy to recognize on the image (for example, red). The mark plate driving unit 38 allows the mark plate 40 to be moved in the direction of the arrow A in FIG. 4 in response to an instruction from the controller 22.
  • A function of the present embodiment will be described below. Firstly, the initialization of lattice point position information during installation of the hand pointing [0099] input apparatus 20 will be described with reference to the flow chart of FIG. 5.
  • In step [0100] 100, the mark plate driving unit 38 allows the mark plate 40 to be moved to a predetermined position (a position corresponding to an end of the moving range of the mark plate 40), namely, a reference position. In the next step 102, the three-dimensional coordinates (x, y, z) of the multiplicity of marks 40A recorded on the mark plate 40 in the information input space, in the current position of the mark plate 40 are calculated. In step 104, the image of the information input space is picked up by the video cameras 36A and 36B through the image pickup control device 34. In the next step 106, the image of the information input space picked up by the video camera 36A (referred to as an image A) is captured through the image pickup control device 34.
  • In [0101] step 108, the marks 40A in the image A captured in step 106 are recognized (extracted). In the next step 110, the positions (XA, YA) of all the recognized marks 40A on the image A are calculated. In step 112, the three-dimensional coordinates (x, y, z) in the information input space of all the marks 40A in the image A are made to correspond to the positions (XA, YA) of all the marks 40A on the image A, and this correspondence is stored in the memory 24 as the lattice point position information of the video camera 36A.
  • In subsequent steps [0102] 114 through 120, the processes of the video camera 36B are performed in the same manner as in the above-described steps 106 through 112. Namely, in the next step 114, the image of the information input space picked up by the video camera 36B (referred to as an image B) is captured through the image pickup control device 34. In step 116, the marks 40A in the image B captured in step 114 are recognized (extracted). In the next step 118, the positions (XB, YB) of all the recognized marks 40A on the image B are calculated. In step 120, the three-dimensional coordinates (x, y, z) in the information input space of all the marks 40A in the image B are made to correspond to the positions (XB, YB) of all the marks 40A on the image B, and this correspondence is stored in the memory 24 as the lattice point position information of the video camera 36B.
  • In the [0103] next step 122, whether or not the mark plate 40 is moved to a final position (a position corresponding to the end opposite to the predetermined position in step 100 within the moving range of the mark plate 40)is determined. If the determination is negative in step 122, the processing proceeds to step 124. In step 124, the mark plate driving unit 38 allows the mark plate 40 to be moved in a predetermined direction by a fixed distance (specially, the distance corresponding to the space between the marks 40A on the mark plate 40) Then, the processing is returned to step 102.
  • As described above, until the mark plate [0104] 40 reaches the final position, steps 102 through 124 are repeated. Thus, the multiplicity of marks 40A recorded on the mark plate 40 are moved to the positions corresponding to the multiplicity of lattice points (corresponding to virtual points) which are uniformity spaced in a lattice arrangement in the information input space. The correspondence between the three-dimensional coordinates of the lattice points in the information input space and the positions thereof on the image A is stored in the memory 24 as the lattice point position information of the video camera 36A. The correspondence between the three-dimensional coordinates of the lattice points in the information input space and the positions thereof on the image B is also stored in the memory 24 as the lattice point position information of the video camera 36B.
  • The lattice point position information initialized by the above-mentioned lattice point position information initialization corresponds to the information for corresponding the three-dimensional coordinates of the virtual points to the positions of the virtual points on the image. The [0105] memory 24 corresponds to the storing means of the present invention. Since the mark plate 40 and the mark plate driving unit 38 are used only for the above-mentioned lattice point position information initialization and are not used for the following processing, the mark plate 40 and the mark plate driving unit 38 may be removed after the initialization.
  • Referring to the flow chart of FIG. 6, the following description is provided for an illumination control processing which is regularly carried out by the [0106] controller 22 after the above-mentioned lattice point position information initialization. In step 130, the illumination control device 30 switches on the illuminator 32A and switches off the illuminator 32B. In step 132, an image of the information input space is picked up by the video camera 36A, and the image is then output from the video camera 36A. In step 134, whether or not a predetermined time period has passed since the illuminator 32A was switched on is determined. Processing dose not begin until a positive determination is made.
  • If an affirmative determination is made in [0107] step 134, the processing proceeds to step 136. In step 136, the illumination control device 30 switches off the illuminator 32A and switches on the illuminator 32B. In step 138, an image of the information input space is picked up by the video camera 36B, and the image is then output from the video camera 36B. In step 140, whether or not a predetermined time period has passed since the illuminator 32A was switched on is determined. Processing dose not begin until a positive determination is made. Then, if an affirmative determination is made in step 140, the processing returns to step 130.
  • As shown in FIG. 7, too, the above-described illumination control processing allows the [0108] illuminators 32A and 32B to be alternately switched on/off at a predetermined time interval. When the illuminator 32A is switched on, the image is picked up by the video camera 36A, and image data indicating the image A picked up by the video camera 36A is then output to the controller 22 through the image pickup control device 34. When the illuminator 32B is switched on, the image is picked up by the video camera 36B, and the image data indicating the image B picked up by the video camera 36B is then output to the controller 22 through the image pickup control device 34.
  • In the present embodiment, since the image pickup is performed by means of a near-infrared light, the luminance of the image part corresponding to the [0109] information inputting person 10 in the picked up image is not influenced, and thus not altered by a change in the luminance of the display 12 when a visible light is emitted therefrom, or by the skin color or clothing color of the information inputting person 10. Therefore, in the instruction determination processing as described below, the image part corresponding to the information inputting person 10 can be extracted with a high level of accuracy. Moreover, even if a fluorescent tube, which is processed so that light of the wavelength of the near-infrared range is not be emitted therefrom, is disposed near the information input space, the processing is not influenced by this fact. Furthermore, since the emission of the near-infrared light is not perceived by the information inputting person 10, the above-described alternate switch-on/off of the illuminators 32A and 32B does not give an uncomfortable feeling to the information inputting person 10.
  • Referring to the flow chart of FIGS. 8A and 8B, the following description is provided for the instruction determination processing for determining the instruction from the [0110] information inputting person 10, which is repeated at a predetermined time interval by the controller 22, together with the aforementioned illumination control processing.
  • In [0111] step 150, the image data indicating the image A output from the video camera 36A and the image data indicating the image B output from the video camera 36B are captured at the timing shown in FIG. 7. In the next step 152, whether or not the information inputting person 10 is present in the information input space is determined based on the image data of the images A and B captured in step 150.
  • As described above, the image of the information input space is picked up by the [0112] video camera 36A when the illuminator 32A alone is switched on, and the image pickup range of the video camera 36A is adjusted so as to be out of the range on the floor surface illuminated by the illuminator 32A. Accordingly, even if an object 50A which is not a subject to be recognized (see FIG. 3) such as the luggage of the information inputting person 10 or trash is present within the range on the floor surface illuminated by the illuminator 32A, this object 50A which is not the subject to be recognized is not within the image pickup range of the video camera 36A. Furthermore, if an object 50B which is not the subject to be recognized (see FIG. 3) is present within the range on the floor surface picked up by the video camera 36A, an image of the object 50B which is not the subject to be recognized is picked up by the video camera 36A. However, since the object 50B which is not the subject to be recognized is out of the range illuminated by the illuminator 32A, the luminance of the image part corresponding to the object 50B which is not the subject to be recognized in the image A is very low.
  • Furthermore, the image of the information input space is picked up by the [0113] video camera 36B when the illuminator 32B alone is switched on, and the image pickup range of the video camera 36B is adjusted so that it may be out of the range on the floor surface illuminated by the illuminator 32B. Accordingly, even if the object 50B which is not the subject to be recognized is present on the floor surface illuminated by the illuminator 32B, this object 50B which is not the subject to be recognized is not within the image pickup range of the video camera 36B. Furthermore, if the object 50A which is not the subject to be recognized is present within the range on the floor surface picked up by the video camera 36B, the image of the object 50A which is not the subject to be recognized is picked up by the video camera 36B and thus the image part corresponding to the object 50A which is not the subject to be recognized is present in the image B. However, in the same manner as described above, the luminance of the image part corresponding to the object 50A is very low.
  • Therefore, in the [0114] previous step 152, whether or not the information inputting person 10 is present in the information input space can be determined by a very simple determination of, for example, whether or not the image part having a high luminance, and an area of a predetermined value or more, is present in the images A and B. When a negative determination is made in step 152, no processing is carried out and the instruction determination processing is completed.
  • On the other hand, if an affirmative determination is made in [0115] step 152, the processing proceeds to step 154. The processing from step 154 corresponds to the determining means of the present invention. In step 154, the image part corresponding to a full-length image of the information inputting person 10 are extracted from the images A and B. The image part corresponding to the full-length image of the information inputting person 10 can be also easily extracted by determining a continuous region which is composed of high-luminance pixels and has the area of a predetermined value or more.
  • In step [0116] 156, the height of the information inputting person 10 is calculated based on the image part corresponding to the full-length image of the information inputting person 10. As shown in FIG. 9, f denotes a focal length of the imaging lens of the video camera positioned at a point O, H denotes the distance between an intersection point Q of a vertical line passing through the point O and the floor surface in the information input space and the point O, R denotes the distance between the point Q and a point P on the floor surface on which the information inputting person 10 is standing, and a distance h between a point P′ corresponding to the top of the head of the information inputting person 10 and the point P is defined as the height of the information inputting person 10. Assuming that θ denotes ∠POQ; θ′ denotes ∠P′OQ; h′ denotes the length of the image of the information inputting person formed on the receptor surface of the area sensor of the video camera, a point p denotes an imaging point on the receptor surface corresponding to the point P; a point p′ denotes the imaging point on the receptor surface corresponding to the point P′; r denotes the distance between a center ∘ of the receptor surface and the point p; and r′ denotes the distance between the center ∘ of the receptor surface and the point p′, the angles θ, θ′ and the distances r, r′ can be determined by the following equations (1) through (4).
  • θ=tan−1(R/H)  (1)
  • θ′=tan−1 {R/(H−h)}  (2)
  • r=fθ  (3)
  • 4′=fθ′  (4)
  • Therefore, the height h of the [0117] information inputting person 10 and the distance R can be determined by the following equations (5) and (6).
  • h=H{1−tan(r/f)/tan(r′/f)}  (5)
  • R=Htan(r/f)  (6)
  • Since the distance H and the focal length f are already known, in step [0118] 156, the distances r and r′ are determined from either the image A or the image B picked up by the video cameras 36A or 36B, and these determined distances r and r′ are then substituted in the equation (5), whereby the height h of the information inputting person 10 can be found. In step 156, the distances r are found from the images A and B, and the determined distances r are then substituted in the equation (6) so that the distances R are found, whereby the position (two-dimensional coordinates) of the information inputting person 10 on the floor surface is determined.
  • In [0119] next step 158, the three-dimensional coordinates (x0, y0, z0) of a reference point P0 of the information inputting person 10 is determined based on the height h of the information inputting person 10 and the position of the information inputting person 10 on the floor surface determined in step 156. For example, the point (the point P0 shown in FIG. 11) corresponding to the back of the information inputting person 10 or the like can be used as the reference point P0. In this case, the height (for example, the value z0) of the reference point P0, corresponding to the back of the information inputting person 10, from the floor surface is calculated in accordance with the height h of the information inputting person 10. Then, the position (plane coordinates) of the information inputting person 10 on the floor surface is set to the plane coordinate (for example, the values x0 and y0) of the reference point P0, whereby the three-dimensional coordinates of the reference point P0 can be determined.
  • In [0120] step 159, whether or not the information inputting person 10 makes the pointing motion (the motion to point toward the display 12 using a by the finger or the like) is determined based on the shapes of the image parts corresponding to the full-length images of the information inputting person 10 in the images A and B. Since the direction of the display 12 seen from the information inputting person 10 is already known, the determination in step 159 can be accomplished by, for example, determining whether or not the portion projecting toward the display 12, as seen from the information inputting person 10, is present at the height determinable as the position of the hand of the information inputting person 10, in the image part corresponding to the full-length image of the information inputting person 10.
  • Thus, when the [0121] information inputting person 10 changes his/her attitude from an upright standing attitude, as shown in FIG. 12A, into an attitude of pointing with the hand to the display 12, as shown in FIG. 12B or 12C, the determination that the information inputting person 10 is making a pointing motion is determined. If a negative determination is made in step 159, no processing is performed and the instruction determination processing is completed. On the other hand, if an affirmative determination is made in step 159, the processing proceeds to step 160.
  • In [0122] step 160, a feature point PX of the information inputting person 10 in the image A is extracted on the basis of the image data indicating the image A captured from the video camera 36A, and the position (XA, YA) of the feature point PX on the image A is calculated. The point corresponding to the fingertip pointing to the display 12 or the like can be used as the feature point PX of the information inputting person 10. In this case, this calculation can be accomplished by defining, as the position of the feature point PX, the position whose the tip of the portion projecting toward the display 12 is positioned at a height determinable as the position of the hand of the information inputting person 10, in the image part indicating the full-length image of the information inputting person 10.
  • Thus, when the image of the hand of the [0123] information inputting person 10 is picked up by the video camera 36A, as shown in FIG. 10A, the coordinates (XA, YA) of the feature point PX, as shown in FIG. 10B, is calculated in order to determine the position of the feature point PX.
  • In step [0124] 162, all the lattice points whose positions on the image A are within the range (a range R shown in FIG. 10B) of (XA±dX, YA±dY) are searched based on the lattice point position information of the video camera 36A stored in the memory 24. The sizes of dX and dY are defined on the basis of the space between the lattice points (the space between the marks 40A) so that at least one lattice point or more may be extracted.
  • In the present embodiment, a wide-angle lens is used as the imaging lens of the video camera. Thus, assuming that dX and dY are constant, the longer the distance between the video camera and the lattice points gets, the more lattice points are within the range of (X[0125] A±dX, YA±dY), thereby resulting in a deterioration of the accuracy of calculating the three-dimensional coordinates of the feature point PX as described below. Thus, dX and dY are set so that the values thereof are reduced as the distance from the video camera to dX and dY gets longer on the three-dimensional coordinates. Therefore, the range corresponding to (XA±dX, YA±dY) on the three-dimensional coordinate is shaped into a quadrangular pyramid whose bottom surface is positioned on the side of the video camera. In this step 162, the virtual points positioned within a predetermined range including the feature point on the image are extracted.
  • In [0126] step 164, in the same manner as the previous step 160, the feature point PX of the information inputting person 10 in the image B is extracted on the basis of the image data indicating the image B, captured from the video camera 36B, and the position (XB, YB) of the feature point PX on the image B is calculated. In step 166, in the same manner as the previous step 162, all the lattice points whose positions on the image B are within the range of (XB±dX, YB±dY) are searched on the basis of the lattice point position information of the video camera 36B stored in the memory 24. In this step 166, the virtual points positioned within a predetermined range including the feature point on the image are also extracted.
  • In [0127] next step 168, the common extracted lattice points are determined on the basis of the lattice points extracted from the images A and B as described above. By this determination, only a plurality of lattice points in the position adjacent to the feature point PX in the information input space are extracted. In step 170, the three-dimensional coordinates of the common lattice points extracted from the images A and B are captured from the lattice point position information.
  • In this embodiment, as described below, the three-dimensional coordinates of the feature point P[0128] X are calculated by an interpolation from the three-dimensional coordinates of plural lattice points in the position adjacent to the feature point in the information input space, (more specifically, a coordinate value of the three-dimensional coordinates of the feature point is found by a weighted average of the coordinate values of the three-dimensional coordinates of plural lattice points). Thus, previous to the calculation of the three-dimensional coordinates of the feature point PX, in the next step 172, a rate of interpolation from the three-dimensional coordinates of the common lattice points extracted from the images A and B (a weight to the coordinate values of the three-dimensional coordinates of the lattice points) is determined based on the positions on the images A, and B of the common lattice points extracted from the images A and B, the position (XA, YA) of the feature point PX on the image A, and the position (XB, YB) of the feature point PX on the image B. For example, this rate of interpolation can be determined so that the weight of the coordinate values of the three-dimensional coordinates of the lattice points in the positions adjacent to the feature points on the images A and B may be increased.
  • In [0129] step 174, the three-dimensional coordinates (XX, YX, ZX) of the feature point PX are calculated on the basis of the three-dimensional coordinates of the common lattice points extracted from the images A and B and the rate of interpolation determined in step 172.
  • In [0130] step 176, based on the three-dimensional coordinates of the reference point P0 of the information inputting person calculated in the previous step 158, and the three-dimensional coordinates of the feature point PX calculated in step 174, the direction of an extended virtual line (see virtual line 54 in FIG. 11) connecting the reference point and the feature point is determined as the direction pointed to by the information inputting person 10, and the coordinates (plane coordinate) of the intersection point (see point S in FIG. 11) of the plane, including the display surface of the large-screen display 12, and the virtual line are calculated in order to determine the position pointed to by the information inputting person 10.
  • In the [0131] next step 178, whether or not the information inputting person 10 is pointing to the display surface of the large-screen display 12 is determined based on the coordinates determined in step 176. If a negative determination is made, a monitor flag (the flag for monitoring the click motion) is set at 0 in step 180 so as to thereby complete the instruction determination processing. On the other hand, if an affirmative determination is made in step 178, the coordinates indicating the position pointed to by the information inputting person 10 calculated in step 176 are output to the information processor 14. Thus, the information processor 14 performs the processing, for example, it allows a cursor to be displayed at a predetermined position, which is judged the position pointed to by the information inputting person 10, on the display surface of the display 12.
  • From the [0132] next step 184 and the steps following step 184, whether or not the information inputting person 10 makes the click motion is determined. In the present embodiment, the click motion is defined as any motion of the hand of the information inputting person (for example, bending and turning a wrist, bending and extending a finger or the like). In step 184, the image part corresponding to the hand of the information inputting person 10 in the image A is extracted so that the area of the corresponding image part is calculated, and the image part corresponding to the hand of the information inputting person 10 in the image B is also extracted so that the area of the corresponding image part is calculated.
  • In [0133] next step 186, whether or not the monitor flag is 1 is determined. Since a negative determination in step 186 indicates that the information inputting person 10 has not pointed to the display surface of the display 12 during the previous instruction determination processing, the monitor flag is set at 1 in step 188. In the next step 190, the area of the image part corresponding to the hand of the information inputting person 10 calculated in step 184 is stored in the RAM 22C in order to later determine the click motion, and the instruction determination processing is completed.
  • On the other hand, since an affirmative determination in [0134] step 186 indicates that the information inputting person 10 is continuing to point at the display surface of the display 12, the processing proceeds to step 192. In step 192, the area calculated in step 184 is compared to the area stored in the RAM 22C or the like (the area which is calculated when the information inputting person 10 starts pointing at the display surface of the display 12, namely, the time when the monitor flag was set at 1 in step 188), whereby, whether or not the area of the image part corresponding to the hand of the information inputting person 10 is changed beyond a predetermined value, is determined. A negative determination in step 192 indicates that the information inputting person 10 has not made the click motion, so that the instruction determination processing is completed without any processing.
  • When the [0135] information inputting person 10 bends or turns the wrist (for example, changes from the attitude shown in FIG. 12B into the attitude shown FIG. 12C or vice versa) or he/she bends or extends a finger, the areas of the image parts corresponding to the hand of the information inputting person 10 in the images A and B are changed beyond a predetermined value, whereby an affirmative determination is made in step 192. When an affirmative determination is made in step 192, the information indicating “click detected” is output to the information processor 14 in step 194. In the next step 196, the monitor flag is set at 0 and the instruction determination processing is then completed.
  • Thus, the [0136] information processor 14 determines that a predetermined position on the display surface of the display 12, pointed to by the information inputting person 10, (the position corresponding to the coordinates input in step 182) is clicked. Then, the information processor 14 performs the processing in response to the information displayed at a predetermined position on the display surface of the display 12.
  • The [0137] controller 22 of the hand pointing input apparatus 20 repeats the above-described instruction determination processing at a predetermined time interval, whereby it is possible to determine, in real time, the position on the display surface of the display 12 pointed to by the information inputting person 10 and whether or not the click motion is detected. Thus, various uses are possible as described below by combining the instruction determination processing with the processing executed by the information processor 14.
  • For example, the [0138] display 12 is installed on the wall surface in an underground shopping mall or the like, and a product advertisement or the like is displayed on the display 12 by the information processor 14. In this case, the hand pointing input apparatus 20 according to the present embodiment permits an interactive communication with a user, for example, a picture may be displayed describing a particular product in detail, in response to the instruction of the user (the information inputting person). Furthermore, if the user possesses a pre-paid card, the user can buy the product by paying with this card.
  • Moreover, for example, the [0139] display 12 is installed in an entrance of a building, and an information map giving a guide to the building or the like is displayed on the display 12 by the information processor 14. In this case, the hand pointing input apparatus 20 according to the present embodiment permits interactive communication with the user, for example, a picture may be displayed describing in detail the place in the building which the user intends to visit, or a route to the place the user intends to visit may be shown in response to the instruction of the user (the information inputting person).
  • In general, operating manuals and other manuals are not carried into a clean room. However, for example, the [0140] display 12 may be arranged outside the clean room so as to be visible from inside the clean room, and the contents of the operating and other manuals are displayed on the display 12 in response to the instruction from the operator in the clean room determined by the hand pointing input apparatus 20, whereby interactive communication between the inside and the outside of the clean room is possible, so that operating efficiency in the clean room is improved.
  • The following applications are also possible. For example, the large-[0141] screen display 12, the hand pointing input apparatus 20, and the information processor 14 may be operated as a game machine in an amusement park. In a presentation at a conference, an explanation may be displayed on the display 12, and an optional position on the display surface of the display 12 is pointed at.
  • In the above description, the image pickup range of the [0142] video camera 36A is adjusted so that the range on the floor surface illuminated by the illuminator 32A may be out of the image pickup range of the video camera 36A, while the image pickup range of the video camera 36B is adjusted so that the range on the floor surface illuminated by the illuminator 32B may be out of the image pickup range of the video camera 36B. The image pickup is performed by the video camera 36A when the illuminator 32A alone is switched on, while the image pickup is performed by the video camera 36B when the illuminator 32B alone is switched on. Although the images A and B, from which the image parts corresponding to the information inputting person 10 are easily extracted, are thus picked up, the present invention is not limited to this example. Even if the range on the floor surface illuminated by the illuminator 32 is within the image pickup range of the video camera, it is possible to pickup images from which the image parts corresponding to the information inputting person 10 are easily extracted.
  • In the example shown in FIG. 13, the image pickup range of a [0143] video camera 36 includes the range on the floor surface illuminated by the illuminator 32A, and the range on the floor surface illuminated by the illuminator 32B. The object 50A, which is not the subject to be recognized on the floor surface illuminated by the illuminator 32A, and the object 50B, which is not the subject to be recognized on the floor surface illuminated by the illuminator 32B, are picked up by the video camera 36. In such cases, the illumination control processing shown in FIG. 14 may be performed.
  • In the illumination control processing shown in FIG. 14, in [0144] step 250, the illuminator 32A is switched on and the illuminator 32B is switched off. Then, in step 252, an image of information input space is picked up by the video camera 36. In step 254, the image data output from the video camera 36 (the image indicated by the image data is referred to as a first image) is captured and stored in the RAM 22C. In step 256, whether or not a predetermined time T passes after the illuminator 32A is switched on is determined. Until a predetermined time T passes, the processing is not performed. If an affirmative determination is made in step 256, the processing proceeds to step 258. In step 258, the illuminator 32B is switched on, and the illuminator 32A is switched off after a predetermined time to passes after the illuminator 32B is switched on (where it should be noted that t0<T: see FIG. 15).
  • In the [0145] next step 260, an image of the information input space is picked up by the video camera 36. In step 262, the image data output from the video camera 36 (the image indicated by the image data is referred to as a second image) is captured. In step 264, the lower luminance value of the luminance values of a certain pixel in the first and second images is selected based on the image data indicating the first image stored in the RAM 22C in step 254, and the image data indicating the second image captured in step 262. The selected luminance value is used as the luminance value of the pixel. This processing is performed for all the pixels, whereby new image data is generated and the generated image data is output.
  • In this illumination control processing, as shown in FIG. 15, since the time period when the [0146] illuminator 32A is switched on overlaps with the time period when the illuminator 32B is switched on during a predetermined time to, the information inputting person 10 is illuminated at all times. On the other hand, as shown in FIG. 13, the object 50A which is not the subject to be recognized is illuminated only when the illuminator 32A is switched on, and the object 50B which is not the subject to be recognized is illuminated only when the illuminator 32B is switched on. Therefore, by the processing in step 262, it is possible to obtain the image in which only the image part corresponding to the information inputting person 10 has high luminance, namely, the image from which the image part corresponding to the information inputting person 10 is easily extracted (or the image data indicating this data).
  • In the [0147] next step 266, whether or not a predetermined time T passes after the illuminator 32B is switched on is determined. Until a predetermined time T passes, the processing is not performed. If an affirmative determination is made in step 266, the processing proceeds to step 268. In step 268, the illuminator 32A is switched on, and the illuminator 32B is switched off after a predetermined time to passes after the illuminator 32A is switched on. Then, the processing is returned to step 252.
  • For a simple description, a [0148] single video camera 36 alone is shown in FIG. 13, and the processing alone for a single video camera 36 is shown in FIG. 14. However, even if a plurality of video cameras 36 for picking up the information input space from different directions are provided, the above-described processing is performed for each video camera 36, whereby it is possible to obtain the images from which the image parts corresponding to the information inputting person 10 are easily extracted.
  • In the illumination control processing shown in FIG. 14, the image data is captured in synchronization with the switch-on/off timing of the [0149] illuminators 32A and 32B, only during the time period when either the illuminator 32A or 32B is switched on. However, for example, regardless of the switch-on/off timing of the illuminators 32A and 32B, the image data is captured at a period of 1/integral part of the predetermined time T (see FIGS. 14 and 15), whereby the processing in step 264 may be performed at a period of 2×T.
  • Instead of selecting the lower luminance value of each pixel in the [0150] previous step 264, for example, the overlap period time to intervenes between cycles, while the illuminators 32A and 32B are alternately switched on in fixed cycles (whereby the ratio of the amount of time of switch-on for each illuminator 32A and 32B, is 50+a% where a corresponds to the overlap period time). For each pixel, average luminance in one switch-on cycle of the illuminators 32A and 32B may be used as the luminance of each pixel. Alternatively, for the change in the luminance of each pixel in one switch-on cycle of the illuminators 32A and 32B, the direct-current component alone of the change in the luminance is extracted by a low-pass filter, a fast Fourier transform, or the like, whereby the luminance value corresponding to the extracted direct-current component of the luminance change may be used as the luminance value of each pixel. Even in the above-mentioned case, the relatively high luminance value is used as the luminance value of the pixel corresponding to the information inputting person 10 which is always illuminated by the illuminator 32A or 32B during one switch-on cycle of the illuminators 32A and 32B. It is thus possible to obtain an image from which the image part corresponding to the information inputting person 10 is easily extracted.
  • In order to obtain an image from which the image part corresponding to the [0151] information inputting person 10 is easily extracted, as shown in FIG. 16, a slope platform 58 may be arranged on the floor surface in the information input space. The slope platform 58 includes an inclined surface 58A which is formed so that it may surround the information inputting person 10 who enters the information input space. Thus, for example, even if the information inputting person 10 comes to the information input space with luggage or the like, the slope platform 58 prevents the information inputting person 10 from putting the luggage or the like near himself/herself, so that the luggage or the like is put apart from the information inputting person 10, namely, out of the image pickup range of the video camera 36. Therefore, the presence of an image part corresponding to an object which is not the subject to be recognized such as the luggage of the information inputting person 10 in the image picked up by the video camera 36 is presented. It is thus possible to obtain the image from which the image part corresponding to the information inputting person 10 is easily extracted.
  • When an object which is not the subject to be recognized, such as relatively small trash or dust remains around the [0152] information inputting person 10, a fan or the like for generating an air flow may be provided around the information inputting person 10 so that the object which is not the subject to be recognized may be blown away by the air flow. Alternatively, a storage tank for storing water or the like may be also arranged around the information inputting person 10. Furthermore, the storage tank may be circular in shape so that the water or the like may circulate through the storage tank. With a construction such as this, it is also possible to prevent an object which is not the subject to be recognized from remaining around the information inputting person 10.
  • Although, in the above description, the lattice point position information is set by the use of the mark plate [0153] 40 composed of many marks 40A which are recorded so that they may be equally spaced in a matrix shape on the transparent flat plate, the present invention is not limited to this example. As shown in FIG. 17, a mark plate 62, in which markers composed of many light emitting devices 62A such as LED are arranged in a matrix shape on the transparent flat plate, may be used.
  • In this case, in the lattice point position information initialization, one [0154] light emitting device 62A at a time is sequentially switched on. Whenever each light emitting device 62A is switched on, the three-dimensional coordinates of the switched-on light emitting device 62A are calculated. An image of the information input space is picked up by the video cameras 36A and 36B. The position of the light emitting device 62A on the images A and B is calculated. The three-dimensional coordinates of the light emitting device 62A are made to correspond to the position of the light emitting device 62A on the images A and B. This correspondence is stored in the memory 24 as the lattice point position information. After all the light emitting devices 62A on the mark plate 62 are switched on, the mark plate 62 is moved by a fixed amount by the mark plate driving unit 38. The above processing has only to be repeated.
  • As shown in FIG. 18, the mark plate [0155] 40 and the mark plate 62 can be replaced by a robot arm unit 66 capable of moving a hand 66B mounted on the end of an arm 66A to an optional position in the information input space in which the marker composed of a light emitting device 68 is attached to the hand 66B. In this case, in the lattice point position information initialization, the light emitting device 68 is switched on, and the light emitting device 68 is moved to the positions corresponding to many lattice points constantly spaced in the lattice arrangement in the information input space. Whenever the light emitting device 68 is positioned in each position, the three-dimensional coordinates of the light emitting device 68 are calculated. The image of the information input space is picked up by the video cameras 36A and 36B. The position of the light emitting device 68 on the images A and B is calculated. The three-dimensional coordinates of the light emitting device 68 are allowed to correspond to the position of the light emitting device 68 on the images A and B. This correspondence has only to be stored in the memory 24 as the lattice point position information.
  • Furthermore, instead of an automatic positioning of the markers (the marks [0156] 40A, the light emitting devices 62A or the light emitting device 68) in the positions corresponding to a multiplicity of lattice points uniformity spaced in the lattice arrangement in the information input space by driving the mark plate 40, the mark plate 62, the robot arm unit 66 or the like as described above, the markers are manually positioned in the positions corresponding to the multiplicity of lattice points by the operator and an image of this situation is picked up, whereby the lattice point position information initialization alone may be automatically performed.
  • The mark plate shown in FIGS. 17 and 18 can be also applied to the use of at least one video camera and a plurality of illuminators as shown in FIG. 13. [0157]
  • In the instruction determination processing shown in FIGS. 8A and 8B, when the [0158] information inputting person 10 does not make the pointing motion (when the negative determination is made in step 159), the coordinates of the position on the display surface of the display 12 pointed at by the information inputting person 10 are not calculated and thus the coordinates are not output to the information processor 14. As a result, when the information inputting person 10 does not make the pointing motion, the cursor or the like is not displayed on the display 12. Therefore, in order to keep the cursor or the like displayed on the display 12, the information inputting person 10 is required to keep pointing to a desired position on which the cursor or the like is displayed. Disadvantageously, this results in a heavy load of the information inputting person 10.
  • For this reason, the instruction determination processing shown in FIGS. 8A and 8B may be replaced by the instruction determination processing shown in FIG. 19. In this instruction determination processing, in the same manner as [0159] steps 150 and 152 in the instruction determination processing of FIGS. 8A and 8B, the image data output from the video cameras 36A and 36B is captured in step 230, and whether or not the information inputting person 10 is present in the information input space is then determined on the basis of the captured image data in next step 232.
  • If a negative determination is made, the processing proceeds to step [0160] 280. In step 280, whether or not an arrival flag (the flag for indicating that the information inputting person 10 has arrived at the information input space) is 1 is determined. Since the initial value of the arrival flag is 0, the negative determination is first made in step 280, so that the instruction determination processing is completed without any processing. When the information inputting person does not arrive at the information input space, a predetermined attraction picture (the picture for attracting passersby near the information input space to the information input space) is displayed on the display 12 by the information processor 14.
  • On the other hand, when the [0161] information inputting person 10 dose arrive at the information input space, the affirmative determination is made in step 232, and the processing proceeds to step 234. In step 234, whether or not the arrival flag is 0 is determined. If the affirmative determination is made in step 234, the processing proceeds to step 236. In step 236, the information processor 14 is informed that the information inputting person has arrived at the information input space. Thus, the information processor 14 switches the picture displayed on the display 12 from the attraction picture to an initial picture (for example, for a product advertisement, this may be a picture indicating a product list or the like).
  • In the [0162] next step 238, since the information inputting person has arrived at the information input space, the arrival flag is set at 1, an instruction flag, (the flag for indicating that the information inputting person 10 is pointing to the display surface of the display 12), and the monitor flag are set at 0, and then the processing proceeds to step 240. When a negative determination is made in step 234, namely, when the information inputting person remains in the information input space after the previous execution of the instruction determination processing, the processing proceeds to step 240 without any processing in steps 236 and 238.
  • In [0163] step 240, in the same manner as steps 154 through 158 of the flow chart of FIGS. 8A and 8B, the image parts corresponding to the full-length image of the information inputting person 10 are extracted from the images picked up by the video cameras 36A and 36B, and the height h and the position on the floor surface of the information inputting person 10 are calculated, whereby the three-dimensional coordinates of the reference point of the information inputting person 10 are determined. In next step 242, in the same manner as step 159 of the flow chart of FIGS. 8A and 8B, whether or not the information inputting person 10 is making a pointing motion is determined. If a negative determination is made in step 242, whether or not the instruction flag is 1 is determined in step 270. If a negative determination is also made in step 270, the instruction determination processing is completed.
  • On the other hand, when the [0164] information inputting person 10 changes his/her attitude from an upright standing attitude as shown in FIG. 12A into an attitude of pointing with the hand to the display 12 as shown in FIG. 12B or 12C, an affirmative determination is made in step 242, and then the processing proceeds to step 244. In step 244, in the same manner as steps 160 through 176 of the flow chart of FIGS. 8A and 8B, the three-dimensional coordinates of the feature point of the information inputting person 10 are calculated, and the position pointed to by the information inputting person 10 is then calculated.
  • In step [0165] 246, whether or not the information inputting person 10 points to the display surface of the display 12 is determined. If a negative determination is made in step 246, the processing proceeds to step 270. On the other hand, if an affirmative determination is made in step 246, the pointing flag is set at 1 in step 247. Then, in step 248, the coordinates of the position on the display surface of the display 12 pointed to by the information inputting person 10 is output to the information processor 14 and the coordinates are stored in the RAM 22C or the like. Thus, the information processor 14 allows the cursor or the like to be displayed at the position on the display surface of the display 12 pointed to by the information inputting person 10.
  • The processing in the [0166] steps 250 through 262 is performed in the same manner as steps 184 through 196 of the flow chart of FIGS. 8A and 8B, whereby the click motion is detected. Namely, the image part corresponding to the hand of the information inputting person 10 in the image is extracted so that the area thereof is calculated (step 250), and whether or not the monitor flag is 1 is determined (step 252). If a negative determination is made in step 252, the monitor flag is set at 1 (step 254). The previously calculated area of the image part corresponding to the hand of the information inputting person is stored in the memory (step 256), and the instruction determination processing is completed.
  • If an affirmative determination is made in [0167] step 252, the area calculated in step 250 is compared to the area stored in the RAM 22C or the like, whereby whether or not the area of the image part corresponding to the hand of the information inputting person 10 is changed beyond a predetermined value is determined (step 258). If a negative determination is made in step 258, the determination that the information inputting person 10 is not making a click motion is made, so that the instruction determination processing is completed without any processing. On the other hand, if an affirmative determination is made in step 258, the information indicating click detected” is output to the information processor 14 (step 260, whereby the information processor 14 executes a predetermined processing such as replacing the picture displayed on the display 12). Then, the monitor flag and the pointing flag are set at 0 (step 262), and the instruction determination processing is completed.
  • If the [0168] information inputting person 10 points to the display surface of the display 12, and then he/she lowers the arm without a click motion, a negative determination is made in step 242 and the processing proceed to step 270. At this time, since the pointing flag is 1, an affirmative determination is made in step 270, and then processing proceeds to step 272. In step 272, the coordinates of the position on the display surface of the display 12 pointed to by the information inputting person 10, (calculated and stored in the RAM 22C in step 248), are output to the information processor 14. Thus, the information processor 14 allows the cursor to remain displayed at the position where the cursor was displayed before the information inputting person 10 lowered the arm.
  • In the above description, even if the attitude of the [0169] information inputting person 10 is changed from the attitude shown in FIG. 12B or 12C into the attitude shown in FIG. 12A, the cursor remains displayed. Thus, even when the information inputting person 10 desires to keep the cursor displayed (for example, during a presentation at a conference), the information inputting person 10 is not required to keep the arm raised. Accordingly, the burden on the information inputting person 10 can be reduced.
  • If the [0170] information inputting person 10 goes out of the information input space, a negative determination is made in step 232 even midway through a series of processing acts by the information processor 14, so that the processing proceeds to step 280. Since the arrival flag is set at 1 when the information inputting person 10 goes out of the information input space, the affirmative determination is made in step 280. In step 282, the information processor 14 is informed that the information inputting person 10 has gone out of the information input space. Thus, if the processing is midway through being executed, the information processor 14 stops the execution of the processing and switches the picture displayed on the display 12 to the attraction picture. In the next step 284, the airmail flag is set at 0, and the instruction determination processing is completed.
  • In this manner, when an [0171] information inputting person 10 is absent from the information input space, an attraction picture is always displayed on the display. Every time the information inputting person 10 comes to the information input space, the information processor 14 performs a series of processing acts starting with displaying the initial picture on the display 12.
  • Although, in the instruction determination processing shown in FIGS. 8 and 19, the click motion is defined as any motion of the hand of the information inputting person (for example, bending and turning the wrist, bending and extending a finger or the like), the present invention is not limited to these examples. A forward quick motion of the hand of the information inputting person [0172] 10 (see FIG. 22A, hereinafter referred to as a “forward click”) and a backward quick motion of the hand of the information inputting person 10 (see FIG. 22B, hereinafter referred to as a “backward click”) may be defined as the click motion. The above-described click motion can be detected by, for example, the instruction determination processing shown in FIG. 20 instead of the instruction determination processing shown in FIGS. 8 and 19.
  • Namely, in the instruction determination processing shown in FIG. 20, firstly, in [0173] step 310, in the same manner as step 152 of the flow chart of FIGS. 8A and 8B and step 232 of the flow chart of FIG. 19, whether or not the information inputting person 10 has arrived at (is present in) the information input space is determined. This determination can also be accomplished by the very simple determination of, for example, whether or not an image part having a high luminance and an area of a predetermined value or more is present in the images A and B. If a negative determination is made in step 310, the processing is delayed until an affirmative determination is made. When the information inputting person 10 arrives at the information input space, an affirmative determination is made in step 310, and then the processing proceeds to step 312. In step 312, a click motion speed setting processing is executed.
  • This click motion speed setting processing will now be described with reference to the flow chart of FIG. 21. In [0174] step 290, the information processor 14 is given an instruction to display on the display 12 a message to request the information inputting person 10 to make the click motion. The information processor 14 allows the massage to be displayed on the display 12. When the massage is displayed on the display 12, the information inputting person 10 bends or extends the arm and repeats the forward click motion or backward click motion.
  • In the [0175] next step 292, a reference point/feature point coordinates calculation processing (the same processing as in steps 154 through 176 of the flow chart of FIGS. 8A and 8B) is performed, whereby the three-dimensional coordinates of the reference point P0 and the feature point PX are determined. In step 294, whether or not the information inputting person 10 makes a pointing motion to point to the display 12 is determined. If a negative determination is made in step 294, the processing returns to step 292. Steps 292 and 294 are repeated until the information inputting person 10 makes the pointing motion. If an affirmative determination is made in step 294, the processing proceeds to step 296.
  • In [0176] step 296, a distance k between the reference point P0 and the feature point PX is calculated from the three-dimensional coordinates of the reference point P0, and the three-dimensional coordinate of the feature point PX which are captured in step 292. Although step 296 is repeated, during the second and later repetitions, the rate of the change of the distance k, that is, a velocity of change V, (a moving speed of the position of the feature point PX to the reference point P0), is calculated based on the difference between the current value of the distance k and the previous value of the distance k. This calculation result is stored.
  • In the [0177] next step 298, whether or not a predetermined time passes after the message requesting the click motion is displayed on the display 12 is determined. If the negative determination is made in step 298, the processing is returned to step 292, and steps 292 through 298 are repeated. Therefore, until a predetermined time passes after the massage of the request for the click motion is displayed, the calculation and storage of the velocity of change V of the distance k between the reference point P0 and the feature point PX are repeated.
  • If an affirmative determination is made in [0178] step 298, the processing proceeds to step 300. The previously calculated and stored velocity of change V is captured, and a click motion speed V0 is set and stored as the threshold value, based on the transition of the velocity of change V during a single click motion of the information inputting person 10. This click motion speed V0 is used as the threshold value for determining whether or not the information inputting person 10 is making the click motion in the processing described below. Thus, in order to determine with certainty that the information inputting person 10 is making a click motion, a click motion speed V0 can be set at, for example, a value which is slightly smaller than the average value of the velocity of change V during a single click motion of the information inputting person 10. Alternatively, the click motion speed V0 may be set at a minimum value of the velocity of change V during a single click motion of the information inputting person 10.
  • When the [0179] information inputting person 10 bends or extends an arm so as to thereby make the click motion, the moving speed (the velocity of change V) of the feature point PX varies depending on the information inputting person 10. However, the above-described click motion speed setting processing is executed every time an information inputting person 10 arrives at the information input space. Therefore, when a new information inputting person 10 arrives at the information input space, an appropriate new value is set as the click motion speed V0 in response to the physique, muscular strength, or the like of the new information inputting person 10.
  • When the above-described click motion speed setting processing is completed, the processing proceeds to step [0180] 314 of the instruction determination processing (FIG. 20). In step 314, the reference point/feature point coordinates calculation processing (the same processing as in steps 154 through 176 of the flow chart of FIGS. 8A and 8B) is performed, whereby the three-dimensional coordinates of the reference point P0 and the feature point PX are determined. In the next step 316, whether or not the information inputting person 10 is making the pointing motion is determined based on the three-dimensional coordinates of the reference point P0 and the feature point PX determined in step 314.
  • If a negative determination is made in [0181] step 316, the processing proceeds to step 334. In step 334, whether or not the information inputting person 10 has left the information input space is determined. In the same manner as step 310 described above, this determination can also be accomplished by the very simple determination of, for example, whether or not the image part having a high luminance and an area of a predetermined value or more is absent from the images A and B. If a negative determination is made, the processing returns to step 314. Steps 314, 316 and 334 are repeated until the information inputting person 10 makes the pointing motion, steps 314, 316, 334 are repeated.
  • If an affirmative determination is made in [0182] step 316, the processing proceeds to step 318. In step 318, based on the three-dimensional coordinates of the reference point P0 and the feature point PX calculated in step 314, in the same manner as step 176 of the flow chart of FIGS. 8A and 8B, in order to determine the position pointed to by the information inputting person 10, the coordinate of the intersection point on a the plane including the display surface of the large-screen display 12, and the virtual line connecting the reference point and the feature point, are calculated. In the next step 320, whether or not the information inputting person 10 points to the display surface of the large-screen display 12 is determined based on the coordinate calculated in step 318.
  • If a negative determination is made in [0183] step 320, the processing proceeds to step 334 without any processing. On the other hand, if an affirmative determination is made in step 320, in step 322, the coordinates calculated in step 318 are output to the information processor 14, whereby the information processor 14 is given the instruction to display the cursor. Thus, the information processor 14 performs the processing allowing the cursor to be displayed on a predetermined position, which is judged to be the position pointed to by the information inputting person 10, on the display surface of the display 12.
  • In the [0184] next step 324, the distance k between the reference point P0 and the feature point PX is calculated based on the three-dimensional coordinates of the reference point P0 and the feature point PX and whether or not the distance k is changed is determined. Step 324 is repeated, when the information inputting person 10 points to the display surface of the display 12 (when an affirmative determination is made in step 320). Since whether or not the distance k is changed cannot be determined when the distance k is calculated for the first time in step 324, a negative determination is unconditionally made in step 324.
  • On the other hand, if a affirmative determination is made in [0185] step 324, the processing proceeds to step 326. In step 326, the velocity of the change V of the distance k is calculated, and whether or not the calculated velocity of change V is the threshold value, (the click motion velocity V0 set by the click motion velocity setting processing), or more is determined. In step 326, since the velocity of change V of the distance k cannot be determined when the distance k is calculated for the first time in step 324, a negative determination is unconditionally made. If a negative determination is made in step 324 or 326, the determination that the information inputting person 10 is not making a click motion is made, and the processing proceeds to step 334 without any processing.
  • If an affirmative determination is made in [0186] step 324 or 326, the determination that the information inputting person 10 is making a click motion is made. In step 328, the direction of the change in the distance k is determined, and the processing branches in response to the result of the determination. When the distance k is changed in a direction of increase, since it can be determined that the information inputting person 10 is making the forward click motion by quickly extending an arm, the processing proceeds to step 330. In step 330, the information indicating that the forward click has been detected is output to the information processor 14, and then the processing proceeds to step 334. On the other hand, when the distance k is changed in a direction of reduction, since it can be determined that the information inputting person 10 is making the backward click motion by quickly bending the arm, the processing proceeds to step 332. In step 332, the information indicating that the backward click has been detected is output to the information processor 14, and then the processing proceeds to step 334.
  • When the information indicating that the forward or backward click has been detected is input to the [0187] information processor 14, the information processor 14 determines that the current position pointed to by the information inputting person 10 is clicked. If the forward click is detected, a first processing corresponding to the current position pointed to is performed. If the backward click is detected, a second processing (differing from the first processing) corresponding to the current position pointed to is performed. When the information inputting person 10 goes out of the information input space, an affirmative determination is made in step 334, and the processing returns to step 310.
  • Since the click motion in the instruction determination processing is a very natural motion as the motion for pointing to and selecting a specific position on the display surface of the [0188] display 12, the person to be recognized can make the click motion without feeling uncomfortable. Moreover, in the above description, since whether or not the click motion is performed, and whether the performed click motion is the forward click motion or the backward click motion, can be determined on the basis of the change in the distance k between the reference point and the feature point, the click motion can be detected in a short time. Since two types of click motion, (the forward click motion and the backward click motion), are also defined as the click motion, the information inputting person can selectively execute the first processing and the second processing.
  • The natural movement of a persons hand after performing the forward click motion or the backward click motion, is to try to return to the position (neutral position) prior to the click motion. Therefore, in order to prevent the motion of the hand trying to return to the neutral position after the forward click motion from being mistaken as the backward click motion, and to prevent the motion of the hand trying to return to the neutral position after the backward click motion from being mistaken as the forward click motion, it is desirable that the motion of the hand trying to return to the neutral position is ignored after detecting the forward or backward click motion. This can be accomplished by, for example, stopping a detection of the click motion for a predetermined time after detecting the forward or backward click motion. Alternatively, this can be also accomplished in the following manner. That is, the value of the distance k before detecting the forward or backward click motion is previously stored as the value corresponding to the neutral position. Then, the detection of the click motion is stopped until the value of the distance k reaches the value corresponding to the neutral position after the forward or backward click motion is detected. [0189]
  • For the above-mentioned instruction determination processing, in the same manner as the instruction determination processing shown in FIG. 19, when the information inputting person lowers the arm, then needless to say, the cursor may remain displayed at the position on which the cursor was displayed before the arm was lowered. [0190]
  • Although, in the above description, the position pointed to by the information inputting person is calculated on the basis of the three-dimensional coordinates of the reference point and the feature point of the information inputting person, the present invention is not limited to this example. As shown in FIG. 23, an [0191] image part 72 corresponding to the full-length image of the information inputting person 10 is extracted from the image picked up by the video camera, and the height h and the position on the floor surface of the information inputting person 10 are calculated. Furthermore, after other parameters concerning the information inputting person 10 such as their shape have been determined, the full-length image of the information inputting person is converted into a dummy model 74 on the basis of various parameters including their height h. Various motions of the information inputting person including the motion to point to the display surface of the display 12 may be recognized on the basis of this dummy model.
  • As described above, when the dummy model is used, it is also possible to recognize a motion such as a motion waving the hand which is difficult to recognize from the full-length image of the information inputting person. For example, assuming that the motion, in which the information inputting person waves the hand, is defined as the motion indicating “cancel”, when the information inputting person waves the hand, it is possible to stop the processing executed in response to the previously recognized motion of the information inputting person. [0192]
  • Although the above description is provided for an example of a mode in which the information inputting person points to an optional point on the display surface of the display, the subject to be pointed to by the information inputting person is not limited to the display. The information inputting person may point to an optional direction or to an optional object positioned at an unfixed distance from the information inputting person. [0193]
  • When the information inputting person points to an optional direction, in the instruction determination processing (for example, in [0194] step 176 of the flow chart of FIGS. 8A and 8B), the direction in which of the virtual line connecting the reference point and the feature point of the information inputting person extends is determined, whereby the direction pointed to by the information inputting person can be determined. When the information inputting person points to an optional object positioned at an unfixed distance from the information inputting person, in the previous step 176, the extending direction of the virtual line is determined, and then the object on the end of the extending virtual line is determined, whereby the direction pointed to by the information inputting person can be determined.
  • The information inputting person may point to an optional direction in the following application. For example, in a theater or the like, the direction of emission of a spot light, and the directions of acoustic beams generated by a multiplicity of speakers in an array arrangement might be oriented to the direction pointed to by the operator (information inputting person). [0195]
  • The information inputting person may point to an optional object positioned at an unfixed distance from the information inputting person in the following application. For example, on a building site, a factory, or the like, a crane and other machines might operated in response to instructions from the operator (information inputting person). Furthermore, the information inputting person might give various instructions to various devices in home automation. [0196]
  • Although, in the above main description, a [0197] single video camera 36 or two video cameras 36A and 36B are mainly provided, the present invention is not limited to this example. The image of the information input space may be picked up by more video cameras whereby the instruction from the information inputting person is determined.

Claims (18)

What is claimed is:
1. A hand pointing apparatus comprising:
illuminating means for illuminating a person to be recognized;
a plurality of image pickup means, located in different positions, wherein an image pickup range is adjusted for each image so that said person to be recognized, who is illuminated by said illuminating means, may be within the image pickup range and an illuminated range on a floor surface, which is illuminated by said illuminating means, may be out of the image pickup range; and
determining means for extracting an image part corresponding to said person to be recognized from a plurality of images based on a plurality of images of situations picked up by said plurality of image pickup means, the situations being indicative of said person to be recognized pointing to either a specific position or a specific direction, and for determining either the position or the direction pointed to by said person to be recognized.
2. A hand pointing apparatus comprising:
a plurality of illuminating means for illuminating a person to be recognized from different directions;
a plurality of image pickup means, located in different positions corresponding to each of said plurality of illuminating means, wherein an image pickup range is adjusted so that said person to be recognized, who is illuminated by said corresponding illuminating means, may be within the image pickup range and an illuminated range on a floor surface, which is illuminated by said corresponding illuminating means, may be out of the image pickup range;
controlling means for switching on/off said plurality of illuminating means one by one in sequence, and for controlling so as to pickup an image of said person to be recognized pointing to either a specific position or a specific direction by said image pickup means corresponding to said switched-on illuminating means; and
determining means for extracting an image part corresponding to said person to be recognized from a plurality of images based on a plurality of images picked up by said plurality of image pickup means, and for determining either the position or the direction pointed to by said person to be recognized.
3. A hand pointing apparatus comprising:
a plurality of illuminating means for illuminating a person to be recognized from different directions;
at least one image pickup means for picking up an image of said person to be recognized, who is illuminated by said illuminating means;
discriminating means for switching on/off said plurality of illuminating means one by one in sequence, for comparing a plurality of images of said person to be recognized pointing to either a specific position or a specific direction picked up by the same image pickup means during the switching on of said plurality of illuminating means, and for discriminating between an image part corresponding to said person to be recognized and an image part other than the image part corresponding to said person to be recognized in said plurality of images for at least one image pickup means; and
determining means for extracting the image part corresponding to said person to be recognized from said plurality of images picked up by said image pickup means based on a result of a discrimination by said discriminating means, and for determining either the position or the direction pointed to by said person to be recognized.
4. A hand pointing apparatus comprising:
illuminating means for illuminating a person to be recognized;
a plurality of image pickup means for picking up an image of said person to be recognized, who is illuminated by said illuminating means from different directions;
determining means for extracting an image part corresponding to said person to be recognized from a plurality of images based on a plurality of images of situations picked up by said plurality of image pickup means, the situations being indicative of said person to be recognized pointing to either a specific position or a specific direction, and for determining either the position or the direction pointed to by said person to be recognized; and
preventing means for preventing an object which is not the subject to be recognized from remaining on the floor surface around said person to be recognized.
5. A hand pointing apparatus comprising:
illuminating means for illuminating a person to be recognized who arrives at a predetermined place;
a plurality of image pickup means for picking up an image of said person to be recognized, who is illuminated by said illuminating means from different directions;
storing means for storing information for corresponding three-dimensional coordinates of a plurality of virtual points, positioned near said predetermined place, to the positions of said plurality of virtual points on said plurality of images picked up by said plurality of image pickup means; and
determining means: for extracting an image part corresponding to said person to be recognized from a plurality of images based on a plurality of images of situations picked up by said plurality of image pickup means, the situations being indicative of said person to be recognized pointing to either a specific position or a specific direction; for determining the position of a feature point of said person to be recognized in each of said images; for determining the three-dimensional coordinates of the feature point based on the determined position of the feature point and the information stored in said storing means; and for determining either the position or the direction pointed to by said person to be recognized based on the determined three-dimensional coordinates of the feature point.
6. A hand pointing apparatus according to
claim 5
, wherein said storing means stores the information for corresponding the three-dimensional coordinates of a multiplicity of virtual points uniformly spaced in a lattice arrangement near said predetermined place, to the positions of said multiplicity of virtual points on said plurality of images picked up by said plurality of image pickup means.
7. A hand pointing apparatus according to
claim 6
, wherein said determining means determines the position of the feature point of said person to be recognized in said images, extracts from said images the virtual points positioned in a region within a predetermined range including said feature point on said images from said images, and determines the three-dimensional coordinates of said feature point based on the three-dimensional coordinates of the common virtual points extracted from said images.
8. A hand pointing apparatus according to
claim 5
further comprising:
generating means for allowing said plurality of image pickup means to pick up images of the situations where markers are positioned in the positions of said virtual points, for generating the information for corresponding the three-dimensional coordinates of said virtual points to the positions of said virtual points on said images, based on the three-dimensional coordinates of said virtual points and the marker positions on said images picked up by said plurality of image pickup means, and for allowing said storing means to store the generated information.
9. A hand pointing apparatus according to
claim 6
further comprising:
generating means for allowing said plurality of image pickup means to pick up images of the situations where markers are positioned in the positions of said virtual points, for generating the information for corresponding the three-dimensional coordinates of said virtual points to the positions of said virtual points on said images, based on the three-dimensional coordinates of said virtual points and the marker positions on said images picked up by said plurality of image pickup means, and for allowing said storing means to store the generated information.
10. A hand pointing apparatus comprising:
illuminating means for illuminating a person to be recognized;
a plurality of image pickup means for picking up images of said person to be recognized, who is illuminated by said illuminating means from different directions;
determining means for extracting an image part corresponding to said person to be recognized from a plurality of images based on a plurality of images of situations picked up by said plurality of image pickup means, the situations being indicative of said person to be recognized pointing to either a specific position or a specific direction, and for determining either the position or the direction pointed to by said person to be recognized;
first detecting means for extracting the image part corresponding to a predetermined part of a body of said person to be recognized from said plurality of images, and for detecting a change in any one of either an area of the extracted image part, an outline of the extracted image part, or a length of an outline of the extracted image part; and
processing means for executing a predetermined processing when said change is detected by said first detecting means.
11. A hand pointing apparatus comprising:
illuminating means for illuminating a person to be recognized;
a plurality of image pickup means for picking up images of said person to be recognized, who is illuminated by said illuminating means from different directions;
determining means for extracting an image part corresponding to said person to be recognized from a plurality of images based on a plurality of images of situations picked up by said plurality of image pickup means, the situations being indicative of said person to be recognized pointing to either a specific position or a specific direction, for determining the three-dimensional coordinates of the feature point whose position is changed when said person to be recognized bends or extends an arm, and the three-dimensional coordinates of a reference point whose position is not changed even if said person to be recognized bends and extends an arm, and for determining either the position or the direction pointed to by said person to be recognized based on the three-dimensional coordinates of the feature point and the three-dimensional coordinates of the reference point; and
processing means for calculating the distance between said reference point and said feature point and for executing a predetermined processing based on the change in the distance between said reference point and said feature point.
12. A hand pointing apparatus according to
claim 11
, wherein said processing means performs a first predetermined processing when the distance between said reference point and said feature point is increased, and performs a second predetermined processing differing from said first predetermined processing when the distance between said reference point and said feature point is reduced.
13. A hand pointing apparatus according to
claim 11
, wherein said processing means detects a rate of change in the distance between said reference point and said feature point, and executes a predetermined processing when the detected rate of change is a threshold value or more.
14. A hand pointing apparatus according to
claim 12
, wherein said processing means detects a rate of change in the distance between said reference point and said feature point, and executes a predetermined processing when the detected rate of change is a threshold value or more.
15. A hand pointing apparatus according to
claim 13
further comprising:
threshold value setting means for requesting said person to be recognized to bend or extend an arm in order to allow said processing means to perform a predetermined processing, and for setting said threshold value based on the rate of the change in the distance between said reference point and said feature point when said person to be recognized bends or extends an arm.
16. A hand pointing apparatus according to
claim 14
further comprising threshold value setting means for requesting said person to be recognized to bend or extend an arm in order to allow said processing means to perform a predetermined processing, and for setting said threshold value based on the rate of the change in the distance between said reference point and said feature point when said person to be recognized bends or extends an arm.
17. A hand pointing apparatus according to
claim 10
further comprising:
second detecting means for extracting the image part corresponding to the arm of said person to be recognized from said plurality of images, and for detecting whether or not the arm of said person to be recognized is lowered, wherein said processing means continues the current state when said second detecting means detects that the arm of said person to be recognized is lowered.
18. A hand pointing apparatus according to
claim 11
further comprising:
second detecting means for extracting the image part corresponding to the arm of said person to be recognized from said plurality of images, and for detecting whether or not the arm of said person to be recognized is lowered, wherein said processing means continues the current state when said second detecting means detects that the arm of said person to be recognized is lowered.
US09/040,436 1997-03-21 1998-03-18 Hand pointing device Expired - Fee Related US6385331B2 (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
JP9-068602 1997-03-21
JP6860297 1997-03-21
JP9-68602 1997-03-21
JP9-369628 1997-12-29
JP36962897A JP3749369B2 (en) 1997-03-21 1997-12-29 Hand pointing device

Publications (2)

Publication Number Publication Date
US20010043719A1 true US20010043719A1 (en) 2001-11-22
US6385331B2 US6385331B2 (en) 2002-05-07

Family

ID=26409810

Family Applications (1)

Application Number Title Priority Date Filing Date
US09/040,436 Expired - Fee Related US6385331B2 (en) 1997-03-21 1998-03-18 Hand pointing device

Country Status (5)

Country Link
US (1) US6385331B2 (en)
EP (3) EP0866419B1 (en)
JP (1) JP3749369B2 (en)
KR (1) KR100328648B1 (en)
DE (3) DE69824225T2 (en)

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003054683A2 (en) * 2001-12-07 2003-07-03 Canesta Inc. User interface for electronic devices
EP1336916A2 (en) * 2002-02-18 2003-08-20 Canon Kabushiki Kaisha Position-direction measuring apparatus and information processing method
US20030156756A1 (en) * 2002-02-15 2003-08-21 Gokturk Salih Burak Gesture recognition system using depth perceptive sensors
US6711280B2 (en) * 2001-05-25 2004-03-23 Oscar M. Stafsudd Method and apparatus for intelligent ranging via image subtraction
US20040066500A1 (en) * 2002-10-02 2004-04-08 Gokturk Salih Burak Occupancy detection and measurement system and method
US20040153229A1 (en) * 2002-09-11 2004-08-05 Gokturk Salih Burak System and method for providing intelligent airbag deployment
US20050070024A1 (en) * 2003-09-30 2005-03-31 Nguyen Hoa Duc Method of analysis of alcohol by mass spectrometry
US20050151838A1 (en) * 2003-01-20 2005-07-14 Hironori Fujita Monitoring apparatus and monitoring method using panoramic image
US7151530B2 (en) 2002-08-20 2006-12-19 Canesta, Inc. System and method for determining an input selected by a user through a virtual interface
WO2007107874A3 (en) * 2006-03-22 2007-12-21 Home Focus Dev Ltd Interactive playmat
US20080013826A1 (en) * 2006-07-13 2008-01-17 Northrop Grumman Corporation Gesture recognition interface system
US20080028325A1 (en) * 2006-07-25 2008-01-31 Northrop Grumman Corporation Networked gesture collaboration system
US20080043106A1 (en) * 2006-08-10 2008-02-21 Northrop Grumman Corporation Stereo camera intrusion detection system
US20080244468A1 (en) * 2006-07-13 2008-10-02 Nishihara H Keith Gesture Recognition Interface System with Vertical Display
WO2008131565A1 (en) * 2007-04-26 2008-11-06 Roberto Rosso Communications control bus and apparatus for controlling multiple electronic hardware devices
US20090103780A1 (en) * 2006-07-13 2009-04-23 Nishihara H Keith Hand-Gesture Recognition Method
US20090115721A1 (en) * 2007-11-02 2009-05-07 Aull Kenneth W Gesture Recognition Light and Video Image Projector
US20090116742A1 (en) * 2007-11-01 2009-05-07 H Keith Nishihara Calibration of a Gesture Recognition Interface System
US20090316952A1 (en) * 2008-06-20 2009-12-24 Bran Ferren Gesture recognition interface system with a light-diffusive screen
GB2462709A (en) * 2008-08-22 2010-02-24 Northrop Grumman Space & Msn A method for determining compound gesture input
US20100110384A1 (en) * 2007-03-30 2010-05-06 Nat'l Institute Of Information & Communications Technology Floating image interaction device and its program
US7746321B2 (en) 2004-05-28 2010-06-29 Erik Jan Banning Easily deployable interactive direct-pointing system and presentation control system and calibration method therefor
US20100328682A1 (en) * 2009-06-24 2010-12-30 Canon Kabushiki Kaisha Three-dimensional measurement apparatus, measurement method therefor, and computer-readable storage medium
US20120119988A1 (en) * 2009-08-12 2012-05-17 Shimane Prefectural Government Image recognition apparatus, operation determining method and computer-readable medium
CN102968208A (en) * 2012-09-05 2013-03-13 广东威创视讯科技股份有限公司 Method and system for selecting adjustment mode of effective area of area array camera positioning image
US20130343607A1 (en) * 2012-06-20 2013-12-26 Pointgrab Ltd. Method for touchless control of a device
JP2014142695A (en) * 2013-01-22 2014-08-07 Ricoh Co Ltd Information processing apparatus, system, image projector, information processing method, and program
WO2014129683A1 (en) * 2013-02-21 2014-08-28 엘지전자 주식회사 Remote pointing method
US8933882B2 (en) * 2012-12-31 2015-01-13 Intentive Inc. User centric interface for interaction with visual display that recognizes user intentions
US8938124B2 (en) 2012-05-10 2015-01-20 Pointgrab Ltd. Computer vision based tracking of a hand
US9124761B2 (en) 2011-09-05 2015-09-01 Panasonic Intellectual Property Management Co., Ltd. Television communication system, terminal, and method
US9165368B2 (en) 2005-02-08 2015-10-20 Microsoft Technology Licensing, Llc Method and system to segment depth images and to detect shapes in three-dimensionally acquired data
US9285897B2 (en) 2005-07-13 2016-03-15 Ultimate Pointer, L.L.C. Easily deployable interactive direct-pointing system and calibration method therefor
CN106383500A (en) * 2016-09-05 2017-02-08 湖北工业大学 Intelligent building door and window curtain wall system
US10201900B2 (en) * 2015-12-01 2019-02-12 Seiko Epson Corporation Control device, robot, and robot system
US10242255B2 (en) 2002-02-15 2019-03-26 Microsoft Technology Licensing, Llc Gesture recognition system using depth perceptive sensors
US20200371597A1 (en) * 2017-04-18 2020-11-26 Kyocera Corporation Electronic device
CN113891526A (en) * 2020-07-01 2022-01-04 丰田自动车株式会社 Server device, information processing system, and method for operating system

Families Citing this family (97)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6256046B1 (en) * 1997-04-18 2001-07-03 Compaq Computer Corporation Method and apparatus for visual sensing of humans for active public interfaces
JP3795647B2 (en) 1997-10-29 2006-07-12 株式会社竹中工務店 Hand pointing device
DE19806024A1 (en) * 1998-02-13 1999-08-19 Siemens Nixdorf Inf Syst Method for monitoring a device operation process and self-service device monitored with it
US6690357B1 (en) 1998-10-07 2004-02-10 Intel Corporation Input device using scanning sensors
TW464800B (en) * 1998-10-07 2001-11-21 Intel Corp A method for inputting data to an electronic device, an article comprising a medium for storing instructions, and an image processing system
JP2000181601A (en) * 1998-12-18 2000-06-30 Fujitsu General Ltd Information display system
JP2000222117A (en) * 1999-01-29 2000-08-11 Takenaka Komuten Co Ltd Hand pointing device, indicated position display method and recording medium
WO2000075860A1 (en) * 1999-06-08 2000-12-14 Soffix, S.R.L. Electronic writing/display apparatus and respective method of operation
EP1074943A3 (en) * 1999-08-06 2004-03-24 Canon Kabushiki Kaisha Image processing method and apparatus
US6738041B2 (en) 1999-10-29 2004-05-18 Intel Corporation Using video information to control cursor position
WO2001088681A1 (en) * 2000-05-17 2001-11-22 Koninklijke Philips Electronics N.V. Apparatus and method for indicating a target by image processing without three-dimensional modeling
US6531999B1 (en) 2000-07-13 2003-03-11 Koninklijke Philips Electronics N.V. Pointing direction calibration in video conferencing and other camera-based system applications
JP2004524624A (en) * 2001-03-01 2004-08-12 ヴォリューム・インタラクションズ・プライヴェート・リミテッド Display device
US7274800B2 (en) * 2001-07-18 2007-09-25 Intel Corporation Dynamic gesture recognition from stereo sequences
JP2003084229A (en) * 2001-09-14 2003-03-19 Takenaka Komuten Co Ltd Device and method for displaying pointed position
US6973579B2 (en) * 2002-05-07 2005-12-06 Interdigital Technology Corporation Generation of user equipment identification specific scrambling code for the high speed shared control channel
US7165029B2 (en) 2002-05-09 2007-01-16 Intel Corporation Coupled hidden Markov model for audiovisual speech recognition
US20030212552A1 (en) * 2002-05-09 2003-11-13 Liang Lu Hong Face recognition procedure useful for audiovisual speech recognition
US7209883B2 (en) * 2002-05-09 2007-04-24 Intel Corporation Factorial hidden markov model for audiovisual speech recognition
JP4149213B2 (en) * 2002-07-12 2008-09-10 本田技研工業株式会社 Pointed position detection device and autonomous robot
DE60215504T2 (en) 2002-10-07 2007-09-06 Sony France S.A. Method and apparatus for analyzing gestures of a human, e.g. for controlling a machine by gestures
US7171043B2 (en) 2002-10-11 2007-01-30 Intel Corporation Image recognition using hidden markov models and coupled hidden markov models
US7472063B2 (en) * 2002-12-19 2008-12-30 Intel Corporation Audio-visual feature fusion and support vector machine useful for continuous speech recognition
US7203368B2 (en) * 2003-01-06 2007-04-10 Intel Corporation Embedded bayesian network for pattern recognition
JP4286556B2 (en) * 2003-02-24 2009-07-01 株式会社東芝 Image display device
US7618323B2 (en) * 2003-02-26 2009-11-17 Wms Gaming Inc. Gaming machine system having a gesture-sensing mechanism
FR2853423B1 (en) * 2003-03-20 2005-07-15 Simag Dev INPUT DEVICE COMPRISING AN OPTICAL SENSOR FOLLOWED BY A FILTERING MEANS
EP1477924B1 (en) * 2003-03-31 2007-05-02 HONDA MOTOR CO., Ltd. Gesture recognition apparatus, method and program
US20080273764A1 (en) * 2004-06-29 2008-11-06 Koninklijke Philips Electronics, N.V. Personal Gesture Signature
WO2006027423A1 (en) * 2004-08-09 2006-03-16 Simag Developpement Input device comprising an optical sensor and a filter means
JP5631535B2 (en) 2005-02-08 2014-11-26 オブロング・インダストリーズ・インコーポレーテッド System and method for a gesture-based control system
KR100687737B1 (en) * 2005-03-19 2007-02-27 한국전자통신연구원 Apparatus and method for a virtual mouse based on two-hands gesture
JP4654773B2 (en) * 2005-05-31 2011-03-23 富士フイルム株式会社 Information processing apparatus, moving image encoding apparatus, information processing method, and information processing program
KR100815159B1 (en) 2005-12-08 2008-03-19 한국전자통신연구원 3D input apparatus by hand tracking using multiple cameras and its method
US8537112B2 (en) * 2006-02-08 2013-09-17 Oblong Industries, Inc. Control system for navigating a principal dimension of a data space
US9910497B2 (en) * 2006-02-08 2018-03-06 Oblong Industries, Inc. Gestural control of autonomous and semi-autonomous systems
US8531396B2 (en) 2006-02-08 2013-09-10 Oblong Industries, Inc. Control system for navigating a principal dimension of a data space
US9075441B2 (en) * 2006-02-08 2015-07-07 Oblong Industries, Inc. Gesture based control using three-dimensional information extracted over an extended depth of field
US9823747B2 (en) 2006-02-08 2017-11-21 Oblong Industries, Inc. Spatial, multi-modal control device for use with spatial operating system
US8537111B2 (en) 2006-02-08 2013-09-17 Oblong Industries, Inc. Control system for navigating a principal dimension of a data space
US8370383B2 (en) 2006-02-08 2013-02-05 Oblong Industries, Inc. Multi-process interactive systems and methods
JP2008052029A (en) * 2006-08-24 2008-03-06 Takata Corp Photographing system, vehicle crew detection system, operation device controlling system, and vehicle
KR100799766B1 (en) 2006-10-25 2008-02-01 엠텍비젼 주식회사 Apparatus for controlling moving of pointer
US8356254B2 (en) * 2006-10-25 2013-01-15 International Business Machines Corporation System and method for interacting with a display
KR100814289B1 (en) 2006-11-14 2008-03-18 서경대학교 산학협력단 Real time motion recognition apparatus and method
KR100851977B1 (en) * 2006-11-20 2008-08-12 삼성전자주식회사 Controlling Method and apparatus for User Interface of electronic machine using Virtual plane.
EP2163987A3 (en) * 2007-04-24 2013-01-23 Oblong Industries, Inc. Processing of events in data processing environments
JP5120754B2 (en) * 2008-03-28 2013-01-16 株式会社国際電気通信基礎技術研究所 Motion detection device
US9740922B2 (en) 2008-04-24 2017-08-22 Oblong Industries, Inc. Adaptive tracking system for spatial input devices
US9740293B2 (en) 2009-04-02 2017-08-22 Oblong Industries, Inc. Operating environment with gestural control and multiple client devices, displays, and users
US9952673B2 (en) 2009-04-02 2018-04-24 Oblong Industries, Inc. Operating environment comprising multiple client devices, multiple displays, multiple users, and gestural control
US9495013B2 (en) 2008-04-24 2016-11-15 Oblong Industries, Inc. Multi-modal gestural interface
US8723795B2 (en) 2008-04-24 2014-05-13 Oblong Industries, Inc. Detecting, representing, and interpreting three-space input: gestural continuum subsuming freespace, proximal, and surface-contact modes
US10642364B2 (en) 2009-04-02 2020-05-05 Oblong Industries, Inc. Processing tracking and recognition data in gestural recognition systems
US9684380B2 (en) 2009-04-02 2017-06-20 Oblong Industries, Inc. Operating environment with gestural control and multiple client devices, displays, and users
CN102112945B (en) * 2008-06-18 2016-08-10 奥布隆工业有限公司 Control system based on attitude for vehicle interface
US20100066673A1 (en) * 2008-09-16 2010-03-18 Shang Tai Yeh Laser pointer capable of detecting a gesture associated therewith and representing the gesture with a function
JP5183398B2 (en) * 2008-09-29 2013-04-17 株式会社日立製作所 Input device
EP2352078B1 (en) * 2008-10-01 2022-09-07 Sony Interactive Entertainment Inc. Information processing apparatus, information processing method, information recording medium, and program
US8788977B2 (en) 2008-11-20 2014-07-22 Amazon Technologies, Inc. Movement recognition as input mechanism
US9317128B2 (en) 2009-04-02 2016-04-19 Oblong Industries, Inc. Remote devices used in a markerless installation of a spatial operating environment incorporating gestural control
US10824238B2 (en) 2009-04-02 2020-11-03 Oblong Industries, Inc. Operating environment with gestural control and multiple client devices, displays, and users
JP2010257089A (en) 2009-04-22 2010-11-11 Xiroku:Kk Optical position detection apparatus
KR100969927B1 (en) * 2009-08-17 2010-07-14 (주)예연창 Apparatus for touchless interactive display with user orientation
US8970669B2 (en) * 2009-09-30 2015-03-03 Rovi Guides, Inc. Systems and methods for generating a three-dimensional media guidance application
US9971807B2 (en) 2009-10-14 2018-05-15 Oblong Industries, Inc. Multi-process interactive systems and methods
US9933852B2 (en) 2009-10-14 2018-04-03 Oblong Industries, Inc. Multi-process interactive systems and methods
JP2011166332A (en) * 2010-02-08 2011-08-25 Hitachi Consumer Electronics Co Ltd Information processor
US8878773B1 (en) 2010-05-24 2014-11-04 Amazon Technologies, Inc. Determining relative motion as input
US9372618B2 (en) 2010-10-01 2016-06-21 Z124 Gesture based application management
KR20120051212A (en) * 2010-11-12 2012-05-22 엘지전자 주식회사 Method for user gesture recognition in multimedia device and multimedia device thereof
US9440144B2 (en) 2011-04-21 2016-09-13 Sony Interactive Entertainment Inc. User identified to a controller
US9123272B1 (en) 2011-05-13 2015-09-01 Amazon Technologies, Inc. Realistic image lighting and shading
KR101235432B1 (en) * 2011-07-11 2013-02-22 김석중 Remote control apparatus and method using virtual touch of electronic device modeled in three dimension
US9041734B2 (en) 2011-07-12 2015-05-26 Amazon Technologies, Inc. Simulating three-dimensional features
US10088924B1 (en) 2011-08-04 2018-10-02 Amazon Technologies, Inc. Overcoming motion effects in gesture recognition
US8947351B1 (en) 2011-09-27 2015-02-03 Amazon Technologies, Inc. Point of view determinations for finger tracking
US20130080932A1 (en) 2011-09-27 2013-03-28 Sanjiv Sirpal Secondary single screen mode activation through user interface toggle
US9223415B1 (en) 2012-01-17 2015-12-29 Amazon Technologies, Inc. Managing resource usage for task performance
US8884928B1 (en) 2012-01-26 2014-11-11 Amazon Technologies, Inc. Correcting for parallax in electronic displays
US9063574B1 (en) 2012-03-14 2015-06-23 Amazon Technologies, Inc. Motion detection systems for electronic devices
US9285895B1 (en) 2012-03-28 2016-03-15 Amazon Technologies, Inc. Integrated near field sensor for display devices
US9587804B2 (en) 2012-05-07 2017-03-07 Chia Ming Chen Light control systems and methods
US9423886B1 (en) 2012-10-02 2016-08-23 Amazon Technologies, Inc. Sensor connectivity approaches
US9524028B2 (en) 2013-03-08 2016-12-20 Fastvdo Llc Visual language for human computer interfaces
US9035874B1 (en) 2013-03-08 2015-05-19 Amazon Technologies, Inc. Providing user input to a computing device with an eye closure
US9829984B2 (en) 2013-05-23 2017-11-28 Fastvdo Llc Motion-assisted visual language for human computer interfaces
US9717118B2 (en) 2013-07-16 2017-07-25 Chia Ming Chen Light control systems and methods
US9269012B2 (en) 2013-08-22 2016-02-23 Amazon Technologies, Inc. Multi-tracker object tracking
US11199906B1 (en) 2013-09-04 2021-12-14 Amazon Technologies, Inc. Global user input management
US10055013B2 (en) 2013-09-17 2018-08-21 Amazon Technologies, Inc. Dynamic object tracking for user interfaces
US9367203B1 (en) 2013-10-04 2016-06-14 Amazon Technologies, Inc. User interface techniques for simulating three-dimensional depth
US9990046B2 (en) 2014-03-17 2018-06-05 Oblong Industries, Inc. Visual collaboration interface
EP3146262A4 (en) 2014-04-29 2018-03-14 Chia Ming Chen Light control systems and methods
US10088971B2 (en) 2014-12-10 2018-10-02 Microsoft Technology Licensing, Llc Natural user interface camera calibration
JP6586891B2 (en) * 2016-01-13 2019-10-09 セイコーエプソン株式会社 Projector and projector control method
US10529302B2 (en) 2016-07-07 2020-01-07 Oblong Industries, Inc. Spatially mediated augmentations of and interactions among distinct devices and applications via extended pixel manifold

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE8915535U1 (en) * 1989-03-02 1990-10-25 Fa. Carl Zeiss, 7920 Heidenheim, De
JPH0474285A (en) 1990-07-17 1992-03-09 Medama Kikaku:Kk Position detecting and display device for specific person or object
JP3114813B2 (en) * 1991-02-27 2000-12-04 日本電信電話株式会社 Information input method
JPH0519957A (en) * 1991-07-15 1993-01-29 Nippon Telegr & Teleph Corp <Ntt> Information inputting method
DE4142614A1 (en) 1991-10-14 1993-04-15 Tropf Hermann Dr Ing Object recognition appts. detecting surface irregularities - controls operation of two cameras and light sources arranged at set angle
JPH05324181A (en) 1992-05-26 1993-12-07 Takenaka Komuten Co Ltd Hand pointing type input device
DE571702T1 (en) * 1992-05-26 1994-04-28 Takenaka Corp Handheld input device and wall computer unit.
IT1264225B1 (en) * 1993-09-24 1996-09-23 Sintecna S R L DEVICE FOR POINTING THE CURSOR ON THE SCREEN OF INTERACTIVE SYSTEMS
JPH07160412A (en) * 1993-12-10 1995-06-23 Nippon Telegr & Teleph Corp <Ntt> Pointed position detecting method
JP2552427B2 (en) * 1993-12-28 1996-11-13 コナミ株式会社 Tv play system
US5900863A (en) * 1995-03-16 1999-05-04 Kabushiki Kaisha Toshiba Method and apparatus for controlling computer without touching input device
JPH08329222A (en) 1995-06-02 1996-12-13 Takenaka Komuten Co Ltd Three-dimensional position recognition device
JPH08328735A (en) * 1995-06-02 1996-12-13 Takenaka Komuten Co Ltd Hand pointing input device

Cited By (81)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6711280B2 (en) * 2001-05-25 2004-03-23 Oscar M. Stafsudd Method and apparatus for intelligent ranging via image subtraction
US20030165048A1 (en) * 2001-12-07 2003-09-04 Cyrus Bamji Enhanced light-generated interface for use with electronic devices
WO2003054683A3 (en) * 2001-12-07 2003-12-31 Canesta Inc User interface for electronic devices
WO2003054683A2 (en) * 2001-12-07 2003-07-03 Canesta Inc. User interface for electronic devices
US20030156756A1 (en) * 2002-02-15 2003-08-21 Gokturk Salih Burak Gesture recognition system using depth perceptive sensors
US10242255B2 (en) 2002-02-15 2019-03-26 Microsoft Technology Licensing, Llc Gesture recognition system using depth perceptive sensors
US7340077B2 (en) 2002-02-15 2008-03-04 Canesta, Inc. Gesture recognition system using depth perceptive sensors
EP1336916A2 (en) * 2002-02-18 2003-08-20 Canon Kabushiki Kaisha Position-direction measuring apparatus and information processing method
EP1336916A3 (en) * 2002-02-18 2004-03-17 Canon Kabushiki Kaisha Position-direction measuring apparatus and information processing method
US7151530B2 (en) 2002-08-20 2006-12-19 Canesta, Inc. System and method for determining an input selected by a user through a virtual interface
US20040153229A1 (en) * 2002-09-11 2004-08-05 Gokturk Salih Burak System and method for providing intelligent airbag deployment
US7526120B2 (en) 2002-09-11 2009-04-28 Canesta, Inc. System and method for providing intelligent airbag deployment
US20040066500A1 (en) * 2002-10-02 2004-04-08 Gokturk Salih Burak Occupancy detection and measurement system and method
US20050151838A1 (en) * 2003-01-20 2005-07-14 Hironori Fujita Monitoring apparatus and monitoring method using panoramic image
US20050070024A1 (en) * 2003-09-30 2005-03-31 Nguyen Hoa Duc Method of analysis of alcohol by mass spectrometry
US11073919B2 (en) 2004-05-28 2021-07-27 UltimatePointer, L.L.C. Multi-sensor device with an accelerometer for enabling user interaction through sound or image
US9785255B2 (en) 2004-05-28 2017-10-10 UltimatePointer, L.L.C. Apparatus for controlling contents of a computer-generated image using three dimensional measurements
US11409376B2 (en) 2004-05-28 2022-08-09 UltimatePointer, L.L.C. Multi-sensor device with an accelerometer for enabling user interaction through sound or image
US11402927B2 (en) 2004-05-28 2022-08-02 UltimatePointer, L.L.C. Pointing device
US8049729B2 (en) 2004-05-28 2011-11-01 Erik Jan Banning Easily deployable interactive direct-pointing system and presentation control system and calibration method therefor
US11755127B2 (en) 2004-05-28 2023-09-12 UltimatePointer, L.L.C. Multi-sensor device with an accelerometer for enabling user interaction through sound or image
US8866742B2 (en) 2004-05-28 2014-10-21 Ultimatepointer, Llc Easily deployable interactive direct-pointing system and presentation control system and calibration method therefor
US9063586B2 (en) 2004-05-28 2015-06-23 Ultimatepointer, Llc Easily deployable interactive direct-pointing system and presentation control system and calibration method therefor
US7746321B2 (en) 2004-05-28 2010-06-29 Erik Jan Banning Easily deployable interactive direct-pointing system and presentation control system and calibration method therefor
US11416084B2 (en) 2004-05-28 2022-08-16 UltimatePointer, L.L.C. Multi-sensor device with an accelerometer for enabling user interaction through sound or image
US9411437B2 (en) 2004-05-28 2016-08-09 UltimatePointer, L.L.C. Easily deployable interactive direct-pointing system and presentation control system and calibration method therefor
US9311715B2 (en) 2005-02-08 2016-04-12 Microsoft Technology Licensing, Llc Method and system to segment depth images and to detect shapes in three-dimensionally acquired data
US9165368B2 (en) 2005-02-08 2015-10-20 Microsoft Technology Licensing, Llc Method and system to segment depth images and to detect shapes in three-dimensionally acquired data
US9285897B2 (en) 2005-07-13 2016-03-15 Ultimate Pointer, L.L.C. Easily deployable interactive direct-pointing system and calibration method therefor
US10372237B2 (en) 2005-07-13 2019-08-06 UltimatePointer, L.L.C. Apparatus for controlling contents of a computer-generated image using 3D measurements
US11841997B2 (en) 2005-07-13 2023-12-12 UltimatePointer, L.L.C. Apparatus for controlling contents of a computer-generated image using 3D measurements
WO2007107874A3 (en) * 2006-03-22 2007-12-21 Home Focus Dev Ltd Interactive playmat
US20090103780A1 (en) * 2006-07-13 2009-04-23 Nishihara H Keith Hand-Gesture Recognition Method
US8589824B2 (en) 2006-07-13 2013-11-19 Northrop Grumman Systems Corporation Gesture recognition interface system
US20080244468A1 (en) * 2006-07-13 2008-10-02 Nishihara H Keith Gesture Recognition Interface System with Vertical Display
US8180114B2 (en) 2006-07-13 2012-05-15 Northrop Grumman Systems Corporation Gesture recognition interface system with vertical display
US20080013826A1 (en) * 2006-07-13 2008-01-17 Northrop Grumman Corporation Gesture recognition interface system
US9696808B2 (en) 2006-07-13 2017-07-04 Northrop Grumman Systems Corporation Hand-gesture recognition method
US8234578B2 (en) 2006-07-25 2012-07-31 Northrop Grumman Systems Corporatiom Networked gesture collaboration system
US20080028325A1 (en) * 2006-07-25 2008-01-31 Northrop Grumman Corporation Networked gesture collaboration system
US20080043106A1 (en) * 2006-08-10 2008-02-21 Northrop Grumman Corporation Stereo camera intrusion detection system
US8432448B2 (en) 2006-08-10 2013-04-30 Northrop Grumman Systems Corporation Stereo camera intrusion detection system
US20100110384A1 (en) * 2007-03-30 2010-05-06 Nat'l Institute Of Information & Communications Technology Floating image interaction device and its program
US8985774B2 (en) 2007-03-30 2015-03-24 National Institute Of Information And Communication Technology Floating image interaction device and its program
US8583843B2 (en) 2007-04-26 2013-11-12 Roberto Rosso Communications control bus and apparatus for controlling multiple electronic hardware devices
US20110106996A1 (en) * 2007-04-26 2011-05-05 Roberto Rosso Communications control bus and apparatus for controlling multiple electronic hardware devices
WO2008131565A1 (en) * 2007-04-26 2008-11-06 Roberto Rosso Communications control bus and apparatus for controlling multiple electronic hardware devices
EP2145238A4 (en) * 2007-04-26 2011-10-12 Roberto Rosso Communications control bus and apparatus for controlling multiple electronic hardware devices
EP2145238A1 (en) * 2007-04-26 2010-01-20 Roberto Rosso Communications control bus and apparatus for controlling multiple electronic hardware devices
US8139110B2 (en) 2007-11-01 2012-03-20 Northrop Grumman Systems Corporation Calibration of a gesture recognition interface system
US20090116742A1 (en) * 2007-11-01 2009-05-07 H Keith Nishihara Calibration of a Gesture Recognition Interface System
US20090115721A1 (en) * 2007-11-02 2009-05-07 Aull Kenneth W Gesture Recognition Light and Video Image Projector
US9377874B2 (en) 2007-11-02 2016-06-28 Northrop Grumman Systems Corporation Gesture recognition light and video image projector
US20090316952A1 (en) * 2008-06-20 2009-12-24 Bran Ferren Gesture recognition interface system with a light-diffusive screen
US8345920B2 (en) 2008-06-20 2013-01-01 Northrop Grumman Systems Corporation Gesture recognition interface system with a light-diffusive screen
US8972902B2 (en) 2008-08-22 2015-03-03 Northrop Grumman Systems Corporation Compound gesture recognition
DE102009034413B4 (en) * 2008-08-22 2012-02-02 Northrop Grumman Space & Mission Systems Corporation Recognition of compound gestures
GB2462709A (en) * 2008-08-22 2010-02-24 Northrop Grumman Space & Msn A method for determining compound gesture input
GB2462709B (en) * 2008-08-22 2012-11-14 Northrop Grumman Systems Corp Compound gesture recognition
US20100050133A1 (en) * 2008-08-22 2010-02-25 Nishihara H Keith Compound Gesture Recognition
DE102009043798B4 (en) * 2008-12-17 2014-07-24 Northrop Grumman Space & Mission Systems Corporation Method for detecting hand gestures
US9025857B2 (en) * 2009-06-24 2015-05-05 Canon Kabushiki Kaisha Three-dimensional measurement apparatus, measurement method therefor, and computer-readable storage medium
US20100328682A1 (en) * 2009-06-24 2010-12-30 Canon Kabushiki Kaisha Three-dimensional measurement apparatus, measurement method therefor, and computer-readable storage medium
US8890809B2 (en) * 2009-08-12 2014-11-18 Shimane Prefectural Government Image recognition apparatus, operation determining method and computer-readable medium
US20120119988A1 (en) * 2009-08-12 2012-05-17 Shimane Prefectural Government Image recognition apparatus, operation determining method and computer-readable medium
US9535512B2 (en) 2009-08-12 2017-01-03 Shimane Prefectural Government Image recognition apparatus, operation determining method and computer-readable medium
US9124761B2 (en) 2011-09-05 2015-09-01 Panasonic Intellectual Property Management Co., Ltd. Television communication system, terminal, and method
US8938124B2 (en) 2012-05-10 2015-01-20 Pointgrab Ltd. Computer vision based tracking of a hand
US20130343607A1 (en) * 2012-06-20 2013-12-26 Pointgrab Ltd. Method for touchless control of a device
CN102968208A (en) * 2012-09-05 2013-03-13 广东威创视讯科技股份有限公司 Method and system for selecting adjustment mode of effective area of area array camera positioning image
US8933882B2 (en) * 2012-12-31 2015-01-13 Intentive Inc. User centric interface for interaction with visual display that recognizes user intentions
JP2014142695A (en) * 2013-01-22 2014-08-07 Ricoh Co Ltd Information processing apparatus, system, image projector, information processing method, and program
WO2014129683A1 (en) * 2013-02-21 2014-08-28 엘지전자 주식회사 Remote pointing method
US9734582B2 (en) 2013-02-21 2017-08-15 Lg Electronics Inc. Remote pointing method
US10201900B2 (en) * 2015-12-01 2019-02-12 Seiko Epson Corporation Control device, robot, and robot system
CN106383500A (en) * 2016-09-05 2017-02-08 湖北工业大学 Intelligent building door and window curtain wall system
US20200371597A1 (en) * 2017-04-18 2020-11-26 Kyocera Corporation Electronic device
CN113891526A (en) * 2020-07-01 2022-01-04 丰田自动车株式会社 Server device, information processing system, and method for operating system
US20220007483A1 (en) * 2020-07-01 2022-01-06 Toyota Jidosha Kabushiki Kaisha Server apparatus, information processing system, and operating method for system
US11553575B2 (en) * 2020-07-01 2023-01-10 Toyota Jidosha Kabushiki Kaisha Server apparatus, information processing system, and operating method for system
JP7380453B2 (en) 2020-07-01 2023-11-15 トヨタ自動車株式会社 Server device, information processing system, and system operating method

Also Published As

Publication number Publication date
EP1329839A3 (en) 2003-07-30
DE69829424T2 (en) 2006-05-04
EP1329839B1 (en) 2005-03-16
EP1329838A2 (en) 2003-07-23
JP3749369B2 (en) 2006-02-22
US6385331B2 (en) 2002-05-07
DE69824225D1 (en) 2004-07-08
DE69829424D1 (en) 2005-04-21
KR100328648B1 (en) 2002-07-31
KR19980080509A (en) 1998-11-25
EP0866419B1 (en) 2004-06-02
DE69824225T2 (en) 2005-06-23
DE69828912D1 (en) 2005-03-10
EP1329838A3 (en) 2003-08-06
EP1329839A2 (en) 2003-07-23
EP1329838B1 (en) 2005-02-02
EP0866419A2 (en) 1998-09-23
JPH10326148A (en) 1998-12-08
DE69828912T2 (en) 2005-07-28
EP0866419A3 (en) 2001-05-23

Similar Documents

Publication Publication Date Title
US6385331B2 (en) Hand pointing device
JP3795647B2 (en) Hand pointing device
KR101650799B1 (en) Method for the real-time-capable, computer-assisted analysis of an image sequence containing a variable pose
US5912721A (en) Gaze detection apparatus and its method as well as information display apparatus
JP3660492B2 (en) Object detection device
JP2988178B2 (en) Gaze direction measuring device
EP2945038B1 (en) Method of controlling a cleaner
JP2006346767A (en) Mobile robot, marker, method of computing position and posture of mobile robot, and autonomous traveling system for mobile robot
JPH0844490A (en) Interface device
JPH0889481A (en) Eye direction measurement device for vehicle
JP3792907B2 (en) Hand pointing device
US10606372B2 (en) System and method for interaction with a computer implemented interactive application using a depth aware camera
JP2016167208A (en) Action support device
CN113946223A (en) Tourism interaction method adopting display screen
JP3138145U (en) Brain training equipment
KR101536753B1 (en) Method and system for image processing based on user&#39;s gesture recognition
JP2003076488A (en) Device and method of determining indicating position
JP2000222098A (en) Hand pointing device, instruction position displaying method and recording medium
KR20150073146A (en) Method and system for image processing based on user&#39;s gesture recognition
WO2016151958A1 (en) Information processing device, information processing system, information processing method, and program
JP2000222117A (en) Hand pointing device, indicated position display method and recording medium
KR20240021098A (en) Device and method for obtaining location information indoor space
JP2003084229A (en) Device and method for displaying pointed position
JP2005100466A (en) Pattern recognition device
JPWO2008126419A1 (en) Exercise support method

Legal Events

Date Code Title Description
AS Assignment

Owner name: TAKENAKA CORPORATION, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HARAKAWA, KENICHI;UNNO, KENICHI;IGAWA, NORIO;REEL/FRAME:009069/0807

Effective date: 19980313

REMI Maintenance fee reminder mailed
FEPP Fee payment procedure

Free format text: PAYER NUMBER DE-ASSIGNED (ORIGINAL EVENT CODE: RMPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

SULP Surcharge for late payment
REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20100507

FEPP Fee payment procedure

Free format text: PAYER NUMBER DE-ASSIGNED (ORIGINAL EVENT CODE: RMPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY