WO2008048664A2 - Image management through lexical representations - Google Patents
Image management through lexical representations Download PDFInfo
- Publication number
- WO2008048664A2 WO2008048664A2 PCT/US2007/022226 US2007022226W WO2008048664A2 WO 2008048664 A2 WO2008048664 A2 WO 2008048664A2 US 2007022226 W US2007022226 W US 2007022226W WO 2008048664 A2 WO2008048664 A2 WO 2008048664A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- graphical representations
- representations
- image
- lexical
- images
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5838—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5854—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using shape and object relationship
Definitions
- FIG. 1 shows a block diagram of an image management system, which may employ various examples of text-based image database creation and image retrieval processes disclosed herein, according to an embodiment of the invention
- FIG. 2 shows a block diagram of an original image and a morhpo- lexical histogram derived from the original image, according to an embodiment of the invention
- FIG. 3A shows a flow diagram of a method for creating a database of images, according to an embodiment of the invention
- FIG. 3B shows a more detailed flow diagram of various steps performed with a morpho-lexical process step discussed in FIG. 3A, according to an embodiment of the invention.
- FIG. 4 shows a flow diagram of a method for searching for images on a database created through implementation of the method depicted in FIG. 3A 1 according to an embodiment of the invention
- various characteristics of a plurality of images may be represented through human readable lexicon. These representations include the relationships between various objects in the images, which may also be represented through human readable lexicon.
- a database of the images that is searchable through textual terms defining various characteristics of the images may be created.
- desired images may be. retrieved through a search of the database using textual search terms or through a comparison with an input image.
- a user may access and search the database for one or more images in manners that are similar to searches performed for textual documents.
- the method and systems disclosed herein may afford users with a relatively more intuitive manner of searching for images.
- FIG. 1 there is shown a block diagram of an image management system 100 which may employ various examples of the text- based image database creation and image retrieval processes disclosed herein, according to an example.
- the image management system 100 is depicted as including a communications interface 102, processing circuitry 104, storage circuitry 106, a user interface 108, an image input device 110, and a database 120.
- the image management system 100 may include additional components and some of the components described herein may be removed and/or modified without departing from a scope of the image management system 100.
- the communications interface 102 is arranged to implement communications of the image management system 100, which may be embodied in a computing device, with respect to external devices, which are not shown. For instance, the communications interface 102 may be arranged to communicate information bi-directionally with respect to another computing device.
- the communications interface 102 may be implemented as a network interface card (NIC), serial or parallel connection, USB port, Firewire interface, flash memory interface, floppy disk drive, or any other suitable arrangement for communicating with respect to the image management system 100.
- NIC network interface card
- serial or parallel connection USB port
- Firewire interface Firewire interface
- flash memory interface flash memory interface
- floppy disk drive or any other suitable arrangement for communicating with respect to the image management system 100.
- the processing circuitry 104 is arranged to process data, control data access and storage, issue commands, and control other desired operations.
- the processing circuitry 104 may include circuitry configured to implement desired programming provided by appropriate media in at least one example, such as the methods disclosed herein below.
- the processing circuitry 104 may be implemented as one or more of a processor and other structure configured to execute executable instructions including, for example, software, firmware, and/or hardware circuitry instructions.
- the processing circuitry 104 may thus include, for instance, hardware logic, PGA, FPGA, ASIC, state machines, or other structures alone or in combination with a processor.
- the storage circuitry 106 is configured to store programming such as executable code or instructions (for instance, software, firmware, or both), electronic data, image data, meta data associated with image data, databases, or other digital information and may include processor-usable media.
- Processor- usable media may be embodied in any computer program product(s) or article of manufacture(s) which may contain, store, or maintain programming, data and/or digital information for use by or in connection with an instruction execution system including the processing circuitry 104.
- the processor-usable media may include any one of physical media such as electronic, magnetic, optical, electromagnetic, infrared or semiconductor media.
- processor-usable media include, for instance, a portable magnetic computer diskette, such as a floppy diskette, zip disk, hard drive, random access memory, read only memory, flash memory, cache memory, and other configurations capable of storing programming, data, or other digital information.
- a portable magnetic computer diskette such as a floppy diskette, zip disk, hard drive, random access memory, read only memory, flash memory, cache memory, and other configurations capable of storing programming, data, or other digital information.
- At least some of the examples or aspects described herein may be implemented using programming stored within appropriate storage circuitry 106 described above and/or communicated through a network or other transmission media and configured to control appropriate processing circuitry.
- programming may be provided through appropriate media including, for instance, embodied within articles of manufacture 112, embodied within a data signal, for instance, modulated carrier wave, data packets, digital representations, etc., communicated through an appropriate transmission medium, such as a communication network, for instance, the Internet, a private network, or both, wired electrical connection, optical connection, electromagnetic energy, for instance, through a communications interface, or provided using other appropriate communication structure or medium.
- programming including processor-usable code may be communicated as a data signal embodied in a carrier wave.
- the storage circuitry 106 may further be in communication with the database 120, which may be created by the processing circuitry 104 to store images, morpho-lexical representations of the images, or both.
- the database 120 may be created to generally enable search and retrieval of images through text-based search queries, similar to those used for text document search and retrieval.
- the user interface 108 is configured to interact with a user including conveying data to a user by, for instance, displaying data for observation by the user, audibly communicating data to a user, etc., as well as receiving inputs from the user, for instance, tactile input, voice instruction, etc.
- the user interface 108 may include a display 114, for instance, a cathode ray tube, LCD, etc., configured to depict visual information and a keyboard, mouse, and/or other suitable input device 116 for enabling user-interaction with the image management system 100.
- a user may employ the user interface 108 to input search terms into the image management system 100, which may be similar to search terms used for text based searches.
- the image input device 110 may be implemented as any suitable device configured to provide electronic image data corresponding to an image, such as a photograph, a frame of a video capture, etc., provided to the image management system 100.
- the image input device 110 may include, for instance, a scanning device, such as a flatbed color photograph scanner, a digital camera, a digital video camera, another image management system, etc.
- the image input device 110 may additionally be implemented to input search criteria into the image management system 100.
- an image may be scanned into the image management system 100 through the image input device 110 and the image may be morpho-lexically processed as discussed below.
- the characteristics of the morpho-lexically processed image may then be compared with the characteristics of the morpho-lexically processed images stored in the database 120 to, for instance, find images in the database 120 that are similar to the scanned image.
- the processing circuitry 104 may quantize the image data, which may include, for instance, RGB, Lab, etc., of a plurality of image forming elements, for instance, pixels, to identify areas of the images having a consistent or common characteristic.
- the consistent or common characteristic may include, for instance, contiguous areas in the images having the same colors.
- the quantized image data may be further morpho-lexically processed to thereby translate the image data into human readable lexicon, as described in greater detail herein below.
- morpho-lexically process may be defined to include processes in which one or more characteristics of the various areas in the images are identified and labeled using human readable lexicon.
- the one or more characteristics may include, for instance, the locations of the various areas with respect to each other, the colors of the various areas, the sizes of the various areas, etc.
- the one or more characteristics may include the relationships between the various areas with respect to each other and the borders of the images. In other words and as described in greater detail herein below, morphological processing may be performed upon images which have been lexically quantized.
- lexical quantization may be defined to include the use of human comprehensible words of a human readable lexicon, for instance, words of the English language or other language, to describe visual characteristics of the contents or objects in an image.
- the human comprehensible words may be associated with the image data and may be used to assist with or facilitate management of the images, such as, in the creation of the searchable database 120 of images.
- the human comprehensible words may also assist with or facilitate in the retrieval of images from the searchable database 120 of images.
- the human comprehensible words may describe characteristics, for instance, colors, gray scaling, or both, of contents of the images in a natural language, which may readily be understood by average humans.
- the human comprehensible words may include, for instance, lexical color names present within a human readable and comprehensible lexicon, for instance, content readily read and understood by humans as part of a human language, as distinguished from machine language or code, which may also be understood by programmers but typically requires some type of mapping or understanding of mathematical relationships to color. Examples of lexical color names readily recognizable to humans include, black, red, blue, green, yellow, orange, etc.
- the human comprehensible words may also include quantized lexical size designations present within a human readable and comprehensible lexicon. Examples of quantized lexical size designations readily recognizable to humans include, for instance, very small, small, medium, large, very large, etc. As should be clearly understood, the lexical size designations may include a plethora of other size designations depending upon the desired level of granularity in describing the sizes of the objects contained in the images relative to each other or otherwise relative to some other feature.
- the human comprehensible words may further include lexical relative position labels present within a human readable and comprehensible lexicon.
- the lexical relative position labels may, for instance, denote the location of a first object with respect to the location of a second object, the location of the first or second object with respect to the image, the location of the first or second object with respect to one or more borders of the image, etc.
- the lexical relative position labels may denote whether the first object is in contact with the second or other object.
- examples of lexical relative position labels that are readily recognizable to humans include, north, south, east, west, left, right, center, upper, lower, etc.
- the lexical relative position labels may be as detailed or as broad as desired, depending upon, for instance, the desired level of granularity in describing the relative positions of the objects in an image.
- the lexical color names corresponding to quantization bins may be generated by an aggregation of definitions of a relatively large population of humans. Accordingly, in some examples, words that describe ranges of frequencies of the electromagnetic visible spectrum and which are readily understood words of the human lexicon as distinguished from technical terms for identifying the electromagnetic energy and perhaps only familiar to technical persons educated with respect to such technical terms, are employed in at least one implementation. Words may refer to any meaning bearing sequences of symbols of a lexicon, and may include abbreviations and lemmas, as examples.
- the number of categories or bins for lexical quantization is determined according to the number of color names used to characterize images. Once the images are lexically quantized, words indicative of the content of the images, for instance, lexical color names, are associated with image forming elements of the images. Additional details of lexical quantization are discussed in detail below.
- the morphological processing described herein may be performed upon an image which has been lexically quantized as mentioned above. That is, an appropriate one of the lexical color names is associated with each of the image forming elements of the image corresponding to the color contents of the image forming elements. Generally speaking, the morphological processing identifies plural areas of the images having a consistent or common characteristic. In a more specific example, areas of an image are identified as one of the lexical color names, are associated with each of the areas, and correspond to the color of the respective area. Morphological processing may include filtering of image forming elements of a given area which do not have the common characteristic and to change the content of such elements to the common characteristic. The filtering may be provided in different resolutions as discussed below.
- Information regarding the resultant areas may be associated with the initial image data, for instance, the image data of the images before lexical quantization and morphological processing and useable to reproduce faithful reproductions of the images, and stored for example, as metadata of the images using the storage circuitry 106.
- the metadata may be used to identify and retrieve desired initial image data of respective images in one example.
- Information regarding the resultant areas may include a lexical color name indicative of the color of the image forming elements of the area.
- the area information may additionally include mass information, for instance, the quantity of image forming elements of the areas in number of pixels or a percentage to the total, as well as location information of the area.
- the location information may identify a centroid of the respective region corresponding to the average x and y locations of all image forming elements of the area, in one example.
- the mass information may be represented by lexical mass designations and the location information may be represented by lexical relative position labels.
- FIG. 2 depicts an original image 200 and a morpho-lexical histogram 220 derived from the original image 200.
- the original image 200 may comprise a segmented image generated using a quantization method discussed in greater detail herein below. Other methods for segmenting images, for instance, generating multiple segments of an image and each of the segments being assigned a single color, may be used in other examples.
- the original image 200 includes a plurality of objects, including, a jug 202 having an exterior color 204 and an interior color 206. Within a handle portion of the jug 202 is a third color 208, which may be darker than the surrounding regions of the jug 202 due to shadows.
- various objects surrounding the jug 202 are labeled as 21 Oa- 210c.
- the morpho-lexical histogram 220 may be created.
- the morpho-lexical histogram 220 of FIG. 2 includes graphical representations of the various areas in the original image 200. More particularly, the morpho-lexical histogram 220 generally depicts graphical representations of the various objects according to their sizes, colors, respective locations, and morphologies. As such, the morpho-lexical histogram 220 graphically depicts the various sections 204-208 of the jug 202 and the areas 210a-210c surrounding the jug 202 according to their centroids, sizes, and colors.
- the exterior color 204 of the jug 202 is graphically represented as a relatively large circle 204' having the exterior color 204, the centroid of which is positioned near the center of the morpho-lexical histogram 220.
- the graphical representation 204' of the area having the exterior color 204 is depicted as being in contact with graphical representations 206' and 208' of the areas having the interior color 206 and the third color 208, respectively, through lines 222.
- the graphical representations 204'-208' are also depicted as being in contact with graphical representations 210a'-210c' of the areas 210a-210c surrounding the jug 202.
- the processing circuitry 104 may employ the morpho-lexical histogram 220 to derive lexical representations of the objects in the original image 200.
- the processing circuitry 104 may determine the colors of the graphical representations 204'-21Oc' and may assign lexical color names to the graphical representations 204'-21Oc' as described above.
- the processing circuitry 104 may determine that the graphical representation 204' is very large, that the graphical representations 210b' and 210c' are medium, that the graphical representations 206' and 208' are small, and that the graphical representation 210a' is very small.
- the processing circuitry 104 may assign lexical size designations according to the determined sizes.
- the processing circuitry 104 may further determine the relative positions of the graphical representations 204'-21Oc 1 with respect to each other, the boundaries of the image 200, or both.
- the processing circuitry 104 may also assign lexical relative position labels for the graphical representations 204'-21Oc'. For instance, the processing circuitry 104 may store an indication that the graphical representation 206' is located above the graphical representation 204' and the graphical representation 21 Oc' is located to the right of the graphical representation 204'.
- the processing circuitry 104 may divide the morpho-lexical histogram 220 into a plurality of virtual swaths.
- the morpho-lexical histogram 220 may be divided into 3 equal virtual swaths, which extend horizontally across the morpho-lexical histogram 220.
- each of the swaths may be divided into a number of regions. The number of swaths and regions into which the morpho-lexical histogram 220 is divided may be based, for instance, on the densities of the various regions.
- the processing circuitry 104 may divide the morpho-lexical histogram 220 into a greater number of regions if there is a greater density of regions.
- the regions may be identified as the swaths are traversed, to thereby enable lexical representations of the regions, and the graphical representations contained in the regions, to be generated.
- FIG. 3A there is shown a flow diagram of a method 300 of creating a database of images, where the database is searchable through human readable lexicon, according to an example.
- the method 300 may be performed using the processing circuitry 104.
- other methods may include more, less and/or alternative steps in other examples.
- the processing circuitry 104 may initiate the method 300 through receipt of a command from a user, at a predetermined period of time, automatically, etc. Once initiated, the processing circuitry 104 may access image data of an image to be processed, at step 304.
- the image data may include RGB data for a plurality of image forming elements, for instance, pixels.
- the processing circuitry 104 may operate to convert the image data to a desired color space, such as Lab.
- the processing circuitry 104 may morpho-lexically process the image data, as indicated at step 306.
- the image data may be morpho- lexically processed as described above with respect to FIG. 2 to yield human readable lexical representations of the various regions contained in the images.
- One manner in which the image data is morpho-lexically processed is discussed in greater detail herein below with respect to FIG. 3B.
- the image data may be morphologically processed at step 306 through a series of morphological operations at multiple resolutions allowing spurious colors to be removed from homogeneous color regions in the image.
- images may be filtered morphologically to represent the images as graphical representations individually comprising a single consistent color.
- graphical representations are defined wherein a majority of the image forming elements have a consistent or common characteristic (common lexical color name resulting from the lexical quantization) and other inconsistent image forming elements of the graphical representations may be changed or filtered to the consistent characteristic. More detailed description of various manners in which the image data may be morphologically processed are described in U.S. Patent Application Serial No.
- TBD (Attorney Docket No. 200408243-1), entitled “Image Processing Methods, Image Management Systems, and Articles of Manufacture", filed on July 27, 2006
- U.S. Patent Application Serial No. TBD (Attorney Docket No. 200408244-1), entitled “Image Management Methods, Image Management Systems, and Articles of Manufacture", filed on July 27, 2006.
- the disclosures of both of the above-identified Applications are hereby incorporated by reference in their entireties.
- the processing circuitry 104 may control the storage circuitry 106 to store the lexical representations in the database 120 as human readable lexicon, as indicated at step 308.
- the database 120 may be searchable through textual queries as described in greater detail herein below.
- the images may be stored on the database and the human readable lexical representations of the regions in the image may be stored in the metadata of the image.
- the processing circuitry 104 may determine whether the method 300 is to be continued. The processing circuitry 104 may determine that the method 300 is to be continued, for instance, in order to create and store lexical representations of any additional images. If, however, there are no additional images, the processing circuitry 104 may end the method 300 at step 312.
- FIG. 3B illustrates the steps of morpho-lexically processing an image, according to an example.
- the processing circuitry 104 may generate graphical representations of various objects or areas in the image, such as the image 200 . More particularly, for instance, the processing circuitry 104 may associate individual image forming elements of a quantized image with one of a plurality of respective graphical representations. Quantization of the image allows for a discrete outcome permitting filtering of non-consistent colors within a graphical representation.
- the objects may be defined into respective graphical representations through identification of which areas in the image contains consistent or common characteristics.
- the consistent or common characteristic may include, for instance, contiguous areas in the images having the same colors.
- some areas may be merged if a plurality of areas are identified as corresponding to a single portion or object of an original image, for instance, due to a color gradient occurring in the portion or object causing the lexical quantization of the portion or object to be classified into plural areas.
- the respective graphical representations may include the exterior color 204, the interior color 206, the third color 208, and the areas 210a-210c surrounding the jug 202.
- the processing circuitry 104 may analyze the respective subject graphical representation with respect to other graphical representations which touch or border the respective subject graphical representation, and if certain criteria are met, the processing circuitry 104 may merge appropriate graphical representations. Once the regions which border a subject graphical representation are identified, the processing circuitry 104 may access initial image data of the image, for instance, the content of the image data prior to lexical or morphological processing, corresponding to the subject graphical representations and the bordering graphical representations and may calculate respective average values, for instance, average luminance and chrominance L, a, and b values of an exemplary Lab color space, of the graphical representations using the initial image data. The average values of the subject graphical representation may be compared with each of the average values of the respective bordering graphical representations, for example using a Euclidean metric:
- the x values correspond to average L, a, b values of the subject region and the y values correspond to average L, a, b values of the bordering region being analyzed.
- the two graphical representations may be merged with one another.
- the threshold may be selected to distinguish between merging of graphical representations which are so similar in the original image that they should be merged, for instance, select the threshold to identify plural similar graphical representations which were near a border between quantization bins, from not merging graphical representations which clearly include content of different colors, for instance, quantization into separate bins did not occur as a result of the color crossing into plural similar quantization bins.
- the analysis may be repeated for the other graphical representations which border the subject graphical representation in one embodiment.
- the merged graphical representations may represent a single object of the image using a single image graphical representation in one embodiment.
- the graphical representation information including lexical color name, mass and location of each of the graphical representations may be associated with the respective image, for example, as meta data.
- lexical representations of the graphical representation information are determined and assigned are discussed in greater detail herein below.
- the processing circuitry 104 may determine the centroids and the sizes of the various graphical representations. The processing circuitry 104 may determine these characteristics of the graphical representations based upon the arrangement of the image forming elements or objects in the image. In any event, at step 324 the processing circuitry 104 may plot the graphical representations onto a morpho-lexical histogram, for instance, in a manner similar to the morpho-lexical histogram 220 depicted in FIG. 2.
- the processing circuitry 104 may determine lexical color names of the various graphical representations. As described above, the processing circuitry 104 may determine human comprehensible words to describe the lexical color names of the various regions in the image, such as, black, red, blue, green, yellow, orange, etc.
- the processing circuitry 104 may determine lexical size designations for the various graphical representations.
- the lexical size designations may include, for instance, very small, small, medium, large, very large, etc.
- the processing circuitry 104 may compare the sizes of the graphical representations with respect to each other to determine the lexical size designations.
- the processing circuitry 104 may compare the sizes of the graphical representations with preset standards to determine the lexical size designations. In this example, for instance, the processing circuitry may determine that a graphical representation is small if that graphical representation is below a predetermined percentage of the overall image.
- the processing circuitry 104 may morphologically process the quantized image to determine the relationships between the various graphical representations. More particularly, for instance, the processing circuitry 104 may determine the positions of the graphical representations with respect to each other and their respective positions on the image itself.
- the morphological processing of step 330 may include one or more levels of morphological processing (filtering) at different resolutions. Additional details of processing of plural stages in one example are discussed in Obrador, Pere, "Multiresolution Color Patch Extraction,” published in SPIE Visual Communications and Image Processing, January 15-192006, San Jose, California, the teachings of which are incorporated herein by reference in its entirety.
- the processing circuitry 104 may use a plurality of morphological filters to generate abstract representations of the image at multiple resolution levels.
- the morphological filters may be used to vary the amount of detail to be kept in the abstract representations of the image. For instance, in lower resolution levels, the smaller graphical representations are obviated, thereby leaving a very coarse abstract representation and lexical representation of the image. In contrast, at higher resolution levels, a greater level of detail is retained and relatively smaller graphical representations are represented in the abstract representation and a more detailed lexical representation of the image is provided.
- the processing circuitry 104 may determine which of the graphical representations is in contact with which of the other graphical representations, which of the graphical representations are in contact with the borders of the image, etc. In addition, or alternatively, the processing circuitry 104 may divide the image or the morpho-lexical histogram representation of the image into a plurality of virtual swaths, with each swath containing zero or more virtual regions. In this instance, the processing circuitry 104 may, for example, scan across the virtual swaths to determine the relative positions of the virtual regions with respect to each other. For instance, the processing circuitry 104 may identify that a region A, located in a center of the image, has a neighbor region B, which is located to the North of region A, and a neighbor region C, which is located to the East of region A.
- the processing circuitry 104 may assign human readable lexical representations to the virtual regions.
- the processing circuitry 104 may determine that a first region, denoted by dashed lines 224 and taken from the top left corner of the histogram 220, includes the graphical representations 210a' and 208'.
- the processing circuitry 104 may assign the first region 224 with human readable lexical representations that indicate a very small light gray graphical representation is located to the left of a small dark gray graphical representation.
- the processing circuitry 104 may assign a centrally located region 226 with human readable lexical representations that indicate a very large orange graphical representation. In other words, the processing circuitry 104 may determine that the first region 224 is to the NorthWest of the central region 226. In addition, the processing circuitry 104 may determine that a very large orange graphical representation is located beneath and to the right of a small dark gray color patch, which is located to the right of a very small light gray patch, and so forth.
- the processing circuitry 104 may repeat this process with the remaining regions to thereby identify and assign human readable lexical representations of the remaining graphical representations.
- the human readable lexical representations may be stored in the text-based searchable database as indicated at step 308.
- FIG. 4 there is shown a flow diagram of a method
- the method 400 for retrieving images on the database 120 created through implementation of the method 300.
- the method 400 generally depicts a manner in which the database 120 may be queried such that one or more desired images may be retrieved. More particularly, the method 400 enables text-based queries, similar to those used to retrieve text documents, to be employed in finding and retrieving image documents.
- the database 120 may be accessed by the processing circuitry 104.
- the processing circuitry 104 may receive a search query.
- the search query may be received through the user interface 108 as one or more search terms.
- the processing circuitry 104 may parse the one or more search terms to determine various characteristics of the one or more search terms. For instance, if the search query includes the term "beach", the processing circuitry 104 may determine that images that may match the desired term include, on a high level, a very large blue area above a very large beige area. On a more detailed level, the processing circuitry 104 may determine that images containing a blue sky and beige sand are matching characteristics for the term "beach". In either case, the processing circuitry 104 may determine the relative positions of the different regions concerning the search query term(s).
- the processing circuitry 104 may receive a request for desired images via input 108 using search criteria including characteristics, such as lexical color names, mass and/or location information of one or more regions within the desired images to be retrieved.
- the request may specify one or more regions of images to be retrieved, for instance, "Locate images having a large blue region center top, a medium red region center, and a yellow region center bottom", and processing circuitry 104 may search images stored in storage circuitry 106 using the search criteria and region information associated with the stored images, as indicated at step 406, and rank the stored images according to how close they match the search criteria.
- the processing circuitry 104 may create a search representation using the inputted text search request which represents the specified lexical color name, mass and/or location information which may be used to search the stored images.
- the search representation may, for instance, be in the form of three vectors corresponding to color, mass and location.
- the search query may be received through the image input device 110 as, for instance, a scanned image.
- the processing circuitry 104 may morpho-lexically process the scanned image as described above with respect to FIGS. 3A and 3B to obtain lexical representations of the scanned image.
- the processing circuitry 104 may determine that the scanned image contains a very large blue area above a very large beige area.
- the processing circuitry 104 may rank the images stored in the database 120 according to respective similarities to the scanned images.
- the processing circuitry 104 may create region information of at least one region of the search image to create a search representation and use the search representation to search the stored images using the region information associated with respective ones of the stored images. More particularly, for instance, the processing circuitry 104 may access the database 120 to retrieve one or more images that are responsive to the search query, at step 406.
- the processing circuitry 104 may retrieve all of the images that have a very large blue area above a very large beige area.
- the processing circuitry 104 may access region information of the stored images and compare the search criteria with respect to the region information of the regions of the stored images in an attempt to identify desired images.
- the processing circuitry 104 may use the lexical color name, mass and location information to perform comparison operations.
- the lexical color name, mass and location information may be used to calculate distances of at least one region of the search criteria with respect to a region of each of the stored images.
- the processing circuitry 104 may be configured to rank the similarity of the search criteria with respect to each of the stored images as a relationship directly proportional to size of the graphical representations, inversely proportional to centroids of the graphical representations, and inversely proportional to the color differences of the graphical representations. For example, for calculating a distance between two images 1 and 2, the following formulae may be used:
- Dist colorPalch hrPalch proportionalTo ⁇ — — — - — — —
- the processing circuitry 104 may provide information indicative of similarities of the images being compared responsive to similarities of the regions of the images as indicated by the calculated distances corresponding to the respective regions. For example, the stored images may be ranked from closest, or most similar, to farthest, or most dissimilar.
- the processing circuitry 104 may depict the search results using display 116 and the user may select desired images for viewing, as indicated at step 408. Initial image data of selected images may be retrieved from the storage circuitry 106 and displayed using the display 116.
- the processing circuitry 104 may initially compare the largest graphical representation of the search representation with respect to the largest graphical representation of the stored images, and subsequently proceed to analyze the smaller size graphical representations if the larger graphical representations are found to be sufficiently similar.
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB0906015A GB2455943A (en) | 2006-10-17 | 2007-10-17 | Image management through lexical representations |
JP2009533369A JP4824820B2 (en) | 2006-10-17 | 2007-10-17 | Image management by vocabulary expression |
CN2007800388458A CN101529422B (en) | 2006-10-17 | 2007-10-17 | Image management through lexical representations |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/582,118 US7755646B2 (en) | 2006-10-17 | 2006-10-17 | Image management through lexical representations |
US11/582,118 | 2006-10-17 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2008048664A2 true WO2008048664A2 (en) | 2008-04-24 |
WO2008048664A3 WO2008048664A3 (en) | 2008-07-24 |
Family
ID=39302670
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2007/022226 WO2008048664A2 (en) | 2006-10-17 | 2007-10-17 | Image management through lexical representations |
Country Status (5)
Country | Link |
---|---|
US (1) | US7755646B2 (en) |
JP (1) | JP4824820B2 (en) |
CN (1) | CN101529422B (en) |
GB (1) | GB2455943A (en) |
WO (1) | WO2008048664A2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010176479A (en) * | 2009-01-30 | 2010-08-12 | Fujifilm Corp | Image keyword appending apparatus, image search device and method of controlling them |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7755646B2 (en) * | 2006-10-17 | 2010-07-13 | Hewlett-Packard Development Company, L.P. | Image management through lexical representations |
CN102132318B (en) * | 2008-08-21 | 2014-06-11 | 惠普开发有限公司 | Automatic creation of a scalable relevance ordered representation of an image collection |
US9400808B2 (en) | 2009-10-16 | 2016-07-26 | Nec Corporation | Color description analysis device, color description analysis method, and color description analysis program |
US8837867B2 (en) * | 2012-12-07 | 2014-09-16 | Realnetworks, Inc. | Method and system to detect and select best photographs |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020081024A1 (en) * | 2000-12-26 | 2002-06-27 | Sung Hee Park | Apparatus and method for retrieving color and shape of image based on natural language |
US20060112088A1 (en) * | 1998-09-30 | 2006-05-25 | Canon Kabushiki Kaisha | Information search apparatus and method, and computer readable memory |
Family Cites Families (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4307376A (en) * | 1976-12-09 | 1981-12-22 | Geometric Data Corporation | Pattern recognition system for generating hematology profile |
JPH05108728A (en) * | 1991-10-21 | 1993-04-30 | Hitachi Ltd | Filing and retrieving method for picture |
EP0550838A1 (en) * | 1992-01-10 | 1993-07-14 | Hewlett-Packard GmbH | Method and computer-aided design system for defining geometric relations |
JP3464338B2 (en) * | 1995-03-31 | 2003-11-10 | 株式会社東芝 | Drawing search method and apparatus |
JP2950258B2 (en) * | 1996-11-11 | 1999-09-20 | 日本電気株式会社 | Charged particle beam drawing method and apparatus |
US5930783A (en) * | 1997-02-21 | 1999-07-27 | Nec Usa, Inc. | Semantic and cognition based image retrieval |
JP4058127B2 (en) * | 1997-02-24 | 2008-03-05 | 富士通株式会社 | Method for changing layout of semiconductor device |
JPH11232288A (en) * | 1998-02-13 | 1999-08-27 | Fuji Xerox Co Ltd | Retrieving device and document image register |
US6240423B1 (en) * | 1998-04-22 | 2001-05-29 | Nec Usa Inc. | Method and system for image querying using region based and boundary based image matching |
JP4086377B2 (en) * | 1998-09-30 | 2008-05-14 | キヤノン株式会社 | Information retrieval apparatus and method |
US6502105B1 (en) * | 1999-01-15 | 2002-12-31 | Koninklijke Philips Electronics N.V. | Region-based image archiving and retrieving system |
US6778697B1 (en) * | 1999-02-05 | 2004-08-17 | Samsung Electronics Co., Ltd. | Color image processing method and apparatus thereof |
JP2001160057A (en) * | 1999-12-03 | 2001-06-12 | Nippon Telegr & Teleph Corp <Ntt> | Method for hierarchically classifying image and device for classifying and retrieving picture and recording medium with program for executing the method recorded thereon |
KR100378351B1 (en) | 2000-11-13 | 2003-03-29 | 삼성전자주식회사 | Method and apparatus for measuring color-texture distance, and method and apparatus for sectioning image into a plurality of regions using the measured color-texture distance |
KR100374791B1 (en) * | 2000-11-22 | 2003-03-04 | 삼성전자주식회사 | Method and apparatus for sectioning image into a plurality of regions |
US7030855B2 (en) * | 2001-05-10 | 2006-04-18 | Metcalf Darrell J | Video-imaging apparel with user-control system |
FR2825817B1 (en) * | 2001-06-07 | 2003-09-19 | Commissariat Energie Atomique | IMAGE PROCESSING METHOD FOR THE AUTOMATIC EXTRACTION OF SEMANTIC ELEMENTS |
FR2825814B1 (en) * | 2001-06-07 | 2003-09-19 | Commissariat Energie Atomique | PROCESS FOR AUTOMATICALLY CREATING AN IMAGE DATABASE THAT CAN BE INTERVIEWED BY ITS SEMANTIC CONTENT |
US20020196965A1 (en) * | 2001-06-22 | 2002-12-26 | Wallace Edward S. | Image transformation and analysis system and method |
GB0118491D0 (en) * | 2001-07-28 | 2001-09-19 | Mood Internat Ltd | A method and apparatus for visual synchronisation between graphical hierarchical representations of an organisation |
CH701481B1 (en) * | 2001-09-25 | 2011-01-31 | Roland Pulfer | Process management. |
DE10206903A1 (en) * | 2002-02-19 | 2003-09-04 | Siemens Ag | Software application, software architecture and method for creating software applications, especially for MES systems |
JP2004030122A (en) * | 2002-06-25 | 2004-01-29 | Fujitsu Ltd | Drawing retrieval support device and method for retrieving drawing |
US20040021666A1 (en) * | 2002-08-01 | 2004-02-05 | University Of Iowa Research Foundation | System and method for dynamically analyzing a mobile object |
JP4585742B2 (en) * | 2003-02-06 | 2010-11-24 | キヤノン株式会社 | Image display device, image display method, program, and recording medium |
CA2420479A1 (en) * | 2003-02-13 | 2004-08-13 | Ibm Canada Limited - Ibm Canada Limitee | Flow debugging software and method |
AU2003902954A0 (en) * | 2003-06-12 | 2003-06-26 | Canon Information Systems Research Australia Pty Ltd | Geometric space decoration in a graphical design system |
US7289666B2 (en) * | 2003-09-09 | 2007-10-30 | Hewlett-Packard Development Company, L.P. | Image processing utilizing local color correction and cumulative histograms |
US7548334B2 (en) * | 2003-10-15 | 2009-06-16 | Canon Kabushiki Kaisha | User interface for creation and editing of variable data documents |
US20050283070A1 (en) * | 2004-06-21 | 2005-12-22 | Celina Imielinska | Systems and methods for qualifying symmetry to evaluate medical images |
US20080021502A1 (en) * | 2004-06-21 | 2008-01-24 | The Trustees Of Columbia University In The City Of New York | Systems and methods for automatic symmetry identification and for quantification of asymmetry for analytic, diagnostic and therapeutic purposes |
CA2486482A1 (en) * | 2004-11-01 | 2006-05-01 | Canadian Medical Protective Association | Event analysis system and method |
RU2004138685A (en) * | 2004-12-29 | 2006-06-10 | Общество с ограниченной ответственностью "Активное Видео" (RU) | METHOD (OPTIONS) AND SYSTEM OF PROCESSING MULTIMEDIA INFORMATION AND METHOD FOR FORMING A PURPOSED AREA (OPTIONS) |
US20080030497A1 (en) * | 2005-12-08 | 2008-02-07 | Yangqiu Hu | Three dimensional modeling of objects |
US7755646B2 (en) * | 2006-10-17 | 2010-07-13 | Hewlett-Packard Development Company, L.P. | Image management through lexical representations |
-
2006
- 2006-10-17 US US11/582,118 patent/US7755646B2/en not_active Expired - Fee Related
-
2007
- 2007-10-17 WO PCT/US2007/022226 patent/WO2008048664A2/en active Application Filing
- 2007-10-17 CN CN2007800388458A patent/CN101529422B/en not_active Expired - Fee Related
- 2007-10-17 GB GB0906015A patent/GB2455943A/en not_active Withdrawn
- 2007-10-17 JP JP2009533369A patent/JP4824820B2/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060112088A1 (en) * | 1998-09-30 | 2006-05-25 | Canon Kabushiki Kaisha | Information search apparatus and method, and computer readable memory |
US20020081024A1 (en) * | 2000-12-26 | 2002-06-27 | Sung Hee Park | Apparatus and method for retrieving color and shape of image based on natural language |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2010176479A (en) * | 2009-01-30 | 2010-08-12 | Fujifilm Corp | Image keyword appending apparatus, image search device and method of controlling them |
Also Published As
Publication number | Publication date |
---|---|
JP2010507171A (en) | 2010-03-04 |
JP4824820B2 (en) | 2011-11-30 |
GB0906015D0 (en) | 2009-05-20 |
US7755646B2 (en) | 2010-07-13 |
US20080088642A1 (en) | 2008-04-17 |
CN101529422A (en) | 2009-09-09 |
CN101529422B (en) | 2011-10-05 |
WO2008048664A3 (en) | 2008-07-24 |
GB2455943A (en) | 2009-07-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7848577B2 (en) | Image processing methods, image management systems, and articles of manufacture | |
US20240070214A1 (en) | Image searching method and apparatus | |
Gong | Intelligent image databases: towards advanced image retrieval | |
JP4139615B2 (en) | Event clustering of images using foreground / background segmentation | |
US7925650B2 (en) | Image management methods, image management systems, and articles of manufacture | |
US6115717A (en) | System and method for open space metadata-based storage and retrieval of images in an image database | |
US8150216B2 (en) | Methods and apparatus for automated true object-based image analysis and retrieval | |
US7917518B2 (en) | Compositional balance and color driven content retrieval | |
EP1288798A2 (en) | Method and system for automated grouping of images | |
JP2010519659A (en) | Search for images based on sample images | |
WO2010021625A1 (en) | Automatic creation of a scalable relevance ordered representation of an image collection | |
Shih et al. | An intelligent content-based image retrieval system based on color, shape and spatial relations | |
WO2006122164A2 (en) | System and method for enabling the use of captured images through recognition | |
US7755646B2 (en) | Image management through lexical representations | |
Dass et al. | Image retrieval using interactive genetic algorithm | |
Khotanzad et al. | Color image retrieval using multispectral random field texture model and color content features | |
Chan et al. | A ROI image retrieval method based on CVAAO | |
Khokher et al. | Image retrieval: A state of the art approach for CBIR | |
Zhang | On the use of CBIR in Image Mosaic Generation | |
Singh et al. | Semantics Based Image Retrieval from Cyberspace-A Review Study. | |
Yu et al. | Image retrieval using color co-occurrence histograms | |
Kumari et al. | A Study and usage of Visual Features in Content Based Image Retrieval Systems. | |
Deshmukh et al. | An improved content based image retreival | |
Seo et al. | ROI-based medical image retrieval using human-perception and MPEG-7 visual descriptors | |
Madhuri et al. | Techniques for Retrieving Images based on the Region of Interest |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200780038845.8 Country of ref document: CN |
|
ENP | Entry into the national phase |
Ref document number: 0906015 Country of ref document: GB Kind code of ref document: A Free format text: PCT FILING DATE = 20071017 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 0906015.3 Country of ref document: GB |
|
ENP | Entry into the national phase |
Ref document number: 2009533369 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 07861441 Country of ref document: EP Kind code of ref document: A2 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 07861441 Country of ref document: EP Kind code of ref document: A2 |