US20170091590A1 - Computer vision as a service - Google Patents
Computer vision as a service Download PDFInfo
- Publication number
- US20170091590A1 US20170091590A1 US15/285,679 US201615285679A US2017091590A1 US 20170091590 A1 US20170091590 A1 US 20170091590A1 US 201615285679 A US201615285679 A US 201615285679A US 2017091590 A1 US2017091590 A1 US 2017091590A1
- Authority
- US
- United States
- Prior art keywords
- machine learning
- platform
- algorithm
- performance
- task
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G06K9/6262—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/96—Management of image or video recognition tasks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/40—Software arrangements specially adapted for pattern recognition, e.g. user interfaces or toolboxes therefor
-
- G06K9/00979—
-
- G06K9/6218—
-
- G06K9/66—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/94—Hardware or software architectures specially adapted for image or video understanding
- G06V10/945—User interactive design; Environments; Toolboxes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/94—Hardware or software architectures specially adapted for image or video understanding
- G06V10/95—Hardware or software architectures specially adapted for image or video understanding structured as a network, e.g. client-server architectures
Abstract
A computer vision service includes technologies to, among other things, analyze computer vision or learning tasks requested by computer applications, select computer vision or learning algorithms to execute the requested tasks based on one or more performance capabilities of the computer vision or learning algorithms, perform the computer vision or learning tasks for the computer applications using the selected algorithms, and expose the results of performing the computer vision or learning tasks for use by the computer applications.
Description
- This application is a continuation of U.S. patent application Ser. No. 14/212,237, filed Mar. 14, 2014, which claims the benefit of and priority to U.S. Provisional Patent Application Ser. No. 61/787,254, filed Mar. 15, 2013, each of which are incorporated herein by this reference in their entirety.
- This invention was made in part with government support under contract number FA8750-12-C-0103, awarded by the Air Force Research Laboratory. The United States Government has certain rights in this invention.
- In computer vision, mathematical techniques are used to detect the presence of and recognize various elements of the visual scenes that are depicted in digital images. Localized portions of an image, known as features, may be used to analyze and classify an image. Low-level features, such as interest points and edges, may be computed from an image and used to detect, for example, people, objects, and landmarks that are depicted in the image. Machine learning algorithms are often used for image recognition.
- Computer vision capabilities can be provided by self-contained (e.g., vertical or shrink-wrapped) software applications, such as Goggles by GOOGLE, Inc., Image Search by GOOGLE, Inc., Bing Image Search by Microsoft Corp., and Kooaba's Image Recognition. Some computer vision applications utilize open source computer vision algorithm libraries, such as OpenCV.
- This disclosure is illustrated by way of example and not by way of limitation in the accompanying figures. The figures may, alone or in combination, illustrate one or more embodiments of the disclosure. Elements illustrated in the figures are not necessarily drawn to scale. Reference labels may be repeated among the figures to indicate corresponding or analogous elements.
-
FIG. 1 is a simplified module diagram of at least one embodiment of a computing system for providing a computer vision service as disclosed herein; -
FIG. 2 is a simplified module diagram of at least one embodiment of the computer vision service ofFIG. 1 ; -
FIG. 3 is a simplified module diagram of at least one embodiment of the capabilities interface ofFIG. 2 ; -
FIG. 4 is a simplified module diagram of at least one embodiment of the task performance interface ofFIG. 2 ; -
FIG. 5 is a simplified flow diagram of at least one embodiment of a method by which the computing system ofFIG. 1 may handle a computer vision or machine learning task requested by a computer application; and -
FIG. 6 is a simplified block diagram of an exemplary computing environment in connection with which at least one embodiment of the computing system ofFIG. 1 may be implemented. - While the concepts of the present disclosure are susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and are described in detail below. It should be understood that there is no intent to limit the concepts of the present disclosure to the particular forms disclosed. On the contrary, the intent is to cover all modifications, equivalents, and alternatives consistent with the present disclosure and the appended claims.
- Existing computer vision tools and applications are not generally accessible to address a broad range of computer application needs and performance requirements. Visual processing of and visual learning from large repositories of digital content (e.g., images, video and associated metadata) is increasingly needed. Unprecedented quantities of visual data are available online through Internet services (such as YOUTUBE, FACEBOOK, news sites, and blogs), and the proliferation of such data is facilitated by mobile device cameras. This visual data can be a valuable tool for many different types of computer applications, including, for example, entertainment, travel, education, security, and disaster recovery applications. However, in order to expose the richness embedded in this visual data to computer applications “at large,” many types of computer vision algorithms, machine learning algorithms, indexing algorithms, “big data” handling algorithms, and associated architectures need to be organized and optimized for various different computer vision and learning tasks. This task is large and complex, and thus extremely difficult for the traditional, self-contained applications to handle.
- Referring now to
FIG. 1 , an embodiment of a vision and learningalgorithm services platform 140 is embodied in acomputing system 100. Theillustrative platform 140 exposes the features and capabilities of a wide variety of computer vision, machine learning, and bigdata processing algorithms 142 for use by many different types ofcomputer applications 130 at different levels of sophistication, depending on the needs of theparticular application 130. Embodiments of theplatform 140 can intelligently provide the needed vision and learning algorithm services to, for example, back-end, middleware, and/or customer-oriented computer applications. The inventors have further described some embodiments of theillustrative platform 140 in Sawhney, Harpreet S., “Tools for ‘Democratizing’ Computer Vision,” and “Performance Characterization of Detection Algorithms: Enabling Parameter Selection, Probabilistic and Content-specific Reasoning,” Presented at the IEEE Conference on Computer Vision and Pattern Recognition (“CVPR'2013”) Vision, Industry & Entrepreneurship Workshop, dated Jun. 24, 2013, and in Sawhney, et. al., “Image Content Based Algorithm Performance Characterization and Recommendation,” draft dated Feb. 7, 2014, both of which are incorporated herein by reference. - As used herein, “user-oriented application” may refer to, among other things, any of these types of computer applications, whether back-end, middleware, and/or customer-oriented computer applications, that has a need for computer vision, machine learning, big data processing, or similar tasks, but whose main focus or objective may be something other than performing computer vision, machine learning, or big data processing tasks. Such applications may include, for example, online travel and tourism services, photo recommenders, home furnishing recommenders, data-driven image and/or video content recommenders (e.g., for focused advertising), text recognition applications (e.g., for reading text in images and videos), and/or others. As used here, “application” or “computer application” may refer to, among other things, any type of computer program or group of computer programs, whether implemented in software, hardware, or a combination thereof, and includes self-contained, vertical, and/or shrink-wrapped software applications, distributed and cloud-based applications, and/or others. Portions of a computer application or application may be embodied as firmware, as one or more components of an operating system, a runtime library, an application programming interface (API), as a self-contained software application, or as a component of another software application, for example.
- In operation, the
computer application 130 interfaces with a person, such as an end user or application developer, e.g., by a user interface subsystem 622 (shown inFIG. 6 , described below) of thecomputing system 100. From time to time, thecomputer application 130 receives or accesses user content 110, which is stored electronically (e.g., as digital files stored in amemory data storage device 618, 668). The user content 110 may include, for example, structured (e.g., meta tags) or unstructured (e.g., natural language)text 112, audio 114 (e.g., sounds and/or spoken dialog),video 116, and/or images 118 (e.g., a single image or a group or sequence of images). Thecomputer application 130 may determine, e.g., by executing computer logic to identify the user content 110 or a characteristic thereof, that a computer vision or learning task needs to be performed. If thecomputer application 130 determines that a computer vision or learning task is to be performed, thecomputer application 130 formulates and submits the vision orlearning task 132 to theplatform 140. As used herein, a “task” may refer to, among other things, an activity, such as a vision or learning operation, to be performed by thecomputing system 100 on specified content 110. As such, a task can haveparameters 134 that relate to, among other things, the specific content 110 to be processed by thecomputing system 100 in performing thetask 132. In some embodiments, thetask 132 involves executing one or more vision or learning algorithms on the user content 110 and returning a result of the algorithm execution to the requestingapplication 130. Alternatively or in addition, thetask 132 may include a request to select an appropriate algorithm for use in processing particular content 110 and/or a request to determine an appropriate set of parameters to use with a particular algorithm in processing certain content 110. For example, portions of theplatform 140 may function mainly as a developer's tool while other portions may operate as an online service directly accessible byapplications 130 in operation (e.g., at load time or runtime). - Based on the
task 132 and/or one ormore parameters 134 relating to the task 132 (which may be supplied to theplatform 140 by theapplication 130 as part of thetask 132 or separately from the task 132), theplatform 140 selects one or more of the vision andlearning algorithms 142 to perform thetask 132. As described in more detail below, theplatform 140 may accessreference data 144 in order to inform its algorithm selection process and/or to perform the requestedtask 132. For instance, theplatform 140 may utilizereference data 144 to characterize and predict the capabilities of one or more of thealgorithms 142 in relation to theparticular task 132. Theplatform 140 executes or initiates the execution of the selected algorithm(s) 142 to perform thetask 132 with therequisite algorithm parameters 146, receives algorithm results 148 (e.g., the output of the execution of the selected algorithm 142), and exposes platform output 136 (e.g., thealgorithm results 148 and/or an “application friendly” version of the algorithm results 148) for use by thecomputer application 130. - In turn, the
application 130 may process theplatform output 136 according to the needs of theapplication 130 and, as a result,present application output 120. For example, if thetask 132 requested by theapplication 130 is “recognize all of the faces in all of these images,” theplatform 140 may select analgorithm 142 based onparameters 134, where theparameters 134 may include the number of images 118 to which thetask 132 relates, the quality and/or content of the images 118, the processing power of the available computing resources (e.g., mobile device or server), and/or the task type (e.g., face, object, scene, or activity recognition). Theselected algorithm 142 performs thetask 132 by, for instance, algorithmically extracting useful information from the images 118 and comparing the useful information for each of the images 118 to a portion of thereference data 144. Theplatform 140 may supply the matching images or information relating to the matching images (such as the name of each person recognized in each of the images 118, or a computer storage location at which the matching images can be accessed) to theapplication 130 asplatform output 136. Theapplication 130 may formulate theplatform output 136 for presentation to an end user of theapplication 130. For instance, theapplication 130 may place the recognized persons' names on or adjacent to each image 118 and display the image 118 and recognized name on a display device of the computing system 100 (e.g., a display of theuser interface subsystem 622, shown inFIG. 6 and described below), or theapplication 130 may invoke a text-to-speech processor to output the recognized persons' names as machine-generated speech audio. Alternatively or in addition, theapplication 130 may use the name information for a subsequent task, such as to perform a background check or to query other information relating to the person. The manner in which theapplication 130 ultimately utilizes theplatform output 136 can vary depending on the requirements or design of theparticular application 130. - Embodiments of the
platform 140 ensure that the requestedtask 132 is completed within the requisite accuracy, quality, efficiency, and/or other parameters that may be specified by theapplication 130 as part of thetask 132 and/or theparameters 134. As described in more detail below, anyapplication 130 that has a need for vision or learning algorithm services can benefit from the use of theplatform 140. In some implementations, revenue may be generated in response to the use of theplatform 140 and/or theunderlying algorithms 142 byvarious applications 130. For example, vendors of different vision and learningalgorithms 142 may “compete” to have their algorithm selected by theplatform 140 or byapplications 130, and the selection of an algorithm may trigger a revenue event. - Referring now to
FIG. 2 , an embodiment of the vision and learningalgorithm services platform 140 is shown in more detail, in the context of an environment that may be created during the operation of the computing system 100 (e.g., an execution or “runtime” environment). Theillustrative platform 140 and portions thereof are embodied as a number of computer-executable modules, components, and/or data structures, including application-algorithm interfaces 200 and vision andlearning services 220. Theinterfaces 200 can individually or collectively communicate with theapplication 130 at an “application” level and can communicate with one or more of theservices 220 at an “algorithm” level (where the application level is typically a higher level of abstraction than the algorithm level). Some embodiments of theplatform 140 may be implemented as an application programming interface (API) or as a collection of APIs, which is made available to applications 130 (or application developers) as an Internet-based service (e.g., a web service). Theillustrative interfaces 200 include analgorithm capabilities interface 210, atask performance interface 212, acomputational performance interface 214, and adatabase interface 216. Theillustrative services 220 include ahardware optimization service 222, analgorithm organization service 230, and adata organization service 250. Theinterfaces services - The computer vision and learning
algorithms 142 can have tens or sometimes hundreds ofinternal parameters 146 that control the flow and performance of thealgorithms 142. Input and output data types for thealgorithms 142 are often complex and heterogeneous. Such complex interfaces burden the algorithm user by requiring knowledge of the meaning and appropriate use of thealgorithm parameters 146. Theillustrative interface layer 200 functions as an intermediary between theapplications 130 and theservices 220. Theinterfaces 200 are designed to support theapplications 130 in terms of intelligently selecting one or more of theavailable algorithms 142 that is capable of performing aparticular task 132 given theparameters 134, based on, for example, the algorithms' capabilities, task performance, computational performance, and/or applicable databases. The algorithm capabilities interface 210 provides theapplications 130 with access to one or a combination of thealgorithms 142, the functional capabilities of each of thealgorithms 142, and/or a means by which to control thealgorithms 142 via parameters that are meaningful in terms of “real-world” characteristics (where such parameters may be referred to herein as “physical parameters” or “application parameters”) (e.g., the parameters 134), rather than algorithm-centric parameters (e.g., the algorithm parameters 146). As an example, the use of Ground Sampling Distance may be considered a physical parameter/application parameter 134, while pixel resolution may be considered analgorithm parameter 146, which corresponds to the Ground Sampling Distance application parameter 134). In some cases,algorithm parameters 146 may be used to calculateapplication parameters 134, or vice versa. - The algorithm capabilities interface 210 enables these application-algorithm interactions through computerized abstractions that are designed to remain consistent, or change minimally, over the algorithm development cycles and across the algorithm implementations of
different applications 130. To do this, thecapabilities interface 210 utilizes APIs that have templated algorithms and abstract input and output data types. The APIs for thealgorithms 142 may be developed using, for example, polymorphism, inheritance and template representations that allow a container type to hold multiple object-types and enable algorithm implementations to be independent of the container and object type, with a fixed API. Additionally, as described in connection withFIG. 3 , an algorithmsparameter mapping module 316 automatically associates the “real world” data characteristics of theapplication parameters 134 with thealgorithm parameters 146 so that thealgorithms 142 can be executed with thealgorithm parameters 146 to producealgorithm results 148 for the requestedtask 132 andparameters 134. - The published performance characteristics of many existing vision algorithms are limited in that they are the result of the
algorithms 142 having been run against certain specific public datasets. These datasets are often not tactically relevant for a wide variety ofapplications 130. Further, existing performance evaluations rarely characterize thealgorithms 142 against fuzzy, compressed, and/or low-quality data, which is common on the Internet and is likely to be a common type of content 110 inmany applications 130. The illustrativetask performance interface 212 exposes the performance characteristics of analgorithm 142 in terms of the accuracy, precision, confidence, uncertainty, etc., of the algorithm results 148 for aspecific task 132 andparameters 134. Theplatform 140 can customize the performance characteristics for a particular type ofalgorithm 142. The algorithm type may be defined by the functionality performed by thealgorithm 142, such as object detection, face detection, scene detection, activity detection, event detection, vehicle detection, facial recognition, etc. The algorithm type may, alternatively or in addition, refer to a level of abstraction associated with the algorithm 142 (e.g., pixel-level, feature-level, semantic level, etc.). As an example, the performance of object detection algorithms can be characterized by metrics such as the probability of detection, the probability of false alarm, and the receiver operating characteristic (ROC) curves (resulting from the distribution of scores with respect to positive and negative instances). Similarly, the performance of localization algorithms can be characterized by uni- or multi-modal distributions of their output parameters, such as location and orientation of the camera. A computer vision algorithmic service as disclosed herein incorporates programming interfaces for these and all other types of algorithmic performance characterizations. - As described below with reference to
FIG. 4 , embodiments of thetask performance interface 212 can provideperformance estimates 426 at multiple levels of performance characterization, including: (1) a projected performance characterization that quantifies the algorithm's performance against known datasets; (2) a “probe,” which rapidly analyzes input without running the full algorithm, e.g., to determine the algorithm's suitability for aparticular task 132; and (3) a “diagnostic” characterization that characterizes the uncertainty of specific point solutions obtained by executing thealgorithm 142. In this way, thetask performance interface 212 can provide both coarse and fine performance measures onalgorithms 142 to, among other things, aid in dynamic workflow optimization. Theinterface 212 may expose one or more of theestimates 426 for use by theplatform 140 or theapplication 130. Thecomputational performance interface 214 exposes information about the time taken and computing resources needed (e.g. hardware processors, cloud clusters, etc.) to execute analgorithm 142. Such information may be obtained from, for example, thehardware optimization service 222, described below. Thedatabase interface 216 enablesapplications 130 to access the indexedcontent 144 that is applicable to a giventask 132 andparameters 134, in a rapid and efficient manner. Additional details of theinterfaces - Referring now to the vision and
learning algorithm services 220, thealgorithm organization service 230 is embodied as a computerized framework (e.g., as middleware) for the layered representation of vision and learningalgorithms 142. The illustrative framework of algorithm layers 232 organizes thealgorithms 142 into a pixel-level class ofalgorithms 238, a feature-level class ofalgorithms 236, and a semantic-level class ofalgorithms 234, where the feature-level algorithms 236 are implemented at a higher level of abstraction than the pixel-level algorithms 238 and the semantic-level algorithms 238 are implemented at a higher level of abstraction than the feature-level algorithms 236 and the pixel-level algorithms 238. For example, the illustrative pixel-level vision algorithms 238 may produce enhanced versions of input images and may extract camera characteristics; whereas the feature-level algorithms 236 may process images to combine feature aggregates and geometry for recognition; and the semantic-level algorithms 234 may ingest feature and pixel data to produce decisions, labels or tags, and semantic output. Additional examples of thelayered framework 232 are shown in TABLE 1 below. -
TABLE 1 Examples of algorithm layers. Organization Algorithm Type Example Algorithms Semantic-level WHERE-centric Scene and Terrain Recognition; Semantic fingerprint- Algorithms based recognition for geo-registration; Skyline matching and location recognition WHAT-centric Recognition - vehicles, humans, weapons; Parts-based object recognition; Human pose-based object recognition; Vehicle fingerprinting WHO-centric Face recognition; Soft-biometrics: gait, ethnicity, clothing WHEN-centric Time estimation: sun angle analysis, shadow analysis; Season estimation: plant/foliage recognition, plant state analysis Feature-level Features Oriented histograms; Textons; Color histograms; Spin- Algorithms images; Shape-context; Space-Time Interest Points (STIP) Geometry 3D pose recovery; Parts-based geometry estimation Matching Feature-based image matching; Point cloud matching; Image descriptors matching; Self-similarity Trackers Multi-hypothesis tracking; Joint Probabilistic Data Association Filters (PDAF); Extended Kalman Filters; Flow tracking; Condensation Classifier (Untrained) Support Vector Machines (SVMs); Random Implementations Forests; Decision trees Pixel-level Image Enhancement Local contrast normalization; Noise reduction; Blur Algorithms estimation and removal Sharpening; Color correction; Color enhancement; Image de-compression; Image filtering; Pyramids Image Quality Measures of image energy, texture, gradient strength Camera calibration/ Vanishing lines; Multi-frame calibration parameters Alignment 2-dimensional (2D) direct motion analysis; Motion layers extraction; Flow estimation - As shown in TABLE 1, each of the algorithm layers 232 is further organized by algorithm type. As such, the
platform 140 or theapplication 130 can access any of thealgorithms 142 by its level of abstraction and/or its type. Current open source libraries like OpenCV and VXL are organized around “textbook level” core image libraries, and tend to be very project-specific. Among other things, thelayered architecture 232 can handle content diversity, can abstract computations and data structures for algorithm families rather than per atomic module, and can handle new versions of thealgorithms 142 without changing theinterfaces 200. - The
illustrative integration service 240 integrates thealgorithms 142 into thelayered architecture 232. Theillustrative capability analysis 242 evaluates the functional capabilities ofnew algorithms 142 or new versions ofalgorithms 142 and the illustrativeevolving algorithms service 244 maps the new or new versions ofalgorithms 142 to anappropriate level architecture 232 - The
reference data 144 may include a wide variety of different content, including the user content 110, which may be stored in databases, files, and/or other electronic data stores (which may be referred to herein simply as “data” or “databases” for ease of discussion). Thereference data 144 may include private or protecteddata 260 and/or public orunprotected data 262. Referring now to thedatabase interface 216 and thedata organization service 250, theillustrative database interface 216 exposes information regarding the content databases of thereference data 144 that are available, what type of data each of the databases includes (e.g., area, sensors, biometric data) and the functional characteristics of each of the databases. Some other examples ofreference data 144 include: (1) LIDAR (light detection and ranging data), aerial and ground level imagery (with and without any associated metadata); (2) polygonal and point cloud models of objects; (3) biometric and object databases; and (4) architectural, botanical, geological and meteorological databases. In general, thereference data 144 is stored in any suitable computer readable media, such as thedata storage device 668 shown inFIG. 6 and described below. - The illustrative
data organization service 250 is embodied as an extensible framework (e.g., middleware) for indexing algorithms to comprehensively and efficiently handle the diversity of databases and data stores of thereference data 144 that may be applicable todifferent applications 130. To do this, thedata organization service 250 creates thereference data 144 by, for example, ingesting data from a large variety of databases and data stores (e.g., Internet sources such as YOUTUBE, FLICKR, FACEBOOK, etc.), where the “ingesting” may be performed by thecomputing system 100 as an automated (e.g., background) process or by interfacing with an end user, for example. Thedata organization service 250 automatically indexes this data, and providesdatabase access interfaces 256 for the applications. For example, thedatabase indexing module 254 of thedata organization service 250 may create a reference data index 252 to index a collection of invariant two-dimensional and/or three-dimesional features as having been demonstrated for accuracy and efficiency. Thedatabase access module 256 may specify and/or verify appropriate permissions and/or access levels for theapplications 130 to access thereference data 144, e.g., theprivate data 260. The reference data index 252 automatically indexes visual data and metadata from structured, semi-structured and unstructured data stores. In this way, thedata organization service 250 can expose a large number of heterogeneous databases and/or data stores for use by theplatform 140 or theapplication 130. - Some embodiments of the
database interface 216 provide multiple types of abstracted APIs that allow unified access to different categories of indexed data, such as: (i) imagery data (e.g., Electro-Optic (EO), Multi-/Hyper-spectral Imagery (MSI/HSI), etc.), (ii) three-dimensional data (e.g., Light Detection and Ranging (LIDAR), Digital Elevation Maps (DEM)), (iii) attributes (e.g., color, class type, scene type), and (iv) features (e.g., Histogram of Oriented Gradients (HoG), spin-image), (iv) object, location, scene, action, event and other such database entity, for instance, faces from a biometric imagery database, locations from a geo-organized imagery database, and actions from an ingested and indexed action database from imagery. For eachspecific database 144, thedatabase interface 216 exposes certain database information (e.g., as a summary table that contains among other fields: data type, spatial coverage, time interval (if available) and the number of elements in the database). Some embodiments of thedatabase interface 216 include an interface to an index of high-dimensional visual and metadata features that enable rapid (typically logarithmic/sublinear) access to the categories of data types described above. The summary table can be used (e.g., by theplatform 140 or an application 130) to quickly poll for types of data that are available within a given subject area, in order to create a workflow that is suitable for aparticular task 132. An example of adatabase interface 216 is shown in TABLE 2 below. -
TABLE 2 Database interface for an example location-matching query. Function Input Output Query for Location - latitude, longitude, (Boolean) Yes/No - Geo-referenced availability of altitude dataset available specific feature Area of coverage - M × N meter2 (Boolean) Yes/No - Desired feature set for a given Feature Type - Histogram of set available location Oriented Gradients Retrieve geo- Location - latitude, longitude, (Boolean) Yes/No - Data available referenced data altitude Image - <Type> Geo-referenced image for visualization Ground Sampling Distance - Nm/ pixel - In addition to high-level information about the indexed data, the
database APIs 216 allow applications or theplatform 140 to poll thedatabase 144 to retrieve stored data according to one or more criteria, such as: (i) spatial or volumetric (e.g., return all the images or skylines within a region of interest); (ii) attribute-based (e.g., return all the locations containing a desert scene); (iii) feature similarity (e.g., return all the skylines being similar to a query input); (iv) temporal access. More complex queries can be formed by creating combinations of the different query types. - In some implementations, the
APIs 216 for image data use templated image classes to handle different pixel data types (e.g., color, gray scale, etc.). For three-dimensional (3D) data, abstracted 3D point interfaces similar to the open-source Point Cloud Library (PCL) from Willow Garage may be used. For accessing derived attributes and features, theinterface 216 may be based on a templated class that comprises several fields, such as: feature identifier, 2D/3D feature location, origin identifier, and/or descriptor vector. More complex feature types can be constructed using, e.g., inheritance and polymorphism mechanisms to allow APIs to remain fixed. - Referring now to the
hardware optimization service 222, theservice 222 maps and optimizes vision and learningalgorithms 142 for a variety ofcomputing platforms service 222 contains implementations of thealgorithms 142 that are optimized for the different hardware platforms, e.g., GPU cores (e.g., CUDA—the Compute Unified Device Architecture), multi-threading CPU cores and parallel/distributed cloud architectures. - Referring now to
FIG. 3 , an embodiment of the algorithm capabilities interface 210 is shown in more detail. Theinterface 210 includes a number ofapplication parameters APIs 312 andperformance parameters APIs 314. TheAPIs APIs 314 can be used to automatically determine a set of “optimal”algorithm parameters 146 for aparticular task 132 andparameters 134. The physical or user-orientedparameters 134 may be computed by theapplication 130 or by theplatform 140 from the task 132 (e.g., a query string) or may be specified by a user of the application 130 (e.g., an end user) or a user of the platform 140 (e.g., an application developer). As used herein, “optimal” may refer to, among other things, a combination ofalgorithm parameters 146 that is algorithmically determined to have a high probability of performing thetask 132 according to the specifiedparameters 134, accuracy criteria, performance criteria, and/or other criteria. - The illustrative algorithm capabilities interface 210 is embodied as one or more APIs that encode: (i) the parameters 134 (e.g., physical input parameters for the
task 132, such as parameters with real-world meaning rather than algorithm-specific meaning, e.g., Ground Sampling Distance (GSD), focal length, and blur level), and (ii) input and output parameters that specify a desired performance level for thealgorithms 142 executing thetask 132 or implementing theapplication 130. These APIs are abstracted and thus do not require theapplications 130 to providealgorithm parameters 146. For example, anapplication 130 may provide only query image properties and desired performance parameters, and theplatform 140 can automatically select analgorithm 142, determine thealgorithm parameters 146, and produce theplatform output 136 in response to those parameters, without requiring any additional information from theapplication 130. - To do this, the algorithms
parameter mapping module 316 translates the parameters 134 (e.g., physical input parameters and input/output (I/O) performance parameters) into a set of I/O APIs 318 for interacting directly with the semantic-, feature and pixel-level algorithms task performance interface 212 to map theparameters 134 toalgorithm performance parameters 146. In doing this, themodule 316 identifies a set ofinput parameters 146 that are mathematically determined to be likely to achieve a specific level of performance, as determined through systematic testing and evaluation of thealgorithm 142′s output over time. As an illustration, consider an interface for vehicle recognition. Anapplication 130 needs a vehicle recognition service (e.g., thetask 132 may be a query such as “find all vehicles in these images”). Theplatform 140 or theapplication 130 uses a combination ofalgorithms 142 to compute or otherwise determine the physical parameters of thetask 132. The physical parameters can be provided by the application 130 (e.g. distance to an object) or computed by a vision orlearning algorithm 142 based on one or more of thealgorithm parameters 146. Theplatform 140 or theapplication 130 invokes the capabilities interface 210 with theseparameters 134. The algorithmsparameters mapping module 316 maps these parameters to a complete set of (i) semantic-level parameters (e.g., classifier type, classifier parameters); and (ii) feature-level parameters (e.g., feature type). Theplatform 140 selects and executes a semantic-level algorithm 234, which producesresults 148 using the semantic-level parameters, and theplatform 140 passes theresults 148 to theapplication 130 asplatform output 136. Additionally, theplatform 140 may select and execute a feature-level algorithm 236, which producesresults 148 using the feature-level parameters, and theplatform 140 passes theresults 148 to theapplication 130 asplatform output 136. Some illustrative examples of algorithm-parameter mappings that may be performed by the algorithm capabilities interface 210 are shown in TABLE 3 below. -
TABLE 3 Capablity interface for example WHERE-centric location-matching algorithms. Algorithm Function Organization (e.g., task 132) Input Parameters 134Output Parameters 136Semantic Retrieve Similar <vector> Histogram of Pointers to locations of Images Oriented Gradients (HoG) reference images with similar feature HoG features, and at the (double) GSD, zoom level desired resolution (zoom) Feature Compute HoG Image Edge Map <vector> HoG features features at building Feature Type - “HoG” corners Window size - 10 (pixels) Pixel Edge detection Image query image Image - Edge Map Edge algorithm “Canny” Window size - 3 (pixels) - Referring now to
FIG. 4 , an embodiment of thetask performance interface 212 is shown in more detail. The illustrativetask performance interface 212 provides algorithm performance estimates that enable theplatform 140 or theapplication 130 to obtain an “optimal”result 148 in response to atask 132 and parameters 134 (e.g., an answer to a query). To do this, theplatform 140 constructs a workflow for accomplishing thetask 132 according to applicable task performance constraints (e.g., the image quality and/or computational constraints), which may be supplied by thetask performance interface 212. Theinterface 212 determines performance estimates based on characteristics of the knowndatasets 144 and the operating range of theparameters 134 that are specific to the task 132 (e.g., an image query). Furthermore, theinterface 212 enables theplatform 140 or theapplication 130 to perform a quick “triage” oftask 132 or the associated content 110 to assess the landscape of potential solutions for thetask 132. As a result of the triage, theplatform 140 or theapplication 130 can select one ormore algorithms 142 for full execution that output confidences in the specific solution produced. In other words, once an algorithm is picked and the solution is obtained, the quality of the solution is characterized (as performance characterization) and provided to theplatform 140 or theapplication 130, as needed. - Some embodiments of the
task performance interface 212 describe performance characteristics as functions of properties of the content 110 (e.g., image and scene properties, such as ground sample distance (GSD), image blur, and contrast) that are associated with atask 132. To do this, the content properties (e.g., scene and image properties) are extracted from the content 110 (e.g., query images). One or more of the extracted content properties are used to characterize thealgorithm 142's performance and to recommend parameters for improved performance. To do this, thealgorithm 142's performance is learned over a data cluster that is based on the extracted image feature content and attributes. Based on the learning, a set of parameters is recommended for performing the algorithm on a new image (e.g., an image that is not already represented in the cluster), based on the parameters' performance in the cluster. Local image features are computed for the image, to capture the image content, and then all of the images in a dataset of images are clustered using the feature similarity. A ratings matrix may be computed to estimate the algorithm's performance over the clusters for each combination of parameters (where a combination of parameters may be referred to as an “operating point”). In this way, a new image can be related to a cluster having a similar feature distribution, and the predicted best operating point of the matching cluster can be used as thealgorithm parameters 146. To perform the clustering, the images in the dataset are clustered according to similar content and attributes using, for example, a k-means clustering algorithm. - As shown in
FIG. 4 , the illustrative task performance interface gives theplatform 140 or theapplication 130 multiple ways to estimate the performance of analgorithm 142, and thereby helps theplatform 140 or anapplication 130 in selectingcandidate algorithms 142 for thetask 132. A projectedperformance measure 414 gives a global estimate in response toparameters 410 and the algorithm type 412 (which may be determined by the capabilities interface 210 as described above, where theparameters 410 may include theapplication parameters 134, thealgorithm parameters 146, or a combination of theparameters algorithm type 412 may be determined according to the algorithm layers framework 232). The projectedmeasure 414 does not use any inputs from the given content 110 (e.g. a query image 416). As such, the projectedmeasure 414 quantifies performance against known datasets and the resultingestimate 426 is not particular to the content 110. - A probe performance measure 420 uses the actual content 110 (e.g., an image 416) to predict uncertainty relating to the algorithm's performance of the
task 132 usingcoarse feature computations 418. The probe measure 420 can rapidly assess the space of possible solutions (e.g., a set of candidate algorithms 142). A diagnostic performance measure 424 assesses the final results of executing the algorithm 142 (selected based on the algorithm type 412) on the content 110 (e.g., an image 416) to perform thetask 132 using theparameters 410. In this way, thetask performance interface 212 enables multiple-level data-driven performance characterization ofalgorithms 142 for aspecific task 132 submitted by anapplication 130. As a result,applications 130 or theplatform 140 can obtain both coarse and fine performance measures to dynamically optimize their processing pipelines by evaluating multiple algorithms (e.g., a set of candidate algorithms 142). - As an example, for a given
skyline matching algorithm 142, the projectedmeasure 414 provides the ranges of achievable location uncertainty with respect to parameters computed from images in the existing datasets. The projectedmeasures 414 are pre-computed for therespective algorithms 142 as part of training and testing stages of thealgorithms 142. For atask 132 involving a “where” query, given skyline and building outlines, without performing detailed matching, the probe measure 420 informs theplatform 140 or theapplication 130 as to whether the achievable geo-location uncertainty is, e.g., 10000 sq. kms. or 100 sq. kms. The probe measure 420 produces estimates using global properties of the content 110 (e.g., a query image 416), such as scene type (e.g., desert, shoreline, rural), resolution, contrast, blur and also coarse level properties of features related to thespecific task 132 or theparticular application 130. For example, the jaggedness of a skyline is a good indication of how well a skyline-based “where” algorithm is likely to work. Similarly, the number of distinctive corners available in an urban image will determine the expected uncertainties in its location. The probe measure 420 mimics human-like abilities by training thesystem 100 to make high-level estimates from global and application-specific features of the task 132 (e.g., query 416). - An embodiment of the probe measure 420 can be implemented in multiple stages, including (i) an offline technique for learning the mapping between the image properties and the estimates obtained by running the respective algorithms on labeled examples, and (ii) an incremental technique that can refine its mapping by adding a new image, its properties, and its results. The probe measure 420 can dynamically update its predictions by incorporating the results for images previously analyzed by the
platform 140 or by anapplication 130. Given the learned mapping, the probe module 420 computes the global and application-specific features of a test image and applies the mapping to generate estimates of the expected accuracy and confidence. - The
estimates 426 produced by the diagnostic measure 424 are computed from the complete configuration of matching features inmodule 422, and determined by combining their confidences and accuracies into summary estimates in module 424. The diagnostic measure 424 analyzes the results ofcomponent algorithms 142 to estimate a confidence and accuracy for thewhole task 132. This analysis is different for eachtask 132 or for eachapplication 130. For example, when a semantic fingerprint is used to answer a “where” task (e.g., query) with respect to animage 416, the diagnostic prediction method 424 combines the confidences and accuracies associated with each semantic feature detected in the image 416 (such as a building, road, and mountain), and forms a confidence and accuracy for thewhole image 416. When a face detection algorithm is used, for example, to answer a “who” task (e.g., query), the diagnostic measure 424 reports the expected recall, precision, and confidence of the selectedalgorithm 142 for the database used. TABLE 4 below summarizes typical inputs used and output interfaces for the task performance measures 414, 420, 424. -
TABLE 4 Summary of task performance measures and their input/output interfaces. Measure Typical Inputs Used Typical Output Interface Projected Global Properties, Stored <PD, PFA, Location Uncertainties, . . . > Datasets (Pre-computed) PD, PFA → Detection & False Alarm Probabilities Probe Global and App-specific Ranges of <Recall, Precision, Location Uncertainties, . . . > features computed from the Query Image Diagnostic Detailed Query Image <PD, PFA, Confidences, Location Uncertainties, . . . > at the Properties Solution Points - Referring back to
FIG. 2 , an embodiment of thecomputational performance interface 214 provides estimates of the computational resources required by eachalgorithm 142 to help construct a pipeline to meet the resource constraints specified by the application 130 (or the application developer). For instance, theinterface 214 can provide estimates of the computational requirements for the algorithms when they are executed on different processing configurations, such as cloud computers with GPUs, or enable profiling techniques to recognize when two ormore algorithms 142 perform the same computations, so that an analysis can be performed once and shared, in order to save time or for other reasons. - Traditionally, computational complexity is characterized in terms of algorithmic complexity. While useful, this characterization is not sufficient for the
platform 140 or theapplication 130 to make an informed decision about resource allocation because it does not lead to an accurate estimate of processing time in a cloud architecture, which has diverse computational resources, including different types of CPUs and GPUs. Thecomputational performance interface 214 provides a variety of computation-specific information for eachalgorithm 142, as shown by the example in TABLE 5 below. -
TABLE 5 Computational performance measures and their input/output interfaces. Measure Typical Inputs Used Typical Output Interface Projected Global Properties, Stored Datasets (CPU, GPU) Average execution time; Memory (Pre-computed), Resources (CPUs, usage GPUs) (Cloud) Average execution time; Memory usage; Latency Diagnostic Image Properties; Resources (CPU, GPU) Actual execution time; Memory usage (CPUs, GPUs, Datasets/Databases) (Cloud) Actual execution time; Memory usage; Latency - The
illustrative interface 214 lists the processing (e.g., CPU, GPU) and memory requirements to execute analgorithm 142 at different levels. For example, a projected level may characterize the algorithm's processing time with respect to knowndatasets 144, and a diagnostic level may characterize the computational resources used by the executed algorithm. In some embodiments, components of theinterface 214 are implemented using a framework for managing processing flows, such as a Hadoop/UIMA-based infrastructure or any of the standard cloud-based distributed/parallel processing infrastructures. Theinterface 214 can use the UIMA (Unstructured Information Management Architecture) to define processing pipelines for analgorithm 142, each of which may involve multiple types of computing resources (e.g., hardware). Theinterface 214 can use Hadoop to replicate these function blocks and to distribute data to them. With this framework, when it is time to execute an algorithm, theplatform 140 or theapplication 130 can specify resources and/or a processing graph using XML (Extensible Markup Language)-based configurations. Theinterface 214 can, for example, choose among multiple processing options, replicate function blocks, and distribute computation across the cloud at runtime. - As an example, consider a feature extraction algorithm that uses the Harris corner detector and Histogram-of-Oriented-Gradients (HoG) feature descriptor. Through the
interface 214, theplatform 140 can determine that the algorithm can be executed using single-core CPU, multi-core CPU or GPU hardware, and can analyze the time taken for each. Since both Harris corners and HoG compute first-order gradients, this computation only needs to be done once for the feature extraction algorithm. The benefit of this optimization, compared to independently using Harris detectors and HoG descriptors, is exposed through thecomputational performance interface 214. - Referring now to
FIG. 5 , anillustrative method 500 by which thecomputing system 100 may perform vision or learning algorithm services for anapplication 130 on content 110 is shown. Themethod 500 may be embodied as computerized programs, routines, logic and/or instructions executed by thecomputing system 100, for example by theplatform 140. Atblock 510, thesystem 100 receives a computer vision ormachine learning task 132 from a requestingapplication 130. To do this, the requestingapplication 130 may submit thetask 132 using one or more of the application-algorithm interfaces 200 (e.g., the capabilities interface 210), or may use one or more of theAPIs application 130. Atblock 512, thecomputing system 100 determines whether thetask 132 requires the use of one of thealgorithms 142. If the requested task does not require the use of an algorithm 142 (as may be the case if, for example, thetask 132 cannot be handled by any of the available algorithms 142), the system returns to block 510 and monitors for another task 132 (which may be received from thesame application 130 or adifferent application 130, depending on the design or implementation of the platform 140). If thetask 132 requires the use of analgorithm 142, thesystem 100 determines theapplication parameters 134 for the requestedtask 132, atblock 514. To do this, thesystem 100 may, for example, extract theparameters 134 from a query string of the requestedtask 132. Atblock 516, thesystem 100 identifies one or more candidate algorithms to perform the requested task based on theapplication parameters 134 determined atblock 514, where the candidate algorithms comprise a subset of the library ofalgorithms 142. To do this, thecomputing system 100 may analyze theparameters 134 to determine an appropriate level of algorithm abstraction using thealgorithm organization service 230. In some embodiments, thesystem 100 may simply return a list of thecandidate algorithms 142, depending on the needs of the requestingapplication 130. In other embodiments, thesystem 100 proceeds to intelligently analyze the capabilities of the candidate algorithms vis a vis the requestedtask 132, as described below. - The algorithm and performance capabilities of the
candidate algorithms 142 identified atblock 516 are evaluated for the requestedtask 132, atblock 518. To do this, thecomputing system 100 may utilize the capabilities interface 210 (at block 520), the task performance interface 212 (at block 522), the computational performance interface 214 (at block 524), and/or one or more of theservices 220, as described above. Atblock 526, thecomputing system 100 compares the results of the evaluating of the algorithm and performance capabilities of thecandidate algorithms 142 and selects one or more of thealgorithms 142 to perform (e.g., fully execute) thetask 132 on the content 110. In some embodiments, thesystem 100 may simply return the selected algorithm(s) 142, or a list of the selected algorithm(s), depending on the needs of the requestingapplication 130. In other embodiments, thesystem 100 proceeds to intelligently determine a set of parameters for the selected algorithm(s) 142 in view of the requestedtask 132, as described below. - At
block 528, thesystem 100 determines the “optimal”algorithm parameters 146 to execute the algorithm(s) 142 selected atblock 526 on the particular content 110 that is the subject of thetask 132. For example, thesystem 100 may perform content-based performance characterization, wherein attributes of the content 110 may be extracted and clustered with a dataset of previously-analyzed content, to identify thealgorithm parameters 146. In some embodiments, thesystem 100 may simply return thealgorithm parameters 146, depending on the needs of the requestingapplication 130. In other embodiments, thesystem 100 proceeds to execute the selected algorithm(s) 142 using theparameters 146, as described below. - At
block 530, thesystem 100 executes thetask 132 using the selectedalgorithms 142 and thealgorithm parameters 146 determined atblock 528, and obtains the algorithm results 148. To do this, thesystem 100 may initiate the executing of thealgorithms 142 through an API, such as one or more of theAPIs block 532, thesystem 100 communicates theresults 148 of performing thetask 132 with the selectedalgorithms 142 asoutput 136 to the requestingapplication 130. To do this, thesystem 100 may expose theresults 148 asoutput 136 through one or more of theinterfaces 200 or through one or more of theAPIs application 130. As used herein, “expose” may refer to the action of making information or computer functionality available for use by other applications, by some computerized mechanism (e.g., through an API, or through a message communication mechanism). - Among other things, the
platform 140 can be used to combine existing vision and/or learning techniques in new ways to adapt to evolving algorithmic and software needs. For example, theplatform 140 can combine shadow detection, object recognition, and an ephemeris module to create a technique for estimating the time at which a picture was taken, by using detected objects as sundials. Theplatform 140 also has a number of military, intelligence and commercial applications. - Alternatively or in addition, the
platform 140 can be used to provide performance characteristics ofalgorithms 142 for ranges of operating parameters that cover image quality, scene complexity and clutter, and object and scene oriented parameters. Accordingly, theplatform 140 can help developers and practitioners to create objective assessments ofalgorithms 142 and operating ranges for thealgorithms 142 to be used in real-time by mainstream applications. - Referring now to
FIG. 6 , a simplified block diagram of anembodiment 600 of thecomputing system 100 is shown. While theillustrative computing system 600 is shown as involving multiple components and devices, it should be understood that in some embodiments, thecomputing system 600 may constitute a single computing device, alone or in combination with other devices. Thecomputing system 600 includes a user computing device 610, which may be in communication with one or more other computing systems ordevices 660 via one ormore networks 650. The vision andlearning services platform 140 or portions thereof may be distributed across multiple computing that are connected to the network(s) 650 as shown. In other embodiments, however, theplatform 140 may be located entirely on the computing device 610. In some embodiments, portions of theplatform 140 may be incorporated into other systems or computer applications. Such applications or systems may include, for example, operating systems, middleware or framework software, and/or applications software. For example, portions of theplatform 140 may be incorporated into or accessed by search engines or intelligent assistance applications. - The illustrative computing device 610 includes at least one processor 612 (e.g. a microprocessor, microcontroller, digital signal processor, etc.),
memory 614, and an input/output (I/O)subsystem 616. The computing device 610 may be embodied as any type of computing device capable of performing the functions described herein, such as a personal computer (e.g., desktop, laptop, tablet, smart phone, body-mounted device, etc.), a server, an enterprise computer system, a network of computers, a combination of computers and other electronic devices, or other electronic devices. Although not specifically shown, it should be understood that the I/O subsystem 616 typically includes, among other things, an I/O controller, a memory controller, and one or more I/O ports. Theprocessor 612 and the I/O subsystem 616 are communicatively coupled to thememory 614. Thememory 614 may be embodied as any type of suitable computer memory device (e.g., volatile memory such as various forms of random access memory). - The I/
O subsystem 616 is communicatively coupled to a number of hardware components and/or other computing systems including the computer application(s) 130, theplatform 140, and theuser interface subsystem 622, which includes one or more user input devices (e.g., a touchscreen, keyboard, virtual keypad, microphone, etc.) and one or more output devices (e.g., speakers, displays, LEDs, etc.). The I/O subsystem 616 is also communicatively coupled to one ormore storage media 618, one or more one or more video and/or still image capture devices 620 (e.g., cameras), and acommunication subsystem 624. It should be understood that each of the foregoing components and/or systems may be integrated with the computing device 610 or may be a separate component or system that is in communication with the I/O subsystem 616 (e.g., over anetwork 650 or a serial bus connection). - The
storage media 618 may include one or more hard drives or other suitable data storage devices (e.g., flash memory, memory cards, memory sticks, and/or others). In some embodiments, portions of the application(s) 130, theplatform 140, the user content 110, theapplication output 120, thetask 132, theplatform output 136, and/or other data reside at least temporarily in thestorage media 618. Portions of the application(s) 130, theplatform 140, the user content 110, theapplication output 120, thetask 132, theplatform output 136, and/or other data may be copied to thememory 614 during operation of the computing device 610, for faster processing or other reasons. - The
communication subsystem 624 may communicatively couple the computing device 610 to one or more communication networks, e.g., a local area network, wide area network, personal cloud, enterprise cloud, public cloud, and/or the Internet, for example. Accordingly, the network interfaces 632 may include one or more wired or wireless network interface software, firmware, or hardware, for example, as may be needed pursuant to the specifications and/or design of theparticular computing system 600. - The server computing device(s) 660 may be embodied as any suitable type of computing device capable of performing the functions described herein, such as any of the aforementioned types of devices or other electronic devices. For example, in some embodiments, the server computing device(s) 660 may include one or more server computers including
storage media 668, which may be used to store portions of the vision and learningalgorithms 142, thereference data 144, the application(s) 130, theplatform 140, the user content 110, theapplication output 120, thetask 132, theplatform output 136, and/or other data. The illustrativeserver computing device 660 also includes auser interface subsystem 670 and acommunication subsystem 672, which may be embodied similarly to thecomponents computing system 600 may include other components, sub-components, and devices not illustrated inFIG. 6 for clarity of the description. In general, the components of thecomputing system 600 are communicatively coupled as shown inFIG. 6 by signal paths, which may be embodied as any type of wired or wireless signal paths capable of facilitating communication between the respective devices and components. - Illustrative examples of the technologies disclosed herein are provided below. An embodiment of the technologies may include any one or more, and any combination of, the examples described below.
- In an Example 1, a platform for providing computer vision algorithm services is embodied in one or more machine accessible storage media and includes an application-algorithm interface to: receive a computer vision task from a computer application, the computer vision task to be performed on one or more digital images accessed by the computer application; determine one or more parameters relating to the performing of the computer vision task on the one or more digital images; select one or more computer vision algorithms from a library of computer vision algorithms based on capabilities of the computer vision algorithms that have a high level of performance in comparison to the capabilities of the other computer vision algorithms in the library of computer vision algorithms to perform the computer vision task on the one or more digital images with the one or more parameters; and expose, for use by the computer application, output of the selected computer vision algorithm performing the computer vision task on the one or more digital images using the one or more parameters.
- An Example 2 includes the platform of Example 1, wherein the platform is to execute one or more algorithm performance characterization techniques to determine the one or more parameters relating to the performing of the computer vision task. An Example 3 includes the platform of Example 2, wherein the platform is to map one or more application parameters supplied by the computer application to one or more algorithm parameters of the selected computer vision algorithms based on the executing of the one or more algorithm performance characterization techniques. An Example 4 includes the platform of any of Examples 1-3, wherein the platform is to determine a content characteristic or an application-defined parameter of the one or more digital images, wherein the application-defined parameter is defined by the computer application, and select the computer vision algorithm based on the content characteristic or the application-defined parameter of the one or more digital images. An Example 5 includes the platform of Example 4, wherein the platform is to analyze the performance of the selected computer vision algorithm on the computer vision task based on the content characteristic or the application-defined parameter of the one or more digital images, and determine a set of algorithm parameters for the selected computer vision algorithm based on analyzing the performance of the selected computer vision algorithm. An Example 6 includes the platform of any of Examples 1-5, wherein the platform is to organize the library of computer vision algorithms according to a plurality of different levels of abstraction, and determine a level of abstraction at which to select the computer vision algorithm based on a characteristic of the computer vision task or a characteristic of the computer application. An Example 7 includes the platform of any of Examples 1-6, wherein the platform is to select a combination of different computer vision algorithms to perform the computer vision task, execute the combination of different computer vision algorithms on the one or more digital images using the parameter, and expose, for use by the computer application, output of the combination of computer vision algorithms executing the computer vision task. An Example 8 includes the platform of any of Examples 1-7, wherein the platform is to determine a hardware computing resource available to perform the computer vision task, estimate the computational performance of the selected computer vision algorithm on the hardware computing resource, and determine an algorithm parameter for use with the selected computer vision algorithm based on the computational performance of the selected computer vision algorithm on the computing resource. An Example 9 includes the platform of any of Examples 1-8, wherein the platform is to select a data store of a plurality of available data stores for use in performing the computer vision task, wherein each of the data stores includes reference data usable to perform the computer vision task, and the platform selects the selected data store based on a content characteristic or an application-defined parameter of the one or more digital images, wherein the application-defined parameter is defined by the computer application. An Example 10 includes the platform of any of Examples 1-9, wherein the platform is to compute a plurality of different performance measures for the selected computer vision algorithm, and determine an algorithm parameter for use with the selected computer vision algorithm based on the plurality of different performance measures. An Example 11 includes the platform of any of Examples 1-10, wherein the platform is to determine an algorithm type of the selected computer vision algorithm, and customize the performance measures for the selected computer vision algorithm based on the algorithm type. An Example 12 includes the platform of any of Examples 1-11, wherein the platform is to predictively characterize the performance of the selected one or more computer vision algorithms without executing the algorithm. An Example 13 includes the platform of any of Example 12, wherein the platform is to predictively characterize the performance of the selected one or more computer vision algorithms based on one or more content characteristics of the one or more digital images and/or one or more application-defined parameters supplied by the computer application. An Example 14 includes the platform of any of Examples 1-13, wherein the platform is to characterize the performance of the selected one or more computer vision algorithms by executing the selected one or more computer vision algorithms on the one or more digital images and analyzing the output of the selected one or more computer vision algorithms.
- In an Example 15, a platform for providing computer vision and learning algorithm services to user-oriented computer applications includes, embodied in one or more machine accessible storage media: an application-algorithm interface to determine application parameters to perform a computer vision or learning task on digital content, the computer vision or learning task received from a computer application, at least one of the application parameters indicating a characteristic of the digital content; an algorithm capabilities interface to, based on the application parameters, identify candidate computer vision or learning algorithms to perform the computer vision or learning task on the digital content based on the application parameters; and a performance interface to evaluate a capability of each of the candidate computer vision or learning algorithms to perform the computer vision or learning task on the digital content, the performance capability determined at least in part by the characteristic of the digital content. An Example 16 includes the platform of Example 15, wherein the platform is to select a computer vision or learning algorithm of the candidate computer vision or learning algorithms based on the evaluating of the capability of the selected computer vision or learning algorithm to perform the computer vision or learning task on the digital content. An Example 17 includes the platform of Example 16, including an algorithms parameter mapping module to map the application parameters to one or more algorithm parameters to use with the selected computer vision or learning algorithm to perform the computer vision or learning task on the digital content. An Example 18 includes the platform of Example 17, wherein the platform is to perform the computer vision or learning task on the digital content by executing the selected computer vision or learning algorithm using the one or more algorithm parameters. An Example 19 includes the platform of Example 18, wherein the platform is to communicate a result of executing the selected computer vision or learning algorithm to the computer application. An Example 20 includes the platform of any of Examples 15-19, including an algorithm organization framework to organize the candidate computer vision or learning algorithms according to a plurality of different levels of abstraction, and wherein the platform is to select a level of abstraction based on the computer vision or learning task. An Example 21 includes the platform of any of Examples 15-20, including a computational performance interface to determine a hardware computing resource available to perform the computer vision or learning task, and estimate the computational performance of each of the candidate computer vision or learning algorithms on the hardware computing resource. An Example 22 includes the platform of any of Examples 15-21, including a task performance interface to compute a plurality of different performance measures for each of the candidate computer vision or learning algorithms, and select the computer vision or learning algorithm based on the plurality of different performance measures. An Example 23 includes the platform of any of Examples 15-22, including a data organization service to index a plurality of reference data for use by the platform in executing the computer vision or learning algorithms. An Example 24 includes the platform of any of Examples 15-23, including a plurality of application programming interfaces to expose computer vision or learning algorithms for use at a plurality of different levels of abstraction.
- In an Example 25, a method for providing machine learning algorithm services to computer applications includes, with at least one computing device: determining a parameter relating to a machine learning task of a computer application; evaluating a capability of a plurality of machine learning algorithms to perform the machine learning task with the parameter; selecting a machine learning algorithm of the plurality of machine learning algorithms based on the evaluating of the capability of the machine learning algorithms to perform the machine learning task with the parameter; performing the machine learning task by executing the selected machine learning algorithm with the parameter; and communicating a result of the executing of the machine learning algorithm to the computer application.
- An Example 26 includes the method of Example 25, wherein the machine learning task includes analyzing digital content, and the method includes determining the parameter based on an attribute of the digital content. An Example 27 includes the method of Example 26, including using the attribute of the digital content to determine a performance characteristic of the machine learning algorithm. An Example 28 includes the method of Example 27, including determining the performance characteristic of the machine learning algorithm by executing a content clustering technique based on the attribute of the digital content. An Example 29 includes the method of any of Examples 25-28, wherein the machine learning task includes analyzing one or more digital images, and the method includes algorithmically extracting a content characteristic from a digital image, computing distances between the digital image and each of a plurality of clusters of stored digital images, wherein the distances are computed based on the extracted content characteristic, selecting a cluster having a shortest computed distance to the digital image, and executing the selected machine learning algorithm on the digital image using a parameter associated with the selected cluster. An Example 30 includes the method of any of Examples 25-29, wherein the machine learning algorithms are organized according to a plurality of different levels of abstraction, and the method includes selecting a level of abstraction based on the machine learning task and/or the parameter. An Example 31 includes an Internet-based service including instructions embodied in one or more machine accessible storage media, the instructions executable by one or more processors to perform the method of any of Examples 25-30. An Example 32 includes one or more machine accessible storage media having embodied therein a plurality of instructions executable by a processor to perform the method of any of Examples 25-30.
- In the foregoing description, numerous specific details, examples, and scenarios are set forth in order to provide a more thorough understanding of the present disclosure. It will be appreciated, however, that embodiments of the disclosure may be practiced without such specific details. Further, such examples and scenarios are provided for illustration, and are not intended to limit the disclosure in any way. Those of ordinary skill in the art, with the included descriptions, should be able to implement appropriate functionality without undue experimentation.
- References in the specification to “an embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is believed to be within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly indicated.
- Embodiments in accordance with the disclosure may be implemented in hardware, firmware, software, or any combination thereof. Embodiments may also be implemented as instructions stored using one or more machine-readable media, which may be read and executed by one or more processors. A machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computing device or a “virtual machine” running on one or more computing devices). For example, a machine-readable medium may include any suitable form of volatile or non-volatile memory.
- Modules, data structures, blocks, and the like are referred to as such for ease of discussion, and are not intended to imply that any specific implementation details are required. For example, any of the described modules and/or data structures may be combined or divided into sub-modules, sub-processes or other units of computer code or data as may be required by a particular design or implementation. In the drawings, specific arrangements or orderings of schematic elements may be shown for ease of description. However, the specific ordering or arrangement of such elements is not meant to imply that a particular order or sequence of processing, or separation of processes, is required in all embodiments. In general, schematic elements used to represent instruction blocks or modules may be implemented using any suitable form of machine-readable instruction, and each such instruction may be implemented using any suitable programming language, library, application-programming interface (API), and/or other software development tools or frameworks. Similarly, schematic elements used to represent data or information may be implemented using any suitable electronic arrangement or data structure. Further, some connections, relationships or associations between elements may be simplified or not shown in the drawings so as not to obscure the disclosure. This disclosure is to be considered as exemplary and not restrictive in character, and all changes and modifications that come within the spirit of the disclosure are desired to be protected.
Claims (26)
1-32. (canceled)
33. A platform for providing machine learning algorithm services to user-oriented computer applications, the platform comprising:
a performance interface to evaluate a capability of each of a plurality of candidate machine learning algorithms to perform a machine learning task received from a computer application on digital content, the performance capability being determined at least in part by a characteristic of the digital content indicated by at least one application parameter of a plurality of application parameters determined by an application-algorithm interface, wherein identification of the plurality of candidate machine learning algorithms is based on the application parameters; and
an algorithm organization framework to organize the plurality of candidate machine learning algorithms according to a plurality of different levels of abstraction, wherein the platform is to select a level of abstraction based on the machine learning task,
wherein the platform is to select a machine learning algorithm from the plurality of candidate machine learning algorithms based on the evaluation performed by the performance interface.
34. The platform of claim 34 , further comprising an algorithm parameter mapping module to map the application parameters to one or more algorithm parameters to use with the selected machine learning algorithm to perform the machine learning task on the digital content.
35. The platform of claim 33 , wherein the platform is to perform the machine learning task on the digital content by executing the selected machine learning algorithm using the one or more algorithm parameters.
36. The platform of claim 35 , wherein the platform is to communicate a result of executing the selected machine learning algorithm to the computer application.
37. The platform of claim 33 , further comprising a data organization service to index a plurality of reference data for use by the platform in executing the machine learning algorithms.
38. The platform of claim 37 , wherein the data organization service creates the plurality of reference data by analyzing and indexing data from a plurality of databases and data stores.
39. The platform of claim 38 , wherein the data organization service provides a database access interface to the plurality of reference data for the computer application.
40. The platform of claim 39 , wherein the database access interface enables the user-oriented computer applications or the platform to poll retrieve stored data according to one or more criteria, selected from a group including a spatial or volumetric characteristic, an attribute-based characteristic, feature similarity and temporal access.
41. The platform of claim 38 , wherein a reference data index automatically indexes visual data and metadata from structured, semi-structured and unstructured data stores.
42. The platform of claim 33 , wherein the performance interface is to provide performance estimates of each of the plurality of candidate machine learning algorithms at multiple levels of performance characterization.
43. The platform of claim 38 , wherein the multiple levels of performance characterization include a level that performs a projected performance characterization that quantifies each of the plurality of candidate machine learning algorithms' performance against known datasets.
44. The platform of claim 38 , wherein the multiple levels of performance characterization include a level that performs rapid analysis of input without fully running each of the plurality of candidate machine learning.
45. The platform of claim 38 , wherein the multiple levels of performance characterization include a level that determines each of the plurality of candidate machine learning algorithms' suitability for a particular task.
46. The platform of claim 38 , wherein the multiple levels of performance characterization includes a level that performs a diagnostic characterization that characterizes the uncertainty of specific point solutions obtained by executing each of the plurality of candidate machine learning.
47. A method for providing machine learning algorithm services to computer applications, the method comprising, with at least one computing device:
evaluating a capability of each of a plurality of candidate machine learning algorithms to perform a machine learning task on digital content, the performance capability being determined at least in part by a characteristic of the digital content indicated by at least one application parameter of a plurality of application parameters, wherein identification of the plurality of candidate machine learning algorithms is based on the application parameters;
organizing the plurality of candidate machine learning algorithms, using an algorithm organization framework, according to a plurality of different levels of abstraction,
selecting a level of abstraction based on the machine learning task;
selecting a machine learning algorithm from the plurality of candidate machine learning algorithms based on the evaluation of the capability of each of the plurality of candidate machine learning algorithms;
performing the machine learning task by executing the selected machine learning algorithm with the parameter; and
communicating a result of the executing of the machine learning algorithm to the computer application.
48. The method of claim 47 , wherein the machine learning task comprises analyzing digital content, and the method further comprises determining the parameter based on an attribute of the digital content.
49. The method of claim 48 , further comprising using the attribute of the digital content to determine a performance characteristic of the machine learning algorithm.
50. The method of claim 47 , further comprising determining the performance characteristic of the machine learning algorithm by executing a content clustering technique based on the attribute of the digital content.
51. The method of claim 47 , wherein the evaluation of the capability of each of a plurality of candidate machine learning algorithms is to perform estimates of each of the plurality of candidate machine learning algorithms at multiple levels of performance characterization.
52. The method of claim 51 , wherein the multiple levels of performance characterization include a level that performs a projected performance characterization that quantifies each of the plurality of candidate machine learning algorithms' performance against known datasets
53. The method of claim 51 , wherein the multiple levels of performance characterization include a level that performs rapid analysis of input without fully running each of the plurality of candidate machine learning.
54. The method of claim 51 , wherein the multiple levels of performance characterization include a level that determines each of the plurality of candidate machine learning algorithms' suitability for a particular task.
55. The method of claim 51 , wherein the multiple levels of performance characterization includes a level that performs a diagnostic characterization that characterizes the uncertainty of specific point solutions obtained by executing each of the plurality of candidate machine learning.
56. An Internet-based service comprising instructions embodied in one or more non-transitory machine accessible storage media, the instructions executable by one or more processors to perform the method of claim 43 .
57. One or more non-transitory machine accessible storage media having embodied therein a plurality of instructions executable by a processor to perform the method of claim 43 .
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/285,679 US20170091590A1 (en) | 2013-03-15 | 2016-10-05 | Computer vision as a service |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361787254P | 2013-03-15 | 2013-03-15 | |
US14/212,237 US9152870B2 (en) | 2013-03-15 | 2014-03-14 | Computer vision as a service |
US14/849,423 US9466013B2 (en) | 2013-03-15 | 2015-09-09 | Computer vision as a service |
US15/285,679 US20170091590A1 (en) | 2013-03-15 | 2016-10-05 | Computer vision as a service |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/849,423 Continuation US9466013B2 (en) | 2013-03-15 | 2015-09-09 | Computer vision as a service |
Publications (1)
Publication Number | Publication Date |
---|---|
US20170091590A1 true US20170091590A1 (en) | 2017-03-30 |
Family
ID=51527306
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/212,237 Expired - Fee Related US9152870B2 (en) | 2013-03-15 | 2014-03-14 | Computer vision as a service |
US14/849,423 Active US9466013B2 (en) | 2013-03-15 | 2015-09-09 | Computer vision as a service |
US15/285,679 Abandoned US20170091590A1 (en) | 2013-03-15 | 2016-10-05 | Computer vision as a service |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/212,237 Expired - Fee Related US9152870B2 (en) | 2013-03-15 | 2014-03-14 | Computer vision as a service |
US14/849,423 Active US9466013B2 (en) | 2013-03-15 | 2015-09-09 | Computer vision as a service |
Country Status (1)
Country | Link |
---|---|
US (3) | US9152870B2 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160378685A1 (en) * | 2015-06-27 | 2016-12-29 | Mcafee, Inc. | Virtualized trusted storage |
CN109961045A (en) * | 2019-03-25 | 2019-07-02 | 联想(北京)有限公司 | A kind of location information prompt method, device and electronic equipment |
US11120150B2 (en) | 2018-02-13 | 2021-09-14 | International Business Machines Corporation | Dynamic access control for knowledge graph |
Families Citing this family (53)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009091845A1 (en) | 2008-01-14 | 2009-07-23 | Isport, Llc | Method and system of enhancing ganglion cell function to improve physical performance |
US9488492B2 (en) | 2014-03-18 | 2016-11-08 | Sri International | Real-time system for multi-modal 3D geospatial mapping, object recognition, scene annotation and analytics |
FR3010217A1 (en) * | 2013-08-30 | 2015-03-06 | Thomson Licensing | METHOD AND DEVICE FOR PROCESSING NON-ALIGNED IMAGES, AND CORRESPONDING COMPUTER PROGRAM |
KR102016545B1 (en) * | 2013-10-25 | 2019-10-21 | 한화테크윈 주식회사 | System for search and method for operating thereof |
JP6354178B2 (en) * | 2014-01-31 | 2018-07-11 | オムロン株式会社 | Image processing apparatus, management system, and management method |
US10708550B2 (en) | 2014-04-08 | 2020-07-07 | Udisense Inc. | Monitoring camera and mount |
CN113205015A (en) | 2014-04-08 | 2021-08-03 | 乌迪森斯公司 | System and method for configuring a baby monitor camera |
US10185894B2 (en) * | 2015-03-26 | 2019-01-22 | Beijing Kuangshi Technology Co., Ltd. | Picture management method and device, picture synchronization method and device |
CN104796611A (en) * | 2015-04-20 | 2015-07-22 | 零度智控(北京)智能科技有限公司 | Method and system for remotely controlling unmanned aerial vehicle to implement intelligent flight shooting through mobile terminal |
US11461368B2 (en) | 2015-06-23 | 2022-10-04 | Micro Focus Llc | Recommending analytic tasks based on similarity of datasets |
CN105677763B (en) * | 2015-12-29 | 2019-08-20 | 华南理工大学 | A kind of image quality measure system based on Hadoop |
WO2017125161A1 (en) * | 2016-01-21 | 2017-07-27 | Hewlett Packard Enterprise Development Lp | Resource allocation |
US10354912B2 (en) * | 2016-03-21 | 2019-07-16 | Qualcomm Incorporated | Forming self-aligned vertical interconnect accesses (VIAs) in interconnect structures for integrated circuits (ICs) |
US10209773B2 (en) * | 2016-04-08 | 2019-02-19 | Vizzario, Inc. | Methods and systems for obtaining, aggregating, and analyzing vision data to assess a person's vision performance |
USD854074S1 (en) | 2016-05-10 | 2019-07-16 | Udisense Inc. | Wall-assisted floor-mount for a monitoring camera |
US10664750B2 (en) | 2016-08-10 | 2020-05-26 | Google Llc | Deep machine learning to predict and prevent adverse conditions at structural assets |
US11205103B2 (en) | 2016-12-09 | 2021-12-21 | The Research Foundation for the State University | Semisupervised autoencoder for sentiment analysis |
US10346762B2 (en) | 2016-12-21 | 2019-07-09 | Ca, Inc. | Collaborative data analytics application |
US10409367B2 (en) * | 2016-12-21 | 2019-09-10 | Ca, Inc. | Predictive graph selection |
WO2018130687A1 (en) * | 2017-01-13 | 2018-07-19 | Deutsche Telekom Ag | Method for an enhanced and user-oriented information search and information gathering, system, program and computer program product |
KR102572811B1 (en) * | 2017-02-09 | 2023-09-07 | 랭 오록 오스트레일리아 피티와이 엘티디 | System for identifying defined objects |
US10453165B1 (en) * | 2017-02-27 | 2019-10-22 | Amazon Technologies, Inc. | Computer vision machine learning model execution service |
JP6836068B2 (en) * | 2017-03-24 | 2021-02-24 | 富士通株式会社 | Learning method, learning device, learning program, search method, search device and search program |
US20190019107A1 (en) * | 2017-07-12 | 2019-01-17 | Samsung Electronics Co., Ltd. | Method of machine learning by remote storage device and remote storage device employing method of machine learning |
US11144896B1 (en) * | 2017-07-14 | 2021-10-12 | Giorgio Salvatore Frondoni | Image appliance vehicle toll transaction system and method for identifying a vehicle at an electronic toll for electronic toll collection |
US11467574B2 (en) * | 2017-07-20 | 2022-10-11 | Nuro, Inc. | Infrastructure monitoring system on autonomous vehicles |
USD855684S1 (en) | 2017-08-06 | 2019-08-06 | Udisense Inc. | Wall mount for a monitoring camera |
US10474926B1 (en) * | 2017-11-16 | 2019-11-12 | Amazon Technologies, Inc. | Generating artificial intelligence image processing services |
US10874332B2 (en) | 2017-11-22 | 2020-12-29 | Udisense Inc. | Respiration monitor |
US11551064B2 (en) | 2018-02-08 | 2023-01-10 | Western Digital Technologies, Inc. | Systolic neural network engine capable of forward propagation |
US11494582B2 (en) | 2018-02-08 | 2022-11-08 | Western Digital Technologies, Inc. | Configurable neural network engine of tensor arrays and memory cells |
US20210248514A1 (en) * | 2018-05-06 | 2021-08-12 | Strong Force TX Portfolio 2018, LLC | Artificial intelligence selection and configuration |
US11294949B2 (en) | 2018-09-04 | 2022-04-05 | Toyota Connected North America, Inc. | Systems and methods for querying a distributed inventory of visual data |
CN109345510A (en) * | 2018-09-07 | 2019-02-15 | 百度在线网络技术(北京)有限公司 | Object detecting method, device, equipment, storage medium and vehicle |
US20200134476A1 (en) * | 2018-10-24 | 2020-04-30 | International Business Machines Corporation | Generating code performance hints using source code coverage analytics, inspection, and unstructured programming documents |
CN109948414A (en) * | 2018-12-29 | 2019-06-28 | 中国科学院遥感与数字地球研究所 | Electric power corridor scene classification method based on LiDAR point cloud feature |
USD900430S1 (en) | 2019-01-28 | 2020-11-03 | Udisense Inc. | Swaddle blanket |
USD900429S1 (en) | 2019-01-28 | 2020-11-03 | Udisense Inc. | Swaddle band with decorative pattern |
USD900428S1 (en) | 2019-01-28 | 2020-11-03 | Udisense Inc. | Swaddle band |
USD900431S1 (en) | 2019-01-28 | 2020-11-03 | Udisense Inc. | Swaddle blanket with decorative pattern |
JP7258604B2 (en) * | 2019-03-05 | 2023-04-17 | キヤノン株式会社 | Image processing method, image processing device, program, and method for manufacturing learned model |
US10929058B2 (en) | 2019-03-25 | 2021-02-23 | Western Digital Technologies, Inc. | Enhanced memory device architecture for machine learning |
US11783176B2 (en) | 2019-03-25 | 2023-10-10 | Western Digital Technologies, Inc. | Enhanced storage device memory architecture for machine learning |
US11113037B2 (en) * | 2019-11-08 | 2021-09-07 | International Business Machines Corporation | Software performance modification |
US11182644B2 (en) * | 2019-12-23 | 2021-11-23 | Beijing Institute Of Technology | Method and apparatus for pose planar constraining on the basis of planar feature extraction |
US11797598B2 (en) * | 2020-10-30 | 2023-10-24 | Sitecore Corporation A/S | System and method to automatically create, assemble and optimize content into personalized experiences |
US11516311B2 (en) * | 2021-01-22 | 2022-11-29 | Avago Technologies International Sales Pte. Limited | Distributed machine-learning resource sharing and request routing |
EP4102404B1 (en) * | 2021-06-08 | 2023-12-20 | Tata Consultancy Services Limited | System and method for dynamically generating composable workflow for machine vision application-based environments |
US11798247B2 (en) | 2021-10-27 | 2023-10-24 | Meta Platforms Technologies, Llc | Virtual object structures and interrelationships |
WO2023072389A1 (en) * | 2021-10-27 | 2023-05-04 | Telefonaktiebolaget Lm Ericsson (Publ) | Controlling execution of a perception algorithm |
US11748944B2 (en) * | 2021-10-27 | 2023-09-05 | Meta Platforms Technologies, Llc | Virtual object structures and interrelationships |
CN114116520B (en) * | 2021-12-08 | 2023-05-26 | 抖音视界有限公司 | Algorithm evaluation method, device, gateway and storage medium |
US20240061693A1 (en) * | 2022-08-17 | 2024-02-22 | Sony Interactive Entertainment Inc. | Game platform feature discovery |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20010037324A1 (en) * | 1997-06-24 | 2001-11-01 | International Business Machines Corporation | Multilevel taxonomy based on features derived from training documents classification using fisher values as discrimination values |
US20120102050A1 (en) * | 2009-07-01 | 2012-04-26 | Simon James Button | Systems And Methods For Determining Information And Knowledge Relevancy, Relevent Knowledge Discovery And Interactions, And Knowledge Creation |
US8965104B1 (en) * | 2012-02-10 | 2015-02-24 | Google Inc. | Machine vision calibration with cloud computing systems |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7885906B2 (en) * | 2007-08-20 | 2011-02-08 | Raytheon Company | Problem solving system and method |
JP5321569B2 (en) * | 2010-12-02 | 2013-10-23 | コニカミノルタ株式会社 | Image processing system, image processing method, image processing server, image forming apparatus, and image processing program |
US8429103B1 (en) | 2012-06-22 | 2013-04-23 | Google Inc. | Native machine learning service for user adaptation on a mobile platform |
-
2014
- 2014-03-14 US US14/212,237 patent/US9152870B2/en not_active Expired - Fee Related
-
2015
- 2015-09-09 US US14/849,423 patent/US9466013B2/en active Active
-
2016
- 2016-10-05 US US15/285,679 patent/US20170091590A1/en not_active Abandoned
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20010037324A1 (en) * | 1997-06-24 | 2001-11-01 | International Business Machines Corporation | Multilevel taxonomy based on features derived from training documents classification using fisher values as discrimination values |
US20120102050A1 (en) * | 2009-07-01 | 2012-04-26 | Simon James Button | Systems And Methods For Determining Information And Knowledge Relevancy, Relevent Knowledge Discovery And Interactions, And Knowledge Creation |
US8965104B1 (en) * | 2012-02-10 | 2015-02-24 | Google Inc. | Machine vision calibration with cloud computing systems |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160378685A1 (en) * | 2015-06-27 | 2016-12-29 | Mcafee, Inc. | Virtualized trusted storage |
US10162767B2 (en) * | 2015-06-27 | 2018-12-25 | Mcafee, Llc | Virtualized trusted storage |
US10579544B2 (en) | 2015-06-27 | 2020-03-03 | Mcafee, Llc | Virtualized trusted storage |
US11120150B2 (en) | 2018-02-13 | 2021-09-14 | International Business Machines Corporation | Dynamic access control for knowledge graph |
CN109961045A (en) * | 2019-03-25 | 2019-07-02 | 联想(北京)有限公司 | A kind of location information prompt method, device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
US20140270494A1 (en) | 2014-09-18 |
US9152870B2 (en) | 2015-10-06 |
US20160004936A1 (en) | 2016-01-07 |
US9466013B2 (en) | 2016-10-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9466013B2 (en) | Computer vision as a service | |
US11205100B2 (en) | Edge-based adaptive machine learning for object recognition | |
US10846534B1 (en) | Systems and methods for augmented reality navigation | |
WO2019001481A1 (en) | Vehicle appearance feature identification and vehicle search method and apparatus, storage medium, and electronic device | |
CN108280477B (en) | Method and apparatus for clustering images | |
US20220147743A1 (en) | Scalable semantic image retrieval with deep template matching | |
Tran et al. | On-device scalable image-based localization via prioritized cascade search and fast one-many ransac | |
Zhang et al. | Cognitive template-clustering improved linemod for efficient multi-object pose estimation | |
Lu et al. | Indoor localization via multi-view images and videos | |
Lu et al. | Image-based indoor localization system based on 3d sfm model | |
Li et al. | Dept: depth estimation by parameter transfer for single still images | |
Zhang et al. | A deep neural network-based vehicle re-identification method for bridge load monitoring | |
Krylov et al. | Object geolocation from crowdsourced street level imagery | |
Kargah-Ostadi et al. | Automated real-time roadway asset inventory using artificial intelligence | |
CN113781493A (en) | Image processing method, image processing apparatus, electronic device, medium, and computer program product | |
Keyvanfar et al. | Performance comparison analysis of 3D reconstruction modeling software in construction site visualization and mapping | |
Zhai et al. | JD-SLAM: Joint camera pose estimation and moving object segmentation for simultaneous localization and mapping in dynamic scenes | |
San Blas et al. | A platform for swimming pool detection and legal verification using a multi-agent system and remote image sensing | |
Sommer et al. | Systematic evaluation of deep learning based detection frameworks for aerial imagery | |
Chu et al. | A news picture geo-localization pipeline based on deep learning and street view images | |
Sun et al. | Convolutional neural network-based coarse initial position estimation of a monocular camera in large-scale 3D light detection and ranging maps | |
Venable | Improving Real-World Performance of Vision Aided Navigation in a Flight Environment | |
Strotov et al. | High-performance technology for indexing of high volumes of Earth remote sensing data | |
Porzi et al. | An automatic image-to-DEM alignment approach for annotating mountains pictures on a smartphone | |
Ma et al. | Multi-source fusion based geo-tagging for web images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SRI INTERNATIONAL, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SAWHNEY, HARPEET SINGH;ELEDATH, JAYAKRISHAN;ALI, SAAD;AND OTHERS;SIGNING DATES FROM 20140314 TO 20160128;REEL/FRAME:039943/0773 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |