US20120296866A1 - System and method for implementing on demand cloud database - Google Patents

System and method for implementing on demand cloud database Download PDF

Info

Publication number
US20120296866A1
US20120296866A1 US13/562,374 US201213562374A US2012296866A1 US 20120296866 A1 US20120296866 A1 US 20120296866A1 US 201213562374 A US201213562374 A US 201213562374A US 2012296866 A1 US2012296866 A1 US 2012296866A1
Authority
US
United States
Prior art keywords
cloud database
database nodes
nodes
cloud
requests
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/562,374
Inventor
Shyam Kumar Doddavula
Abhishek Pratap Singh
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Infosys Ltd
Original Assignee
Infosys Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Infosys Ltd filed Critical Infosys Ltd
Priority to US13/562,374 priority Critical patent/US20120296866A1/en
Publication of US20120296866A1 publication Critical patent/US20120296866A1/en
Assigned to Infosys Limited reassignment Infosys Limited CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: INFOSYS TECHNOLOGIES LIMITED
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5019Workload prediction

Definitions

  • the present invention relates to on demand databases and more specifically to provide a system and method for implementing a cloud database that can be provisioned or removed on demand.
  • RDBMS Relational Database Management Systems
  • SQL Structured Query Language
  • a method for dynamic management of one or more cloud database nodes comprises, firstly, gathering information related to usage of one or more cloud database nodes. Secondly, the method comprises comparing time required by the one or more cloud database nodes for responding to one or more requests with a predetermined threshold. The method further comprises provisioning one or more new cloud database nodes or removing one or more new cloud database nodes based on at least one of: the gathered information, the comparison and a combination thereof.
  • gathering information related to usage of one or more cloud database nodes comprises gathering information on current usage of one or more cloud database nodes and gathering information on future usage of one or more cloud database nodes.
  • the current usage information comprises at least one of number of entities stored in the one or more cloud database nodes and number of requests received at the one or more cloud database nodes.
  • the future usage information comprises at least one of number of entities expected to be stored in the one or more cloud database nodes and number of requests expected to be received at the one or more cloud database nodes.
  • comparing time required by the cloud database nodes for responding to one or more requests with a predetermined threshold comprises determining if time required for responding to one or more requests by the cloud database nodes is less than a predetermined threshold.
  • provisioning one or more cloud database nodes based on the gathered information, comparison and combination thereof comprises adding one or more cloud database node instances using a cloud provider application API.
  • comparing time required by the cloud database nodes for responding to one or more requests with a predetermined threshold comprises determining if time required for responding to one or more requests by the cloud database nodes is more than the predetermined threshold.
  • removing one or more cloud database nodes based on the gathered information, comparison and combination thereof comprises deleting one or more cloud database node instances using a cloud provider application API.
  • comparing time required by the cloud database nodes for responding to one or more requests with a predetermined threshold comprises determining if time required for responding to one or more requests by the cloud database nodes is equal to the predetermined threshold.
  • the predetermined threshold comprises at least one of: response time threshold, throughput threshold and resource utilization threshold.
  • the one or more cloud database nodes comprises cloud database node instances and cloud database server instances.
  • the method further comprises updating a routing module with information related to the provisioning or removal of one or more cloud database nodes.
  • the method further comprises repartitioning entities across provisioned and existing cloud database node instances.
  • the repartitioning ascertains even distribution of entities across the node instances.
  • the method further comprises replicating entities across provisioned and existing cloud database node instances.
  • the method further comprises versioning entities across provisioned and existing cloud database node instances.
  • the method comprises, firstly, receiving one or more requests by a cloud database client. Secondly, the method comprises converting the one or more requests to a query language format. The method further comprises locating one or more cloud database server instances. Furthermore, the method comprises sending the converted data to at least one of the located cloud database server instance. Further, the method comprises processing the one or more requests converted to the query language format received at the cloud database server instance. The method further comprises sending the one or more requests to one or more cloud database nodes instances and processing one or more results received from the one or more cloud database node instances. Finally, the method comprises sending the one or more results to the cloud database client.
  • converting the one or more requests to a query language format comprises capturing the one or more requests in a query object.
  • locating one or more cloud database server instances comprises locating at least one of: existing cloud database server instances and provisioned cloud database server instances.
  • sending the converted data to at least one of the located cloud database server instance comprises serializing the converted data into a sequence of bits.
  • processing the one or more requests converted in the query language format received at the cloud database server instance comprises deserializing the sequence of bits to obtain the converted data, interpreting the one or more requests converted to the query language format and determining one or more cloud database node instances for processing the requests based on the interpretation.
  • the one or more cloud database node instances comprise provisioned or existing nodes.
  • sending the one or more requests to one or more cloud database node instances comprises at least one of: routing the one or more requests to a corresponding cloud database node instance and routing the one or more requests to one or more cloud database node instances via one or more other cloud database server instances.
  • processing results received from the one or more cloud database server instances comprises aggregating results corresponding to the one or more requests received from the one or more cloud database server instances.
  • a system for dynamic management of one or more cloud database nodes comprises a cloud database management console configured to facilitate a user to at least in part gather information related to usage of one or more cloud database node instances and to monitor response time of services provided by the one or more cloud database node instances.
  • the system further comprises a cloud infrastructure module configured to facilitate provisioning or removal of one or more cloud database node instances using the cloud database management console.
  • the cloud database management console comprises a web based user interface configured to receive one or more requests.
  • the cloud database management console comprises one or more software modules configured to facilitate addition and/or deletion of one or more cloud database node instances using a cloud provider Application Programming Interface (API).
  • API Application Programming Interface
  • the system comprises a cloud database client configured to receive and convert one or more requests to a query language format.
  • the system further comprises a cloud database server configured to receive the query language format from the cloud database client and process the query language format.
  • the system comprises one or more cloud database node instances configured to provide one or more results corresponding to the one or more requests via the cloud database server and/or other cloud database server instances.
  • the cloud database client comprises a query store client that receives and converts one or more requests to a query language format using a Java Persistence API (Application Programming Interface) Adaptor.
  • the cloud database client comprises a Serialization De-Serialization module configured to translate the converted data into a sequence of bits and send the sequence of bits to the cloud database server over a network.
  • the cloud database client comprises an Inconsistency Resolver configured to resolve inconsistencies in one or more results received from the cloud database server.
  • the Inconsistency Resolver comprises a Resolve by Identification (Id) module configured to resolve inconsistencies in one or more results by updating version identification numbers associated with entities stored in the one or more cloud database node instances.
  • the Inconsistency Resolver comprises a Resolve by Time module configured to resolve inconsistencies in one or more results by identifying latest version of entities stored in the one or more cloud database node instances based on modified time information associated with the entities.
  • the cloud database server is configured to send the one or more requests to one or more other cloud database server instances.
  • the cloud database server comprises a Query Processor configured to interpret the one or more requests converted to the query language format.
  • the Query Processor comprises Query by Identification module configured to interpret the one or more requests when identity of entities associated with the requests is predetermined.
  • the Query Processor comprises Query by Criteria module configured to interpret the one or more requests based on selection criteria created using one or more characteristics of the one or more requests.
  • the cloud database server comprises a routing module configured to repartition entities across one or more of the existing and provisioned cloud database node instances.
  • the cloud database server comprises a routing module configured to replicate entities across one or more of the existing and provisioned cloud database node instances.
  • a computer program product for dynamic management of one or more cloud database nodes comprises program instruction means for gathering information related to usage of one or more cloud database nodes.
  • the computer program product further comprises program instruction means for comparing time required by the one or more cloud database nodes for responding to one or more requests with a predetermined threshold.
  • the computer program product comprises program instruction means for provisioning one or more new cloud database nodes or removing one or more new cloud database nodes based on at least one of: the gathered information, the comparison and a combination thereof.
  • the computer program product comprises program instruction means for receiving one or more requests by a cloud database client.
  • the computer program product further comprises program instruction means for converting the one or more requests to a query language format.
  • the computer program product comprises program instruction means for locating one or more cloud database server instances.
  • the computer program product further comprises program instruction means for sending the converted data to at least one of the located cloud database server instance.
  • the computer program product comprises program instruction means for processing the one or more requests converted to the query language format received at the cloud database server instance.
  • the computer program product comprises program instruction means for sending the one or more requests to one or more cloud database nodes instances.
  • the computer program product comprises program instruction means for processing one or more results received from the one or more cloud database node instances and program instruction means for sending the one or more results to the cloud database client.
  • a system and method for implementing on demand cloud database is provided.
  • the invention provides for provisioning or removing various cloud database nodes based on magnitude of data.
  • the invention facilitates adding cloud database nodes when space is required for storing and quickly retrieving large amounts of data.
  • the invention facilitates distributing data across the various cloud database nodes and also replicating data across various cloud database nodes.
  • the invention also facilitates caching of data efficiently which in turn reduces latency in service response time.
  • FIG. 1 illustrates an architectural diagram of a cloud database solution in accordance with an embodiment of the present invention.
  • the cloud database solution architecture 100 comprises a cloud database management console 102 , a cloud infrastructure module 106 , a cloud database client 108 , one or more cloud database servers 110 , 112 and one or more cloud database nodes 114 , 116 , 118 , 120 .
  • the cloud database servers 110 , 112 and cloud database nodes 114 , 116 , 118 , 120 are served by the cloud infrastructure module 106 .
  • the cloud database management console 102 further comprises a cloud controller 104 .
  • the cloud database management console 102 provides a web based user interface for provisioning or removal of new instances of cloud database nodes and for expanding the storage capacity.
  • a graphical user interface is provided to enable a user to add cloud database node instances to an existing cloud database cluster or delete cloud database node instances from the existing cloud database cluster.
  • the cloud database management console 102 comprises a cloud controller 104 which is used to monitor consumption and availability of cloud database resources.
  • the cloud controller 104 is used to monitor response time of services provided by the cloud database resources.
  • the cloud database management console 102 facilitates handling on-time warnings and accurate capacity management of the cloud database cluster.
  • the cloud infrastructure module 106 comprises various software modules that facilitate provisioning or removal of new cloud database node instances which is carried out via the cloud database management console 102 .
  • the cloud infrastructure module 106 facilitates addition, deletion, initialization, and management of new cloud database node instances.
  • addition modules are used to append various cloud database nodes as and when need arises.
  • initialization modules are used to initialize various cloud database nodes whenever there is a need to store data after addition of nodes.
  • the deletion modules are used to delete various cloud database nodes as and when need arises.
  • the software modules form a part of a cloud provider Application Programming Interface (API).
  • API Application Programming Interface
  • the cloud provider API provides an abstraction from cloud providers internal API and uses the software modules for implementing infrastructure for the cloud database solution architecture 100 .
  • the various software modules may be implemented by performing virtualization of physical hardware of the cloud database solution architecture 100 .
  • open source cloud computing framework such as Eucalyptus may be used for performing virtualization of the physical hardware employing Xen virtualization platform.
  • the cloud database client 108 provides a standardized interface such as Java Persistence API (JPA) or Structured Query Language (SQL) interface to the cloud database.
  • JPA Java Persistence API
  • SQL Structured Query Language
  • the cloud database client 108 provides an interface for receiving requests to store entities in the cloud database nodes.
  • the cloud database client 108 provides an interface to update and delete entities stored in the cloud database nodes.
  • the cloud database client 108 provides an interface to retrieve entities stored in the cloud database nodes.
  • the cloud database client 108 converts the request to a standard query language format and sends it to the cloud database server 110 , 112 over a network.
  • the cloud database server 110 , 112 processes the request and interacts with the cloud database nodes to obtain results corresponding to the request.
  • the cloud database server 110 , 112 provides a mechanism for high availability and high performance of the cloud database nodes by replicating data across the various cloud database nodes.
  • the cloud database server 110 , 112 further provides a mechanism for data distribution across various cloud database nodes, and for handling data storage and data retrieval from the client database nodes.
  • the cloud database node 114 , 116 , 118 , 120 is a distributed node that provides underlying storage along with a processing engine to handle data storage and retrieval requests at the node level.
  • the nodes include a virtual machine with an appropriate operating system.
  • the nodes also include a software image along with software stack to enable virtual appliances that can be instantiated and can be turned off/shutdown on demand.
  • the cloud database nodes use a query engine (not shown) such as an Off the Shelf JPA implementation to process various requests and provide corresponding results.
  • the query engine (not shown) is configured with an object-relational mapping module (not shown) to process various requests and provide corresponding results.
  • object-relational mapping module (not shown)
  • open source hibernate framework may be used as a query engine which uses configuration files to map objects to database tables.
  • the cloud database node 114 , 116 , 118 , 120 also updates versioning information related to replicated data across the various cloud database nodes 114 , 116 , 118 , 120 .
  • FIG. 2 illustrates a detailed block diagram of the cloud database client in accordance with an embodiment of the present invention.
  • the cloud database client 202 comprises Query Store Client 204 , Serialization De-Serialization module 206 , and Entity Helper 208 .
  • the Query Store Client 204 further comprises a JPA Adaptor 210 and Inconsistency Resolver 212 .
  • the Inconsistency Resolver 212 further comprises a Resolve by Identification (Id) module 214 and Resolve by Time module 216 .
  • Id Identification
  • the Query Store Client 204 comprises a Java Persistence API (JPA) Adaptor 210 that provides a JPA query interface for processing request to store, retrieve, update and delete entities.
  • JPA query interface details are available as part of Java Specification Request (JSR).
  • JSR Java Specification Request
  • the Query Store Client 204 receives requests from a user to store, update, delete or retrieve entities stored in various cloud database nodes and captures the request in a query object.
  • the Serialization De-Serialization module 206 serializes the query object for sending to a cloud database server over a network.
  • serialization is the process of translating an object into a sequence of bits so that it can be stored on a storage medium or transmitted across a network link to be used in another networking environment.
  • the Serialization De-Serialization module 206 enables distributed deployment of various modules of the cloud database client 202 .
  • the Inconsistency Resolver 212 within the Query Store Client 204 resolves inconsistencies in results retrieved from the cloud database server.
  • Different versions of a particular entity are provided by different cloud database nodes and each of the versions are associated with version identification numbers.
  • the Resolve by Id module 214 uses the version identification numbers to identify latest version of the entity and replaces all the cloud database nodes which have an older version of the entity.
  • the Resolve by Id module 214 therefore, enables resolving inconsistencies by updating version identification number associated with the cloud database.
  • the Resolve by Time module 216 uses last modified time information stored along with an entity to identify latest version when multiple inconsistent copies of an entity are stored among different cloud database nodes.
  • the Inconsistency Resolver 212 facilitates integrity and consistency of entities.
  • the Entity Helper 208 provides all configuration and additional information about an entity which otherwise cannot be stored in a primary storage repository of the entity.
  • FIG. 3 illustrates a detailed block diagram of a cloud database server in accordance with an embodiment of the present invention.
  • the cloud database server 302 comprises Serialization-De-serialization module 304 , Query Processor 306 , and Entity Helper 308 .
  • the Query Processor 306 further comprises a Query by Identification (Id) module 310 , Query by Criteria module 312 , Routing module 314 , Read Repair module 316 and Failover module 318 .
  • Id Identification
  • the Serialization De-Serialization module 304 receives requests over the network from the cloud database client.
  • the requests are in a serialized format, which is deserialized for further processing. Further, the Serialization De-Serialization module 304 serializes result data corresponding to the request and sends it to the cloud database client.
  • the Query Processor 306 interprets the requests received and processes them in accordance with an embodiment of the present invention.
  • the requests provided by the users may be processed by the Query by Id module 310 .
  • Query by Id module 310 is used when identity of an entity associated with the query is known.
  • the queries provided by the users may be processed by the Query by Criteria module 312 .
  • Query by Criteria module 312 is used when identity of an entity associated with the query is not known and only a few characteristics of the entity are known. Using the few characteristics selection criteria is created and the Query by Criteria module 312 operates on a group of entities selected based on the selection criteria to identify and retrieve the requested entity.
  • the requests forwarded by the users are received by the Query Processor 306 which is further interpreted and provided to a corresponding cloud database node or to various other cloud database node instances via corresponding cloud database server instances.
  • Routing Module 314 partitions entities across one or more instances of cloud database server nodes (existing and provisioned nodes) and routes requests to the appropriate cloud database server node. This reduces amount of data stored on a particular node and further reduces time required to process requests and provide results.
  • the first cloud database server node which receives the request, routes the requests to various other cloud database server nodes.
  • the first cloud database server node receives and aggregates the results received from the various cloud database server node instances.
  • the Routing Module 314 may replicate an entity across multiple cloud database node instances to ensure that related entities are stored, thus, providing high availability of entities.
  • the Routing module 314 maintains data store constraint checks that may be executed to ensure consistency of relationships across the entities stored in the cloud database nodes.
  • a pluggable replication algorithm may be used for replicating the entities.
  • the entities may be versioned using known versioning techniques which in turn provide consistency and availability of entities.
  • the Read Repair Module 316 performs repairs on versioned entities when it finds inconsistencies in the replicated entities retrieved from the cloud database server instances.
  • the Read Repair Module 316 is used to update cloud database nodes which have outdated information on an entity.
  • the Failover Module 318 handles scenarios when the cloud database server is turned off shut down or restored at the time of processing of the query object.
  • the Entity Helper 308 is used to maintain information about the entity relationships so that related entities are routed to the same cloud database server node instances.
  • the Entity Helper 308 provides information about entity types that are related to each other and entities that are used by the Routing Module 314 to ensure that related entities are stored in same cloud database node.
  • FIG. 4 is a flowchart illustrating a method for dynamic management of one or more cloud database nodes and routing of requests to the one or more cloud database nodes, in accordance with an embodiment of the present invention.
  • information on current usage of one or more cloud database nodes is gathered.
  • information on current usage of the one or more cloud database nodes include identifying number of entities stored in the one or more cloud databases.
  • the information on current usage of the one or more cloud database nodes include identifying number of request received at the one or more cloud database nodes.
  • the requests may include requests to store entities and request to retrieve entities.
  • information on future usage of the one or more cloud database nodes is gathered.
  • information on future usage of the one or more cloud database nodes include forecasting the number of entities that can be stored in the one or more cloud databases.
  • information on future usage of the one or more cloud database nodes include forecasting number of requests that are expected to be received at the one or more cloud database nodes.
  • the forecasting may be based on past information of number of entities stored, requests to retrieve or store entities. Forecasting techniques such as ARIMA (Auto-Regressive Integrated Moving Average) may be used to perform the predictions based on the past information.
  • forecasting and prediction enables optimizing distribution of entities across the different cloud database nodes based on considerations such as performance of retrieval, cost of storage etc. depending on forecasted usage.
  • a check is performed to determine if the time required for servicing one or more requests by the one or more cloud databases is less than or more than a predetermined threshold.
  • the predetermined threshold includes response time threshold.
  • the predetermined threshold includes throughput threshold.
  • the predetermined threshold includes resource utilization threshold.
  • techniques such as queuing theory may be used to determine the amount of time that is taken by the cloud databases to service one or more requests for storage or retrieval of entities.
  • queuing theory facilitates to predict computing resource requirements for different storage transactions. This information is used to optimize allocation of computing resources to different cloud database nodes and also in the allocation of workloads to the different cloud database nodes.
  • one or more new instances of cloud database nodes and cloud database servers are provisioned or removed using a cloud provider application programming interface (API).
  • the cloud provider application API uses the information related to number of entities that can be stored in the one or more cloud databases to provision the one or more new instances of cloud database nodes.
  • the cloud provider application API uses the number of requests for storage or retrieval of entities that are expected to be received at the one or more cloud database to provision the one or more new instances of cloud database nodes.
  • the cloud provider application API therefore facilitates addition and removal of cloud database nodes dynamically using the abovementioned information. Further, dynamic distribution of entities and workloads to different cloud database nodes enables optimizations in computing resource allocation and also provides ability to scale dynamically depending on usage of the cloud database nodes.
  • routing module is updated with information related to the provisioning or removal of the one or more cloud database nodes.
  • addition and deletion of cloud database nodes at runtime facilitates dynamic routing. This enables the cloud database architecture to scale up and down easily which further enables cost optimizations as well as ability to meet service levels.
  • requests are routed to cloud database nodes via cloud database server instances based on the updated information.
  • entities stored in the one or more cloud database nodes is repartitioned into one or more segments and distributed by the routing module to existing and/or newly provisioned cloud database nodes.
  • the data is repartitioned appropriately to ascertain that the entities are evenly distributed across the various cloud database node instances.
  • consistent hashing of primary keys associated with the entities may be performed to evenly distribute entities across the various cloud database instances. Consistent hashing is a mechanism where hash function of a primary key is created and is used to identify cloud database node to which the entity is routed.
  • the hash function is consistent and any user who has the entity's primary key and the hash function can determine the cloud database node on which the entity with that primary key would get stored.
  • the routing module therefore, need not maintain a mapping of the entity to corresponding cloud database nodes.
  • information related to entity relationships is maintained so that related entities are routed to the same cloud database node instances.
  • one or more entities may be replicated across all the cloud database node instances in order to maintain consistency of the entity relationships in the cloud database node instances. For example, when there are two entity types that are related to each other, information regarding which entity's primary key should be used and how to retrieve primary key of each of the entities is provided as input to the consistent hashing mechanism. This will result in both the entities getting routed to the same cloud database node. Further, the requests related to storage or retrieval of entities is routed to the existing and/or newly provisioned cloud databases suitably.
  • FIG. 5 is a flowchart illustrating a method for interacting with cloud database implementation that supports standard application programming interface (API), in accordance with an embodiment of the present invention.
  • API application programming interface
  • request data is received.
  • request data is received by a cloud database client.
  • the request data may include, request for storing entities in the cloud database, retrieving entities stored in the cloud database, updating entities stored in the cloud database, and deleting entities stored in the cloud database.
  • the request data is converted to a standard query language format.
  • the request data is converted to a standard query language format by a query interface which is a standard Application Programming Interface (API).
  • the standard API may include a Java Persistence API (JPA) query interface.
  • JPA query interface captures the request data in a query object.
  • Query object is a query language standard for object orientated databases.
  • one or more cloud database server instances are located.
  • the cloud database client locates one or more cloud database server instances.
  • the one or more cloud database server instances include existing and newly provisioned cloud database server instances as explained in FIG. 4 .
  • static configuration mechanisms may be employed to locate the one or more cloud database server instances.
  • static configuration may be a property file having location information of the cloud database server instances which is used by the cloud database client to locate the cloud database server instances.
  • dynamic discovery mechanisms such as multi-cast may be employed to locate the one or more cloud database server instances.
  • dynamic discovery mechanism includes sending a multi-cast request on a pre-defined port.
  • One or more cloud database server instances listening on the multi-cast port respond with details of their location and the cloud database client locates the one or more cloud database server instances.
  • the converted data is sent to one of the cloud database server instance.
  • the converted data is sent to at least one of the cloud database server instance over a network.
  • the converted data is serialized to translate the converted data into a sequence of bits so that the data can be stored in a file or a memory buffer or transmitted across a network in a computing environment.
  • query object is interpreted and processed.
  • the serialized data sent over the network is received by the cloud database server instance and further the serialized data is deserialized to obtain the converted data.
  • the converted data includes the query object.
  • the cloud database server instance interprets the query object which is further processed.
  • the query object is processed to determine one or more cloud database node instances where one or more entities can be stored or retrieved and route the request data to the appropriate cloud database node through corresponding cloud database server instances.
  • the cloud database server instances may include both existing and provisioned cloud database server instances (as explained in FIG. 4 ).
  • the cloud database server instance uses configuration information to replicate the one or more entities across one or more cloud database node instances.
  • the configuration information may include a property file that provides information regarding the number of times a particular entity has to be replicated among different cloud database nodes. This ascertains that the entity data is not lost even if one or more of the cloud database node instances does not function.
  • the cloud database server instance sends the query object to its corresponding cloud database node instance or to other cloud database node instances through corresponding cloud database server instances.
  • the cloud database node instances processes the query and sends the results of the query to the cloud database server instance.
  • the cloud database server instance receives results from various other cloud database server instances and then processes the results.
  • the cloud database server instance processes the results to amend problems which may arise due to inconsistent versions of entity data across various cloud database node instances.
  • the cloud database server instance also handles scenarios when the cloud database server is turned off/shut down and also when the cloud database server is restored.
  • the results are sent to the cloud database client.
  • the cloud database server instance sends the results to the cloud database client.
  • the cloud database server and the cloud database client resolves inconsistencies which may arise due to replicated entities across the one or more cloud database node instances using techniques such as, but not limited to, versioning.
  • the present invention may be implemented in numerous ways including as an apparatus, method, or a computer program product such as a computer readable storage medium or a computer network wherein programming instructions are communicated from a remote location.

Abstract

A method for dynamic management of one or more cloud database nodes is provided. The method enables gathering information related to usage of one or more cloud database nodes. The method further enables comparing time required by the one or more cloud database nodes for responding to one or more requests with a predetermined threshold. Furthermore, the method enables provisioning one or more new cloud database nodes or removing one or more new cloud database nodes based on at least one of: the gathered information, the comparison and a combination thereof.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • This application is a divisional of U.S. patent application Ser. No. 12/902,298 filed on 12 Oct. 2010 which claims priority under 35 U.S.C. 119 to Indian Patent Application Number 2386/CHE/2010 filed on 19 Aug. 2010, where the contents of said applications are herein incorporated by reference in their entirety.
  • FIELD OF THE INVENTION
  • The present invention relates to on demand databases and more specifically to provide a system and method for implementing a cloud database that can be provisioned or removed on demand.
  • BACKGROUND OF THE INVENTION
  • The advent and progression of information technology has resulted in proliferation of data and there is a continuous demand to store and retrieve content. Various applications such as user generated content in community portals, emails and other web applications, call log files generated at call centers, inventory managements systems, Enterprise Resource Planning (ERP) systems etc. results in huge amount of data. The data generated by the abovementioned processes needs to be stored in an efficient way so that it can be quickly retrieved whenever there is a need. As a result, there is an increasing demand for storing, processing, and retrieving huge amount of data.
  • Various conventional Relational Database Management Systems (RDBMS) are used to cater to the increasing demand of data but due to inherent limitation theses databases cannot be scaled horizontally and therefore, there is a need to store data on demand. Various key value data stores which are used to deal with the abovementioned limitation of horizontal scalability works on the premises that rather than providing Structured Query Language (SQL) based interface, map interface should be provided so that traditional enterprise applications can be served.
  • In light of abovementioned disadvantages, there is a need for a system and method for implementing a database that can be provisioned or removed on demand. In addition, there is a need for database that can be scaled horizontally. Also, a system and method for implementing a cloud database is required which provides high scalability, availability, performance and have low latency.
  • SUMMARY OF THE INVENTION
  • A method for dynamic management of one or more cloud database nodes is provided. In various embodiments of the present invention, the method comprises, firstly, gathering information related to usage of one or more cloud database nodes. Secondly, the method comprises comparing time required by the one or more cloud database nodes for responding to one or more requests with a predetermined threshold. The method further comprises provisioning one or more new cloud database nodes or removing one or more new cloud database nodes based on at least one of: the gathered information, the comparison and a combination thereof.
  • In an embodiment of the present invention, gathering information related to usage of one or more cloud database nodes comprises gathering information on current usage of one or more cloud database nodes and gathering information on future usage of one or more cloud database nodes. In an embodiment of the present invention, the current usage information comprises at least one of number of entities stored in the one or more cloud database nodes and number of requests received at the one or more cloud database nodes. In another embodiment of the present invention, the future usage information comprises at least one of number of entities expected to be stored in the one or more cloud database nodes and number of requests expected to be received at the one or more cloud database nodes.
  • In an embodiment of the present invention, comparing time required by the cloud database nodes for responding to one or more requests with a predetermined threshold comprises determining if time required for responding to one or more requests by the cloud database nodes is less than a predetermined threshold. In an embodiment of the present invention, provisioning one or more cloud database nodes based on the gathered information, comparison and combination thereof comprises adding one or more cloud database node instances using a cloud provider application API. In another embodiment of the present invention, comparing time required by the cloud database nodes for responding to one or more requests with a predetermined threshold comprises determining if time required for responding to one or more requests by the cloud database nodes is more than the predetermined threshold. In yet another embodiment of then present invention, removing one or more cloud database nodes based on the gathered information, comparison and combination thereof comprises deleting one or more cloud database node instances using a cloud provider application API. In another embodiment of the present invention, comparing time required by the cloud database nodes for responding to one or more requests with a predetermined threshold comprises determining if time required for responding to one or more requests by the cloud database nodes is equal to the predetermined threshold. In an embodiment of the present invention, the predetermined threshold comprises at least one of: response time threshold, throughput threshold and resource utilization threshold.
  • In an embodiment of the present invention, the one or more cloud database nodes comprises cloud database node instances and cloud database server instances. In an embodiment of the present invention, the method further comprises updating a routing module with information related to the provisioning or removal of one or more cloud database nodes.
  • In an embodiment of the present invention, the method further comprises repartitioning entities across provisioned and existing cloud database node instances. The repartitioning ascertains even distribution of entities across the node instances. In another embodiment of the present invention, the method further comprises replicating entities across provisioned and existing cloud database node instances. In yet another embodiment of the present invention, the method further comprises versioning entities across provisioned and existing cloud database node instances.
  • A method for interacting with dynamically scaled cloud database architecture is provided. In various embodiments of the present invention, the method comprises, firstly, receiving one or more requests by a cloud database client. Secondly, the method comprises converting the one or more requests to a query language format. The method further comprises locating one or more cloud database server instances. Furthermore, the method comprises sending the converted data to at least one of the located cloud database server instance. Further, the method comprises processing the one or more requests converted to the query language format received at the cloud database server instance. The method further comprises sending the one or more requests to one or more cloud database nodes instances and processing one or more results received from the one or more cloud database node instances. Finally, the method comprises sending the one or more results to the cloud database client.
  • In an embodiment of the present invention, converting the one or more requests to a query language format comprises capturing the one or more requests in a query object. In another embodiment of the present invention, locating one or more cloud database server instances comprises locating at least one of: existing cloud database server instances and provisioned cloud database server instances. In yet another embodiment of the present invention, sending the converted data to at least one of the located cloud database server instance comprises serializing the converted data into a sequence of bits. In an embodiment of the present invention, processing the one or more requests converted in the query language format received at the cloud database server instance comprises deserializing the sequence of bits to obtain the converted data, interpreting the one or more requests converted to the query language format and determining one or more cloud database node instances for processing the requests based on the interpretation. The one or more cloud database node instances comprise provisioned or existing nodes.
  • In an embodiment of the present invention, sending the one or more requests to one or more cloud database node instances comprises at least one of: routing the one or more requests to a corresponding cloud database node instance and routing the one or more requests to one or more cloud database node instances via one or more other cloud database server instances. In another embodiment of the present invention, processing results received from the one or more cloud database server instances comprises aggregating results corresponding to the one or more requests received from the one or more cloud database server instances.
  • A system for dynamic management of one or more cloud database nodes is provided. In various embodiments of the present invention, the system comprises a cloud database management console configured to facilitate a user to at least in part gather information related to usage of one or more cloud database node instances and to monitor response time of services provided by the one or more cloud database node instances. The system further comprises a cloud infrastructure module configured to facilitate provisioning or removal of one or more cloud database node instances using the cloud database management console.
  • In an embodiment of the present invention, the cloud database management console comprises a web based user interface configured to receive one or more requests. In another embodiment of the present invention, the cloud database management console comprises one or more software modules configured to facilitate addition and/or deletion of one or more cloud database node instances using a cloud provider Application Programming Interface (API).
  • A system for facilitating interaction with dynamically scaled cloud database architecture is provided. In various embodiments of the present invention, the system comprises a cloud database client configured to receive and convert one or more requests to a query language format. The system further comprises a cloud database server configured to receive the query language format from the cloud database client and process the query language format. Furthermore, the system comprises one or more cloud database node instances configured to provide one or more results corresponding to the one or more requests via the cloud database server and/or other cloud database server instances.
  • In an embodiment of the present invention, the cloud database client comprises a query store client that receives and converts one or more requests to a query language format using a Java Persistence API (Application Programming Interface) Adaptor. In another embodiment of the present invention, the cloud database client comprises a Serialization De-Serialization module configured to translate the converted data into a sequence of bits and send the sequence of bits to the cloud database server over a network.
  • In another embodiment of the present invention, the cloud database client comprises an Inconsistency Resolver configured to resolve inconsistencies in one or more results received from the cloud database server. In yet another embodiment of the present invention, the Inconsistency Resolver comprises a Resolve by Identification (Id) module configured to resolve inconsistencies in one or more results by updating version identification numbers associated with entities stored in the one or more cloud database node instances. In another embodiment of the present invention, the Inconsistency Resolver comprises a Resolve by Time module configured to resolve inconsistencies in one or more results by identifying latest version of entities stored in the one or more cloud database node instances based on modified time information associated with the entities.
  • In an embodiment of the present invention, the cloud database server is configured to send the one or more requests to one or more other cloud database server instances. In another embodiment of the present invention, the cloud database server comprises a Query Processor configured to interpret the one or more requests converted to the query language format. In an embodiment of the present invention, the Query Processor comprises Query by Identification module configured to interpret the one or more requests when identity of entities associated with the requests is predetermined. In another embodiment of the present invention, the Query Processor comprises Query by Criteria module configured to interpret the one or more requests based on selection criteria created using one or more characteristics of the one or more requests.
  • In an embodiment of the present invention, the cloud database server comprises a routing module configured to repartition entities across one or more of the existing and provisioned cloud database node instances. In another embodiment of then present invention, the cloud database server comprises a routing module configured to replicate entities across one or more of the existing and provisioned cloud database node instances.
  • A computer program product for dynamic management of one or more cloud database nodes is provided. In various embodiments of the present invention, the computer program product comprises program instruction means for gathering information related to usage of one or more cloud database nodes. The computer program product further comprises program instruction means for comparing time required by the one or more cloud database nodes for responding to one or more requests with a predetermined threshold. Furthermore, the computer program product comprises program instruction means for provisioning one or more new cloud database nodes or removing one or more new cloud database nodes based on at least one of: the gathered information, the comparison and a combination thereof.
  • A computer program product for interacting with dynamically scaled cloud database architecture is provided. In various embodiments of the present invention, the computer program product comprises program instruction means for receiving one or more requests by a cloud database client. The computer program product further comprises program instruction means for converting the one or more requests to a query language format. Further, the computer program product comprises program instruction means for locating one or more cloud database server instances. The computer program product further comprises program instruction means for sending the converted data to at least one of the located cloud database server instance. The computer program product comprises program instruction means for processing the one or more requests converted to the query language format received at the cloud database server instance. The computer program product comprises program instruction means for sending the one or more requests to one or more cloud database nodes instances. Further, the computer program product comprises program instruction means for processing one or more results received from the one or more cloud database node instances and program instruction means for sending the one or more results to the cloud database client.
  • DETAILED DESCRIPTION OF THE INVENTION
  • A system and method for implementing on demand cloud database is provided. The invention provides for provisioning or removing various cloud database nodes based on magnitude of data. The invention facilitates adding cloud database nodes when space is required for storing and quickly retrieving large amounts of data. The invention facilitates distributing data across the various cloud database nodes and also replicating data across various cloud database nodes. The invention also facilitates caching of data efficiently which in turn reduces latency in service response time.
  • The disclosure is provided in order to enable a person having ordinary skill in the art to practice the invention. Exemplary embodiments herein are provided only for illustrative purposes and various modifications will be readily apparent to persons skilled in the art. The general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the invention. The terminology and phraseology used herein is for the purpose of describing exemplary embodiments and should not be considered limiting. Thus, the present invention is to be accorded the widest scope encompassing numerous alternatives, modifications and equivalents consistent with the principles and features disclosed herein. For purpose of clarity, details relating to technical material that is known in the technical fields related to the invention have been briefly described or omitted so as not to unnecessarily obscure the present invention.
  • The present invention would now be discussed in context of embodiments as illustrated in the accompanying drawings.
  • FIG. 1 illustrates an architectural diagram of a cloud database solution in accordance with an embodiment of the present invention. The cloud database solution architecture 100 comprises a cloud database management console 102, a cloud infrastructure module 106, a cloud database client 108, one or more cloud database servers 110, 112 and one or more cloud database nodes 114, 116, 118, 120. In an embodiment of the present invention, the cloud database servers 110, 112 and cloud database nodes 114, 116, 118, 120 are served by the cloud infrastructure module 106. The cloud database management console 102 further comprises a cloud controller 104.
  • The cloud database management console 102 provides a web based user interface for provisioning or removal of new instances of cloud database nodes and for expanding the storage capacity. In various embodiments of the present invention, a graphical user interface is provided to enable a user to add cloud database node instances to an existing cloud database cluster or delete cloud database node instances from the existing cloud database cluster. In an embodiment of the present invention, the cloud database management console 102 comprises a cloud controller 104 which is used to monitor consumption and availability of cloud database resources. In another embodiment of the present invention, the cloud controller 104 is used to monitor response time of services provided by the cloud database resources. The cloud database management console 102 facilitates handling on-time warnings and accurate capacity management of the cloud database cluster.
  • The cloud infrastructure module 106 comprises various software modules that facilitate provisioning or removal of new cloud database node instances which is carried out via the cloud database management console 102. In an embodiment of the present invention, the cloud infrastructure module 106 facilitates addition, deletion, initialization, and management of new cloud database node instances. In an embodiment of the present invention, addition modules are used to append various cloud database nodes as and when need arises. In another embodiment of the present invention, initialization modules are used to initialize various cloud database nodes whenever there is a need to store data after addition of nodes. In another embodiment of the present invention, the deletion modules are used to delete various cloud database nodes as and when need arises. In an embodiment of the present invention, the software modules form a part of a cloud provider Application Programming Interface (API). The cloud provider API provides an abstraction from cloud providers internal API and uses the software modules for implementing infrastructure for the cloud database solution architecture 100. In an embodiment of the present invention, the various software modules may be implemented by performing virtualization of physical hardware of the cloud database solution architecture 100. In an exemplary embodiment of the present invention, open source cloud computing framework such as Eucalyptus may be used for performing virtualization of the physical hardware employing Xen virtualization platform.
  • The cloud database client 108 provides a standardized interface such as Java Persistence API (JPA) or Structured Query Language (SQL) interface to the cloud database. In an embodiment of the present invention, the cloud database client 108 provides an interface for receiving requests to store entities in the cloud database nodes. In another embodiment of the present invention, the cloud database client 108 provides an interface to update and delete entities stored in the cloud database nodes. In yet another embodiment of the present invention, the cloud database client 108 provides an interface to retrieve entities stored in the cloud database nodes. The cloud database client 108 converts the request to a standard query language format and sends it to the cloud database server 110, 112 over a network.
  • The cloud database server 110, 112 processes the request and interacts with the cloud database nodes to obtain results corresponding to the request. In an embodiment of the present invention, the cloud database server 110, 112 provides a mechanism for high availability and high performance of the cloud database nodes by replicating data across the various cloud database nodes. The cloud database server 110, 112 further provides a mechanism for data distribution across various cloud database nodes, and for handling data storage and data retrieval from the client database nodes.
  • The cloud database node 114, 116, 118, 120 is a distributed node that provides underlying storage along with a processing engine to handle data storage and retrieval requests at the node level. In an embodiment of the present invention, the nodes include a virtual machine with an appropriate operating system. In another embodiment of the present invention, the nodes also include a software image along with software stack to enable virtual appliances that can be instantiated and can be turned off/shutdown on demand. In an embodiment of the present invention, the cloud database nodes use a query engine (not shown) such as an Off the Shelf JPA implementation to process various requests and provide corresponding results. In an embodiment of the present invention, the query engine (not shown) is configured with an object-relational mapping module (not shown) to process various requests and provide corresponding results. In an exemplary embodiment of the present invention, open source hibernate framework may be used as a query engine which uses configuration files to map objects to database tables. The cloud database node 114, 116, 118, 120 also updates versioning information related to replicated data across the various cloud database nodes 114, 116, 118, 120.
  • FIG. 2 illustrates a detailed block diagram of the cloud database client in accordance with an embodiment of the present invention. In various embodiments of the present invention, the cloud database client 202 comprises Query Store Client 204, Serialization De-Serialization module 206, and Entity Helper 208. The Query Store Client 204 further comprises a JPA Adaptor 210 and Inconsistency Resolver 212. The Inconsistency Resolver 212 further comprises a Resolve by Identification (Id) module 214 and Resolve by Time module 216.
  • The Query Store Client 204 comprises a Java Persistence API (JPA) Adaptor 210 that provides a JPA query interface for processing request to store, retrieve, update and delete entities. The JPA query interface details are available as part of Java Specification Request (JSR). In an embodiment of the present invention, the Query Store Client 204 receives requests from a user to store, update, delete or retrieve entities stored in various cloud database nodes and captures the request in a query object.
  • The Serialization De-Serialization module 206 serializes the query object for sending to a cloud database server over a network. In an embodiment of the present invention, serialization is the process of translating an object into a sequence of bits so that it can be stored on a storage medium or transmitted across a network link to be used in another networking environment. In another embodiment of the present invention, the Serialization De-Serialization module 206 enables distributed deployment of various modules of the cloud database client 202.
  • The Inconsistency Resolver 212 within the Query Store Client 204 resolves inconsistencies in results retrieved from the cloud database server. Different versions of a particular entity are provided by different cloud database nodes and each of the versions are associated with version identification numbers. In an embodiment of the present invention, the Resolve by Id module 214 uses the version identification numbers to identify latest version of the entity and replaces all the cloud database nodes which have an older version of the entity. The Resolve by Id module 214, therefore, enables resolving inconsistencies by updating version identification number associated with the cloud database. The Resolve by Time module 216 uses last modified time information stored along with an entity to identify latest version when multiple inconsistent copies of an entity are stored among different cloud database nodes. The Inconsistency Resolver 212 facilitates integrity and consistency of entities.
  • The Entity Helper 208 provides all configuration and additional information about an entity which otherwise cannot be stored in a primary storage repository of the entity.
  • FIG. 3 illustrates a detailed block diagram of a cloud database server in accordance with an embodiment of the present invention. The cloud database server 302 comprises Serialization-De-serialization module 304, Query Processor 306, and Entity Helper 308. The Query Processor 306 further comprises a Query by Identification (Id) module 310, Query by Criteria module 312, Routing module 314, Read Repair module 316 and Failover module 318.
  • The Serialization De-Serialization module 304 receives requests over the network from the cloud database client. The requests are in a serialized format, which is deserialized for further processing. Further, the Serialization De-Serialization module 304 serializes result data corresponding to the request and sends it to the cloud database client.
  • The Query Processor 306 interprets the requests received and processes them in accordance with an embodiment of the present invention. In an embodiment of the present invention, the requests provided by the users may be processed by the Query by Id module 310. Query by Id module 310 is used when identity of an entity associated with the query is known. In another embodiment of the present invention, the queries provided by the users may be processed by the Query by Criteria module 312. Query by Criteria module 312 is used when identity of an entity associated with the query is not known and only a few characteristics of the entity are known. Using the few characteristics selection criteria is created and the Query by Criteria module 312 operates on a group of entities selected based on the selection criteria to identify and retrieve the requested entity. The requests forwarded by the users are received by the Query Processor 306 which is further interpreted and provided to a corresponding cloud database node or to various other cloud database node instances via corresponding cloud database server instances.
  • Routing Module 314 partitions entities across one or more instances of cloud database server nodes (existing and provisioned nodes) and routes requests to the appropriate cloud database server node. This reduces amount of data stored on a particular node and further reduces time required to process requests and provide results. In an embodiment of the present invention, when a request is received for multiple entities, the first cloud database server node which receives the request, routes the requests to various other cloud database server nodes. The first cloud database server node, then, receives and aggregates the results received from the various cloud database server node instances. In yet another embodiment of the present invention, the Routing Module 314 may replicate an entity across multiple cloud database node instances to ensure that related entities are stored, thus, providing high availability of entities. In another embodiment of the present invention, the Routing module 314 maintains data store constraint checks that may be executed to ensure consistency of relationships across the entities stored in the cloud database nodes. In an exemplary embodiment of the present invention, a pluggable replication algorithm may be used for replicating the entities. In another embodiment of the present invention, the entities may be versioned using known versioning techniques which in turn provide consistency and availability of entities.
  • The Read Repair Module 316 performs repairs on versioned entities when it finds inconsistencies in the replicated entities retrieved from the cloud database server instances. In an embodiment of the present invention, the Read Repair Module 316 is used to update cloud database nodes which have outdated information on an entity. The Failover Module 318 handles scenarios when the cloud database server is turned off shut down or restored at the time of processing of the query object.
  • The Entity Helper 308 is used to maintain information about the entity relationships so that related entities are routed to the same cloud database server node instances. The Entity Helper 308 provides information about entity types that are related to each other and entities that are used by the Routing Module 314 to ensure that related entities are stored in same cloud database node.
  • FIG. 4 is a flowchart illustrating a method for dynamic management of one or more cloud database nodes and routing of requests to the one or more cloud database nodes, in accordance with an embodiment of the present invention.
  • At step 402, information on current usage of one or more cloud database nodes is gathered. In an embodiment of the present invention, information on current usage of the one or more cloud database nodes include identifying number of entities stored in the one or more cloud databases. In another embodiment of the present invention, the information on current usage of the one or more cloud database nodes include identifying number of request received at the one or more cloud database nodes. The requests may include requests to store entities and request to retrieve entities.
  • At step 404, information on future usage of the one or more cloud database nodes is gathered. In an embodiment of the present invention, information on future usage of the one or more cloud database nodes include forecasting the number of entities that can be stored in the one or more cloud databases. In another embodiment of the present invention, information on future usage of the one or more cloud database nodes include forecasting number of requests that are expected to be received at the one or more cloud database nodes. In an embodiment of the present invention, the forecasting may be based on past information of number of entities stored, requests to retrieve or store entities. Forecasting techniques such as ARIMA (Auto-Regressive Integrated Moving Average) may be used to perform the predictions based on the past information. In an embodiment of the present invention, forecasting and prediction enables optimizing distribution of entities across the different cloud database nodes based on considerations such as performance of retrieval, cost of storage etc. depending on forecasted usage.
  • At step 406, a check is performed to determine if the time required for servicing one or more requests by the one or more cloud databases is less than or more than a predetermined threshold. In an embodiment of the present invention, the predetermined threshold includes response time threshold. In another embodiment of the present invention, the predetermined threshold includes throughput threshold. In yet another embodiment of the present invention, the predetermined threshold includes resource utilization threshold. In an embodiment of the present invention, techniques such as queuing theory may be used to determine the amount of time that is taken by the cloud databases to service one or more requests for storage or retrieval of entities. In an embodiment of the present invention, queuing theory facilitates to predict computing resource requirements for different storage transactions. This information is used to optimize allocation of computing resources to different cloud database nodes and also in the allocation of workloads to the different cloud database nodes.
  • At step 408, one or more new instances of cloud database nodes and cloud database servers are provisioned or removed using a cloud provider application programming interface (API). In an embodiment of the present invention, the cloud provider application API uses the information related to number of entities that can be stored in the one or more cloud databases to provision the one or more new instances of cloud database nodes. In another embodiment of the present invention, the cloud provider application API uses the number of requests for storage or retrieval of entities that are expected to be received at the one or more cloud database to provision the one or more new instances of cloud database nodes. The cloud provider application API therefore facilitates addition and removal of cloud database nodes dynamically using the abovementioned information. Further, dynamic distribution of entities and workloads to different cloud database nodes enables optimizations in computing resource allocation and also provides ability to scale dynamically depending on usage of the cloud database nodes.
  • At step 410, routing module is updated with information related to the provisioning or removal of the one or more cloud database nodes. In an embodiment of the present invention, addition and deletion of cloud database nodes at runtime facilitates dynamic routing. This enables the cloud database architecture to scale up and down easily which further enables cost optimizations as well as ability to meet service levels.
  • At step 412, requests are routed to cloud database nodes via cloud database server instances based on the updated information. In various embodiments of the present invention, entities stored in the one or more cloud database nodes is repartitioned into one or more segments and distributed by the routing module to existing and/or newly provisioned cloud database nodes. The data is repartitioned appropriately to ascertain that the entities are evenly distributed across the various cloud database node instances. In an exemplary embodiment of the present invention, consistent hashing of primary keys associated with the entities may be performed to evenly distribute entities across the various cloud database instances. Consistent hashing is a mechanism where hash function of a primary key is created and is used to identify cloud database node to which the entity is routed. The hash function is consistent and any user who has the entity's primary key and the hash function can determine the cloud database node on which the entity with that primary key would get stored. The routing module, therefore, need not maintain a mapping of the entity to corresponding cloud database nodes.
  • In an embodiment of the present invention, information related to entity relationships is maintained so that related entities are routed to the same cloud database node instances. In another embodiment of the present invention, one or more entities may be replicated across all the cloud database node instances in order to maintain consistency of the entity relationships in the cloud database node instances. For example, when there are two entity types that are related to each other, information regarding which entity's primary key should be used and how to retrieve primary key of each of the entities is provided as input to the consistent hashing mechanism. This will result in both the entities getting routed to the same cloud database node. Further, the requests related to storage or retrieval of entities is routed to the existing and/or newly provisioned cloud databases suitably.
  • FIG. 5 is a flowchart illustrating a method for interacting with cloud database implementation that supports standard application programming interface (API), in accordance with an embodiment of the present invention.
  • At step 502, request data is received. In various embodiments of the present invention, request data is received by a cloud database client. The request data may include, request for storing entities in the cloud database, retrieving entities stored in the cloud database, updating entities stored in the cloud database, and deleting entities stored in the cloud database.
  • At step 504, the request data is converted to a standard query language format. In various embodiments of the present invention, the request data is converted to a standard query language format by a query interface which is a standard Application Programming Interface (API). In an embodiment of the present invention, the standard API may include a Java Persistence API (JPA) query interface. The JPA query interface captures the request data in a query object. Query object is a query language standard for object orientated databases.
  • At step 506, one or more cloud database server instances are located. In various embodiments of the present invention, the cloud database client locates one or more cloud database server instances. The one or more cloud database server instances include existing and newly provisioned cloud database server instances as explained in FIG. 4. In an embodiment of the present invention, static configuration mechanisms may be employed to locate the one or more cloud database server instances. In an exemplary embodiment of the present invention, static configuration may be a property file having location information of the cloud database server instances which is used by the cloud database client to locate the cloud database server instances. In another embodiment of the present invention, dynamic discovery mechanisms such as multi-cast may be employed to locate the one or more cloud database server instances. In an embodiment of the present invention, dynamic discovery mechanism includes sending a multi-cast request on a pre-defined port. One or more cloud database server instances listening on the multi-cast port respond with details of their location and the cloud database client locates the one or more cloud database server instances.
  • At step 508, the converted data is sent to one of the cloud database server instance. In various embodiments of the present invention, the converted data is sent to at least one of the cloud database server instance over a network. In an embodiment of the present invention, the converted data is serialized to translate the converted data into a sequence of bits so that the data can be stored in a file or a memory buffer or transmitted across a network in a computing environment.
  • At step 510, query object is interpreted and processed. In various embodiments of the present invention, the serialized data sent over the network is received by the cloud database server instance and further the serialized data is deserialized to obtain the converted data. The converted data includes the query object. The cloud database server instance, then, interprets the query object which is further processed. In an embodiment of the present invention, the query object is processed to determine one or more cloud database node instances where one or more entities can be stored or retrieved and route the request data to the appropriate cloud database node through corresponding cloud database server instances. The cloud database server instances may include both existing and provisioned cloud database server instances (as explained in FIG. 4). In an embodiment of the present invention, the cloud database server instance uses configuration information to replicate the one or more entities across one or more cloud database node instances. In an embodiment of the present invention, the configuration information may include a property file that provides information regarding the number of times a particular entity has to be replicated among different cloud database nodes. This ascertains that the entity data is not lost even if one or more of the cloud database node instances does not function.
  • At step 512, the cloud database server instance sends the query object to its corresponding cloud database node instance or to other cloud database node instances through corresponding cloud database server instances. The cloud database node instances processes the query and sends the results of the query to the cloud database server instance.
  • At step 514, the cloud database server instance receives results from various other cloud database server instances and then processes the results. In an embodiment of the present invention, the cloud database server instance processes the results to amend problems which may arise due to inconsistent versions of entity data across various cloud database node instances. In another embodiment of the present invention, the cloud database server instance also handles scenarios when the cloud database server is turned off/shut down and also when the cloud database server is restored.
  • At step 516, the results are sent to the cloud database client. In various embodiments of the present invention, the cloud database server instance sends the results to the cloud database client. In an embodiment of the present invention, the cloud database server and the cloud database client resolves inconsistencies which may arise due to replicated entities across the one or more cloud database node instances using techniques such as, but not limited to, versioning.
  • The present invention may be implemented in numerous ways including as an apparatus, method, or a computer program product such as a computer readable storage medium or a computer network wherein programming instructions are communicated from a remote location.
  • While the exemplary embodiments of the present invention are described and illustrated herein, it will be appreciated that they are merely illustrative. It will be understood by those skilled in the art that various modifications in form and detail may be made therein without departing from or offending the spirit and scope of the invention as defined by the appended claims.

Claims (19)

1. A method for dynamic management of one or more cloud database nodes, the method comprising:
gathering information related to usage of one or more cloud database nodes;
comparing time required by the one or more cloud database nodes for responding to one or more requests with a predetermined threshold; and
provisioning one or more new cloud database nodes or removing one or more new cloud database nodes based on at least one of: the gathered information, the comparison and a combination thereof.
2. The method of claim 1, wherein gathering information related to usage of one or more cloud database nodes comprises:
gathering information on current usage of one or more cloud database nodes; and
gathering information on future usage of one or more cloud database nodes.
3. The method of claim 2, wherein the current usage information comprises at least one of: number of entities stored in the one or more cloud database nodes and number of requests received at the one or more cloud database nodes.
4. The method of claim 2, wherein the future usage information comprises at least one of: number of entities expected to be stored in the one or more cloud database nodes and number of requests expected to be received at the one or more cloud database nodes.
5. The method of claim 1, wherein comparing time required by the cloud database nodes for responding to one or more requests with a predetermined threshold comprises determining if time required for responding to one or more requests by the cloud database nodes is less than a predetermined threshold.
6. The method of claim 5, wherein provisioning one or more cloud database nodes based on the gathered information, comparison and combination thereof comprises adding one or more cloud database node instances using a cloud provider application API.
7. The method of claim 1, wherein comparing time required by the cloud database nodes for responding to one or more requests with a predetermined threshold comprises determining if time required for responding to one or more requests by the cloud database nodes is more than the predetermined threshold.
8. The method of claim 7, wherein removing one or more cloud database nodes based on the gathered information, comparison and combination thereof comprises deleting one or more cloud database node instances using a cloud provider application API.
9. The method of claim 1, wherein comparing time required by the cloud database nodes for responding to one or more requests with a predetermined threshold comprises determining if time required for responding to one or more requests by the cloud database nodes is equal to the predetermined threshold.
10. The method of claim 1, wherein the predetermined threshold comprises at least one of: response time threshold, throughput threshold and resource utilization threshold.
11. The method of claim 1, wherein the one or more cloud database nodes comprise cloud database node instances and cloud database server instances.
12. The method of claim 1 further comprising updating a routing module with information related to the provisioning or removal of one or more cloud database nodes.
13. The method of claim 1 further comprising repartitioning entities across provisioned and existing cloud database node instances, wherein the repartitioning ascertains even distribution of entities across the node instances.
14. The method of claim 1 further comprising replicating entities across provisioned and existing cloud database node instances.
15. The method of claim 1 further comprising versioning entities across provisioned and existing cloud database node instances.
16. A system for dynamic management of one or more cloud database nodes, the system comprising:
a cloud database management console configured to facilitate a user to at least in part gather information related to usage of one or more cloud database node instances and to monitor response time of services provided by the one or more cloud database node instances; and
a cloud infrastructure module configured to facilitate provisioning or removal of one or more cloud database node instances using the cloud database management console.
17. The system of claim 16, wherein the cloud database management console comprises a web based user interface configured to receive one or more requests.
18. The system of claim 16, wherein the cloud database management console comprises one or more software modules configured to facilitate addition and/or deletion of one or more cloud database node instances using a cloud provider Application Programming Interface (API).
19. A computer program product for dynamic management of one or more cloud database nodes, the computer program product comprising:
program instruction means for gathering information related to usage of one or more cloud database nodes;
program instruction means for comparing time required by the one or more cloud database nodes for responding to one or more requests with a predetermined threshold; and
program instruction means for provisioning one or more new cloud database nodes or removing one or more new cloud database nodes based on at least one of: the gathered information, the comparison and a combination thereof.
US13/562,374 2010-08-19 2012-07-31 System and method for implementing on demand cloud database Abandoned US20120296866A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/562,374 US20120296866A1 (en) 2010-08-19 2012-07-31 System and method for implementing on demand cloud database

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
IN2386CH2010 2010-08-19
IN2386/CHE/2010 2010-08-19
US12/902,298 US8832130B2 (en) 2010-08-19 2010-10-12 System and method for implementing on demand cloud database
US13/562,374 US20120296866A1 (en) 2010-08-19 2012-07-31 System and method for implementing on demand cloud database

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US12/902,298 Division US8832130B2 (en) 2010-08-19 2010-10-12 System and method for implementing on demand cloud database

Publications (1)

Publication Number Publication Date
US20120296866A1 true US20120296866A1 (en) 2012-11-22

Family

ID=45594860

Family Applications (2)

Application Number Title Priority Date Filing Date
US12/902,298 Active US8832130B2 (en) 2010-08-19 2010-10-12 System and method for implementing on demand cloud database
US13/562,374 Abandoned US20120296866A1 (en) 2010-08-19 2012-07-31 System and method for implementing on demand cloud database

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US12/902,298 Active US8832130B2 (en) 2010-08-19 2010-10-12 System and method for implementing on demand cloud database

Country Status (1)

Country Link
US (2) US8832130B2 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120144407A1 (en) * 2010-12-07 2012-06-07 Nec Laboratories America, Inc. System and method for cloud infrastructure data sharing through a uniform communication framework
US20120246030A1 (en) * 2011-03-25 2012-09-27 Fujitsu Limited Information providing device, method, and system
CN103631602A (en) * 2013-12-12 2014-03-12 叶宁 Cloud support system suitable for ERP (Enterprise Resource Planning) software of middle and small-sized enterprises
US20140222866A1 (en) * 2013-02-01 2014-08-07 Google Inc. Accessing objects in hosted storage
CN104753968A (en) * 2013-12-25 2015-07-01 中国电信股份有限公司 Cloud computing cross-region multiple data centers and dispatching management method thereof
US20150288564A1 (en) * 2014-04-02 2015-10-08 Aria Solutions, Inc. Configurable cloud-based routing
US20220006859A1 (en) * 2019-11-01 2022-01-06 Uber Technologies, Inc. Dynamically computing load balancer subset size in a distributed computing system
US11620313B2 (en) * 2016-04-28 2023-04-04 Snowflake Inc. Multi-cluster warehouse
US11956308B2 (en) 2023-05-17 2024-04-09 Uber Technologies, Inc. Dynamically computing load balancer subset size in a distributed computing system

Families Citing this family (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120310875A1 (en) * 2011-06-03 2012-12-06 Prashanth Prahlad Method and system of generating a data lineage repository with lineage visibility, snapshot comparison and version control in a cloud-computing platform
EP2817727A4 (en) * 2012-02-23 2015-10-28 Ajay Jadhav Persistent node framework
KR101930263B1 (en) 2012-03-12 2018-12-18 삼성전자주식회사 Apparatus and method for managing contents in a cloud gateway
US8965921B2 (en) * 2012-06-06 2015-02-24 Rackspace Us, Inc. Data management and indexing across a distributed database
US8805989B2 (en) * 2012-06-25 2014-08-12 Sungard Availability Services, Lp Business continuity on cloud enterprise data centers
US9053161B2 (en) 2012-08-30 2015-06-09 International Business Machines Corporation Database table format conversion based on user data access patterns in a networked computing environment
GB2507338A (en) 2012-10-26 2014-04-30 Ibm Determining system topology graph changes in a distributed computing system
US10394611B2 (en) * 2012-11-26 2019-08-27 Amazon Technologies, Inc. Scaling computing clusters in a distributed computing system
US9563628B1 (en) * 2012-12-11 2017-02-07 EMC IP Holding Company LLC Method and system for deletion handling for incremental file migration
US20140195672A1 (en) * 2013-01-09 2014-07-10 Microsoft Corporation Automated failure handling through isolation
US10026059B2 (en) * 2013-03-04 2018-07-17 Avaya Inc. Systems and methods for managing reporting data on a hosted on-demand reporting system
CN103530335B (en) * 2013-09-30 2017-03-22 广东电网公司汕头供电局 In-stockroom operation method and device of electric power measurement acquisition system
CN103593422B (en) * 2013-11-01 2017-02-15 国云科技股份有限公司 Virtual access management method of heterogeneous database
US20150163721A1 (en) 2013-12-11 2015-06-11 Jdsu Uk Limited Method and apparatus for processing data
US9668172B2 (en) * 2013-12-11 2017-05-30 Viavi Solutions Uk Limited Method and apparatus for enabling near real time data analysis
US9207966B2 (en) 2013-12-19 2015-12-08 Red Hat, Inc. Method and system for providing a high-availability application
US10108686B2 (en) 2014-02-19 2018-10-23 Snowflake Computing Inc. Implementation of semi-structured data as a first-class database element
GB2524951A (en) * 2014-03-13 2015-10-14 Vodafone Ip Licensing Ltd Management of resource allocation in a mobile telecommunication network
US20160034835A1 (en) * 2014-07-31 2016-02-04 Hewlett-Packard Development Company, L.P. Future cloud resource usage cost management
US9800549B2 (en) * 2015-02-11 2017-10-24 Cisco Technology, Inc. Hierarchical clustering in a geographically dispersed network environment
JP6498844B2 (en) * 2015-07-22 2019-04-10 華為技術有限公司Huawei Technologies Co.,Ltd. Computer device and method for reading / writing data by computer device
US9667657B2 (en) * 2015-08-04 2017-05-30 AO Kaspersky Lab System and method of utilizing a dedicated computer security service
US10063256B1 (en) * 2015-11-02 2018-08-28 Cisco Technology, Inc. Writing copies of objects in enterprise object storage systems
US10491477B2 (en) * 2015-12-18 2019-11-26 Privops Llc Hybrid cloud integration fabric and ontology for integration of data, applications, and information technology infrastructure
CN107154960B (en) * 2016-03-02 2020-10-27 阿里巴巴集团控股有限公司 Method and apparatus for determining service availability information for distributed storage systems
US10642860B2 (en) * 2016-06-03 2020-05-05 Electronic Arts Inc. Live migration of distributed databases
WO2018076017A1 (en) * 2016-10-23 2018-04-26 Norman Myers Multi-cloud user interface
KR20180044696A (en) * 2016-10-24 2018-05-03 삼성에스디에스 주식회사 Method and system for storing query result in distributed server
US10366103B2 (en) 2016-11-01 2019-07-30 Sap Se Load balancing for elastic query service system
US20180255123A1 (en) * 2017-03-03 2018-09-06 International Business Machines Corporation Distributed resource allocation in a federated cloud environment
US10581969B2 (en) 2017-09-14 2020-03-03 International Business Machines Corporation Storage system using cloud based ranks as replica storage
US10721304B2 (en) 2017-09-14 2020-07-21 International Business Machines Corporation Storage system using cloud storage as a rank
US10372371B2 (en) 2017-09-14 2019-08-06 International Business Machines Corporation Dynamic data relocation using cloud based ranks
US10372363B2 (en) * 2017-09-14 2019-08-06 International Business Machines Corporation Thin provisioning using cloud based ranks
CN109783577B (en) * 2019-01-05 2021-10-08 咪付(广西)网络技术有限公司 Strategy-based cloud database elastic expansion method
CN111435320B (en) * 2019-01-14 2023-04-11 阿里巴巴集团控股有限公司 Data processing method and device
CN111538718B (en) * 2020-04-22 2023-10-27 杭州宇为科技有限公司 Entity id generation and positioning method, capacity expansion method and equipment of distributed system
US10855660B1 (en) * 2020-04-30 2020-12-01 Snowflake Inc. Private virtual network replication of cloud databases

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050267929A1 (en) * 2004-06-01 2005-12-01 Hitachi, Ltd. Method of dynamically balancing workload of a storage system
US20060136448A1 (en) * 2004-12-20 2006-06-22 Enzo Cialini Apparatus, system, and method for database provisioning
US20090100180A1 (en) * 2003-08-14 2009-04-16 Oracle International Corporation Incremental Run-Time Session Balancing In A Multi-Node System
US20100088150A1 (en) * 2008-10-08 2010-04-08 Jamal Mazhar Cloud computing lifecycle management for n-tier applications

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7200666B1 (en) * 2000-07-07 2007-04-03 International Business Machines Corporation Live connection enhancement for data source interface
US20050125414A1 (en) * 2003-10-16 2005-06-09 Navas Julio C. System and method for facilitating asynchronous disconnected operations for data access over a network
US7523130B1 (en) 2004-01-28 2009-04-21 Mike Meadway Storing and retrieving objects on a computer network in a distributed database
JP4352079B2 (en) 2007-03-28 2009-10-28 株式会社東芝 System, apparatus, and method for retrieving information from a distributed database
ES2387625T3 (en) 2007-12-17 2012-09-27 Nokia Siemens Networks Oy Query routing in a distributed database system
US20110078303A1 (en) * 2009-09-30 2011-03-31 Alcatel-Lucent Usa Inc. Dynamic load balancing and scaling of allocated cloud resources in an enterprise network
US8315977B2 (en) * 2010-02-22 2012-11-20 Netflix, Inc. Data synchronization between a data center environment and a cloud computing environment
US8504689B2 (en) * 2010-05-28 2013-08-06 Red Hat, Inc. Methods and systems for cloud deployment analysis featuring relative cloud resource importance

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090100180A1 (en) * 2003-08-14 2009-04-16 Oracle International Corporation Incremental Run-Time Session Balancing In A Multi-Node System
US20050267929A1 (en) * 2004-06-01 2005-12-01 Hitachi, Ltd. Method of dynamically balancing workload of a storage system
US20060136448A1 (en) * 2004-12-20 2006-06-22 Enzo Cialini Apparatus, system, and method for database provisioning
US20100088150A1 (en) * 2008-10-08 2010-04-08 Jamal Mazhar Cloud computing lifecycle management for n-tier applications

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120144407A1 (en) * 2010-12-07 2012-06-07 Nec Laboratories America, Inc. System and method for cloud infrastructure data sharing through a uniform communication framework
US8613004B2 (en) * 2010-12-07 2013-12-17 Nec Laboratories America, Inc. System and method for cloud infrastructure data sharing through a uniform communication framework
US20120246030A1 (en) * 2011-03-25 2012-09-27 Fujitsu Limited Information providing device, method, and system
US8478654B2 (en) * 2011-03-25 2013-07-02 Fujitsu Limited Information providing device, method, and system
US10956376B2 (en) 2013-02-01 2021-03-23 Google Llc Accessing objects in hosted storage
US20140222866A1 (en) * 2013-02-01 2014-08-07 Google Inc. Accessing objects in hosted storage
CN103631602A (en) * 2013-12-12 2014-03-12 叶宁 Cloud support system suitable for ERP (Enterprise Resource Planning) software of middle and small-sized enterprises
CN104753968A (en) * 2013-12-25 2015-07-01 中国电信股份有限公司 Cloud computing cross-region multiple data centers and dispatching management method thereof
US20150288564A1 (en) * 2014-04-02 2015-10-08 Aria Solutions, Inc. Configurable cloud-based routing
US9860124B2 (en) * 2014-04-02 2018-01-02 Aria Solutions, Inc. Configurable cloud-based routing
US11620313B2 (en) * 2016-04-28 2023-04-04 Snowflake Inc. Multi-cluster warehouse
US11630850B2 (en) * 2016-04-28 2023-04-18 Snowflake Inc. Multi-cluster warehouse
US20220006859A1 (en) * 2019-11-01 2022-01-06 Uber Technologies, Inc. Dynamically computing load balancer subset size in a distributed computing system
US11695827B2 (en) * 2019-11-01 2023-07-04 Uber Technologies, Inc. Dynamically computing load balancer subset size in a distributed computing system
US11956308B2 (en) 2023-05-17 2024-04-09 Uber Technologies, Inc. Dynamically computing load balancer subset size in a distributed computing system

Also Published As

Publication number Publication date
US20120047107A1 (en) 2012-02-23
US8832130B2 (en) 2014-09-09

Similar Documents

Publication Publication Date Title
US8832130B2 (en) System and method for implementing on demand cloud database
US10896172B2 (en) Batch data ingestion in database systems
US10394847B2 (en) Processing data in a distributed database across a plurality of clusters
EP3667500B1 (en) Using a container orchestration service for dynamic routing
US9304815B1 (en) Dynamic replica failure detection and healing
US20180189367A1 (en) Data stream ingestion and persistence techniques
US9081837B2 (en) Scoped database connections
US7490265B2 (en) Recovery segment identification in a computing infrastructure
KR102338208B1 (en) Method, apparatus and system for processing data
US20120047239A1 (en) System and Method for Installation and Management of Cloud-Independent Multi-Tenant Applications
CN111597148B (en) Distributed metadata management method for distributed file system
US10812322B2 (en) Systems and methods for real time streaming
CN110516076B (en) Knowledge graph-based cloud computing management method and system
WO2019153880A1 (en) Method for downloading mirror file in cluster, node, and query server
CN116701330A (en) Logistics information sharing method, device, equipment and storage medium
CN115640110A (en) Distributed cloud computing system scheduling method and device
US20210318994A1 (en) Extensible streams for operations on external systems
US20220342888A1 (en) Object tagging
EP3818453A1 (en) System for optimizing storage replication in a distributed data analysis system using historical data access patterns
CN107276914B (en) Self-service resource allocation scheduling method based on CMDB
US8386732B1 (en) Methods and apparatus for storing collected network management data
CN112347794A (en) Data translation method, device, equipment and computer storage medium
JP6568232B2 (en) Computer system and device management method
US20210286819A1 (en) Method and System for Operation Objects Discovery from Operation Data
US11388235B1 (en) Decoupling volume deployment requirements and code implementation

Legal Events

Date Code Title Description
AS Assignment

Owner name: INFOSYS LIMITED, INDIA

Free format text: CHANGE OF NAME;ASSIGNOR:INFOSYS TECHNOLOGIES LIMITED;REEL/FRAME:030039/0819

Effective date: 20110616

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION