WO2001040932A2

WO2001040932A2 - A system and method for controlling a plurality of servers

Info

Publication number: WO2001040932A2
Application number: PCT/GB2000/003732
Authority: WO
Inventors: Bruce Jackson; Rod French; Paul Colin Chapman
Original assignee: Elata (Holdings) Limited
Priority date: 1999-11-30
Filing date: 2000-09-29
Publication date: 2001-06-07
Also published as: AU7437700A; GB9928309D0; GB2356947A; WO2001040932A3

Abstract

The present invention provides a system and method for controlling a plurality of servers to process requests received from clients. The system comprises a gateway arranged to be responsive to receipt of a request from a client to generate a request parcel from said request, the request parcel containing the request and predetermined control information used to identify the request. A store is provided for receiving each request parcel generated by the gateway. Further, the system includes a plurality of resolvers, each resolver being associated with one of said plurality of servers. Each resolver is arranged to request from the store a request parcel, and upon receipt of a request parcel from the store is arranged to extract the request and control information from the request parcel, to pass the request to the associated server for processing, and to pass the control information to the gateway to enable data generated by the associated server to be routed to the client.

Description

A SYSTEM AND METHOD FOR CONTROLLING A PLURALITY OF SERVERS Field of the Invention

The present invention relates to a system and method for controlling a plurality of servers to process requests received from clients. Description of the Prior Art

It is known to group together a plurality of servers to collectively provide a service to clients. Such an approach is often taken in situations where significant processing power is required to provide the service. For example, a plurality of servers may be grouped together to service one or more web sites accessible by clients over the Internet. As another example, a plurality of LDAP servers could be provided to provide speedy directory access for multiple concurrent users.

Such an approach is particularly useful where a large number of requests have to be handled by the server side of the system, and/or where the data rates to be handled by the server side of the system are large, and/or the processing power required to fulfil each request is large. However, to enable a plurality of servers to effectively work together as if they were a single more powerful server, it is apparent that a suitable control system is required to manage the distribution of tasks to the various servers. Typical known control systems are complex and expensive, as they incorporate complex techniques to determine when to distribute requests to particular servers. Other systems are over-simplistic (for example applying a simple round-robin approach for distribution of requests to servers), and thus fail to achieve optimal load balancing. Further, many solutions involve dedicated hardware units developed specifically to enable the management of large data bandwidths. A common current solution used for web sites is to use a Domain Name

Server (DNS) to perform load balancing. The DNS is arranged to provide more than one IP address in response to a request quoting the domain name of the web site, these addresses to be supplied in a predetermined sequence such that the address of each of the plurality of web servers is returned an equal number of times over many requests. This approach successfully equalises the traffic between the plurality of servers but completely fails to take into account any differences in the performance or capabilities of those servers. A hardware solution used for web sites might involve adding an "Enterprise Clustering Card" to each of the plurality of web servers, the effect of which is to create a machine which effectively shares the memory, disk and processor power resources of the individual machines. Such a system is very powerful and load balances well but the entry cost and the incremental expansion cost is very high and being so highly integrated such a system would be limited to one hardware and software vendor.

Accordingly, it is an object of the present invention to provide an improved technique for controlling a plurality of servers. Summary of the Invention

Viewed from a first aspect, the present invention provides a system for controlling a plurality of servers to process requests received from clients, comprising: a gateway arranged to be responsive to receipt of a request from a client to generate a request parcel from said request, the request parcel containing the request and predetermined control information used to identify the request; a store for receiving each request parcel generated by the gateway; a plurality of resolvers, each resolver being associated with one of said plurality of servers, each resolver being arranged to request from the store a request parcel, and upon receipt of a request parcel from the store being arranged to extract the request and control information from the request parcel, to pass the request to the associated server for processing, and to pass the control information to the gateway to enable data generated by the associated server to be routed to the client.

In accordance with the present invention, a gateway is provided that, for each request received, generates a request parcel containing the request and predetermined control information used to identify the request. Each request parcel is then placed in a store by the gateway. The gateway itself has no responsibility for allocating the requests to particular servers.

To enable the requests to be allocated to particular servers, a plurality of resolvers are provided, each resolver being associated with one of the plurality of servers. A resolver associated with a particular server is arranged to request from the store a request parcel, and upon receipt of a request parcel from the store is then arranged to extract the request and pass it to the associated server for processing, and to extract the control information and to pass it to the gateway. The gateway then uses the control information to enable the data generated by the associated server to be routed to the client that had initiated that request.

By this approach, it can be seen that the present invention decouples the initial handling of incoming client requests from the actual allocation of those requests to individual servers. Instead, each individual request is packaged in a predetermined format to enable the request to be identified later, and then is allocated to the store for temporary storage. Each request parcel in the store can then be handled by any of the servers able to process the request represented by the request parcel, the actual allocation of each request parcel to a server preferably being dependent on when the associated resolvers ask for request parcels from the store and on the exact nature of the request. For example, the store may operate on a first-in-first-out (FIFO) basis, and each server may be able to handle any request. Accordingly, when a resolver requests a request parcel from the store, the store will provide to the resolver the request parcel that has been in the store the longest.

It will be appreciated that it is not necessary that the store operate on a simple FIFO basis, and other factors such as priorities assigned to particular requests may be factored in to the process of determining which request to return to a resolver. Further, it will be appreciated that not all of the servers have to be able to handle all types of requests. Instead, the request parcels may incorporate information identifying the type of the request, and the resolver may request from the store request parcels of one or more predetermined types.

It will be appreciated that a number of different techniques may be used to determine when each resolver should request a request from the store. For example, a simple approach would be to have each resolver request request parcels from the store at predetermined intervals. However, in preferred embodiments, the resolver is arranged to request request parcels from the store dependent on the loading of the associated server.

More particularly, in preferred embodiments, the resolver provides a number of resolver threads, each resolver thread being arranged to request from the store a request parcel for processing by the associated server, and the resolver being arranged to be responsive to predetermined parameters indicative of the loading of the associated server to control the number of resolver threads. By this approach, each resolver thread can be arranged to request a request parcel from the store as soon as it has finished handling a current request, and the number of resolver threads is then controlled to take account of the loading of the associated server. In such preferred embodiments, the data to be passed between the associated server and the gateway during processing of a request is preferably routed via the corresponding resolver thread, and said predetermined parameters comprise throughput information identifying the throughput of data through each resolver thread. Based on such throughput information, the resolver may be arranged to determine weight factors indicative of whether the number of resolver threads should be increased, decreased, or left the same.

As an alternative approach of controlling the number of resolver threads, or in addition to the above identified techniques, the resolver may be arranged to monitor the time between each resolver thread requesting a request parcel and receiving the request parcel, and to control the number of resolver threads dependent on that time information. By this approach, when the frequency of requests received by the system is lower than that which can be handled by the system, the resolver may be arranged to terminate some resolver threads to avoid having more resolver threads than necessary. Clearly, when each resolver thread no longer needs to wait more than a predetermined time before receiving a request parcel from the store, then predetermined techniques may be used to determine whether it is appropriate to begin to increase the number of resolver threads again. For example, the earlier described technique for controlling the number of resolver threads based on the loading of the associated server may be used. For example, a hunting algorithm may be used which, when the resolver thread(s) are not waiting for any requests, attempts to increase the number of threads if this is likely to increase overall throughput through the server.

In preferred embodiments, when one of said resolver threads receives a request parcel from the store, it is arranged to create a first socket and to pass the request to the associated server via that first socket, and to create a second socket and to pass the control information to the gateway via the second socket, a connection between the associated server and the gateway being formed by the resolver thread passing data between the first and second sockets.

As will be appreciated by those skilled in the art, a socket is an identifiable endpoint of a data communication connection. For example, in TCP/IP terms, a socket is such an endpoint characterised by an IP address and a TCP port number.

In preferred embodiments, if either the first or the second socket is closed, the resolver thread is arranged to close the other socket thereby terminating the connection. Preferably, following the commencement of data flow from the server, the resolver thread will then request another request object from the store, the data flow continuing autonomously until the connection is terminated as defined earlier.

In preferred embodiments, the gateway is arranged to create a request handler thread for each request received by the gateway, the request handler thread being arranged to generate the request parcel for the associated request.

It will be appreciated that the control information incorporated into the request parcel by the request handler thread can take a variety of forms, as long as it serves to uniquely identify the request. In preferred embodiments, since a dedicated request handler thread is provided for each request, the control information is arranged to uniquely identify the particular request handler thread responsible for handling that request. Accordingly, in such preferred embodiments, when control information from a request parcel is passed from one of said resolvers to the gateway, the gateway is arranged to use the control information to identify the request handler thread responsible for the associated request, and to cause a connection to be established between that request handler thread and that resolver.

It will be appreciated that there are a number of ways in which the gateway may be arranged to handle incoming calls from the resolvers providing control information. However, in preferred embodiments, the gateway comprises a gateway reply thread arranged to receive incoming calls from the resolvers, each time an incoming call is received, the gateway reply thread being arranged to create a resolver socket to handle the incoming call, to determine from the control information the relevant request handler thread, and to pass the resolver socket to that request handler thread to effect the connection between the resolver and that request handler thread. In preferred embodiments where the resolver provides a number of resolver threads, and each resolver thread is arranged to create first and second sockets providing a connection between the associated server and the gateway, the gateway reply thread is preferably arranged to receive the incoming call from one of said resolver threads via the second socket, whereby once the resolver socket has been passed to the appropriate request handler thread, the request handler thread is able to receive data from, and send data to, the associated server.

Further, in preferred embodiments, each time a request is received by the gateway, the gateway is arranged to create a client socket to provide a connection with the client, and to pass the client socket to the request handler thread allocated to that request, a connection between the resolver and the client hence being formed by the request handler thread passing data between the resolver and client sockets. By this approach, it will be appreciated that there is now a connection established between the associated server and the client to enable data to be transferred between the server and client.

Preferably, if either the resolver socket or the client socket is closed, the request handler thread is arranged to close the other socket thereby terminating the connection. Since at this point, it is deemed that the handling of the request is terminated, then the request handler thread is also arranged to terminate, since that request handler thread was generated by the gateway specifically to handle that request.

It will be appreciated that the present invention may be used in any scenario where it is appropriate to provide a plurality of servers to collectively provide a service to clients. However, in preferred embodiments, the plurality of servers are web servers accessible via the Internet, and the requests received from clients are http requests. Hence, in such preferred embodiments, the system is able to control a collection of web servers to service one or more web sites as if they were a single powerful web server.

In preferred embodiments, the request parcels are generated as request objects in an object oriented programming (OOP) environment, and the store is an object store arranged to store OOP objects. Preferably, the gateway and plurality of resolvers are computer programs operating on the Java platform, and the object store is formed as a JavaSpace. This approach removes the requirement for specialised hardware to control the plurality of servers, thereby reducing complexity and expense. The ubiquity of Java across multiple different platforms enables the plurality of servers controlled by the system to be of a variety of different types, for example PCs. Sun Workstations, Macs, etc, running different operating systems and different web server software.

Further, it will be appreciated that, given the network aware nature of the Java platform, the computer programs for the gateway and the resolvers could run on systems ranging from a single machine to many machines without extensive reconfiguring. Indeed, additional machines can be readily added to the system to increase the overall throughput of the system.

Viewed from a second aspect, the present invention provides a method of controlling a plurality of servers to process requests received from clients, comprising the steps of: (i) upon receipt of a request from a client, arranging a gateway to generate a request parcel from said request, the request parcel containing the request and predetermined control information used to identify the request; (ii) storing each request parcel generated by the gateway in a store; (iii) providing a plurality of resolvers, each resolver being associated with one of said plurality of servers, each resolver being arranged to perform the steps of: (a) requesting from the store a request parcel; (b) upon receipt of a request parcel from the store, extracting the request and control information from the request parcel; (c) passing the request to the associated server for processing; and (d) passing the control information to the gateway to enable data generated by the associated server to be routed to the client. Viewed from a third aspect, the present invention provides a computer program for controlling a computer to act as a resolver in a system in accordance with the first aspect of the present invention, the computer program being operable to configure the computer to implement the steps of: (a) requesting from the store a request parcel; (b) upon receipt of a request parcel from the store, extracting the request and control information from the request parcel; (c) passing the request to the associated server for processing; and (d) passing the control information to the gateway to enable data generated by the associated server to be routed to the client. Viewed from a fourth aspect, the present invention provides a computer program for controlling a computer to act as a gateway in a system in accordance with the first aspect of the present invention, the computer program being operable to configure the computer to implement the steps of: (a) upon receipt of a request from a client, generating a request parcel from said request, the request parcel containing the request and predetermined control information used to identify the request; and (b) sending the request parcel to a store for storage.

Viewed from a fifth aspect, the present invention provides a computer program product comprising a recordable medium having recorded thereon a computer program in accordance with the third or fourth aspects of the present invention.

The scope of the present invention also extends to the use of an object store to store request parcels generated by a system in accordance with the first aspect of the present invention. Brief Description of the Drawings The present invention will be described further, by way of example only, with reference to a preferred embodiment thereof as illustrated in the accompanying drawings, in which:

Figure 1 is a block diagram schematically illustrating a system in accordance with preferred embodiments of the present invention; Figure 2 is a flow diagram illustrating the start up sequence for the system in accordance with preferred embodiments of the present invention;

Figures 3 to 5 are flow diagrams illustrating various program threads executing within the gateway in accordance with preferred embodiments of the present invention; Figure 6 is a flow diagram illustrating a resolver manager program thread executing within each resolver in accordance with preferred embodiments of the present invention;

Figure 7 is a flow diagram illustrating how the weight factors used in Figure 6 are calculated in preferred embodiments of the present invention;

Figures 8 and 8a are flow diagrams illustrating the process performed by a resolver program thread in accordance with preferred embodiments of the present invention; and Figure 9 is a block diagram illustrating in more detail the operation of the system in accordance with preferred embodiments of the present invention Description of a Preferred Embodiment

Figure 1 is a block diagram illustrating the arrangement of the system in accordance with preferred embodiments of the present invention For the sake of the following description, it will be assumed that the servers controlled by the system of preferred embodiments of the present invention are web servers accessible by clients over the Internet. In accordance with the preferred embodiments of the present invention, the plurality of web servers are grouped together under the control of the system illustrated in Figure 1 in order to service one or more web sites as if they were a single powerful web server.

With reference to Figure 1, http requests issued by clients 930, and relating to the one or more web sites managed by the system, are received at the gateway 10 over paths 15 The individual paths 15 illustrated in Figure 1 represent individual socket connections established by the gateway to individual clients making requests

Each time a request is received by the gateway, the gateway is arranged to create a request object incorporating the request and control information used to identify the request, and is then arranged to transmit that request object to the object store 20 for storage The process performed by the gateway will be discussed in more detail later In accordance with preferred embodiments of the present invention, a resolver

30 is provided for each web server 920 controlled by the system, and the resolvers are responsible for retrieving request objects from the object store, passing the requests to their associated server for processing, and then returning the response to the gateway 10 for forwarding on to the appropriate client responsible for those requests. In preferred embodiments, the resolvers are arranged such that they only take requests from the object store at the rate their associated server can handle them, and thus the resolvers effectively balance their associated server's load to its capabilities The process performed by the resolvers will be discussed in more detail later

In preferred embodiments, the gateway 10, resolvers 30 and object store 20 are software entities developed using object oπented programming (OOP) techniques Although it would be possible to run all of these software entities on a single machine, that would prove rather inefficient, and would limit the benefits available through use of the system illustrated in Figure 1. Accordingly, the software entities can run on a number of different machines, with the elements of the system being able to identify each other via the Lookup Schema 40.

In particular preferred embodiments, the gateway and resolvers are computer programs operating on the Java platform, and the object store is formed as a JavaSpace. The ubiquity of Java across multiple different platforms enables the plurality of servers controlled by the system to be of a variety of different types, for example. PC, Sun Workstations, Macs, etc, running different operating systems and different web server software. For more details of Java, the reader is referred to the web page http://www.sun.com/java/, whilst for more details of JavaSpaces. the reader is referred to the web page http://www.sun.com/jini/specs/js-spec.html.

The object store 20 formed as a JavaSpace provides a network available repository of objects. Objects may be put into the store and then retrieved based on type of object, and optionally on contents of object. The system of preferred embodiments uses the object store 20 to contain request objects generated by the gateway, and also to make system wide data (for example system wide setup information) available to all components of the system of preferred embodiments.

In preferred embodiments, known techniques are used to protect the object store against concurrent access, and indeed the object store may be accessed via a suitable transaction processing system. The transaction processing system may be arranged to expire objects that have been in the store for too long, or to return objects to the store if the transaction is not completed within a specified time of the object being removed.

Further, transaction processing can optionally be used as part of the system of preferred embodiments to provide resilience in the event of failures. For example, if a resolver takes a request object from the object store, and then suffers a power failure, the transaction processing system will ensure that the request object reappears in the object store after a predetermined time, to enable that request object to be retrieved by another resolver.

Further, in preferred embodiments, the lookup schema 40 is formed as a Jini lookup service, Jini being a software infrastructure that allows services to be made available across a network, the reader being referred to the web site http://www.sun.com/jini/ for more details. More particularly, the lookup schema provides a register of services, which allows objects to identify each other across the network based on a range of criteria. These criteria include service names and the interfaces implemented by the service. All components of the system illustrated in Figure 1 can identify each other because they are all tagged with a particular label. Further, if multiple incidences of the system were to be run on the same network, these may be distinguished by adding a "Name" tag specific to each instance. Hence, in summary, the Jini lookup service holds the components of the system together, allowing them to identify each other and monitor each other^'s operating status. Individual components may be created or terminated, and the other components can use the notification capabilities of the Jini lookup service to take this in their stride, or to reconfigure to replace any missing components.

The network file system 50 is a common file store that allows all of the web servers controlled by the system in Figure 1 to serve the web site, and also contains any necessary databases or other forms of information to enable the web servers to persist state across requests. Such information could include items such as log-in details, session IDs, shopping basket contents, etc. It will be appreciated that sharing of such information is required, since different servers may handle related requests from the same client, and hence each server may need to be aware of this. Such a network file system 50 will be familiar to those skilled in the art, and accordingly the network file system 50 will not be described in any further detail herein, except to state that it must support a filing system that is compatible with all the different types of machine and operating system that are being used to host resolvers and the gateway. Appropriate client software must also be available for each of the platforms used.

More details of the operation of the system of preferred embodiments, and in particular more details of the operation of the gateway and resolvers, will now be provided with reference to the flow diagrams of Figures 2 to 8. Figure 2 illustrates the basic start up sequence for the system of preferred embodiments of the present invention. The process starts at step 200, and proceeds to step 210, where the object store 20 is initialised. The preferred embodiment of the object store makes use of a Sun Microsystems implementation of JavaSpace. To initialise the object store, the JavaSpace is created, and a name is allocated to it. The process then proceeds to step 220, where within each resolver 30 a resolver manager thread is started. This process will be discussed in more detail later with reference to Figure 6. The basic job of each resolver manager thread is to establish resolver threads to retrieve request objects from the object store, and then to manage the number of resolver threads based on throughput information. 5 The process then proceeds to step 230, where the gateway main thread is started. this process being described later with reference to Figure 3. The basic job of the gateway main thread is to set up a gateway reply thread to handle incoming calls from resolvers. and to then set up request handler threads to handle incoming calls from clients. Once the gateway main thread has been started, the process then proceeds to

10 step 240, where the start up sequence ends.

Figure 3 illustrates in more detail the operation of the gateway main thread. At step 300. the gateway main thread is started, and then at step 310, a gateway reply thread is created. This process will be described in more detail later with reference to Figure 4. Once the gateway reply thread has been created, the process proceeds to step

15 320, where a server socket is created to handle incoming requests from clients. As will be appreciated by those skilled in the art, a server socket is a socket that is made available to answer incoming calls on a specified address. When an incoming call is accepted by the server socket, a new instance of a socket is spawned to handle that incoming call while the server socket remains in place waiting for further incoming calls 0 on the same address. The process then proceeds to step 330. to await an incoming call from a client.

The process stays at step 330 until an incoming call is received, at which point the process proceeds to step 340, where a client socket is spawned to handle the incoming call. The process then proceeds to step 350, where a request handler thread is 5 created to handle the incoming call, the request handler thread being discussed in more detail later with reference to Figure 5. Once the request handler thread has been created, the gateway main thread passes the client socket to the request handler thread at step

360, and then the process returns to step 330 to await receipt of a further incoming call.

The operation of a request handler thread will now be discussed in more detail 0 with reference to Figure 5. The process starts at step 500, and at step 505 the client socket is received from the gateway main thread. This client socket represents the incoming call received at step 330 in Figure 3, and at step 510 the request handler thread is arranged to read a request from the client socket. At step 515, a request object is then generated which includes the request and predetermined control information. The predetermined control information can be any information which serves to uniquely identify the request. Since in preferred embodiments a dedicated request handler thread is provided for each request, the control information is arranged to uniquely identify the particular request handler thread generating the request object.

Once the request object has been generated, the request handler thread posts the request object to the object store 20 at step 520, and then the request handler thread waits for a short time or until it is activated at step 525, before proceeding to step 530 to determine whether it has been activated. If not, the process returns to step 525. As will become apparent from the following description of the actions taken by the various resolvers, the request handler thread will only be activated again once the request object has been retrieved by a particular resolver thread for handling by an associated server. Accordingly, the remainder of the process of Figure 5 will be discussed later once the process performed by the resolver threads has been described.

Figures 6 and 7 describe in detail the process performed within a resolver manager thread in accordance with preferred embodiments to manage the number of resolver threads to be used. Figures 6 and 7 will be discussed later, but for the time being it is sufficient to note that the resolver manager thread will always maintain at least one resolver thread active. The process performed by a resolver thread will now be discussed in detail with reference to Figures 8 and 8a.

As shown in Figure 8, the process starts at step 800, and then at 810, it is determined whether this thread should be terminated. This will depend on whether the particular resolver thread has received a signal from the resolver manager thread to indicate that it should terminate. If the resolver thread is to be terminated, the process branches to step 815 where the process exits. However, assuming that the thread is not to be terminated, the process proceeds to step 820, where it is determined whether there is a request object in the object store for that resolver thread. In preferred embodiments, each web server can preferably handle any requests received by the gateway, and accordingly a resolver thread can take any request object from the object store. Hence, at step 820, it is determined whether there is any request object in the object store awaiting processing. However, in alternative embodiments, particular web servers may only be able to handle particular types of requests, for example a subset of the total number of web servers may service one web site, whilst another subset of the servers may service another web site. In such cases, the request objects preferably incorporate additional information identifying the type of request, and the resolver thread will at step 820 request from the store a request parcel of one or more predetermined types that its associated server is able to handle.

If there is no suitable request object in the store at that time, then the process proceeds to step 823 where it waits for a suitable request object to become available or a predetermined timeout period to elapse. On arrival at step 826, if a suitable object is still not available, the process returns to step 810. However, assuming that there is a suitable request object in the object store, either immediately or after waiting up to the predetermined timeout period, then that request object is provided to the resolver thread, and the process then proceeds to step 830, where a socket is created to enable a call to be made to the associated web server. Then, at step 840, the request is extracted from the request object and sent via the socket to the web server for processing.

Further, at step 850, an additional socket is created to enable a call to be made to the server socket of the gateway reply thread. At step 860, the control information is then extracted from the request object and sent to the gateway reply thread via the socket created at step 850. At this stage, the connection between the associated server and the gateway reply thread is formed by the web server socket and gateway reply socket generated at steps 830 and 850, respectively. The process then proceeds to step 865, where the data transfer thread illustrated in Figure 8a is started. The resolver thread then waits at step 867 until data has started to flow from the web server before returning to step 810, whereby assuming this thread is not to be terminated, the thread then makes a request for a further request object from the object store.

The data transfer thread is described in Figure 8a. The thread starts at step 869. and proceeds to step 870, where it is determined whether there is data to be transferred between the web server and gateway reply sockets. If so, the process proceeds to step 875, where the data is transferred, the process then returning to step 870. If there is no data to transfer, then it is determined at step 880 whether either of the sockets has been closed. Either the client or the web server may cause the data pipe to close. If the web server closes its socket, this will cause the resolver thread's web server socket to close. If the client closes its socket, this will cause the request handler thread's client socket to close, this causing the request handler thread to close its resolver socket, which in turn causes the resolver thread's gateway reply socket to close. If neither of the sockets has been closed, then the process returns to step 870, since this indicates that there may still be some data to be transferred. However, if either of the sockets is determined to be closed at step 880, then this indicates that the transfer of data is complete, and accordingly the process proceeds to step 890, where the other socket is closed. At this point, execution passes to step 895, where the data transfer thread terminates.

Having described how the resolver thread operates to retrieve a request object from the object store, and to then enable the transfer of data to the gateway reply thread, the process performed by the gateway reply thread will now be described in more detail with reference to Figure 4. As will be recalled from' the description of Figure 3, the gateway reply thread is created by the gateway main thread, and once started at step 400 proceeds to step 410 to create a server socket for receipt of replies from resolver threads.

The process then proceeds to step 420, where it is determined whether an incoming call has been received from a resolver thread. The process waits at step 420 until such an incoming call is received, at which point the process proceeds to step 430, where a resolver socket is spawned to handle that incoming call. At step 440, the control information provided in the incoming call from the resolver thread (see step 860 of Figure 8) is read from the resolver socket.

As discussed earlier, the control information uniquely identifies the request handler thread responsible for the original request received from the client. Accordingly, at step 450, the gateway reply thread identifies from the control information the request handler thread responsible for that request, and then activates that request handler thread. The process then proceeds to step 460, where the resolver socket is provided to the request handler thread, the process then returning to step 420 to await a further incoming call from a resolver thread. Returning to Figure 5, which illustrates the process performed by a request handler thread, as discussed previously the request handler thread waits at step 530 until it has been activated. Given the previous description of Figure 4, it will be appreciated that the request handler thread is activated by the gateway reply thread following receipt by the gateway reply thread of the control information initially inserted into the request object by the request handler thread at step 515. Once the request handler thread has been activated, the process proceeds to step 535, where the resolver socket is obtained from the gateway reply thread. At this point, any data passed to the gateway reply thread via the gateway reply socket of a resolver thread will appear on the resolver socket passed to the request handler thread.

The process then proceeds to step 540, where it is determined whether there is data to transfer between the resolver and client socket. This will be the case, for example, if data retrieved by the web server has been passed at step 875 of Figure 8a to the gateway reply socket, and has hence appeared at the resolver socket held by the request handler thread. Further, this will also be the case if the client subsequently sends any further request data via the client socket. If there is data to be transferred, the process proceeds to step 545, where the data is then transferred. If, however, there is no data to be transferred, the process proceeds to step 550, where it is determined whether either socket has been closed. As discussed previously with reference to Figure 8a, the client or the web server may cause the data pipe to close, and the resolver and gateway will then close sockets in response to such action by the client or web server. Assuming neither socket has been closed, the process returns to step 540, since this indicates that there may at some point be data to be transferred. However, if either socket has been closed at step 550, this indicates that there is no longer any data to be transferred, and accordingly the process proceeds to step 555, where the other socket is closed. At this point, the process then proceeds to step 560, where the request handler thread is terminated. The above description has indicated how individual request objects are created and then processed by the system of preferred embodiments, and has described in detail how the data is then routed from the associated web server back to the client socket for distribution to the client. In many embodiments, this process will involve only the unidirectional routing of data back from the server to the client in reply to the initial http request. However, the process supports bidirectional flow of data, i.e. data flowing from the client back to the web server in addition to from the web server to the client, and hence supports HTTP 1.1. Having described in detail the process performed by each resolver thread, the process performed by a resolver manager thread in accordance with preferred embodiments of the present invention in order to manage the number of resolver threads used by each resolver will now be described in detail with reference to Figures 6 and 7. As discussed earlier, in preferred embodiments there is one resolver manager thread for each resolver 30. The resolver manager thread is started at step 220 of the initial start up sequence, and proceeds from step 600 to 605, where a new resolver thread is created. Hence, during the initial iteration of the process, a resolver thread is created. The process then proceeds to step 610 where the resolver manager thread waits for a predetermined time, for example one minute. Then, at step 620, a parameter "t" is set equal to the current number of threads.

As each resolver thread executes, it sends certain predetermined information back to the resolver manager thread. For example, each resolver thread sends information back to the resolver manager thread, indicating whether it has waited for any requests, and also sends information concerning the current throughput of data through that resolver thread. Accordingly, at step 630, the resolver manager thread determines whether any of the resolver threads had to wait for any requests. This will depend on whether the resolver thread had to branch to step 823 before proceeding to step 830 as illustrated in Figure 8. At step 630, the resolver manager thread is basically trying to determine how busy the resolver is by determining how much it waited for request objects to become available. If at step 630 it is determined that the resolver threads did wait for at least some requests, then the process proceeds to step 635, where it is determined whether the resolver threads waited for every request. If not, then the process merely returns to step 610 without any adjustment to the number of resolver threads, or any update of the throughput information. However, assuming that the resolver threads did wait for every request, then at step 645 it is determined whether there is currently more than one resolver thread. If not, then again the process returns to step 610. However, assuming that there is more than one resolver thread, then the process proceeds to step 655, where one of the resolver threads is terminated. Following the decision to terminate a thread, the next resolver thread to execute step 810 will terminate and no others will terminate until a fresh decision to terminate has been made. The process then returns to step 610. Hence, by the above approach, it will be seen that the resolver manager will take action to reduce the total number of resolver threads (assuming there are more than one) if the resolver threads have had to wait for every request.

If at step 630, it is determined that the resolver threads did not wait for any requests, then the process proceeds to step 640, where current throughput information P_t is stored. Then, the process proceeds to step 650, where it is determined whether there is only one resolver thread. If there is, the process returns to step 605, where a new resolver thread is created. However, assuming that there is more than one resolver thread, the process proceeds to step 660, where a new set of weight factors W_x are calculated for x = t, t-1 and t+1. The calculation of the weight factors is discussed in detail with reference to Figure 7. The process starts at step 700, and proceeds to step 710, where the weight factor W_t-ι is set equal to the average of the previous P_t-ι values plus the latest P_t-ι value. Up to ten previous values may in preferred embodiments be stored. Then, at step 720, the weight factor W_t is set equal to the average of the previous

P_t values plus the latest P_t value. Next, at step 730, it is determined whether there are any statistics for P_t+ι* This would for example be the case if the resolver manager thread had previously terminated a resolver thread, such that the there are currently less resolver threads than there had been previously. If there are statistics for P_t+ι, then the process branches to step 735, where the weight factor W_t+1 is set equal to the average of the previous P_t+ι values plus the latest P_t+ι value. The process then proceeds to step 745, where the weight factor W_t+ι is altered by a predetermined factor, in preferred embodiments this alteration involving multiplying the value by 0.8. The process then proceeds to step 750. If, at step 730, it was determined that there were no statistics for P_t+ι, then the process proceeds to step 740, where a weight factor W_t+) is set equal to W_t*2 - W_t-[. The process then proceeds to step 750.

At step 750, the parameter X is set equal to the minimum of W_t-ι, W_t or W_t+1, multiplied by a factor of 0.666. Then, at step 760, each of the weight factors W_t, W_t-ι and W_t+ι are altered by subtracting X from the values determined earlier in the process. At this point, the process then terminates at step 770. Returning to Figure 6, the process then proceeds to step 670, where a random number R is generated between zero and a maximum value of W_t-ι + W_t + W_t+ι. At step 680, it is then determined whether R is less than W_t-|. and if it is, then the process proceeds to step 655, where one of the resolver threads is terminated. Following this. then the process then returns to step 610. However, assuming that R is not less than W_t- i. the process proceeds to step 690, where it is determined whether R is less than W_t-ι + W_t. If it is not, then the process branches to step 605, where a new resolver thread is created, whilst if it is, the process returns directly to step 610 without creating a new resolver thread. To summarise the operation of the process of Figure 6, waiting for requests causes the number of threads to be reduced, whilst not waiting for requests offers a probability of the number of threads being increased. Consider an example situation where the resolver had been working hard and had computed weight factors (which represent the data throughput achieved) for 1, 2 and 3 threads as 3, 12 and 20 respectively. Then business is bad and it decreases the number of resolver threads down to 1 thread. When the request frequency increases the resolver thread never waits - there is always work as soon as it is ready for it. Hence, with reference to figure 6, the process proceeds from step 630 down the "No" path to step 640, step 650 goes down the "Yes" path and accordingly a second thread is created immediately. During the next iteration (again we assume we never wait), the process proceeds from step 630 down the "No" path to step 640, step 650 goes down the "No" path and the calculation at step 660 is performed. There is a probability of 20 divided by 35 that the number of threads will be increased (that's 57%) and a probability of 3 divided by 35 (that's 9%) that the number of threads will be reduced. This random element forces values on each side of the current one to be tried from time to time (even if they have been not the best in the past) just to see if the performance of them is better than the current one this time (conditions may have changed).

From the above description of Figures 6 and 7, it will be appreciated that the resolver manager thread basically implements a random controlled hunting algorithm with the aim of ensuring that the optimum number of threads are chosen dynamically, even in the event of varying operating and load conditions. The number of threads run by the resolver is important for maximising throughput while minimising delay and optimising the load sharing capabilities of the system. The process illustrated with reference to Figures 6 and 7 seeks to achieve this by monitoring the data throughput whilst adjusting the number of threads.

Figure 9 is a block diagram providing an overall illustration of the operation of the system of preferred embodiments, paying particular attention to the resolver thread detail. As shown in Figure 9, client devices 930 may communicate with the gateway 10 over, for example, the Internet, the gateway 10 then generating a request object for each request received from a client device 930, and posting that request object to the object store 20. A number of resolvers are then provided, each resolver consisting of a resolver manager 900 and a number of resolver threads 910, the number of resolver threads being managed by the resolver manager 900. Each resolver thread is arranged to retrieve request objects from the object store 20, and pass them to an associated web server 920 for processing. A uni-directional or bi-directional connection can then be established between the web server 920 and the client 930 via the relevant resolver thread 910 and the gateway 10 to allow HTML request and response data to be transferred. Further, each resolver thread 910 is arranged to provide throughput data to the resolver manager 900 to enable the resolver manager to effectively manage the number of resolver threads. The resolver manager 900 is then able to create and destroy resolver threads as required. It will be appreciated that the system of preferred embodiments may be used in any situation where it is useful to manage a plurality of servers to effectively work together as if they were a single more powerful server. However, a particularly advantageous application of the system of preferred embodiments is in situations where the data rates over the network are not huge, but the server side processor power is being severely tested. Such applications include e-commerce systems where the web server is having to validate credit card details, perform database searches to maintain shopping basket information, etc, or search engines which perform lengthy searches and generate the reply HTML page on the fly.

As mentioned previously, it would be possible though rather inefficient to run all of the entities of the system of preferred embodiments on a single machine. However, a more likely scenario would be for the gateway to run on a single fairly powerful machine, together with the object store, and for a pair of machines to then run a resolver and a web server to service the requests. It will be appreciated that adding more web servers to this system is then easy. The lookup schema allows another machine plugged into the network to locate the object store and begin processing requests. As already discussed, information associated with those requests gives the server an indication of the return path to be used for responses.

If the data rate itself were to become an issue, the first symptom may be that the gateway begins to run out of processor power. A solution to this would be to run the object store on a separate machine to the gateway. Again the lookup schema allows the resolvers and the gateway to locate the object store, but now the processor power needed to drive the object store is not impeding the ability of the gateway to handle the volume of incoming transactions and their responses.

From the above description, it will be appreciated that the system of preferred embodiments of the present invention provides a very flexible system for managing a plurality of servers. It will also be appreciated that the system can be replicated if desired. For example, it would be possible to include a firewall using IP address translation to send requests to more than one gateway. The gateways would each post requests in their own object store, and a group of associated resolvers would be clustered around each object store.

Accordingly, it will be appreciated that the system of preferred embodiments provides a distributed, scalable technique for managing web servers, and is particularly suitable for server intensive web site implementations. The system is able to control and co-ordinate a heterogeneous collection of web servers to service one or more web sites as if they were a single powerful web server. Further, since in preferred embodiments the system is software based, it does not require the use of expensive clustering hardware. In addition, in preferred embodiments, the system allocates requests to multiple web servers based on the actual loading of each server, this occurring via the monitoring and maintaining of a suitable number of resolver threads.

Since in preferred embodiments the system is based on the Java platform, it allows the web server cluster to comprise a variety of different devices, for example PCs, Sun Workstations and Macs running different operating systems and different web servers. Further, as discussed, the system could run on systems ranging from a single machine to many machines without extensive reconfiguring. Indeed, in man> applications, increasing the throughput of the system would merely involve adding another machine to the network, loading the appropriate software component and arranging for it to be run. Further, through the use of a Jini lookup service in preferred embodiments, the system can be tolerant to the failure of individual components, since the Jini lookup service can distribute notifications about the disappearance of any of the system components to all of the other components, thereby enabling them to take appropriate action.

Although a particular embodiment of the invention has been described herewith, it will be apparent that the invention is not limited thereto, and that many modifications and additions may be made within the scope of the invention. For example, various combinations of the features of the following dependent claims could be made with the features of the independent claims without departing from the scope of the present invention.

Claims

1. A system for controlling a plurality of servers to process requests received from clients, comprising: a gateway arranged to be responsive to receipt of a request from a client to generate a request parcel from said request, the request parcel containing the request and predetermined control information used to identify the request; a store for receiving each request parcel generated by the gateway; a plurality of resolvers, each resolver being associated with one of said plurality of servers, each resolver being arranged to request from the store a request parcel, and upon receipt of a request parcel from the store being arranged to extract the request and control information from the request parcel, to pass the request to the associated server for processing, and to pass the control information to the gateway to enable data generated by the associated server to be routed to the client.

2. A system as claimed in Claim 1 , wherein the resolver is arranged to request request parcels from the store dependent on the loading of the associated server.

3. A system as claimed in Claim 2, wherein the resolver provides a number of resolver threads, each resolver thread being arranged to request from the store a request parcel for processing by the associated server, and the resolver being arranged to be responsive to predetermined parameters indicative of the loading of the associated server to control the number of resolver threads.

4. A system as claimed in Claim 3, wherein data to be passed between the associated server and the gateway during processing of a request is routed via the corresponding resolver thread, and said predetermined parameters comprise throughput information identifying the throughput of data through each resolver thread.

5. A system as claimed in Claim 3 or Claim 4, wherein the resolver is arranged to monitor the time between each resolver thread requesting a request parcel and receiving the request parcel, and to control the number of resolver threads dependent on that time information.

6. A system as claimed in any of claims 3 to 5, wherein when one of said resolver threads receives a request parcel from the store, it is arranged to create a first socket and to pass the request to the associated server via that first socket, and to create a second socket and to pass the control information to the gateway via the second socket, a connection between the associated server and the gateway being formed by the resolver thread passing data between the first and second sockets.

7. A system as claimed in Claim 6, wherein if either the first or the second socket is closed, the resolver thread is arranged to close the other socket thereby terminating the connection.

8. A system as claimed in any preceding claim, wherein the gateway is arranged to create a request handler thread for each request received by the gateway, the request handler thread being arranged to generate the request parcel for the associated request.

9. A system as claimed in Claim 8, wherein when control information from a request parcel is passed from one of said resolvers to the gateway, the gateway is arranged to use the control information to identify the request handler thread responsible for the associated request, and to cause a connection to be established between that request handler thread and that resolver.

10. A system as claimed in Claim 9, wherein the gateway comprises a gateway reply thread arranged to receive incoming calls from the resolvers, each time an incoming call is received, the gateway reply thread being arranged to create a resolver socket to handle the incoming call, to determine from the control information the relevant request handler thread, and to pass the resolver socket to that request handler thread to effect the connection between the resolver and that request handler thread.

1 1. A system as claimed in Claim 10 when dependent on Claim 6, wherein the gateway reply thread is arranged to receive the incoming call from one of said resolver threads via the second socket, whereby once the resolver socket has been passed to the appropriate request handler thread, the request handler thread is able to receive data from, and send data to, the associated server.

12. A system as claimed in Claim 10 or Claim 1 1, wherein each time a request is received by the gateway, the gateway is arranged to create a client socket to provide a connection with the client, and to pass the client socket to the request handler thread allocated to that request, a connection between the resolver and the client hence being formed by the request handler thread passing data between the resolver and client sockets.

13. A system as claimed in Claim 12, wherein if either the resolver socket or the client socket is closed, the request handler thread is arranged to close the other socket thereby terminating the connection.

14. A system as claimed in any preceding claim, wherein the plurality of servers are web servers accessible via the Internet, and the requests received from clients are http requests.

15. A system as claimed in any preceding claim, wherein the request parcels are request objects, and the store is an object store.

16. A system as claimed in Claim 15, wherein the gateway and plurality of resolvers are computer programs operating on the Java platform, and the object store is formed as a JavaSpace.

17. A method of controlling a plurality of servers to process requests received from clients, comprising the steps of: (i) upon receipt of a request from a client, arranging a gateway to generate a request parcel from said request, the request parcel containing the request and predetermined control information used to identify the request; (ii) storing each request parcel generated by the gateway in a store; (iii) providing a plurality of resolvers, each resolver being associated with one of said plurality of servers, each resolver being arranged to perform the steps of:

(a) requesting from the store a request parcel;

(b) upon receipt of a request parcel from the store, extracting the request and control information from the request parcel; (c) passing the request to the associated server for processing; and

(d) passing the control information to the gateway to enable data generated by the associated server to be routed to the client.

18. A computer program for controlling a computer to act as a resolver in a system as claimed in any of claims 1 to 16, the computer program being operable to configure the computer to implement the steps of:

(a) requesting from the store a request parcel;

19. A computer program for controlling a computer to act as a gateway in a system as claimed in any of claims 1 to 16, the computer program being operable to configure the computer to implement the steps of:

(a) upon receipt of a request from a client, generating a request parcel from said request, the request parcel containing the request and predetermined control information used to identify the request; and (b) sending the request parcel to a store for storage.

20. A computer program product comprising a recordable medium having recorded therein a computer program according to claim 18 or claim 19.

21. Use of an object store to store request parcels generated by a system as claimed in any of claims 1 to 16.