Data Communications Networks , Systems and Methods
The present invention concerns systems and methods for distributing data ("content") from host systems/servers to clients/recipients via a data communications network such as the Internet. The invention is particularly concerned with the distribution of streaming media content (e.g. video or audio) . In its preferred embodiments, the invention provides systems whereby streaming media content can be distributed from a server to one or more clients over the Internet using standard TCP/IP and HTTP.
The invention will be described with particular reference to the distribution of video content. However, it will be understood that the systems and methods provided are equally applicable to any other types of streaming media and potentially useful for the distribution of any type of data.
One object of the invention is to permit the efficient distribution of data to recipients having
various mutually incompatible and possibly time varying requirements. This is accomplished by transferring the λ intelligence' required within the system from the server to the recipients in a distributed fashion.
It can be desirable to make data available on a widespread basis, for instance using the Internet. Such data will, unless generated in real-time, be hosted at a server, which will respond to requests by transmitting the entire file. This is the system used, for instance under HTTP (Hypertext Transport Protocol) . Such a distribution system however assumes that all potential recipients have equal access needs to the file contents and, indeed, equal permission to access the file contents. This can cause problems such as: - Different recipients of the stream can be connected at any one of a wide range of bit rates according to the nature of their connection technology, rendering download time for the entire file contents excessively long. - Different recipients may have different levels of permission for access to file contents. For example, some may have the right only of access to summary contents and/or data only relevant to one aspect of the subject matter of the file. - Different recipients may have different possibilities to utilise the contents of the file, for instance different storage or processing
abilities if further computation is necessary, or different means of displaying aspects of the file. This will be true for instance if the range of potential recipients includes both desktop and mobile devices . Nevertheless the recipient will, in general, wish to receive data appropriate to their needs, permissions and facilities. The supplier must seek to satisfy each recipient on an individual basis. As stated above the data for distribution is hosted at some point under the supervision of a 'server' . In conventional streaming media systems, typically utilising RTSP (Real Time Streaming Protocol) and UDP/IP, the server has responsibility for controlling the distribution of the data to end recipients. The host will have available, or will be able to generate, multiple versions of the same content, typically encoded and compressed in any of a variety of standard formats such as MPEG4 , suitable for distribution via channels having differing bandwidths to a variety of types of recipient of differing complexity, processing capacity etc. The server has sufficient intelligence to be able to determine the appropriate degree and form of distribution for each recipient individually. This intelligence is used in responding to requests from recipients and requires finite computation by the server in responding to each request.
The information necessary for the computation may be acquired by any of a number of routes that include, but are not necessarily restricted to: - By enquiry of the recipient during initiation of the service. - In the case of distribution within a Java environment, use of a 'script' which detects, for instance, the bandwidth of the connection or the capacity of the recipient device in any one or more relevant aspects (e.g. the sophistication of the decoder used by the recipient for decoding the content) , on initial request for the transfer and informs the server of the values detected such that the server thereafter transmits data in an appropriate form.
- By the supply direct to the server from an authorised third party of the capacities and permissions of all authorised recipients, for all permitted end devices and connection technologies and speeds.
Conventional systems of this type have a number of disadvantages, as follow: - Each of these techniques requires the server to maintain "knowledge of the recipient", which, if there are potentially many simultaneous recipients, can be a complex process. - The convenience of access for authorised recipients can be materially impaired. For instance access from a new device or from a device accessed
on an opportunistic basis may be refused until a complex upgrading of permissions has been performed. - It is not a simple process to respond dynamically to changes in the recipient's requirements during the period of data access. For instance changes in connection bandwidth can be important under some conditions of use. The present invention arises from the recognition that the units best able to define the desirable characteristics of a received data file are the recipients themselves. The invention operates by transferring complete control of the assembly and delivery of the received data to those recipients. In contrast to the state of the art, adoption of this control structure relieves the 'server' of the need for any knowledge of the recipient's characteristics and requirements whatsoever and therefore eliminates the complexity necessary to provide data to the recipient.
The invention in its various aspects is defined in the claims appended hereto. Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings, in which: Fig. 1 is a diagram illustrating the operation of a data distribution system in accordance with one embodiment of the invention. Fig. 2 is a diagram illustrating the structure of a data file in accordance with one embodiment of the invention.
Referring now to the drawings, Fig. 1 illustrates the operation of one preferred embodiment of the invention, which will be described with reference to streaming video content . A server or host system 10 hosts video content data to be distributed to one or more clients or recipients 12. In the most preferred embodiment, the host 10 is connected to the client 12 via the Internet and all transactions between the server and client are by means of TCP/IP and HTTP. The server 10 stores a number of different versions of the content data, the different versions being suited to different channel bandwidths, different recipient devices etc. as described in more detail below. Apart from the actual content data, the server 10 also stores data about the content that is available from the server. The content data and the data about the content is stored in a data structure or format that is known or can be made known to the client 12. When the client wishes to receive particular content, it begins by sending a content instruction identifying the content of interest to the server, which responds by sending to the client the data about the content specified in the content instruction. As shall be described in more detail below, the data about the content includes details of the different available versions of the content and their locations within the data structure. On the basis of the data about the content received from the server, the client decides what version of
the content it wishes to retrieve from the server and sends a transfer instruction to the server that identifies that portion of the content data that corresponds to the selected version. In the preferred embodiments, the transfer instruction points directly to the beginning of the required content data within the data structure; i.e. the content data transfer instruction identifies a specific set of bits within the content data such that the server need exercise no discretion in selecting the content data for transmission to the client in response to the instruction. In other words, the client has "random access" capability in relation to the data structure, however the data is arranged within that structure. The server then begins transmitting the data corresponding to the selected version to the client; i.e. the action of the client is effectively to perform a remote file access into the content data to obtain the content which it currently requires. The invention thus allows the client to randomly access the content held by the server in the light of what is available from the host and the client's own current capabilities and needs, thereby avoiding the need for complex server functions and associated problems that arise when the server is required to determine what particular content to send to a particular client at any given time. In the preferred embodiment, the client monitors one or more parameters relating to the receipt and/or processing of the content data. Typically, the client includes a frame buffer for
temporary storage of incoming frames while they are decoded. If the incoming data transfer rate is too high or the data is too complex, the frame buffer will begin to fill up, or if the frame rate is too low or the data too simple then the frame buffer will be under-utilised. By monitoring the content of the frame buffer (e.g. in terms of number of bits or in terms of number of frames) , the client can determine whether the current version of the data content being retrieved from the server is appropriate, or whether another version (e.g. better or poorer quality) should be used. Alternatively, the client may monitor the data transfer rate or other useful parameter (s) by other means. The client may then send another transfer instruction, identifying a different version of the content data, and the server then begins sending the data of the newly selected version. While the new version may be a different version of the same content data as encoded in the first version, in the case of streaming media such as video the new version will generally comprise the next segment in a sequence of segments, selected so as to correct the parameter (s) by means of which the appropriateness of the selection is monitored. In this way, the client seeks to maintain the content of the buffer within predetermined limits, e.g. substantially constant on average over a given time interval, or at least to ensure that the buffer never becomes empty. The approach described above enables streaming media distribution services and the like to be implemented using conventional TCP/IP + HTTP web
server technology, instead of the specialised (relatively complex and expensive) technology required for conventional streaming media servers using UDP/IP + RTSP or the like. The methods of the present invention require specific action during data assembly (i.e. when setting up the data structures on the host system) and at the recipient device. For example, in one embodiment :
(A) During data assembly:
- Construct a description (the "data about the content") of the content file which identifies such aspects as : The nature of the classes (types) of information contained; The levels of abstraction to which the information is prepared (e.g. to suit different levels of access permission) ; The levels of acce'ss to which different aspects/portions of the file contents are tied; The positions ("offsets") within the file at which different aspects of the data may be found, together with the number of bits used to represent each such aspect. It should be noted that this list is for example only and that specific implementations may use additional and/or other aspects. - The data about the content and the content data itself are embodied in a data structure of some
convenient type (which may vary from file to file, subject only to the restriction that individual recipients can be made aware of the structure and meaning of the data structure at least to a level appropriate to their level of access permission; e.g. the data about the content may include data about the data structure) . It will be understood that the data about the content may be incorporated as part of the content data file or be stored as a separate file associated with the content data file. The data about the content can be accessed by the client from the server or another source at any appropriate time.
(B) Server/recipient activity:
- On initiating a content data transfer some or all of the data about the content is read from the server by the recipient device. - The recipient process interprets the received data about the content to identify an appropriate section of the content data file to receive and sends a transfer instruction to the server to transmit the relevant portion of the content data file. - The server responds by supplying that portion of the content data file that has been identified in the transfer instruction. In doing so the server exercises no discretion and need perform no checks, all such decision making having effectively been executed at the recipient.
- Thereafter, the recipient instructs transfer of additional portions of the data file according to the needs and access permissions of the user. In this way the data transferred is limited to that required by and appropriate to the recipient at the time that the transfer is taking place. It will be recognised that this mechanism includes, but is not limited to, files that are interpreted as they are received. For instance files representing compressed video and/or audio may commence to be played before completion of the file transfer from the 'host server' to the recipient. The preferred implementation of the invention comprises a combination of hardware and software involving a host system equipped with a simple server software package providing file transfer functionality, to which a recipient device is connected by means of a duplex channel. The protocol used to implement the file transfer functionality must provide random access to the file or files held on the host machine; e.g. HTTP. By way of example, and with reference to Fig. 2 (though the invention is not limited to this application) , the host system is furnished with a file comprising concatenated data segments which represent the following types of data: (a) Multiple versions of segments of video. Let these segments be denoted by the upper case letters A, B, C, ... , Z where the source versions of the video segments if concatenated in the order A -> B - C - ... -^ Z would form the complete video sequence.
Each segment is compressed by different factors such that there are multiple compressed versions of each segment of different sizes (expressed in bits) and/or complexities (expressed in MIPS) necessary for decode, (Ax, A2, ... , An) etc. (b) Data about the content, comprising a number of fields equal to the number of source segments, each field identifying: The position of the segment within the total sequence, i.e. its status as 'first', 'second', etc. The number of versions of the segment; For each version of each segment, its size in bits and the position (offset) within the file of the first bit of its representation together with the number of frames of video that it represents. (c) Identifier data of n bytes length, which will be in the same form for all possible files of the type discussed, which identifies: The size of the data about the content described in (b) above expressed in bits; A code pattern which identifies that the total file is for compressed video and the data structure for which has the form described in (b) , i.e. the recipient knows the field arrangement of the data structure. These three units are concatenated within the entire file such that the n-byte identifier is
first, followed by the data about the content, followed by the content data. The recipient device is furnished with a software process that can execute the following functions: - Buffer content data received from the host in a software buffer until it is passed to a process for decoding; - Monitor the bandwidth in bits per second of its input channel from the host; e.g. by observing the contents of the input buffer as a function of time; - Decode and display video drawn from the input buffer; - Monitor the decode complexity necessary to decode at the full appropriate rate; - By reference to a copy, held by the recipient, of the data about the content held by the host, instruct transfer of those segments of the content data file held at the host, identified both by position of first bit and number of bits, which correspond to the next segment of video that is required, the selection being on the basis of the number of frames per second required for display compared to the number of frames per second in the data stream for that segment by reference to the size of the version in bits, the number of frames in the segment and the currently observed channel bit rate; - By reference to a copy, held by the recipient, of the data about the content held by the host,
instruct transfer of those segments of the content data file held at the host, identified both by position of first bit and number of bits, which correspond to the next sub-sequence of video that is required, the selection being on the basis of the decode complexity necessary to maintain the appropriate display rate . The procedure executed by the recipient process on commencing the transfer of a stream from the host is: - As a first action, to instruct the transfer of the first n bytes of the file, i.e. the n-byte identifier which defines the size and format of the data about the content ; - As a second action, to instruct the transfer of the next m bits of the file, m being the number of bits comprising the data about the content as defined by the n-byte identifier; - As a series of subsequent actions, to instruct transfer of selected versions of the successive segments making up the stream, these being selected from those available within the content data file at the host by reference to the data about the content and the observed current channel conditions. Summarising the key features of the preferred embodiment of the invention: - A duplex communication channel exists between the server and the client. - Data assembly on the server is to a form of content data that can be subdivided according to
access conditions appropriate to that data, and data about the content indexing those subdivisions according (at least) to type, position and size is constructed. - The recipient process receives the data about the content such that it is aware of all the aspects of the data that are available to it, and their positions and sizes within the content data file. - The recipient process instructs which segment from the content data file is to be transferred by the server. This is repeated for each required segment until all necessary content data has been received. In the preferred embodiment, as just described, all of the content data for the different versions of the content and data about the content (including, where necessary, data about the data structure) are included in a single file. Alternatively, the content data could be divided into a number of files. For example, the various segments or other subsets of the data could be stored as separate files (all versions of one segment might be in one file) or different subsets of segments might be concatenated in a number of separate files, etc. Also, the data about the content could be stored separately from the content data in one or more separate files or as entries in a database, on the server or at another source. Where the data structure is fixed/predetermined, the client can be assumed to have knowledge of the data structure. Otherwise, the data about the data
structure can be included in or stored separately from the data about the content . As used herein, "content" means any type of data of interest to the client/end-user, and "segment" means any subset of the content data defined within the data structure. Access control in relation to the content data and/or the data about the content may be implemented in any of a number of ways, including but not limited to: - On the basis of permissions granted by a control authority. - By the ability of the client to interpret relevant portions of the content data and/or the data about the content. - By means of a key allocated to a user that controls the functionality of the client in accessing content data and/or data about the content. - By means of a process at the server which validates instructions received from a client in terms of the validity of those instructions against permission definitions held locally, the instruction being accompanied by a password or other identification pattern to specify the permissions held by the current end user. - By reference to those portions or aspects of the content data that are suited to the capability of the client or to the bandwidth of the channel to the host device in the light of current needs.
1 Many other variations or modifications of the
2 embodiments described above are possible within the
3 scope of the invention, including but not limited
4 to:
5 - The user device could equally seek to keep a
6 local buffer full or at some preset level .
7 - The set of content data segments could be
8 related to some subscription related aspect such
9 that access to quality is limited to some maximum
10 level.
11 - There could be transmitted, in association with
12 each compressed data content segment, an indication
13 of the properties of the set of encodings of the
14 next segment as an aid to the decoder in deciding
15 which version to request next .
16 - There can be forward and backward seeks within
17 a remote file if the data about the content includes
18 information of where versions of particular frames
19 can be found.
2.0 - The recipient process' could report on quality
21 actually delivered, not only in terms of effective
22 bandwidth consumed but also on delivered quality
23 such as number of frames dropped or minimum
24 bandwidth actually delivered.
25 - The individual segments need not be entirely
26 self-contained in the sense of commencing with a key
27 frame. A non-key frame headed segment could use the
28 reference frame (s) for the previously decoded
29 segment. In the event of bandwidth step-down this 30 would propagate high quality into the continuing
sequence; in the event of bandwidth ste -up it would propagate lower quality but, without the overhead of a key frame, quality should build faster. Note that key frames could anyway be introduced at regular intervals, preferably at the start of a segment. - The system could equally well be applied to the delivery of any other compressed streaming media. For instance audio streams could be compressed and delivered in this way. - Where the scheme is applied to combined video and audio then, in the case of a diminution of bandwidth there could be a choice, possibly guided or controlled by the user, between preferentially degrading the quality of the audio or of the video to accommodate the reduced bandwidth. Other improvements and/or modifications may be incorporated without departing from the scope of the invention as defined in the appended claims.