WO2002010905A2

WO2002010905A2 - Method and system for accelerated content download over a data network

Info

Publication number: WO2002010905A2
Application number: PCT/IL2001/000422
Authority: WO
Inventors: Ilana Nir
Original assignee: Intra Inc.
Priority date: 2000-08-01
Filing date: 2001-05-11
Publication date: 2002-02-07
Also published as: AU2001256636A1; IL141108A0; WO2002010905A3

Abstract

In a data network, a method and system for processing files held on a server. References to external files within primary files are replaced by references to a program. A list is created by parsing the primary files and storing therein the links connecting the primary files with the associated external files. The set of external files referenced by the primary file packaged into an archive along with the program. A list is created by parsing the logically high level primary file and storing the links therein connecting logically low level primary files referenced by a logically high level primary file. A list of archives associated with the logically lower level primary files is established and inserted into the logically higher level primary file. The modified primary file and associated archives are transmitted to a client. A program implemented on the client downloads, extracts, and preprocesses the archives.

Description

METHOD AND SYSTEM FOR ACCELERATED CONTENT DOWNLOAD

OVER A DATA NETWORK

BACKGROUND OF THE INVENTION ^• FIELD OF THE INVENTION

In general the present invention relates to a system and method for enhancing the performance of a content provider server operating on a compμter platform in a data communication network. More particularly, the present invention relates to a system and method for significantly reducing the size of transmittable data structures, the associated downloading times, the number of file requests, and the resulting response times associated with the operation of the system.

DISCUSSION OF THE RELATED ART Typically, data networks such as the Internet have a client-server architecture and therefore comprise two functionally different types of computing systems: clients requesting information content with associated applications or services, and servers organized to respond to the clients' demands by supplying the requested content and related services. The request/response mechanism operating among the two above-mentioned systems is known. Typically, servers store related units of information content suitably structured into transmittable form such as connected HTML pages and related files. Clients utilize specific software such as Web browsers to access the servers via communication devices, to select the desired units of information, and to request transmittal or the downloading of the selected content. The browser software facilitates the delivery of the requested content into the storage devices of the computer platform hosting the client.

With the extensive growth in the use and popularity of the Internet, an ever-increasing number of diverse computers are connected. Consequently, traffic volumes are increasing effecting escalating congestion of the communication lines, inundating network sites and making the process of accessing servers increasingly problematic. Typically, many popular servers are difficult to access and most major content providers are practically unreachable for up to 10 percent of the time. Some sites are overloaded at times to such an extent that the limit of their capacity to handle all the incoming requests are exceeded and therefore a substantial percentage of their incoming traffic are routinely dropped. The severity of the problem is well known. Information content, application content, and services are not always delivered swiftly, reliably and efficiently to the clients operated by potential customers of commercial sites. Servers supporting financial sites, entertainment and media sites, e-Commerce shop fronts and other high-traffic sites often experience catastrophic drops in response times as they are overwhelmed by a plurality of visitors in excess of all the expectations. For the commercial sites the increasingly inefficient delivery of requested content means that potential customers and their revenue might go elsewhere.

Clients are the ones most adversely affected by slow downloading times. Typically, accessing the Internet involves non-fixed payments to service providers such as ISPs (Internet Service Providers) or telephone companies. The amount of the calculated bill is directly effected by the length of time the client is connected to the network. Extensively long downloading times effect higher bills. Furthermore, slow response times delay the completion of important transactions initiated by a user, not infrequently at the most critical stages when the customer is unable or unwilling to interrupt the processing. Longer downloading times also bring about a higher probability for hardware failures such as the disruption of communications during a transaction. This in turn necessitates the repetition of the transaction. Consequently additional load is brought upon the system.

Presently various solutions exist designed to limit the impact of the congestion. Some of the solutions are hardware-oriented such as increased bandwidth usage, utilization of mirror sites or distributed, replicated servers, proxy servers, and lately the establishment of dedicated networks of servers deployed throughout the network in order to push content to servers closer to the clients. Other solutions, which typically operate in conjunction with additional hardware, involve the use of specifically developed software applications such as load-balancers, caching mechanisms, traffic and usage pattern analysis, usage prediction engines and the like. All of the existing solutions have considerable disadvantages. Some of them such as the addition of supplementary servers and the use of dedicated server networks require considerable investment. Others, such as software applications attempting to predict intelligently patterns of usage or applying various caching techniques, have limited usefulness as they are frequently inefficient and all too often fail to provide the optimal solution in such an extremely dynamic environment as the Internet. None of these solutions provide for interactive customization. None of the proposed solutions offer modifications in the server and client application logic nor facilitates compression, packing and re-packaging of the transmittable content.

The number of customers utilizing data networks is constantly increasing. The ever expanding public of users attempting at any point in time to interact with a plurality of content and application structures having highly complex format and a growing number of features. The present trend of dynamic growth in user demands is very likely to continue. Therefore the congestion of the data communication systems on the network level is also very likely to persist irrespective of all the technological measures that may be effected in the near future on the hardware level. Therefore, it will be easily perceived that there is a need for alternative solutions concerning the serving up of infoπnation, media and application content. Such a solution should reflect the acceptance of the reality of the inherently inefficient traffic management in the Internet and should therefore focus on handling the performance problems on the basic level of server-side and client-side applications and on the transmittable data structures. SUMMARY OF THE PRESENT INVENTION One aspect of the present invention regards a method in a computing environment accommodating at least one client system connectable to one or more server systems. The method processes files in a first computer system into structured file archives to be transferred to and to be re-processed within a second computer system. The method includes determining first reference to first external file, wrapping the first external file, replacing first reference with second reference thereby indicating new location for the wrapped first external file. The second aspect of the present invention regards a file wrapper system operating in a computing environment. The environment includes a server connected to a data network via a communication device, a storage device holding files, and the file wrapper system. The file wrapper system includes a file reference transformer for replacing first reference identified within a file with second reference thereby indicating new location for an external file, a file processor for selecting and loading the external file, and a file wrapper for wrapping the external file.

The third aspect of the present invention regards a computing environment accommodating at least one client system connectable to one or more server systems including a method of packaging files in a first computer system into file containers to be transmitted to and to be re-processed within a second computer system. The method includes resolving a first reference to a first external file, inserting the first external file into the file container, and replacing the first reference with the second reference thereby indicating placement of the packaged first external file in the file container.

The fourth aspect of the present invention regards a computing environment accommodating at least one client system connectable to one or more server systems, and a method of processing files in a first computer system into structured file archives to be pre-downloaded to a second computer system. The method includes identifying a set of related files in the first computer system, analyzing the hierarchical usage pattern of at least two related files in the identified set of related files in order to establish a file usage hierarchy based on the at least two files in the first computer system, examining at least one file within the file usage hierarchy based on the at least two files analyzed in the first computer system to determine at least one reference to a file archive, structuring at least one file archive reference into at least one file archive loader command, and inserting the at least one file archive loader command including the at least one file archive reference into the at least one analyzed file in the first computer. The fifth aspect of the present invention regards a computing environment which contains a server connected to a data network via a communication device, a storage device holding a set of files, the files including at least two distinct files, and a file pre-downloading system. The pre-downloading system includes a file archive loader line builder for formatting a file archive loader command line. The sixth aspect of the present invention regards a computing environment accommodating at least one client system connectable to one or more server systems, and a method of organizing a set of file archive references in a first computer system into at least one structured file archive loader line, embedded into a high-level file to be transferred to a second computer concurrently with the high-level file and to be processed in the second computer in association with a low-level referencing file. The method includes analyzing a set of files for establishing of hierarchical relationship model among at least two files, establishing a set of files located on a lower level of the hierarchical relationship model referenced by a set of files located on a higher level of the hierarchical model, examining the file located on the lower level of the hierarchical model referenced by the file located on a higher level of the hierarchical model for the presence of a file archive reference, structuring a file archive loader line including the set of the file archive identifications, and inserting the structured file archive loader line into the file located on the higher level of the hierarchical relationship model referencing the file located on the lower level of the hierarchical relationship model thereby accomplishing the transmission of the file archives concurrently with the file located on the higher level of the relationship model to a second computer and the processing of the file archives subsequent to the transmission of the referencing file located on the lower level of the hierarchical relationship model.

The seventh aspect of the present invention regards a computing and communications environment accommodating at least one client system connectable to one or more server systems and a method for transferring a set of at least one file archives from a first computer system to a second computer system. The method includes establishing a file archive extractor/file archive downloader module on the second computer system, activating the file archive extractor/file archive downloader module on the second computer in association with the activation of a network browser, inserting into at least one file a list of least one of file archive references referred to by the least one file on the first computer, transferring a first file archive by the network browser module from the first computer system to a temporary storage area on the second computer system, extracting and preprocessing the first file archive from the temporary storage area on the second computer by the file archive extractor module/ file archive downloader module, parsing the file archive list line of the first file archive on the second computer by the file archive extractor/file archive downloader, downloading a set of at least one file archives referenced by the parsed file archive line on the second computer by the file archive extractor/file archive downloader, and extracting and preprocessing the at least one file archives referenced by the parsed file archive line on the second computer from the temporary storage area by the file archive extractor/file archive downloader.

The eight aspect of the present invention regards a computing environment that contains at least one client connected to a data communication network the system contains a file archive downloader module to transfer a set of file archives from a first computer system to a temporary storage area on a second computer system, and a file archive extractor module to extract a set of file archives from a first computer system from the temporary storage area on a second computer system and preprocess the extracted file archives to prepare the files archived within the file archives for processing by a network browser.

Each of the above aspects of the present invention provides for the reduction of the file transfer times.

Each of the above aspects of the present invention provides for reducing the number of file requests necessary for a client system to perform in order to retrieve files from a server system.

Each of the above aspects of the present invention provided for reducing the number of discrete software and hardware operations instituted between a server system and a client system.

Each of the above aspects of the present invention provides for improving the response time associated with the operation of the client system.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will be understood and appreciated more fully from the following detailed description taken in conjunction with the drawings in which: Fig. 1 is a schematic illustration of an exemplary computing and communication system effective in providing a suitable environment for the operation of a preferred embodiments of the present invention; and

Fig. 2 is a high-level flowchart of the main logic flow of the wrapper application, in accordance with a first preferred embodiment of the present invention; and

Fig. 3 is a high-level flowchart, which illustrates the operation of the HTML primer module of Fig. 1, in accordance with the preferred embodiments of the present invention; and

Fig. 4 is a simplified flow diagram of the interaction with the user, in accordance with the preferred embodiments of the present invention; and

Fig. 5 is a simplified flow diagram illustrating the operation of the HTML transformer of Fig. 1, in accordance with the first preferred embodiment of the present invention; and

Fig. 6 is a simplified flow diagram illustrating the operation of external file packager of Fig. 1, in accordance with the preferred embodiments of the present invention; and

Fig. 7 is a simplified flow diagram of the wrapping operation, in accordance with the first preferred embodiment of the present invention; and

Fig. 8 illustrates a conceptual network of logical links between an exemplary set of inter-related HTML pages after being processed by the HTML transformer module of Fig. 1, and an exemplary set of associated archives created by the operation of the external file packager module of Fig. 1, in accordance with the preferred embodiments of the present invention; and

Fig. 9 is a high-level flowchart illustrating the building of the file archive loader line by the file archive loader line builder of Fig. 1, in accordance with the first preferred embodiment of the present invention; and

Fig. 10 is a high-level flowchart illustrating the structuring of the file archive loader line and the insertion thereof to an HTML page by the file archive loader line builder of Fig. 1, in accordance with the first preferred embodiment of the present invention; and Fig. 11 is a schematic block diagram, which shows the input elements and the various operational components constituting the wrapper module; in accordance with the preferred embodiments of the present invention; and

Fig. 12 illustrates an exemplary HTML page prior to and consequent to the processing by the HTML transformer module of Fig.1, in accordance with the first preferred embodiment of the present invention; and

Fig. 13 illustrates an exemplary HTML page prior to and consequent to the processing by the HTML transformer module of Fig.1, in accordance with a variation on the first embodiment of the present invention; and

Fig. 14 shows the structure of an exemplary HTML page prior to and consequent to processing by the file archive loader line builder module of Fig. 1, in accordance with the first preferred embodiment of the present invention; and

Fig. 15 shows the structure of an additional exemplary HTML page prior to and consequent to processing by the file archive loader line builder module of Fig. 1, in accordance with the first preferred embodiment of the present invention; and

Figure 16 is a high-level flowchart of the method of operation of the wrapper application, in accordance with the second preferred embodiment of the present invention; and Fig. 17 is a simplified flow diagram illustrating the operation of package creation, in accordance with the second preferred embodiment of the present invention; and

Fig. 18 is a high-level flowchart illustrating the structuring of the file archive loader line and the insertion thereof to an HTML page, in accordance with the second preferred embodiment of the present invention; and

Fig. 19 shows the structure of an exemplary HTML page prior to and consequent to processing by the method, in accordance with the second embodiment of the present invention; and Fig. 20 is a schematic block diagram illustrating an exemplary environment functional in enabling the operation of the wrapper application, in accordance with the third preferred embodiment of the present invention; and

Fig. 21 is a high-level flowchart of the main logic flow of the wrapper application, in accordance with a third preferred embodiment of the present invention; and

Fig. 22 is a high-level flowchart illustrating the method of the installation of the extractor/downloader module on the client system, in accordance with a third preferred embodiment of the present invention; and

Fig. 23 is a high-level flowchart illustrating the performance of the wrapper system and method, in accordance with a third preferred embodiment of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

A method for the acceleration of content download in a data network is disclosed. In the preferred embodiments of the present invention the data network is the Internet and more specifically the World Wide Web (Web) and the downloadable files are HTML (HyperText Markup Language) pages also referred to as HTML documents or Web pages. It will be apparent to one with ordinary skill in the art that the following description is provided in order to facilitate a thorough understanding of the present invention. The following text should not be construed as a limitation to other possible embodiments associated with alternative configurations, deployments, and uses that could be contemplated without departing from the spirit of the invention or the scope of the appended claims. In other different embodiments the data network could be a

Local Area Network, a wireless network, a cellular network or any other communication network involving the delivery of diverse content among operatively communicating network nodes linking geographically or topographically separate locations.

Downloading is the transmission of a file from one computer system to another. From the user's point-of-view, to download a file is to request the file from another computer and to receive it. Such files also include HTML pages. HTML pages typically consist of text and specific markup symbols or tags or codes inserted therein which tell a browser client application how to display an

HTML page's embedded code for the user. An HTML page can include a reference or a link to rich media files. An HTML page can also include references to another HTML pages. During the downloading process when the client's network browser recognizes such a link the file pointed at by the link will be selectively downloaded and displayed in association with the original page.

The rich media files pointed at by the specific links in HTML pages typically contain animation, streaming audio and video, images and the like. Such files effect slower loading and processing thus significantly increase the latency (or response time) of the entire transaction. In order to alleviate the difficulty some network sites attempt to improve their performance by "dumbing down" or avoiding the use of such files. Obviously this is not an acceptable solution in such a highly competitive environment as the Internet in which a fierce contest is raging for the attention and the attraction of each and every potential customer.

The present invention offers a system and method to improve the performance of a server system by a) reducing the size of the downloadable content using suitable structuring methods b) reducing the number of operations between the server and the clients by the reduction of the number of file requests the clients are compelled to make during the download process c) reducing the response times experienced by the users of the clients by the utilization of a look-ahead mechanism operative in the pre-downloading of selective files, and d) reducing the load on the client agent operative in the handling of the downloaded files by the utilization of a preprocessing utility on the client side. The system accomplishes its objectives in the following manner: The content files of the server system are transformed by the optional replacement of the entire set of the original HTML pages residing on the server system and by the optional restructuring of the entire set of external files belonging to the respective HTML pages. The replacement of the HTML pages and the restructuring of the external files associated with the respective HTML pages is performed offline.

Within the framework of a pre-planned offline run initiated by the operators of the server system the entire set of HTML pages present on a server system is examined one by one for the presence of tags or links referencing external files. The entire set of external files referenced by a single HTML page is collected compressed and packed into a separate, page-related archive or

"package". Then the HTML pages are processed in order to replace all the existing tags or links referencing external files with new references to the archive or the "package" which holds the external files belonging to the specific page. Additionally a look-ahead mechanism is activated in order to conduct a search across the lines of the HTML pages for tags or links, which internally reference other HTML pages. For each analyzed page a list of referenced HTML pages and their associated archives or "packages" is built. Then the list of the archives associated with the referenced pages is suitably structured into a specially formatted, tagged line, referred to as the "file archive loader line". The "file archive loader line" is then inserted into the body of the analyzed page.

A specifically developed software module designated as the "wrapper module" having external file-handling functionalities is inserted into each archive or "package". When a user utilizing a client system accesses the server typically through the services of a network browser and selects a page to download, the page will be transmitted thereto simultaneously with the "package" that contains the entire set of compressed external files referenced by the specific page and all the "packages" stored on the "file archive loader line." The transformed HTML page, the "packages" from the "file archive loader line", and the "package" of the external files will be stored into a memory device of the client. When the client processes a page, the files referred to therein will be typically located on the memory device of the client system. Consequently, the "package" will be accessed and the external files contained therein will be extracted with the associated "wrapper module" executable file in virtue of the known built-in features of the typical HTML browsers. The wrapper module is responsible for the requests of the packages by the definition of the Archive parameter in the HTML page, which is also a known feature of the browser applications. The wrapper module transmitted with the files provides all the functionality previously provided by the traditional references in the original

HTML pages and therefore the external files will be suitably screened, displayed, played, and executed in the standard manner according to the type thereof. For each additional page downloaded, the respective archive or "package", and the

"packages" associated with the pages referred to by the currently downloaded page in the "file archive loader line", will be transmitted to the client. As a result of the compression the files are transferred substantially faster over the data communications network lines and thus the downloading times of the requested pages and associated files will be substantially reduced. The external files referenced by a downloaded page are transmitted efficiently as a single package, simultaneously with the page and therefore are already present on the client system when referenced by the network browser. Therefore, the number of file requests made by the client to the server will be substantially reduced. Following the lightening of the traffic load on the server the performance of the server will considerably improve. In addition to references to external files, such as rich media structures, HTML pages typically also include references to other HTML pages, which, in turn, include references to yet further HTML pages. The referred pages could be held locally on the server machine or could be implemented on remote server platforms. The references occur within the body of the referring page as lines formed to be utilized as hyperlinks. In order to achieve substantially improved response times, the present invention proposes the implementation of a method and system, which utilizes a specifically developed program product with look-ahead capabilities. A functional module within the program product operative in applying the proposed method analyzes a plurality of inter-related HTML pages stored on one or more server platforms. The function establishes for each referring page a set of "next level pages" referenced thereby. "Next level pages" of a referring page is defined as a set pages referenced by the referring page. Typically most HTML pages have a set of "next level pages". Thus, an exemplary HTML page denoted as page "A" could reference other HTML pages denoted as page "B", page "C", page "D", and page "E". The look-ahead function parses the lines of page "A" in order to recognize next level page references. A list of previously created archive files referenced by the list of pages referenced by page "A" is then inserted into page "A". This list includes file archive indications designating file archives containing external files referenced by page "B", page "C", page "D", and page "E". The list of the file archive indications is utilized as a parameter list on a specific line referred to as file archive loader line. The loader line includes appropriate tags to effect the downloading of the file included in the list to a client platform. To activate the loader line a software product, referred to as the "trigger" module is developed, packaged into a separate archive the identification of which is placed on the file archive identification list. Subsequent to the downloading of page "A" to the client the file archive loader line will effect the substantially simultaneous downloading of the file archives included in the list on the loader line. As a result, the external files referenced by the next level pages will be stored into the cache of the client platform prior to the downloading of the referencing pages themselves. If page "B" contains references to external files packaged within the file archives Jara.jar and Jarb.jar then Jara.jar and Jarb.jar will be downloaded concurrently with page "A", prior to their referencing page "B".

In order to effect the extraction of the files archived within the downloaded files archives within the network browser's cache memory device prior to being referenced by a specific page and therefore prior to being extracted by the network browser, a specific preprocessing utility referred to as the "extractor" routine is utilized. The extractor routine is downloaded to the client system simultaneously with the first transmitted HTML page (preferably with the server system's home page). The extractor routine is designed to operate on the client system as an independent pre-processor of the downloaded file archives. The extractor module extracts of the files from the file archives prior to being processed by the network browser. Thus, the files are prepared for ready processing by the network browser. Consequently when the browser retrieves the downloaded files from the cache memory device, the requested files therein will be already extracted from the archives as a result of the operation of the extractor module. It is important to stress that the entire set of files downloaded within the file archives and stored into the network browser's cache memory device are extracted from the respective archives by the extractor module and not by the network browser. Typically, HTML files include text, images, and other media. The pages are requested by and downloaded to a client in order to enable users operating the client system to interact therewith. The interaction includes periodically repeating sequences of passive and active phases. The passive phase of the interaction typically involves visual scanning of the displayed page. This phase extends across a period of time having a greater than zero length. In the method and system proposed by the present invention, practically the entire set of file archives included in the file archive loader line of a specific page are downloaded to the client during the above-mentioned period of time. After the completion of the active phase, the users enter the active phase, which typically involves the optional processing of the currently downloaded page. One such common processing option involves the selection of a predefined hypertext link included in the page. The action results in the introduction of a request submitted to the server for the downloading of an additional HTML page. A hypertext link by definition references a next level page. When a suitably formatted request for a referenced next level page is sent to the server, the referenced page is suitably downloaded and processed by the client's network browser. As a result of the proposed method, at this point in time, the external files referenced by the next level page are already present on the client platform stored in the browser's cache memory device. As the appropriate external files are first sought by the network browser in the local browser cache memory area, no further download requests regarding the referenced external files are submitted to the server. In addition, prior to being retrieved by the browser, the files in the cache memory area at this point in time are already extracted from the file archives as a result of the independent operation of the extractor module. Thus, the response times experienced by users will advantageously improve.

Reference is now made to Fig. 1 that illustrates an exemplary computing and communication environment in which the proposed method is operative. Server system 20 is a content provider or information distribution system implemented on a computer platfoπn, which is operatively connected to a data network 18, such as the Internet, via a communication device 22. Server system 20 contains a storage device 24 that holds HTML pages 28, external files 30, wrapped HTML pages 32, packaged external files 34, Pages/References index file 29, and HTML Wrapper Wizard 26. External files 30 are data structures typically being pointed at by suitable HTML tags or links suitably embedded into HTML files 28. In the preferred embodiment of the present invention HTML Wrapper Wizard is a Windows GUI (Graphical User Interface) application that guides the user of the server system through the wrapping process. In other embodiments of the present invention HTML Wrapper Wizard could be developed and operate in other diverse operating systems such as Linux and could have a different user interface, such as an X- Windows GUI application. HTML Wrapper Wizard 26 is a set of software modules encoded into binary executable files and comprises HTML Primer 23, HTML Transformer 26, Index builder module 36, Wrapper module 37, External File Packager 38, Compression and Packing utility 25, Extractor module 15, Trigger module 19, and File archive loader Line builder 24. HTML Primer 23 designed to preprocess HTML pages 28 in order to assist in the creation of Wrapped HTML pages 32, and Packaged external files 34. Index builder 36 creates Pages/References Index File 29 by parsing HTML pages 28 and extracting the HTML tags and links operative in referencing on-site or off-site external files 30. Compression utility 38 is a standard software application for file compression and archiving such as CABARC (Microsoft Cabinet) developed by Microsoft Corporation of Redmond, Washington, USA, JAR (Java Archive) developed by Sun Microsystems, Inc. or WinZip developed by WinZip Computing, Inc. In a different embodiment of the present invention the proposed system will feature a specifically developed compression utility to compress and decompress files in real-time. The utility could be developed by utilizing an Internet programming application such as Java Applet, ActiveX, Java Script, VB Script, Jscript or the like. Compression and Packing Utility 38 is activated by External File Packager 38 to create an archive of compressed external files 34 intended to be downloaded to requesting client systems 10, 12, 14, and 16 along with the referencing HTML pages. Wrapper module 37 is a specifically developed software module, which is designed to be downloaded simultaneously with the packages of the external files 34 to client systems 10, 12, 14, and 16. Wrapper module 37 is designed to suitably handle packaged external files 34 when referenced by the HTML page. File archive loader line builder 24 is designed to analyze a set of HTML pages in order to determine the logical relationships between the pages, to build page-specific file archive loader lines and to insert the lines into the respective HTML pages in order to effect downloading of the file archives containing external files with the HTML pages. Personalization module 27 provides the user of the server site with the option of modifying basic operational characteristics of the system. Utilizing module 27 the user could determine the type of the external files to be handled, could set the number of levels to be scanned in order to extract the file archive references therein, could establish the minimum number of references on a page necessary for the inclusion of the page-specific archive references in the file archive loader line, and the like. Extractor module 15 is a specifically developed software utility, which is preferably downloaded simultaneously with the first downloaded HTML page, such as the server system's home page, to the client system. Subsequent to the downloading to the client system extractor module 15 is initialized and activated by the network browser. The initialization and the activation of module 15 could be performed in diverse other manner, such as being called by the operating system or the like. Module 15 is responsible for continuously accessing the network browser's cache memory device, to locate the recently downloaded file archives within the cache memory device, and for suitably extracting the archived and compressed files to make the files ready for appropriate processing by the network browser. Trigger module 19 is operative in the activation of the file archive loader line within an HTML page. The operation and functions of module 19 will be described hereunder in association with the following drawings.

Client systems 10, 12, 14, and 16 are implemented on computing platforms over the data network 18 and are intermittently coupled to server 20 via suitable communication devices, via data network 18, and via communication device 22 in order to obtain, display, process and download HTML pages 28 and External files 30. After the transformation of HTML pages 28 and External files 30 by HTML Transformer 26, and after the creation and insertion of the File archive loader lines by File archive loader Line builder 24, HTML pages 28 will be transparently replaced by wrapped HTML pages 32. Suitable pointers to Wrapper module 37 will replace the HTML tags and links operative in referencing external files in HTML pages 28.

Referring now to Fig. 2 which shows the various stages involved in the method of the wrapping of the HTML pages of server system 20 of Fig. 1. At step 40 the Wrapper module 37 of Fig. 1 is created. Wrapper module 37 is a software program specifically developed offline. Diverse software products typically utilized as Internet applications such as Java, ActiveX, JScript, VBScript, Java Script among others could be used as tools in the development of the Wrapper module 37. Wrapper 37 is implemented as a small application that runs in browser client applications installed on client systems such as Microsoft Explorer or Netscape Navigator. Wrapper 37 is inserted in the HTML pages 32 of Fig. 1. Wrapper 37 is responsible for the loading of the external files from a set of file archives or packaged external files 34 of Fig. 1 containing external files 3o of Fig. 1. Wrapper 37 provides the same functionality as the original external files references in HTML pages 28 of Fig. 1.

At step 33 trigger module 19 of Fig. 1 created. Trigger module 19 is a program product specifically developed offline. Diverse software products operative to develop Internet applications can be used as developer tools in the creation of module 19. Examples of such tools are Java, ActiveX, JScript, VBScript, Java Script, and the like. Module 19 is a small application that runs in network browsers installed on client systems such as Microsoft Explorer or Netscape Navigator. In the preferred embodiment of the present invention module 19 is embedded in the packaged external files 34 of Fig. 1. Module 19 is responsible for the pre-downloading of the external files referenced by a set of HTML pages and held in a set of file archives or packaged external files 34 of Fig. 1. Module 19 is also utilized for displaying information regarding the developers of the method and system and as a means for accessing the developers' Web site.

At step 34 the Extractor module 15 of Fig. 1 is created. Extractor module 15 is a utility specifically developed offline. Diverse software products typically utilized as Internet applications such as Java, ActiveX, Script, JScript, VBScript, Java Script among others could be used as tools in the development of the Extractor module 19. In the preferred embodiment of the present invention Extractor module 15 is an ActiveX utility. Module 15 is developed specifically as a small application that runs in network browsers, installed on client systems such as Microsoft Explorer or Netscape Navigator. In the preferred embodiment of the_present invention Module 15 is attached to the file archive loader line 24 of Fig.1. Module 15 is responsible the extraction of the files held in a set of archives or packages 34 of Fig. 1 after being downloaded to the client system and being stored within the network browser's cache memory device. Module 15 is preferably downloaded simultaneously with the first downloaded HTML page to be ready for immediate execution. Module 15 is utilized for continuously accessing the network browser's cache memory area, for identifying recently downloaded packages or file archives, and for extracting the archived files from the packages to make the files available_for immediate processing by the network browser.

At step 35 the personalization of the system is performed. Personalization step 35 is accomplished via the services of the personalization module 27 of Fig. 1. Module 27 of Fig. 1 is activated and executed offline. Module 27 makes available to the user of server 20 of Fig. 1 the option of selectively modifying specific system parameters. Utilizing module 27 the user is enabled to activate or de-activate selected operative modules of the system such as the file archive loader line builder 24 of Fig. 1, and the like. Module 27 also enables the use to adjust certain performance-specific parameters, such as the number of the "next levels" from which the archive references are extracted in order to be inserted into the file archive loader line, and the like.

The actual process of the wrapping operation extends across steps 41 through 46. At step 41 the server system 20 of Fig. 1 is being primed for the HTML pages transformation operation. The HTML pages 28 of Fig. 1 are searched in real-time or offline for references to the external files 30 of Fig.1. A Pages/References Index file 29 of Fig. 1 is built containing a list of HTML pages 28 of Fig. 1 and related references of external files 30 of Fig. 1. At step 42 interaction with the user is initiated in order to facilitate the management of the files. The user is provided with the option of adding entries to the index file, deleting entries from the index file and replacing entries in the index file. The user is also given the option of verifying the index file and of initiating the wrapping of the server system i.e., the transformation of the HTML pages 28 of Fig. 1 and the compression and the packaging of the external files 30 of Fig. 1. At step 44 the method for the transformation of server system 24 of Fig. 1 is executed. HTML files 28 of Fig. 1 will be processed by HTML transformer 26 of Fig. 1. The references within the HTML files 28 of Fig. 1 will be replaced by suitable references to the wrapper module 37 of Fig. 1. The new references will contain parameter values set to the equivalent values of the original external references. The transformed HTML pages 32 of Fig. 1 will be stored in temporary storage on storage device 24 of Fig. 1. At step 45 a distinct set of external files 30 of Fig. 1 is packaged into archives 34 of Fig. 1 for each transformed HTML page 32 of Fig. 1. The sets of external files 34 of Fig. 1 comprise of compressed files created by a suitable compression and packing utility 25 of Fig. 1. Into each set of packaged external files 34 of Fig. 1 a copy of the wrapper module 37 of Fig. 1 is inserted. At step 43 the transformed HTML pages are processed by File archive loader Line Builder module 24 of Fig. 1. The pages are scanned and analyzed in order to recognize a set of "next level pages" or the pages that are referred to within the body of the scanned page. A list of referred-to-pages and a page-specific tagged line is created including an applet tag and parameters concerning the set of archive files associated with the set of referenced pages. The file archive loader lines are respectively inserted into the transformed pages. At step 46 the transformation of the server system is finalized by the replacement of original HTML pages 28 of Fig. 1 with the transformed or the wrapped pages 32 of Fig. 1. The packages consisting of the sets of packed external files 34 of Fig. 1 and the wrapper module 37 of Fig. 1 are stored into a suitable database on the server.

It would be appreciated by those with ordinary skill in the art that any one or a plurality of HTML pages could be processed and transformed accordingly. The decision whether to transform any HTML page or a plurality thereof is preferably made by the user. Alternatively, such selection will be achieved automatically via pre-selected criteria provided by a user such as the server system administrator or the Webmaster. In yet another embodiment of the present invention the user will be shown a schematic graphical display of the HTML pages residing on the server system and mapped therein. The user will from the plurality of pages displayed select the pages for transforming making use of a pointing device such as a mouse associated with the GUI. Such determination will be made by locating HTML tags such as "<IMG....>",

"<HREF=...</A>" and the like. The user has the added capability to select the method for the organization of the packages from among a number of predefined methods of packaging.

In the second preferred embodiment of the present invention the above described execution path is modified by the bypassing of certain steps and the insertion of additional steps. In the second preferred embodiment of the present invention the HTML transformation step 44 is be bypassed and the list of the created packages is inserted into the body of the HTML page as parameters on the file archive loader line. Thus, the HTML transformer module 27 of Fig. 1 becomes inoperative. Additional functions and function steps are activated in the file archive loader line creation step 43 of Fig. 2 and in the other steps along the main logic path. Consequently the HTML page will not be wrapped and the tagged lines within the page will remain unmodified. A detailed description of the second preferred embodiment will be set forth hereunder in association with the following drawings.

Fig. 3 illustrates the method of priming the sever system for the transformation. At step 48 an HTML page is loaded and at step 50 the program control commences to scan and to examine the code lines within the loaded HTML page. A text line within the loaded page is obtained and at step 52 it is determined whether the examined code line contains a reference to external files. If the result is positive then at step 54 a suitable entry is added to the Pages References index file. As long as the loaded HTML page contains more code lines (step 56) program control proceeds to step 50 to obtain the next code line within the loaded HTML page and for each code line program control will loop across steps 50 through 56. When the examination of the entire set of text lines stored on a single page is completed, at step 58 it is determined whether there are more HTML pages to load and to prime. For each suitable HTML page steps 48 through 58 are executed effecting the loading of the respective page and the search therein for references to external files. After the scanning of the entire set of the HTML pages and the respective text lines is completed program control returns to the HTML wrapper application's main logic module at step 60.

The output of the method is the Pages/References Index File 29 of Fig. 1. File 29 consists of entries constituting identifications of HTML pages having external references and of sub-entries of the references proper. One possible structure of the Pages References Index 29 of Fig. 1 file will be described next. The index file comprises page records and external file records. Page records comprise a) an HTML page name and b) number of external files referenced by the HTML page. External file records are linked to the respective HTML page records and comprise a) external file name b) network address parameter value c) file reference parameter value d) additional needed parameter values.

It would be readily understood by one with ordinary skill in the art that the Pages/References index file could have other configurations. Diverse known file access methods and known file structures could be used and additional data fields could be held in the file. In a further embodiment of the present invention the objectives of the system could be accomplished without the use of an index file.

In the preferred embodiments of the present invention interaction is provided with the user of the client system. The user can closely control the operation of the server system transformation by interrogate and modify the Pages/References Index file and thereby enable or disable the packaging of specific external files. The user could also add and replace references effecting thereby independent modification of the existing HTML pages. If for various reasons a user disables the packaging of an external file then the entry in the Pages/References Index file regarding the same file will be deleted. Subsequently the external file will not be packaged and the original reference thereof will remain in the processed HTML file. The handling of such a file will be in the standard manner, i.e., when referenced the file will be downloaded independently of the page and the associated archive of packaged files.

Referring now to Fig. 4 illustrating interaction with the user. The interaction is accomplished through the utilization of suitable interfacing devices such as a keyboard, a pointing and selection device such as a mouse, a display device, and a user interface such as a GUI (Graphical User Interface). At step 62 the user input commands are read. The user is enabled to change the previously created Pages/References index file by adding, deleting or replacing entries therein. In the preferred embodiment of the present invention the user is obligated to verify the index before the activation of the transformation process. In further embodiments the interaction with the user could be in a different manner. At step 64 the user requests to add an entry to the Pages/References Index file. Subsequently at step 66 the input specifying a new entry is obtained, at step 70 the new entry is added to the index file and control passes to step 62 to read additional user commands. At step 72 the user specifies a delete operation. Therefore at step 74 the reference to be deleted is obtained from the user, at step 76 the entry deleted from the index file, and then control passes to step 62 to read additional user commands. At step 78 the user initiates a replace operation. At step 80 the entry to be replaced is obtained at step 82 the new entry is read and at step 84 the entry in the index file is replaced. Control then passes to step 62 to read additional user commands. The user should verify the Pages/References Index file at step 86. At step 88 it is checked if the index file was verified. If the result of the check is negative control passes to step 62. When it was determined that the index was verified at step 90 the user is interrogated concerning authorization for transforming the server system into a wrapped state. The value of a suitable flag indicative of the user choice is set at steps 92 or 94 respectively and control returns to the HTML wrapper. At step 63 the user selects the packaging mode. Packaging modes represent different types of organizations for the packages such as a) compressing all external files 30 of one page 28 into one package b) compressing each external file 30 into a different package and c) compressing groups of external files 30 into different packages. The first method will provide minimum downloadable content size and minimum file requests, the second method will reduce the content size but not the number of file requests and the third method will allow for a balance between content size reduction and the number of file requests. After the mode selection is made method control returns to step 62 to read user commands.

It would be easily perceived by one with ordinary skill in the art that the foregoing description regarding the interaction of the user with the method and system proposed by the present invention was given merely as an example. Other advanced options could be provided to a user such as the capability of controlling the creation of the index file utilizing functional parameters, scripts and the like. In further embodiments of the present invention user interaction could be extensively enhanced or limited.

The first preferred embodiment of the invention uses a method to transform HTML pages from the traditional format to a wrapped format in which all the references indicated by appropriate tags or links to external files are replaced by references to the Wrapper module 37 of Fig. 1. The method is illustrated in the flow diagram of Fig. 5.

The method utilizes the entries of the Pages/References Index file 37 of Fig. 1 as pointers to the HTML pages to be transformed. The method at step 98 begins with the fetching of a page-related entry in the index file 37 of Fig. 1. At step 99 the page pointed at by the index file entry is loaded. At step 100 the entries relating to the external file references of the loaded page are obtained and at step 102 the appropriate text lines are obtained. At step 104 the reference to the external file is replaced by a reference to the wrapper module 37 of Page. 1. At step 106 the parameter values of height and width are obtained from the original reference and are inserted into the new reference line. At step 108 the name of the external file is obtained from the original reference and inserted into the new reference. At step 110 the network address such as the URL (Universal

Resource Locator) value is obtained from the original reference and inserted as parameter value to the new reference. When the referenced external file is located on-site the value of the network address is null. At step 112 the event to take place in case of any mouse capture is specified by inserting the ALT value of the original reference to the equivalent parameter value of the new reference. The value of ALT is a text string, which is included for browsers unable to load or display images. The text is used as a description of the image and displayed when a mouse event occurs. The value of ALT can be null if no such event is specified. At step 114 a check is made regarding the existence of more references relating to the currently handled HTML page in the index file. When more references exist program control will pass to step 100 and will execute a loop across steps 100 through 114 for each additional reference. When the handling of the entire set of the references in the loaded page is completed an exit from the loop is effected and at step 116 the index file is checked for the existence of more HTML page entries. For each additional HTML page program control will loop across steps 99 through 116. When the entire set of HTML pages indicated by the suitable entries in the index file is transformed program control returns to the HTML wrapper at step 118. The routine could include a script compression feature that will compress scripts residing inside the HTML pages. The routine will operate in the following manner. Tags specified as <scripts> are sought in the HTML page. When a <script> tag is found a check is made in regard to the source of the external script file. If the file referenced is already in an archive file then the archive name in the reference is replaced with the HTML page-specific global archive name and the external script is inserted into the global archive file. If the source script reference is not an archive then an external script file is created, and the script code is removed from the HTML page to the new external file. Consequently the new file is attached to the global HTML archive file, and the reference to the source script file is changed to point to the source in the global HTML archive file.

The external files pointed at by the original references in the respective

HTML pages are compressed and packed into separate HTML page-related packages. The method is illustrated in the flow diagram of Fig. 6. The method begins at step 120 by reading a page-related entry from the index, file. At step 121 an HTML page 28 of Fig. 1 pointed at by a page entry from the index file is loaded. A reference entry read from the same index file (step 122) will point to an appropriate reference in the loaded HTML page. At step 124 the reference line will be obtained and at step 126 the name of the external file, and optionally the network address are extracted from the wrapped reference. At step 128 the external file is loaded and at step 130 the external file packager 38 of Fig. 1 is activated to compress and pack the loaded file into the archive of the packaged external files 38 of Fig. 1. At step 132 the next reference entry in the page/reference index file is sought and if a suitable entry is found then control will loop across steps 122 through 132 until the handling of the entire set of references is completed. When no more references are found in the index file regarding the loaded page the index file is accessed to obtain the next page entry at step 134. If the an additional page entry is found method control will loop across steps 120 through 134 until the handling of the entire set of pages is completed. Conversely if the end of Pages/References index file is reached then at step 136 the wrapper module 37 of Fig. 1 is inserted into the package of external files. Consequently at step 140 control returns to the HTML wrapper.

In association with a single predetermined HTML page the specific Extractor module 156 of Fig. 1 is inserted into the package of the external files. The predetermined page is preferably the first HTML page to be downloaded to the client system. For example, the home page of the client system could be used as the driver of the package, which includes the extractor module. When the first page loads to the client system the network browser extracts and activates the extractor module. The extractor module begins a continuous operation of scanning the files withinjhe browser's cache memory area. When a recently downloaded archive is located by the extractor, the archive is preprocessed by the extraction of the files archived therein in order to be made ready for processing by the browser. If a specific HTML page has a built-in reference to an existing archive file the entire set of external files associated with the specific HTML page will be combined into one "global" archive file. Therefore the existing references to archive files will be replaced with references to the "global" archive file. The organization of the packages could be different from the one described above. For example, the entire set of the external files referenced by the entire set of HTML pages residing in the server system could be compressed and packed into a single package.

Following the transformation of the HTML files 28 of Fig. 1 to wrapped

HTML files 32 of Fig. 1 and the packing of the external files 30 of Fig. 1 into distinct HTML page-related packages of external files 34 of Fig. 1 a method of the wrapping of the server system is initiated. The method is illustrated in the flow diagram of Fig. 7. The method of wrapping is driven by Page/Reference index file 29 of Fig. 1. At step 142 a wrapped HTML page 32 of Fig. 1 pointed at by an entry in the index file 29 of Fig. 1 is loaded and at step 144 the original HTML page 28 of Fig. 1 is replaced by the loaded wrapped HTML page 32 of Fig. 1. At step 146 it is determined whether Pages/References index file 29 of Fig. 1 contains more pages to be handled. If there are more pages then control will loop across steps 142 through 146. During the loop the entire set of HTML pages pointed at by Pages References index file 29 of Fig. 1 will be replaced by the wrapped version of the same pages. After the entire set of pages is replaced the page-related packages are stored into a specific database and control returns to the main program at step 150.

Referring now to Fig. 8, which illustrates an exemplary set of interrelated

HTML pages and the associated archive files thereof. It should be noted that the pages shown are transformed pages i.e., after being processed by the HTML transformer module 37 of Fig. 1. The archive files or the "Packages" were created by the External File Packager module 38 of Fig. 1. Fig. 8 shows only simplified references relevant to the creation of the file archive loader line and disregards any other lines, which may appear within a typical HTML page.

HTML page A (200) includes references to page B (202), page C (204), page D (206), and page E (208). HTML page B (202), referred to by page A (200), has no further page references. As a result of the suitable processing performed by the HTML transformer module 26 of Fig. 1 page B (202) includes references to archive files JarA.jar (214) and archive file JarB.jar (216). HTML page C (204), referred to by page A (200), includes references to page F (210) and to page G (212). As a result of the suitable processing by the HTML transformer module 26 of Fig. 1 page C (204) has also references to archive file JarC.jar (226) and archive file JarD.jar (228). HTML page D (206), referred to by page A (200), has no page references but as a result of the prior processing by HTML transformer module 26 of Fig. 1 page D (206) includes reference to archive file Jarl.jar (230). HTML page E (208), referenced by page A (200), has no page references but as a result of the prior processing by the HTML transformer module 26 of Fig. 1 page E (208) contains reference to archive file JarJ.jar (232). HTML page F (210), referred to by page C (204), has no further page references. Page F (210) includes references to archive files JarE.jar (218) and JarF.jar (220). Page G (212), also referred to by page C (204) has no further page references but includes references to archive files JarGjar (222) and JarH.jar (224).

The set of the exemplary interrelated HTML pages has a set of predefined hierarchical relationships. Conceptually, the set of pages are organized in layers where each layer consists of pages and each page within a layer could refer to pages within the next layer. Therefore, by analyzing the lines included in a specific page file, a list of "next level pages" can be built. The pages included in the obtained list are then examined for the presence of archive files within the lines of the same pages, and a collection of the archive file identifications into a structured archive files list is accomplished. Consequently, the archive files included in the list are appropriately inserted into the body of the analyzed page file within a specific line driven by a suitable tag, such as an APPLET tag. The insertion is performed in order to effect the downloading of the archive files to the client, concurrently with the originally analyzed page. Thus, an advantageous look-ahead effect is accomplished, designed to enable a pre-downloading of archive files, one step (or one page) ahead of their driver page.

Referring now to Fig. 9 illustrating the operation of the file archive loader line builder module 24 of Fig. 1. The file archive loader line is defined as the tagged line within the body of an HTML page that holds the list of the file archives referenced by the "next level pages". The file archive loader line includes an APPLET tag followed by a parameter having the value of the applet identification. The file archive loader line also includes the ARCHIVE keyword. The value of the parameter associated with the ARCHIVE keyword is the list of the downloadable archive files. Thus, for HTML page A (200) of Fig. 8 the following archive lines will be placed on the list: Jara.jar, Jarb.jar, Jarc.jar, Jard.jar, Jai.jar, and Jarj.jar. In order to effect the downloading of the trigger module, each list of the downloadable archive files is terminated with the identification of the trigger module, such as "Trigger .jar". The file archive comprising the trigger module is placed at the end of the list of the downloadable archives to effect the loading of the preceding archives on the prior to the loading of the trigger module.

The principal function of the trigger module is to effect the downloading of the archives included in the ARCHIVE parameter list. Additionally, the trigger module is also utilized to display information regarding the developers of the method and system and as a means to enable users to connect to the Web site of the developers for purposes of support, maintenance, software upgrades, and the like. The trigger applet can display the information in diverse formats, such as text within a banner frame. The WIDTH and HEIGHT parameters within the file archive loader line are utilized to define the location and size of the banner frame. The file archive loader line of a single predetermined HTML page will include the Extractor utility 15 of Fig. 1. Typically the single predetermined HTML page will be the home page of the server system in order to effect the downloading of module 15 prior to the entire set of following pages downloaded within a session. Alternatively extractor module 15 could be installed permanently on the client system as a plug-in of the network browser. In the preferred embodiment of the present invention subsequently to the downloading of the specific HTML page carrying the extractor module 15, module 15 is initialized and activated by the network browser. Module 15 regularly accesses the network browser's cache memory area and repeatedly examines the files within the memory area. When a recently downloaded file archive is identified, module 15 prepares the packaged files therein for ready processing by the network browser. The extraction of the packaged files from the file archives within the_network browser's cache memory area is performed independently by the extraction module 15 prior to the obtaining of the files by the network browser.

Still referring to Fig. 9 the file archive loader line builder module begins operation at step 240. At step 242 a page entry from the pages/references index file 29 of Fig. 1 is obtained. At step 244 the transformed HTML page referred to by the index file entry, is loaded and at step 246 a data structure designed to hold the list of the pages included in the loaded HTML page is initialized. At step 248 a tagged line from the loaded page is obtained. At step 250 the obtained line is examined for the presence of a page reference. If the examined line includes a suitable page reference then at step 252 the page identification is added to the list of included pages. At step 254 an end-of-page condition of the loaded page file is tested. If more unexamined lines exist within the page file program control returns to step 248 in order to get the next line from the loaded page. As long as the page holds unexamined lines program control performs an inner loop across steps 248 through 254. Subsequent to the handling of all the lines within the loaded page, program control calls the file archive loader line insertion module at step 256. The line insertion module will be described in detail hereunder in association with Fig. 10. Subsequent to the return of the program control from the line Insertion module at 258 the Pages/References index file 29 of Fig. 1 is checked for an end-of-file condition. If more unexamined pages exist in the index file 29 then program control returns to step 242 to obtain the next page entry from the index file 29. As long as there are more page entries referring to unprocessed transformed pages exist in the index file 29 program control performs a loop across steps 242 through 258. When the end-of-file condition is raised through the reading of the index file the module terminates at step 260.

Referring now to Fig. 10 illustrating the operation of the module for the structuring of the file archive loader line and the insertion thereof to the transformed HTML page. At step 264 the module is activated. At step 264 the transformed page identification is saved in order to be utilized for subsequent processing. At step 266 a page reference entry is read from the included pages list. At step 268 the page pointed at by the page reference entry is loaded and at step 270 a data structure designed to store the file archive loader line is initialized. At step 271 the value of a counter variable designed to hold the total number of references found on a discrete page is initialized to zero. The counter variable is operative in preventing the insertion of certain page-specific archives to the file archive loader line. The counter variable is compared to a predefined limiting value. The limiting value is set by the user of the server system, according to the performance characteristics of the system. In the preferred embodiment of the present invention the limiting value regards the number of archive references to file archives holding image files. Thus, in the preferred embodiment of the present invention, when the processed page contains a single (one) archive reference associated with a file holding an image, then the archive pointed at by the reference will not be inserted into the file archive loader line. In other different embodiments of the present invention different counter values in association with different file types could be used.

At step 272 a tagged line is obtained from the loaded page. At step 274 the line is examined regarding the presence of an archive reference. If the line contains an archive reference then at step 276 the reference count is increased by the value of one and at step 278 the archive reference or the archive identification is added to the file archive loader line. Subsequently at step 280 it is determined whether more unexamined lines exist on the loaded page. If the result is positive then program control returns to step 272 to obtain the next unexamined line from the loaded page file. As long as more unexamined lines exist in the loaded page file program control performs an inner loop across steps 272 through 280. When the end-of-file condition is raised within the loaded page file control exits the inner loop to examine at step 282 whether more unprocessed pages exist on the included pages list. If the result of the check is positive then at step 275 the value of the reference counter is compared to the limiting value. If the value of the reference counter is greater than the limiting value then at step 277 the page-specific references are removed from the list and program control proceeds to step 266 to obtain the next included page from the included pages list. If the value of the reference counter is not greater than the limiting value program control proceeds directly to step 266. As long as more unprocessed pages exist program controls performs an outer loop across steps 266 through 277. When all the pages within the included pages list were processed control exists the outer loop. At step 284 the saved page is loaded and at step 286 the file archive loader line is structured by the addition of the appropriate text segments. The text segments include the APPLET CODE tag, the applet value, and other parameters. The line also includes the ARCHIVE tag associated with the list of the archive identifications, and the name of the archive file comprising the trigger module. The trigger module's archive identification is placed at the end of the archive file identification list in order to effect the loading of the trigger module subsequent to the loading of the archives preceding the trigger archive on the line. The line is concluded by the suitable </APPLET> terminator. . At step 288 the file archive loader line is inserted into the loaded page and at step 290 program control returns to the calling module.

On a specifically predetermined HTML page (typically the server system's home page) the file archives loader line will also include the identification of the archive file comprising the extractor module. The extractor module's archive is preferably placed at the beginning of the archive file identification list in order to effect the loading of the extractorjmodule prior to the loading of the following archives. When a wrapped HTML page is downloaded to the client, the client's browser processes the page. The processing involves the displaying of the page to the users of the client system and the suitable execution of specific tag lines within the page without the users' intervention. One of tags to be automatically executed is the APPLET tag. When the browser identifies the APPLET tag a request is submitted to the server regarding the transmission of the archive files indicated by the ARCHIVE parameter of the APPLET taggedjine. Consequently, the archive files are sent to the client and stored into the client's browser cache area. When users of the client system interact operatively with the displayed pages, the result of the interaction is typically an additional request submitted to the server for the downloading of an additional HTML page. As the entire set of pages referenced by the active pages were pre-downloaded and stored in the cache area of the client's browser, no further request is sent to the server but the requested page is loaded from the client's browser cache area.

The implementation of the extractor module on the client system could be performed in a different manner. The extractor module could be downloaded as a distinct page or as an applet associated with a single page. The extractor module could also be installed permanently on the client system as a plug-in module to the network browser. Referring now to Fig. 11 wrapper module 37 of Fig. 1 is a specifically developed computer application. In a preferred embodiment of the present invention Wrapper module 37 of Fig. 1 is a Java applet. In another preferred embodiment of the present invention Wrapper module 37 is an ActiveX object. The components instituting Wrapper module 37 of Fig. 1 are described in association with Fig. 11. Wrapper module 37 of Fig. 1 receives the following parameters 151 a) the name of the external file b) reference to any network address c) any mouse captures event functions. Wrapper module 37 of Fig. 1 is operative to load (156) the external file according to the name specified in the Name parameter 151(1), to load (156) the external files from any other sites over the data network according to the network address parameter value specified in the Reference parameter field, and to handle 158 any functions of mouse capture events according to the original reference. Wrapper module 37 of Fig. 1 is downloaded to client system 10, 12, 14, or 16 of Fig. 1 along with the page-specific sets of external files packed into distinct packages 34 of Fig. 1. On the client systems 10, 12, 14, and 16 Wrapper module 37 by virtue of being embedded in the downloaded wrapped HTML pages 32 of Fig. 1 has the same functionality as the original file references in the HTML pages 28 of Fig. 1. Therefore, external files referenced will be handled according to the type thereof. Image files having formats such as GIF, BMP, JPEG, PCX, PNG, TGA, TIF, XBM, XIF, and the like will be displayed, video and audio files such as MPEG, AVI, MOV, RAM, WAV, AIFF, AIFC, AIF, AU, SND and the like will be played and script files such as VBS, JS and the like will be executed.

Fig. 12 illustrates an example of the transformation of HTML pages into wrapped HTML pages in the first preferred embodiment of the present invention.

The wrapper module 37 is a Java Applet. HTML page 162 contains two references to external files. Reference 163 regards an on-site external image file named as "CAT. JPEG". Wrapped HTML page 165 contains the reference 166 transformed from the original reference 163. Reference 164 regards an off-site external image file named as "DOG.GIF" located at a remote site with indicated by the network address "www.ani.com". In the present example the network address is a Web URL. Wrapped HTML page 165 contains the reference 167 to wrapper module 37 based on original reference 164.

Fig. 13 illustrates the transformation of HTML pages into wrapped HTML pages according to a variation to the first preferred embodiment of the present invention. Wrapper module 37 is an ActiveX object in the variation for the embodiment. It should be noted that diverse software products such as JScript or VBScript could be used as tools in the development of the software module functional in the transformation of the pages. HTML page 168 contains two references to external files. Reference 172 regards an on-site external image file named as "CAT. JPEG". Wrapped HTML page 170 contains the reference 176 based on original reference 172. Reference 174 regards an off-site external image file named as "DOG.GIF". Wrapped HTML page 170 contains the reference 178 to wrapper module 37 based on original reference 174.

Referring now to Fig. 14 HTML page A (300) is an exemplary page prior to being processed by the file archive loader line builder module 24 of Fig. 1. HTML page A (300) contains page references 302, 304, 306, and 308. Wrapped HTML page A (310) illustrates the structure of a page consequent to the processing by file archive loader line builder module 24 of Fig. 1. Wrapped page A (310) includes tagged line 312, which was built and inserted into the body of the page by the file archive loader line builder 24 of Fig. 1. Line 312 comprises the control tag APPLET CODE, which indicates that the line refers to a Java applet, and a parameter value TRIGGER.CLASS, representing the identification of the applet to be loaded. The principal function of the trigger module is to effect the loading of the archive files. The list of the archive file identifications is stored within the ARCHIVE parameter string value. The file archive loader line includes additional required parameters such as HEIGHT and WIDTH to effect the appropriate positioning of the banner containing information regarding the developers of the system. The ARCHIVE tag is followed by the list of transmittable archive files identifications extracted from the pages referred to by wrapped page A (310). The list of archives is terminated by the archive file identification of the triggering applet. By disposing the trigger file's archive at the end of the list the archives preceding the trigger archive on the list will load prior to the loading of the trigger module. The </APPLET> tag completes the file archive loader line.

Referring back to Fig. 8 HTML page A (200) includes references to page B (202), page C (204), page D (206), and page E (208). The pages 202, 204, 206, 208 include archive files JARA.JAR (214), JARB.JAR (216), JARC.JAR (226), JARD.JAR (228), JARI.JAR (230), JARJ.JAR (232). Therefore, the ARCHIVE parameter of the file archive loader line included in the wrapped HTML file A (310) of Fig. 14 is having the corresponding values of JARA.JAR, JARB.JAR, JARC.JAR, JARD.JAR, JARI.JAR, and JARH.JAR.

Referring now to Fig. 15 HTML page C (314) is an exemplary page shown prior to being processed by the file archive loader line builder module 24 of Fig. 1. HTML page C (314) contains page references 316, and 318. Wrapped HTML page C (320) illustrates the structure of a page consequent to the processing by file archive loader line builder module 24 of Fig. 1. Wrapped page C (320) consists tagged line 322, which was built and inserted into the body of the page by the file archive loader line builder module 24 of Fig. 1. Line 320 comprises control tag APPLET CODE indicating that the line refers to a Java applet, a parameter value TRIGGER.CLASS to indicate the identification of the applet to be loaded. The applet exists as a trigger designed to activate the loading of the archive files the list of which is stored within the ARCHIVE parameter value. The file archive loader line includes additional required parameters such as HEIGHT and WIDTH to enable suitable positioning of a banner displayed by the trigger module. The ARCHIVE tag is followed by the list of transmittable archive files extracted from the pages referred to by wrapped page C (320). The list of archives is terminated by the archive of the triggering applet in order to enable the loading of the preceding archives on the line prior to the loading of the trigger archive. The </APPLET> tag completes the file archive loader line.

Referring back to Fig. 8 HTML page C (204) includes references to page F (210), and page G (212). The pages 210, 212 include archive files JARE.JAR (218), JARF.JAR (220), JARGJAR (222), JARH.JAR (224). Therefore, the ARCHIVE parameter of the file archive loader line included in the wrapped HTML file C (320) of Fig. 14 is having the corresponding values of JARE.JAR, JARF.JAR, JARGJAR, and JARH.JAR, with the TRIGGER JAR attached.

When Wrapped HTML page 310 or 320 are downloaded to the client, the client's browser processes the pages 310 or 320. The pages are displayed to users of the client system and specific tagged lines within the body of the pages are executed without the intervention of the users. One of tags to be executed automatically is the APPLET tag. When the browser identifies the APPLET tag a request is submitted to the server regarding the transmission of the archive files indicated by the ARCHIVE parameter. Consequently the archive files are sent to the client and stored into the browser's cache memory device. Typically, the user of the client system operatively interacts with the displayed pages. The typical results of the interaction are additional requests sent to the server for the downloading of additional HTML pages. The loading of an additional page results in the predefined execution of specific tagged lines within the body of the page and thereby in the referencing of specific external files. As the archives associated with the currently executing page were pre-downloaded substantially concurrently with the previously executed page, the requested external files are already stored in the browser's cache memory device. Therefore, the proposed method and system achieves substantial improvements in the response times experienced by the users.

In the first preferred embodiment of the present invention described in the foregoing an HTML page wrapping mechanism have been disclosed. According to the teaching of the disclosure the wrapping technique effects the replacement of the references to external files within the body of an HTML page by references to file archives, which include the referenced files. In addition, the teaching of the first preferred embodiment describes a look-ahead mechanism operative in the insertion of a file archive loader instruction into a specific page. The file archive loader line contains a set of file archive references. The referenced file archives contain a set of external files referenced by "next level" HTML pages, which in turn are referenced by the specific page. In the second preferred embodiment of the present invention, the above-described techniques are integrated in order to provide an improved method and system. The method and system of the second preferred embodiment is going to be described next in association with the following drawings.

In the second preferred embodiment of the present invention the HTML transformation step 44 is bypassed and the list of the created packages is inserted into the body of the HTML page as additional parameters on the file archive loader line. Thus, the HTML transformer module 27 of Fig. 1 becomes inoperative. Additional steps within specific modules, such as the file archive loader line builder module, are added, and further modifications are made along the execution path. Consequently, the original references to the external files within the HTML page remain unmodified. The archives containing the referenced external files are appropriately created and the identification thereof are attached to the list of file archive identifications on the file archive loader line. When a specific HTML page containing unmodified file external references and the file archive loader line is downloaded the entire set of archives indicated by the file archive loader list is downloaded substantially concurrently with the page. The archives and included files will be stored on the client's browser cache memory area on the client platform. As a result of the standard operational procedures of the common network browsers the client's browser will obtain the files referenced by the unmodified references within the downloaded page directly from the cache memory area. Thus, the cache memory area will hold two types of files a) Files referenced directly by the currently executing page on the client platform and b) Files referenced by "next level pages" referenced by the currently executing page.

Referring now to Fig 16 the operation of the method according to the second preferred embodiment of the present invention is illustrated. On Figs. 2 and 16 like elements are represented by like numbers. When not noted otherwise, the steps involved in the operation of the method as are shown on Fig. 16 are functionally identical to the step involved in the operation of the method as are shown on with Fig. 2. When not noted otherwise, the logical path of the modules shown on Fig. 16 are functionally identical to the logical path of the modules shown on Fig. 2.

Still referring to Fig. 16 at step 40 the Wrapper module 37 of Fig. 1 is created and at step 33 trigger module 19 of Fig. 1 created. At step 34 the extractor module 15 of Fig. 1 is created. At step 35 the personalization of the system is performed. The actual process of the wrapping operation extends across steps 41 through 46. At step 41 the server system 20 of Fig. 1 is being primed for the operation. The HTML pages 28 of Fig. 1 are searched in real-time or offline for references to the external files 30 of Fig.l. A Pages/References

Index file 29 of Fig. 1 is built containing a list of HTML pages 28 of Fig. 1 and related references of external files 30 of Fig. 1. At step 42 interaction with the user is initiated in order to facilitate the management of the files.

In contrast to the operation described in association with Fig. 2, the HTML transformation 44 of Fig. 2 is not performed on Fig. 16. Therefore, in the second preferred embodiment of the present invention, the external file references (and indeed, all original references) within the HTML files 28 of Fig. 1 remain unmodified.

At step 45 a distinct set of external files 30 of Fig. 1 is packaged into archives 34 of Fig. 1 for each processed HTML page 28 of Fig. 1. The sets of external files 34 of Fig. 1 contain compressed files created by a suitable compression and packing utility 25 of Fig. 1. Into each set of packaged external files 34 of Fig. 1 a copy of the wrapper module 37 of Fig. 1 is inserted. At step 43 the HTML pages 28 of Fig. 1 are processed by File archive loader Line Builder module 24 of Fig. 1. As a result of the processing the HTML pages are modified by the insertion of the suitably structured file archive loader lines. At step 46 the wrapping of the server system is finalized by the replacement of original HTML pages 28 of Fig. 1 with modified pages 32 of Fig. 1. The packages consisting of the sets of packed external files 34 of Fig. 1 and the wrapper module 37 of Fig. 1 are stored into a suitable database on the server.

The operations associated with steps 40, 33, 35, 41, and 42 are functionally identical in the first preferred embodiment and the second preferred embodiment. The Package Creation step 45 according to the second preferred embodiment of the present invention is described in association with Fig. 17, which illustrates the process of packet creation 45 of Fig. 16. The external file packager module 38 of Fig. 1 performs the process. On Figs. 5 and 17 like steps are designated by like numbers. When not noted otherwise, the steps involved in the operation of the method, which is illustrated on Fig. 17, are functionally identical to the steps involved in the operation of the method, which is illustrated on Fig. 5.

Still referring to Fig. 17 the process begins at step 120 by reading a page-related entry from the index file. At step 121 an HTML page 28 of Fig. 1 pointed at by a page entry from the index file is loaded. A reference entry read from the same index file (step 122) will point to an appropriate reference in the loaded HTML page. At step 124 the reference line will be obtained and at step 126 the name of the external file, and optionally the network address are extracted from the wrapped reference. At step 128 the external file is loaded and at step 130 the external file packager 38 of Fig. 1 is activated to compress and pack the loaded file into the archive of the packaged external files 38 of Fig. 1.

Step 131 illustrates an additional operation necessary for enabling the correct operation of the method proposed in the second preferred embodiment of the present invention. At step 131 the file archive, which was previously created at step 130, is added to a list of archives. The list is indexed the identification of the referencing page.

At step 132 the next reference entry in the page/reference index file is looked for and if a suitable entry is found then program control will loop across steps 122 through 132 until the processing of the entire set of references is completed. When no more unprocessed references found in the index file regarding the loaded page the index file is accessed to obtain the next page entry at step 134. If an additional page entry is found program control will loop across steps 120 through 134 until the processing of the entire set of pages is completed. Conversely, if the end of Pages/References index file is reached then at step 136 the wrapper module 37 of Fig. 1 is inserted into the package of external files. Consequently at step 140 control returns to the HTML wrapper.

Subsequent to the completion of the process described in association with Fig. 17 an additional data structure, referred to as Pages/Archives file, is created. An entry in the Pages/Archives file consists of a) Page identification b) Archive identification and c) A suitable link among a) and b).

The operations of the file loader line builder as illustrated on Fig. 9 are functionally identical for the first and for the second preferred embodiments of the present invention. The File Archive Loader Line insertion process, in accordance with the second preferred embodiment of the present invention, is described in association with Fig. 18. The file archive loader line builder module 24 of Fig. 1 is responsible for the performance of the process. On Figs. 10 and 18 like steps are designated by like numbers. When not noted otherwise, the steps involved in the operation of the method according to the second preferred embodiment of the present invention, which is illustrated on Fig. 18, are functionally identical to the steps involved in the operation of the method according to the first preferred embodiment of the present invention, which is illustrated on Fig. 10.

Still referring to Fig. 18 the process begins at step 262 by the activation of the module. At step 264 the HTML page identification is saved in order to be utilized for subsequent processing. At step 266 a page reference entry is read from the included pages list. At step 268 the page pointed at by the page reference entry is loaded and at step 270 a data structure designed to store the file archive loader line is initialized. At step 271 the value of a counter variable designed to hold the total number of references found on a discrete page is initialized to zero. At step 272 a tagged line is obtained from the loaded page. At step 274 the line is examined regarding the presence of an archive reference. If the line contains an archive reference then at step 276 the reference count is increased by the value of one and at step 278 the archive reference or the archive identification is added to the file archive loader line. Subsequently at step 280 it is determined whether more unexamined lines exist on the loaded page. If the result is positive then program control returns to step 272 to obtain the next unexamined line from the loaded page file. As long as more unexamined lines exist in the loaded page file program control performs an inner loop across steps 272 through 280. When the end-of-file condition is raised within the loaded page file control exits the inner loop to examine at step 282 whether more unprocessed pages exist on the included pages list. If the result of the check is positive then at step 275 the value of the reference counter is compared to the limiting value. If the value of the reference counter is greater than the limiting value then at step

277 the page-specific references are removed from the list and program control proceeds to step 266 to obtain the next included page from the included pages list. If the value of the reference counter is not greater than the limiting value program control proceeds directly to step 266. As long as more unprocessed pages exist program controls performs an outer loop across steps 266 through 277. When all the pages within the included pages list were processed control exists the outer loop. At step 284 the saved page is loaded.

In order to enable correct operation of the method in accordance with the second preferred embodiment of the present invention, at step 285 the Pages/Archives file, created at step 131 of Fig. 16, is traversed in order to extract the entries thereof. Each entry includes page identification and archive identification linked thereto. Each entry is suitably inserted into the archive identification list on the file archive loader line.

At step 286 the file archive loader line is appropriately structured. At step 288 the file archive loader line is inserted into the loaded page and at step 290 program control returns to the calling module.

Fig. 19 illustrates an example of the conversion of HTML pages into modified HTML pages in the second preferred embodiment of the present invention when wrapper module 37 of Fig. 1 is a Java Applet. HTML page 162 contains two references to external files and one reference link to an HTML file. Reference 163 regards an on-site external image file named as "CAT. JPEG". Modified HTML page 165 contains the reference 163', which is identical to the reference 163 on HTML page 162. Reference 164 regards an off-site external image file named as "DOG.GIF" located at a remote site with indicated by the network address "www.ani.com". In the present example the network address is a Web URL. Modified HTML page 165 contains the reference 164', which is identical to the reference 164 on HTML page 162. File loader line 169 on page

165 includes the file archive "WRAPPER JAR", which contains the external files "CAT.JPEG", and "DOG.GIF".

Referring back to Fig. 8 the conceptual network of logical relationship illustrated among the set of HTML pages is identical for both the first and the second preferred embodiments of the present invention. Still referring to Fig.8 HTML page A (200) includes references to page B (202), page C (204), page D (206), and page E (208). The pages 202, 204, 206, 208 include archive files JARA.JAR (214), JARB.JAR (216), JARC.JAR (226), JARD.JAR (228), JARI.JAR (230), JARJ.JAR (232). Therefore, the ARCHIVE parameter of the file archive loader line included in the modified HTML page 165 of Fig. 19 contains a list of file archive identifications corresponding to the values of JARAJAR, JARB.JAR, JARC.JAR, JARD.JAR, JARI.JAR, and JARH.JAR. In addition the list contains the archive WRAPPERJAR associated with the external file references 163, and 164. The list also contains the archive TRIGGERJAR including the trigger module.

It will be easily perceived by one with ordinary skill in the art that on Figs. 12, 13, 14, 15 and 19 only the simplified lines relevant to the teachings of the disclosure were shown in order not to obscure the salient features of the present invention. It will be also clear that the foregoing description of the preferred embodiments is exemplary only. Different functional steps could be taken in order to achieve substantially the same results. Different routines could be utilized or the routines could be organized in different manner. The control data structures could be designed in diverse different formats. For example in some embodiments of the present invention the analyzing of the transformed page could be performed parallel to the operation of the system preparation module and in other embodiments the Pages/References index file could be integrated with the included pages list. The control data structures mentioned could be implemented in different combinations or could be eliminated altogether.

An alternative embodiment of the present invention is going to be described next. The alternative embodiment, which will be referred to generally as the "third embodiment of the present invention" in the text of this document, is focused on the utilization of a functionally enhanced extractor module. In the third embodiment of the present invention, the extractor module is endowed with additional functions, features, and responsibilities respectively. The basic stages involved in the execution of the proposed method are substantially similar to those present in the previously described embodiments. The third embodiment of the present invention effects a considerable improvement in regard to the response times experienced by users of client platforms during the delivery of specifically requested content information from a wrapped content provider network site. The improvement is further accomplished due to the deployment of the enhanced extractor module, referred to generally in the following text as the "extractor/downloader module", within the configuration of the system. The improvement to the response times are yet further accomplished by a number of modifications made to the main flow of the method, to the inner logic of some of the routines constituting the method, and by the resulting reconstructed data structures constituting the processed HTML pages.

Fig. 20 illustrates an exemplary system configuration operative in the execution of the proposed system and method, in accordance with the third preferred embodiment of the present invention. It should be noted that only software components functional to the operation of the proposed system and method are shown on the discussed drawing. Additional components operating on the network level and on the operating system level were described in association with Fig. 1. It further should be noted that although only a single client system and a single server system appear on the discussed figure in a realistically configured computing and communicating environment a plurality of client systems could be communicatively connected to a plurality of server devices. Client system 320 is implemented on a computing platform within a data communications network such as the Internet or more specifically the World

Wide Web (Web). Client 320 is communicatively linked to server system 328.

Server 328 is a content providing site implemented on a computing platform within a data communications network such as the Internet or more specifically the World Wide Web (Web). Client 320 communicates to server 328 in order to access and interact with content information embedded on server 328. Interaction with the information involves submitting request for specific Web pages by the client 320, the downloading of specific Web pages stored on server 328 to the client 320, and the suitable processing of the Web pages by client 320. Client 320 includes in addition to required network-level and operating system-level components network browser 322, browser cache 324, and extractor/downloader module 326. Browser 322 is a software application used to locate and display Web pages. Browser 322 can be any of the existing Web browsers such as the Netscape Navigator or the Microsoft Internet Explorer (MISE). Browser 322 is activated and manipulated by the user of client 320 is order to access and interact with specific Web pages. Browser cache 324 is a special high-speed storage mechanism. Browser cache 322 can be either a reserved section of main memory or an independent high-speed storage device allocated specifically for caching purposes. The most recently downloaded content data, such as a Web page, is physically stored in a specific buffer associated with the memory area or with an allocated segment of a high-speed storage device such as a hard disk. At any particular moment of time a number of cached Web pages could be held within the buffer. Thus, when the browser 322 needs to access the downloaded data, the browser cache 324 is checked first to see if the data is stored therein.

Extractor/downloader module 326 is a specifically developed software module designed to effect the independent downloading of certain Web pages from server 328 to client 320. Module 326 further operative in the extraction of the downloaded pages from the browser cache 324. In the third preferred embodiment of the present invention module 326 is downloaded from server 328 by browser 322 of client 320 following the first access of server 328 by client 320. Subsequently extractor/downloader module 326 is installed on a storage device associated with client 320. During routine interaction between client 320 and server 328 browser 324 activates module 326. Thereafter, module 326 operates continuously in the background i.e., simultaneously and asynchronously with the operation of browser 322, as long as browser 322 is active. Module 326 effects suitable downloading of pages from the server 328, and extraction of the downloaded pages from the browser cache 324. A detailed explanation of the functions, features, and responsibilities of extractor/downloader module 326 will de described hereunder in association with the following drawings.

Server 328 is a software system operative in providing content information such as Web pages to requesting clients. Server 328 contains HTML pages 330, local external files 332, site wrapper application 36, wrapped HTML pages 338, packaged pages and external files 340, and extractor/downloader module 342. Site wrapper application 334 is a set of software modules operative in the transformation of the content providing site such as server 328 to provide specifically packaged and optimized files to be downloaded to the requesting clients. The fundamental elements and associated operations constituting site wrapper application 334 were described hereinabove. In the following text descriptions will be provided only regarding the modified elements of the proposed system and method, such as main logic flow.

HTML pages 330 are data structures representing Web documents. The layout and structure of the content information embedded within pages 330 is defined and described in the Hypertext Markup Language by using control fields such as tags and attributes. A plurality of predefined tags is used to format and layout the information embedded within an HTML page. For instance, <P> is used to make paragraphs and <I> ... </I>is used to italicize fonts. Tags are also used to specify hypertext links. Hypertext links allow HTML page developers to direct users to other HTML pages or to external files, such as images, sound, video, applets, and the like, with only a click of the pointing device on either a graphical structure or one or more words within the page.

Local external files 332 include typically content information having multi-media format such as graphics, still images, video, sound, applications, virtual reality, or the like. Files 332 can be referred to directly by the utilization of specific tags inserted into the body of HTML pages 330. The connection between specific HTML pages 330 and associated local external files 332 was explored in detail hereinabove in' association with the first and second embodiments of the present invention. Note should be taken that the HTML pages 330 could reference HTML pages or external files stored on one or more separate server systems (not shown).

As was described in detail hereinabove in association with the first and second embodiment of the present invention, the site wrapper application 336 is operative in the building of the packaged pages and external files 340 and the wrapped HTML pages 338. The resulting packaged pages and external files 340 are file archives containing sets of specifically organized, packaged and compressed HTML pages and multi-media files to be transmitted to the requesting client in association with the "driver" HTML page or the HTML page referring to the specific package. Wrapped HTML pages 338 are the pre-processed HTML pages, which include a formatted list of the packages 340 associated with the specific page. Note should be taken that copies of the packages 340 could be "dispersed" or controllably deployed on other server systems (not shown), which are located "closer" to the client system (according to the particular network topology), in order to speed up the delivery of the packages to the clients.

Extractor/downloader module 342 is a specifically developed software module is utilized as the source for module 326 held on client 320. Module 342 is routinely downloaded to client 320 when the client first accesses server 328. Module 342 could be located on one or more different other servers (not shown), such as on servers constituting the Intranet (sub-network) of the application provider, or could be stored on any other server.

Referring now to Fig. 21 that shows the various stages involved in the method of wrapping of the HTML pages 330 of server system 328 of Fig. 20. At step 346 the extractor/downloader module 342 of Fig. 1 is created. Extractor/downloader is software program specifically developed offline. Diverse software products typically utilized as Internet applications such as Java, ActiveX, Jscript, VBScript, Java Script among others could be used as tools in the development of extractor/downloader module 342. Module 342 is implemented as a small application that runs as a background process to a browser application on client systems. Module 342 is responsible for the downloading of packaged archive file from a server system to the clients, and for extracting the cached archives downloaded for use by the browser of the client system. In the third preferred embodiment of the present invention extractor/downloader module is an ActiveX object or ActiveX control. Thus, module 342 can be developed in a variety of languages, including C, C++, Visual Basic, and Java. An ActiveX control is similar to a Java applet. Unlike Java applets, however, ActiveX controls have full access to the Windows operating system. In the first and second embodiment of the present invention described in the foregoing an HTML wrapping mechanism operating in association with a look-ahead mechanism was described. The look-ahead mechanism is operative in analyzing the set of HTML pages stored on the server logically connected to other HTML files and/or to specific external files. A data structure reflecting the hierarchical relationships between the files is created. The structure is methodically examined and a file a file loader line instruction is built for each operative page. The file archive loader line is inserted into the respective specific page. The file archive loader line contains a set of file references. The referenced file archives contain a set of external files referenced by the "next level" HTML pages, which in turn are referenced by a specific HTML page. The technique used was discussed in detail in the foregoing.

The first and second embodiment of the present invention employs a specific module referred to as the "extractor". The extractor module is implemented on the client system. The extractor is responsible for accessing the downloaded archive files within the browser cache memory area, to extract the HTML files and the associated external files from archives, to decompress the files if necessary, and to hand over the processed files to the network browser. The functionality and responsibilities of the extractor module was discussed in the foregoing. In the third embodiment of the present invention, the entire set of the above-described techniques are integrated into the proposed method and are employed with specific enhancements in the method logic, and in the functionality of the extractor module.

At step 348 the personalization of the system is performed. Personalization step 348 is accomplished via the service of the personalization module 27 of Fig. 1. A detailed description of the personalization concept was described in the foregoing in association with the first and the second preferred embodiment of the present invention. The actual process of the site wrapping operation extends across steps 350 through 358. At step 350 a map of the content provider site is created. The creation of the site map involves the analysis of the hierarchical links among the interrelated HTML pages and the extemal files referred by the HTML pages. In accordance with the output of the analysis at least one data structure representing the logical links between the interrelated HTML pages is created. The data structure is used as a set of linked reference points in the building of the suitable downloadable file archives, in the creation of the file archive loader line, and in the modification of the relevant HTML pages. The discussed process was described in detail in the foregoing, in association with the previously discussed embodiments of the present invention. At step 352 the site map is optimized in order to build effective, efficient, and practical file archives. Typically the archive utility tool defines a maximum physical size for a file archive. Thus, it is important to know prior to the actual building of the packages the sizes of the files to be packaged in order to achieve practical, efficient, and effective organization of the files within the archives.

During optimization the physical sizes of the linked files can be examined, the optimal number of files to be inserted into a given archive is calculated. If the total size of the files exceeds the physical limit of the archive then the packaging scheme is modified and additional archives are used for the same set of files. The optimization process also provides solution for other problems inherent in the original organization of the HTML pages and other files on the content provider site. Thus, the optimization can be used to eliminate duplicate files, can transfer hierarchically higher level files into lower level archives and the like.

At step 354 a package building process is performed. The process can utilize off-the-shelf commercially distributed compression and packaging utilities. The details of the package building process were also described in the foregoing. At step 356 the file archive list is created. The list of the archives associated with the next level HTML page are assembled into an archive list line as was previously described. At step 358 the suitable HTML pages are updated by the structuring of a file archive loader line and by the insertion of the file archive loader line into the page. The structuring and insertion of the file archive loader line was also described in the foregoing in association with the first and second embodiment of the present invention.

It is important to note that in the third preferred embodiment of the present invention the tagged lines used as external file references within the body of the HTML are not replaced. For example, a tagged line referencing an external image file, such as '<IMG> SRC="CAT.JPEG" WIDTH=9 HEIGHT=12 ALT="CATJPEG">' will stay in the body of the page unmodified. The external image file 'CAT.JPEG' will be packaged into a file archive, such as 'JARA.JAR'. The name of the archive will be appropriately attached to the file archive loader line and the loader line will be inserted into the body of the page. In the third preferred embodiment of the present invention the wrapper module and the trigger module, described in the foregoing in association with the first and second preferred embodiments, are inoperative. The core component implemented in the proposed system is the extractor/downloader module.

Extractor/downloader module 342 of Fig. 20 is developed and stored on the server 328 of Fig. 20. Extractor/downloader module 342 of Fig. 20 can be deployed on other computing platforms within the local network (Intranet) of the application service provider or can be deployed on any other server. Copies of the extractor/downloader module 342 of Fig. 20 can be distributed across a set of servers geographically separated in order to provide substantially rapid download to a plurality of client platforms. Copies of the extractor/downloader module 342 of Fig. 20 are transmitted to the clients, installed on the clients' platforms, activated by the client's network browsers, and execute as continuous background processes on the clients' machines during the operational cycle of the network browsers. Referring now to Fig. 22 that illustrates the method employed in the downloading the extractor/downloader module 342 from the server 328 to the client 320. The method also provides optional, periodic, and transparent software updates transmitted from the server 328 of Fig. 20 to the client 320 of Fig. 20. The updates are applied to the operative extractor/downloader modules distributed across a plurality of client platforms within the data network. The application of the updates is essential in order to synchronize the existing program module implemented on the client platforms with the potentially upgraded program module installed on the server platform 328 of Fig. 20. Following the communicative linking of client system 320 of Fig. 20 to server 328 of Fig. 20 at step 360 it is checked whether the user that operates client 320 of Fig. 20 is a new (unregistered) user. If result is affirmative then at step 320 it is deteπnined whether the user desires to download the proposed extractor/downloader module 342 of Fig. 20. If the result is affirmative then at step 364 the extractor/downloader module 324 of Fig. 20 is transmitted to the client 320 of Fig. 20 and the method control proceeds to step 372 in order to terminate the program. If the result at step 262 is negative then the method control terminates the program at step 372 without downloading the extractor/downloader module to the client 320 of Fig. 20. If it is determined at step 360 that the user accessing the server 328 of Fig. 20 is a known (registered) user then the extractor/downloader module 342 of Fig. 20 already installed on the client platform thereof. At step 366 the version of the module stored on the server is compared to the version of the module installed on the client platform. If at step 368 it is determined that the versions are identical then the method control terminates the program at step 372. Conversely if the versions differ then at step 370 the updates embodying the differences between the versions are transmitted to the client system and applied to the extractor/downloader module installed therein. Subsequently at step 372 the program terminates.

It would be easily perceived that the proposed method and system could operate differently. Copies of the extractor/downloader module and the associated updates could be transmitted to and could be installed in the client platforms in diverse other ways using diverse other means. Although the proposed process is highly effective in systematically maintaining the integrity of the proposed system, the extractor/downloader module and the subsequent updates could be distributed to the user of the client systems via standard communication lines asynchronously to the operative method, such as attachments to electronic mail. The module could be sent via more traditional delivery channels, such as conventional mail services and the like.

Fig. 23 is the flowchart illustrating the downloading of the pages from a wrapped content provider site such as a server system to a client system. In the third preferred embodiment of the present invention, the downloading of the pages is performed after the site was suitably processed by site wrapper application, the HTML pages and associated external files were packaged into appropriate file archives, the HTMP pages were modified by the insertion of a file archive loader line, and the extractor/downloader module was transmitted to and installed in the server platform. The discussed drawing illustrates particularly the operations involved in the downloading and processing of the first HTML page.

At step 374 the first HTML page is downloaded and stored into the network browser's cache. The extractor/downloader module running as a background process extracts the page from browser cache at step 376, and removes the sources of the external files from the page at step 38 in order to eliminate the external file requests by the network browser. As a result the page is displayed on the display device of the client platform as text only without images or other multi-media formatted files (step 380). Meanwhile in the background the extractor/downloader obtains the packages appearing on the file archive loader line of the first page, downloads the archives from the sever (step 382). The packages are stored in the browser cache memory area. The extractor/downloader extracts the packages from the cache at step 384. At step 386 the external file references are restored to the first page and at step 388 the page with the restored external file references is redisplayed on the display device of the client system.

The rest of pages are displayed only once without removing and restoring the external file references. For each extracted page from the network browser's cache the extractor module parses the file archive loader line, downloads the referenced files in one HTTP connection, and extracts the respective files from the browser cache. As a result all the files needed by the network browser will be present and ready on the client platform at practically all the appropriate points in time.

It will be readily understood that the present invention is not limited to the specific embodiments described above. In another preferred embodiments of the invention additional elements, components, modules, tables and methods could be added based on the broader aspects of the proposed solution. For example in an another embodiment the entire site could be wrapped into one packaged file.

In a still further embodiment scripts could be added to the HTML wrapper.

Various other enhancements and developments are contemplated based on the underlying concept of the proposed invention. The present invention is not limited to what has been particularly shown and described hereinabove. Rather the scope of the present invention is defined only by the claims, which follow.

Claims

I CLAIM:

1. In a computing environment accommodating at least one client system connectable to one or more server systems a method of processing files in a first computer system into structured file archives to be transferred to and to be re-processed within a second computer system, the method comprises the steps of: determining first reference to first extemal file; and wrapping the first external file; and replacing first reference with second reference thereby indicating new location for the wrapped first external file.

2. The method of claim 1 further comprising the steps of: examining a file for the presence of references to external files; and transmitting the file having a replaced first reference and the first wrapped external file to a client; and extracting the wrapped first external file by calling the second reference; and selecting a first reference to be replaced by a second reference thereby indicating new location for the wrapped first external file.

3. In a computing environment having a server connected to a data network via a communication device, a storage device holding files, and a file wrapper system, the wrapper system comprising the elements of: a file reference transformer for replacing first reference identified within a file with second reference thereby indicating new location for an external file; and a file processor for selecting and loading the external file; and a file wrapper for wrapping the external file.

4. The system of claim 3 further comprises the elements of: a file primer for calling and loading a file associated with a prepared reference list; a compression and packaging utility for compressing and packaging the wrapped external file; file reference builder for scanning and creating references to external files within a file; and an external file processor for providing appropriate functionalities to the wrapped external files; and an extractor module for extracting the archived files from the file archives stored within the network browser's cache memory area on the client system.

5. The system of claim 4 wherein the file associated with a prepared reference list comprises a hypertext file.

6. The system of claim 4 wherein the file associated with a prepared reference list comprises an HTML file.

7. In a computing enviromnent accommodating at least one client system connectable to one or more server systems, a method of packaging files in a first computer system into file containers to be transmitted to and to be re-processed within a second computer system, the method comprising the steps of: resolving a first reference to a first external file; and inserting the first external file into the file container; replacing the first reference with the second reference thereby indicating placement of the packaged first external file in the file container.

8. The method of claim 7 further comprising the steps of attaching an external file processor into the file container; and transmitting the file container to a client.

9. The method of claim 7 further comprising the step of attaching an external file extractor utility into the file container.

10. The method of claim 8 wherein the external file processor is an applet.

11. The method of claim 7 wherein the file comprises a hypertext file.

12. The method of claim 7 wherein the file comprises an HTML file.

13. The method of claim 7 wherein the second reference referencing the external file processor.

14. The method of claim 7 wherein the second reference referencing the packaged external file.

15. In a computing environment accommodating at least one client system connectable to one or more server systems, a method of processing files in a first computer system into structured file archives to be pre-downloaded to a second computer system, the method comprising: identifying a set of related files in the first computer system; and analyzing the hierarchical usage pattern of at least two related files in the identified set of related files in order to establish a file usage hierarchy based on the at least two files in the first computer system; and examining at least one file within the file usage hierarchy based on the at least two files analyzed in the first computer system to determine at least one reference to a file archive; and structuring at least one file archive reference into at least one file archive loader command; and inserting the at least one file archive loader command including the at least one file archive reference into the at least one analyzed file in the first computer.

16. The method of claim 15 wherein the step of identifying comprising the steps of: determining a first reference to at least one distinct file in the first computer system; and loading the at least one distinct file to in order to identify a next reference to a distinct file; and determining the next reference to the distinct file.

17. The method of claim 15 wherein the step of analyzing comprising the steps of: determining a first reference to a file within the at least one file in the first computer; and formatting the referenced file identification in the first computer; determining the next reference to a file within the at least one file in the first computer.

18. The method of claim 15 wherein the step of examining comprising the steps of: loading the at least one referenced file to enable the determination of at least one file archive reference therein; and determining a first reference to a file archive within the at least one referenced file in the first computer; and storing the referenced file archive identification referenced by the at least one referenced file; and determining a next reference to a file archive within the at least one referenced file.

19. The method of claim 15 wherein the step of stmcturing comprising the steps of: obtaining a first file archive identification from the stored file archives identifications; and appending the file archive identification to a file archive loader command; and obtaining the next file archive identification from the stored file archives identification.

20. The method of claim 15 wherein the at least two files transmitted to the second computer system.

21. The method of claim 15 wherein the file archive loader command line appended to the at least one file transmitted to the second computer effectuates the concurrent transmission of at least one file archive referenced by at least one file referenced by the at least one transmitted file.

22. In a computing enviromnent and having a server connected to a data network via a communication device, a storage device holding a set of files, the files including at least two distinct files, a file pre-downloading system comprises the element of: a file archive loader line builder for formatting a file archive loader command line.

23. The system of claim 21 further comprising the elements of: a file identifier to recognize the set of files; and a file analyzer for determining the loading order among the at least two distinct files; and a file scanner for locating file archive references within at least one distinct file; and a file archive references collector to store at least one file archive reference; and a file archive loader line builder to format a file archive loader line; and a trigger module for activating the file archive loader line; and an extractor module for extracting the archived files from the file archives stored within the network browser's cache memory area on the client system.

24. The system of claim 21 wherein the set of files comprises hypertext files.

25. The system of claim 21 wherein the set of files comprises HTML files.

26. The system of claim 21 wherein the file archive loader line comprises an HTML tagged line.

27. The system of claim 25 wherein the file archive loader line comprises the elements of:

APPLET CODE tag to effect the loading of an applet in the second computer; and an applet identification to indicate the applet to be loaded in the second computer; and an ARCHIVE tag to effect the loading of file archives in the second computer; and at least one archive file identification to indicate the file archives to be loaded in the second computer; at least one file archive identification to indicate the file archive containing the trigger module; and at least file archive identification to indicate a file archive to indicate a file archive containing the extractor module.

28. The system of claim 25 wherein the at least one file archive referenced by the ARCHIVE tag is in the Java Archive format.

29. The system of claim 27 wherein the at least one distinct file included the file archive is one of still image, streaming video, sound, music, text, animation, and virtual reality resource file.

30. The system of claim 21 wherein the trigger module is operative in the activation of the file archive loader line is a Java applet.

31. The system of claim 21 wherein the trigger module operative in the activation of the file archives loader line is an ActiveX module.

32. The system of claim 21 wherein the trigger module operative in the activation of the file archive loader line is a JavaScript module.

33. The system of claim 21 wherein the extractor module operative in the extraction of the archived files from the file archives within a cache memory device on the client system.

34. The system of claim 21 further comprising a personalization module for providing the option of modifying selected parameters of the system.

35. In a computing environment accommodating at least one client system connectable to one or more server systems, a method of organizing a set of file archive references in a first computer system into at least one structured file archive loader line, embedded into a high-level file to be transferred to a second computer concurrently with the high-level file and to be processed in the second computer in association with a low-level referencing file, the method comprising the steps of: analyzing a set of files for establishing of hierarchical relationship model among at least two files; establishing a set of files located on a lower level of the hierarchical relationship model referenced by a set of files located on a higher level of the hierarchical model; and examining the file located on the lower level of the hierarchical model referenced by the file located on a higher level of the hierarchical model for the presence of a file archive reference; and structuring a file archive loader line including the set of the file archive identifications; and inserting the structured file archive loader line into the file located on the higher level of the hierarchical relationship model referencing the file located on the lower level of the hierarchical relationship model; thereby accomplishing the transmission of the file archives concurrently with the file located on the higher level of the relationship model to a second computer and the processing of the file archives subsequent to the transmission of the referencing file located on the lower level of the hierarchical relationship model.

36. The method of claim 34 wherein the step of analyzing further comprises the steps of: determining a first reference to a file located on the lower level of the hierarchical relationship model within the at least one file located on the higher level of the hierarchical relationship model in the first computer; and storing the referenced file identification in the first computer; and determining the next reference to a file located on a lower level of the hierarchical relationship model within the at least one file located on the higher level of the hierarchical relationship model in the first computer; and accumulating a set of file archive identifications indicated by file archive references present in the file located on the lower level of the file hierarchy model.

37. The method of claim 34 wherein the step of examining comprises the steps of: loading the at least one referenced file located on the lower level of the hierarchical relationship model to enable the determination of at least one file archive reference therein; and determining a first reference to a file archive within the at least one file located on the lower level of the hierarchical relationship model in the first computer; and accumulating the referenced file archive identifications referenced by the at least one file located on the lower level of the hierarchical relationship model; and determining a next reference to a file archive within the at least one file located on the lower level of the hierarchical relationship model.

38. The method of claim 34 wherein the step of stmcturing comprises the steps of: extracting a first file archive identification from the accumulated file archive identifications; and attaching the file archive identification to a file archive loader line; and extracting the next file archive identification from the accumulated file archives identifications.

39. The method of claim 34 wherein the file located on the higher level of the hierarchical relationship model contains the file archive loader line including file archive identifications referenced by the file located on the lower level of hierarchical relationship model.

40. The method of claim 34 wherein the file archive loader command line appended to the file located at the higher level of the hierarchical relationship model and transmitted to the second computer, effects the concurrent transmission of the file archive referenced by the file located at the lower level of the hierarchical relationship model.

41. The method of claim 34 further comprise the step of the hierarchical relationship model optimization.

42. The method of claim 41 wherein the step of optimization comprises the steps of: comparing the file references within the hierarchical relationship model; and removing duplicate file references within the hierarchical relationship model; and counting the number of file references within the hierarchical relationship model; and examining the size of the files referenced by the file references within the hierarchical relationship model; and building a file archiving schema by evaluating the number of files suitable for archiving into a file archive having a predefined maximum packaging capability.

43. In a computing and communications environment accommodating at least one client system connectable to one or more server systems, a method of transferring a set of at least one file archives from a first computer system to a second computer system the method comprising the steps of: establishing an file archive extractor/file archive downloader module on the second computer system; and activating the file archive extractor/file archive downloader module on the second computer simultaneously with the activation of a network browser module; and inserting into at least one file a list of least one of file archive references referred to by the least one file on the first computer; and transferring a first file archive by the network browser module from the first computer system to a temporary storage area on the second computer system; and extracting and preprocessing the first file archive from the temporary storage area on the second computer by the file archive extractor module/ file archive downloader module; and parsing the file archive list line of the first file archive on the second computer by the file archive extractor/file archive downloader; and downloading a set of at least one file archives referenced by the parsed file archive line on the second computer by the file archive extractor/file archive downloader; and extracting and preprocessing the at least one file archives referenced by the parsed file archive line on the second computer from the temporary storage area by the file archive extractor/file archive downloader.

44. The method of claim 43 further comprises the steps of: archiving the first file into the file archive containing file references referred to by the first file on the first computer system; and modifying the first file downloaded by the network browser in order to remove the entire set of the references to the files archived within the first file archive; and displaying the modified first file by the network browser; and extracting the unmodified first file from the first file archive within the temporary storage area by the file archive extractor/file archive downloader; and displaying the unmodified first file extracted from the first file archive within the temporary storage area.

45. The method of claim 43 wherein the step of establishing comprises the step of: creating a file archive extractor/file archive downloader module on the first computer system; and transferring the file archive extractor/file archive downloader from the first computer system to a second computer system; and delivering a set of at least one file archive extractor/file archive downloader updates from the first computer system to the second computer system.

46. In a computing environment having a client connected to a data communication network via a communication device the system comprises the elements of: a file archive downloader module to transfer a set of file archives from a first computer system to a temporary storage area on a second computer system; and a file archive extractor module to extract a set of file archives from a first computer system from the temporary storage area on a second computer system and preprocess the extracted file archives to prepare the files archived within the file archives for processing by a network browser.

47. The system of 46 wherein the file archive extractor/file archive downloader is an ActiveX object.

48. The system of 46 wherein the file archive extractor/file archive downloader is a

Java applet.