WO1997031319A1 - Querying and navigating changes in web repositories - Google Patents

Querying and navigating changes in web repositories Download PDF

Info

Publication number
WO1997031319A1
WO1997031319A1 PCT/US1997/002407 US9702407W WO9731319A1 WO 1997031319 A1 WO1997031319 A1 WO 1997031319A1 US 9702407 W US9702407 W US 9702407W WO 9731319 A1 WO9731319 A1 WO 9731319A1
Authority
WO
WIPO (PCT)
Prior art keywords
document
versions
documents
linked
modified
Prior art date
Application number
PCT/US1997/002407
Other languages
French (fr)
Inventor
Thomas J. Ball
Yih-Farn Robin Chen
Frederick Douglis
Elefterios Koutsofios
Original Assignee
At & T Corp.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US08/797,756 external-priority patent/US5860071A/en
Application filed by At & T Corp. filed Critical At & T Corp.
Priority to CA002246650A priority Critical patent/CA2246650C/en
Priority to EP97907643A priority patent/EP0922258A1/en
Priority to JP53025997A priority patent/JP2001508561A/en
Publication of WO1997031319A1 publication Critical patent/WO1997031319A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • User Interface Of Digital Computer (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The present invention is a system for exploring changes to World Wide Web pages and Web structure or other repository that supports recursive document comparison. The user may explore the differences between documents with respect to two dates. Differences between documents are computed automatically and summarized in a new HTML page, and differences in link structure are shown via graphical representations. The present invention is the combination of two tools that complement one another. The AT & T Internet Difference Engine (AIDE) is a tool for tracking and viewing modification to World Wide Web pages, which has been extended to support recursive tracking of documents; Ciao is graphical navigator that allow users to query and browse structural connections embedded in a document repository. The union of these tools lets users get information on the evolution of documents of interest (both textually and graphically), browse the differences interactively, and dynamically modify the set of documents with which they interact.

Description

QUERYING AND NAVIGATING CHANGES TN WEB REPOSITORIES
This application claims the benefit of the provision application entitled Querying And Navigating Changes in World Wide Web Repositories filed February 23,1996 and having Serial No. 60/012,151.
Background of the Invention
The invention relates to a system for recursively tracking changes to pages or documents in a repository. The system can indicate on a text page whether links to other pages have been modified or whether the underlying linked pages have been modified. The system can also display how the linked structure of a document has been modified to more than one level of indirection. Each display format provides for dynamic extension of the document comparison to other documents in the repository linked to the base document.
Browsing and searching are popular ways to access and find information on the World Wide Web (WWW) . The WWW is an example of a repository upon which the present invention acts, other repositories are discussed later. While GUI-based (Graphic User Interface) browsers and powerful search engines are now ubiquitous, tools and mechanisms that provide access to historical information and tracking of updates only have been developed recently and are not in widespread use. Search engines and browsers help users locate and inspect information of interest, while tracking tools help users to keep up-to-date on this pertinent information. WWW services and applications can benefit from a mechanism that tracks changes, maintains page version histories, and automatically computes differences. The usefulness of the tracking mechanism will be further increased by tools of the present invention for dealing with the vast number of documents on the Web, such as graphical views of pages with querying and filtering based on user-specified criteria and recursive tracking and viewing of changes to related Web documents. We have combined and expanded upon two existing tools, Ciao and the AT&T Internet Difference Engine (AIDE) , in order to provide two sorts of visual cues. The Web Graphical User Interface to a Difference Engine, or WebGUIDE is an implementation of the invention. Ciao displays high-level structural differences by displaying graphs showing the relationships between pages. The color of the nodes representing the pages indicates which pages have stayed the same, been modified or been deleted. The links between the pages are also represented to indicate any modifications. AIDE displays low-level textual differences by marking up changes between versions and modifying anchors to cause documents reached from that page to be annotated. Fred Douglis and Thomas Ball invented the original AIDE system and filed patent application Serial No. 08/549,359 on October 27, 1995, which is incorporated in its entirety herein by reference. Additionally, Mr. Douglis and Mr. Ball published an article on the AIDE system entitled " Internet Difference Engine And Its Applications", which is incorporated in its entirety herein by reference. The AIDE system highlighted the difference between two documents but was unable to support recursive document comparison. Thus, the prior system did not indicate whether a linked page had been modified or whether additional versions of the linked pages were stored so that a difference comparison could be run. Yih-Farn Robin Chen, Eleftherios Koutsofios,
Glenn Fowler and Ryan Wallach published an article on the original Ciao system entitled "Ciao: A Graphical Navigator for Software and Document Repositories", which is incorporated in its entirety herein by reference. This prior Ciao system did not support dynamic recursive document comparison. Dynamic recursion extends the database as new documents are encountered.
The AIDE and Ciao systems referred to in the following description are the versions which have been significantly modified and merged to form the present invention.
Summary and Objects of the Invention
This system provides the means for a user to track changes in a document repository in an efficient manner. The user selects two dates to perform a comparison of a base document. The two versions of the base page are compared and the comparison determines if linked pages at the approximate time are available for comparison and whether the available pages have been modified. The invention also enables the user to view multiple levels of changes to linked pages in a repository. The user can display the structure of these linked pages in a graph or list format. Thus, the user does not have to jump from page to page within the repository to determine if lower level (more than one level of indirection from the base document) documents have been modified. The user can display the difference of any linked page for which an earlier version exists in the repository.
One object of the invention is to provide recursive differentiation of textual materials. The user is thus informed whether the difference function can operate on a linked page and whether the link Universal Resource Locator (URL) or the linked page has been modified between the two dates selected by the user. Often these dates are comparing the current version to the most previously viewed version of the document.
Another object of the invention is to provide a graphic representation of the links between documents in a repository and whether the links within the document and/or the linked documents have been modified. The scope and depth of the graph is determined by the query entered by the user.
Another object of the invention is to provide the user with the ability to manipulate documents from the graph. Linked documents can be compared by clicking on the representative node and selecting the compare function. Also, the links to a document can be dynamically expanded by using an existing node as a base page and running a query. Another object of the invention is to provide a textual list which tracks which links within a document have been changed and whether any linked documents have been modified. The information is displayed as a list and provides information and functions similar to the graphic representation.
Brief Description of the Drawings
The above, and other, objects, features and advantages of the present invention will be apparent from the following detailed description of the preferred embodiments when read in conjunction with the accompanying drawings in which corresponding parts are identified by like reference numerals. Figure 1 is a graphic representation of the output from the Ciao-HTML system applied to the AT&T home page.
Figures 2A and 2B are examples of different pages of output by the AIDE system of the present invention.
Figure 3 is a system architecture overview of the present invention combining the modified AIDE and Ciao systems.
Figure 4 is a graphic representation of the difference graph produced by the Ciao system of the present invention.
Figures 5A-E are flowcharts illustrating the interaction of the functions according to the present invention. Figure 6 is an illustration of an output from the list structural differences function of the present invention.
Detailed Description of the Preferred Embodiments
The present invention is a tracking tool which provides a recursive comparison feature to inform the user if a linked document is available for comparison and if the document has been modified in the period between the date of the earlier version to the date of the later version. A Universal Resource Locator (URL) is the address of a page in the WWW, the page so addressed is referred to as a linked page. However, the present invention is not limited to comparisons of WWW pages but is meant to include documents from any repository which provides recursive support. Two existing tools, Ciao and AIDE, have been modified and combined to form the system of the present invention to provide various ways for the user to view the available tracking information. Ciao Ciao is a customizable navigator that allows users to query and browse structural connections embedded in a document repository. Ciao involves three major components: an abstractor that converts source documents to a database according to a data model that describes the documents' internal structure, a reposi tory that keeps versions of the documents and corresponding databases, and a graphical interface that allows users to query and visualize the information structure. Ciao has been instantiated for C, C++, ksh, Hyper Text Markup Language (HTML) , and some business information repositories.
Ciao-HTML can be used to explore the structure of HTML documents. The data model for HTML includes entities such as HTML pages, anchors, headers, and images, and relationships among them. Unlike some other instantiations, Ciao-HTML database can expand in real time as the user tries to explore links to pages that are not currently incorporated in the database. Fig. 1 shows the output of Ciao-HTML applied to a version of the AT&T home page.
To arrive at the output in Fig. 1, the user entered a query to retrieve all relationships between the AT&T home page and its anchors to a depth of one level. That query resulted in the graph shown in the upper-left window. The user can expand any of the anchors, as shown for the Home and Work anchors, to show further link connections. The expanded graph sections can be separately displayed in another window if the graph becomes too complicated (in a manner similar to the clone feature of the Netscape Navigator web browser) . An example of such a separate expanded graph is shown in the lower right corner of Fig. 1 with the Home node as the base node.
The user also visited two of the home pages by sending requests to the browser. All these operations were done through pop-up menus attached to the graph nodes. These query and navigation features of Ciao-HTML allow the user to browse complex Web structures comfortably.
Ciao-HTML runs as an external application on the user's machine, and interfaces with the browser by sending it commands to visit particular nodes. It retrieves and processes pages independently from the browser by relying on a proxy-caching server to ensure that the same pages are not fetched multiple times from off-site. Once a page is retrieved from the repository, any subsequent changes to that page in the external repository will not show up on a comparison unless that page is retrieved from off-site again. AT&T Internet Difference Engine
The AT&T Internet Difference Engine (AIDE) combines notification of changes to pages on the Web with a customized view of what has changed in those pages. Notification of changes has become relatively commonplace, but viewing changes has not. AIDE supports this with a shared version repository, into which users "deposit" pages of interest when they have seen them, and a tool called HtmlDiff, which creates a page that highlights the differences between two versions of an HTML document. In addition to seeing the changes to a page since the user last viewed it, it is possible to see a history of versions and compare any pair of them. All archival and differencing is performed on a server, using Common Gateway Interface (CGI) scripts.
Figs. 2A and 2B illustrate examples of the document output when AIDE performs a difference through the HtmlDiff operator. Bold italics indicate new text, struck-out text indicates deletions, and arrows point to either, including changes to URLs or modified linked pages, which are not otherwise highlighted. AIDE was specifically modified to determine and illustrate if two versions of a linked page are stored in the system from the approximate dates selected for the two versions of the base document. The above is an example of how modifications are indicated and it is understood to those skilled in the art that other means can be used to display any changes or modifications, such as icons or different colors.
Prior to the combination of the functionality of AIDE and Ciao to form the present invention, the only interface to AIDE was through simple HTML forms and anchors. Once the volume of pages tracked by a single user exceeds some threshold, or links are followed recursively, more sophisticated interfaces are necessary to provide visual feedback and navigational tools. The present invention provides these more sophisticated interfaces.
System Architecture
The preferred embodiment of the present invention is comprised of four components: a version and meta-data repository, a robot that tracks modifications, a difference engine, and a graph generator. While pieces of these components have been described elsewhere, the evolution of the components and their combination to form the present invention are discussed below.
The system architecture is depicted in Fig. 3. The system accesses the WWW or other repository through a CGI interface. The information retrieved by the AIDE and Ciao systems can be stored in separate databases, as shown in Fig. 3, or the two systems can share a database. Documents are stored in the AIDE database in Revision Control System (RCS) format to minimize the storage space required to maintain multiple versions of one document. Modification dates, which users have seen certain version and other document information are also stored in the AIDE database. Data models generated to describe a document's internal structure are stored in the Ciao Entity-Relationship database. Ciao accesses the AIDE database to compare versions of a page. Repository
The AIDE version repository is a centralized service that archives multiple versions of selected pages. The system defaults to a condition where it only stores pages that a user explicitly requests. A user could specify a page that ultimately leads to many other pages, such as Yahoo, and thereby store multiple pages upon one request. Or the system can be arranged to store every document which the user retrieves from the WWW, like, the Inktomi and Lycos search engines. This option is not preferred because of the potential for shortages in storage capacity caused by the needless storage of documents that will not be needed again.
Pages are stored in RCS format, so storing multiple versions does not result in excessive storage overhead as long as changes are relatively small. RCS format maintains one version history for each document regardless of the number of users who have saved that document. As an alternative, each page could be stored separately by each user to protect privacy concerns; however, this alternative generally requires substantial storage. Instead, AIDE tracks which versions of a page each user has viewed. Thus, it can be determined if the document has changed since a particular user last viewed the document rather than since any user last viewed the document. In addition, AIDE maintains a relational database containing meta-data about each page, each user, and the relationship between them. For each page, it stores the following, among other, information: Last modification date
This date is used to find pages that have been modified since a user last saw them or to determine which pages contain new information. Last check
The time when the last modification data was obtained is used to determine when the page should next be checked by the automatic polling program.
Checksum
The checksum is used to determine if a document has been modified between the two dates selected by the user. The checksum is often used when the last modification date is unavailable. History
Information about archived versions, including the date and the RCS version number is stored to provide easy access to a selected document version. Frequency of checks
Different users may request different minimum frequencies to check a page; this number represents the minimum across all users.
For each user, the database contains global information, such as e-mail addresses, and information for each page. For each user, page combination, the database stores the following, among other, information:
Last time viewed
The last time a user viewed a page through AIDE is saved. Of course, if the user views the page directly, AIDE has no way of knowing this unless AIDE has access to her history file. History
AIDE keeps a history of which versions the user has viewed, which is a subset of all versions recorded for a particular page. Minimum frequency of checks
Set by the user to determine how often the page should be checked. The system often has a maximum polling frequency that one can select, such as one hour. Notification method
Most changes to pages will be reported upon request by a user by invoking a CGI script, but in some cases the user may request e-mail notification. In addition, for those pages that are reported together, a priority can cause them to be ordered to call attention to some more than others. This is similar to Tapestry, which orders e-mail and netnews postings based on user criteria. Auto-archive
The user can specify that a page should be archived every time a change is detected, or versions can be archived only upon explicit request of the user. Depth
The depth indicates how many levels of hyperlinks to follow when checking for modifications and archiving versions. Typically it will be zero. Tracking Modifications
The robot periodically checks pages for updates. It queries the database for all pages that have not been checked within their minimum polling frequency. For pages that are to be checked recursively, the polling frequency for links may be less than the base page.
AIDE need not check pages that are "known" to be new. If every user who has expressed an interest in a page has already been told a page has been modified, and has not visited the page through AIDE or viewed its differences, the page need not be checked again with the same frequency.
The time of each check is recorded in the database, as well as the new modification time. Modified pages are reported to interested users immediately if requested. The new page is archived automatically if specified by any user. HTML Differencing And Recursion Originally, differencing was done only on a per-page basis, with no notion of recursion. That mode is useful when most pages are checked in isolation, but less so when pages are tracked recursively. Now, one can visit a page with links to modified pages and have those links highlighted. By following the link,
HtmlDiff is invoked recursively on the new page, and its links are similarly highlighted. HtmlDiff is a tool which compares two versions of a document and outputs a third document containing information indicating a change between the two versions. Thus, one can see the differences between a set of related pages from any points of time that its contents have been archived.
The recursive comparison interface works as follows. The user selects two versions of an HTML document for comparison. The two timestamps associated with these documents define the time range for future document comparison as the user browses. When HtmlDiff compares two documents, it gathers up all the linked pages in the document and queries the version repository to determine if there are different versions of the documents specified by the address of the linked page (its URL) for the two dates. Once the earlier version of the page has been found, the invention performs a preliminary check, based on information such as the dates of modification and/or the checksums to determine if the page has been modified. Since dates of modification and checksums can provide false indications of change, the system can be designed to operate an HtmlDiff to compare the two versions to determine if they have been modified. However, this last technique is presently too burdensome and time consuming for common usage.
If an earlier version is stored in the repository, an icon is inserted before the hypertext link in the output document. The icon is itself a hypertext link that transfers control back to AIDE in order to compare the two versions of the document. If the output document indicates that two versions of a linked page exist, the user can click or otherwise select the corresponding icon to compare the contents and links contained in the linked pages. Clearly, the effectiveness of recursive comparison depends on the quantity of historical information in the version repository. Many addresses will not have any page history and will not be filtered. Other page addresses may have historical information, but not for the exact dates specified for recursive comparison. In the latter case, we make a number of approximations in order to provide more comparative information. Suppose that the current date is 04-01-96, that the user asks for version comparison between the dates 09-20-95 and 03-06-96, and that for a given URL, linked page versions exist for 10-30-95, 01- 01-96, and 03-10-96. In this case, we use the dates closest to those specified (up to some epsilon interval), so the comparison will use the 10-30-95 and 03-10-96 versions. For another linked page, there may only be a version stored for 10-15-95. In that case, we compare the stored version and the current version on the WWW. The epsilon interval used for date approximation may be user-specified or pre-set by the system manager.
Recursive HTML comparison allows users to see that a hypertext link points to a page for which there are changes. However, this only works well for one level of indirection. If the currently viewed page and a changed page are separated by a long chain of unchanged pages, it is bothersome to force the user to step through the unchanged pages to get to the differences. The Ciao graphical interface addresses this problem by providing a graphical overview of the changed pages, allowing the user to quickly navigate to changed pages. A text list analogous to the graph can also be displayed to provide similar information. Graph generator The graphical view of relationships between pages of interest to a user, and their states, could be generated in a number of ways. The present invention generates graphs on the fly as embedded images, using a tool, such as "webdot." The images can be clickable, so clicking on a node can invoke another operation. Unfortunately, image maps do not currently support operations other than selecting a page based on location within the image, unlike an external application which can enable the user to click on a node and directly access the menu. Ciao and WebMap are examples of such external applications. WebMap is a graphical hypertext navigation tool described by P. Domel at the 1994 Second International WWW Conference. Instead, the user selects a page and the selected page provides the menu and enables the selection of an operation. This indirect method is used in the instant invention and supports several operations, such as:
• Visiting the page represented by the node. • Showing the differences between the current version of the page and the previous version saved by the user.
Remembering the page represented by the node by storing the page on disk in RCS format and updating the node's version history.
• Performing a Ciao query to dynamically modify the graph, for instance, to select nodes matching some criteria.
Another approach would be a helper application that would run on a user's machine, external to the browser. This option is complicated by the need to interact with a database and CGI services on another machine, rather than being self-contained and requires that the user install an external software package, such as a Netscape Navigator plug-in. A third approach would be to provide full interactive access to the graph using a language such as Java.
System Operation
Following is a description of a user's interaction with the system of the present invention to query and navigate changes in a repository, such as the WWW. This description demonstrates how the components of AIDE and Ciao are combined seamlessly to provide effective browsing, searching, archiving, and differencing capabilities, all under a simple visual interface.
The user visits the home page of the present invention to view the history of http:// www.att.com. The history of that site is accessed through a standard form-based interface and a history list showing all available versions is sent back. The page is retrieved and displayed through the interaction of the system, the CGI interface and the browser in steps 1-3 in Fig. 5A. The retrieved page and its linked pages are temporarily stored in the system cache. The user can select an option, such as list what's new, step 4, view textual differences, step 10, archive versions, step 20, view graphical differences, step 30, list structural differences, step 40, and manipulative graph, step 50. These options can also be selected prior to retrieving a document from an external repository. List what's new in step 4 provides the user with a complete list, from the documents he or she is following, of those documents which have been newly modified. The list can be determined by comparing the dates of modification, the checksums or the two versions in a difference operation. View textual differences, step 10, is provided through AIDE. The user picks two versions to compare, such as "version 1.24" and "version 1.23" which are retrieved from the appropriate repository in step 11. Each version is temporarily stored in the system cache while the difference is performed. Each file is parsed in step 12 to determine its structure. Html documents have structure which regular text documents do not contain. Parsing the document's structure eases the comparison burden, which is performed in step 13 by HtmlDiff. Of course, the difference operation can be performed by any other program implementing similar functions to HtmlDiff, especially if the documents come from an external repository other than the WWW and are stored in a format other than html.
In step 13, the contents of the two documents are compared, including a comparison of the links to determine if any links have been added or deleted. In step 14, the system checks the various URL's to determine if two versions of the linked documents are stored in the AIDE database for the selected time frame. The two versions of the linked document or the documents' header information is also compared in step 14 to determine if the linked document has been modified. The comparison of the linked documents is discussed above. In step 15, the output document is formed with the system designated annotations indicating changes to the text, the links and the linked documents as well as an indication of whether two versions of each linked document are stored in the database to operate a difference.
The user could also select archive versions in step 20. The current documents can be archived by storing them in the AIDE database. Alternatively, the user can enter a query, specifying a base document and a recursion depth in step 41. The first linked document is retrieved in step 42. Upon user request or by designation, the document can be stored in RCS format in step 43.
In step 44, it is determined if the query will recurse another level. If yes, then the content of the base document is parsed in step 45 and the linked documents are retrieved in step 46. These newly retrieved documents are now the current recursion level and can be stored in RCS format as discussed above. If the query does not call for any more recursive levels, then the system returns the user to step 3.
The user can also select to view graphical differences in step 30 through Ciao. In step 31, the system reconstructs the documents from the RCS repository or retrieves the current version from the external repository. These documents are temporarily stored as discussed above for the AIDE system. The difference operation is then handed over to Ciao at step 32. The Ciao-HTML abstractor is invoked to create a database for each home document in step 32. These databases are temporarily stored in the Ciao Entity- Relationship database, and are deleted after a period of non-use. These databases can contain information from more than one level of indirection.
In step 33, the difference engine invokes the Ciao difference (dbdiff) operator to compute the difference database, including whether any of the links have been added or deleted from the base document. In step 34, the system determines if two versions of each linked document are stored for the selected timeframe in the database. Then the linked documents are checked to determine if they have been modified. The linked documents are checked by calling the AIDE database to check the header information or to determine the content of the individual documents. The Ciao database contains the structural entity-relationship data. The document modifications are thus determined in the manner discussed above for AIDE. However, the presence of two versions of a linked document can be determined from information stored in the Ciao database. In step 35, the graph generator sends back the embedded image graph, which was computed from the difference database to show the connections between the AT&T home page and other anchors, highlighting the additions, deletions, and changes of nodes and edges. The graph gives us a high level view on structural changes which have occurred in the AT&T home page since the last visit, assuming version 1.24 is the current version. A comparison can also be conducted between two versions of the home page stored in memory.
Fig. 4 shows a graphical difference generated by the present invention for the AT&T home pages from 11-28-95 and 01-23-96. The base document is a rectangle node and the anchors are oval nodes. Yellow nodes indicate that the corresponding documents have been changed, red ones are new anchors, white ones are deleted anchors, and light-blue ones are those anchors that remain the same (colors are shown as shades of grey in Fig. 4) . Similarly, dashed lines indicate new links, dotted lines indicate deleted links, and solid lines are those links that remain intact.
From the graphic interface, the user may elect to invoke HtmlDiff on the AT&T home page to see detailed text changes or the user may expand the query using a new node, which he or she is particularly interested in, as the base node. The former operation calls AIDE and function as described above, while the latter operation calls Ciao to perform the steps described above from the new node, as discussed above with respect to Fig. 1. Steps 51-56 of Fig 5F, illustrate the steps necessary to manipulate nodes of the graph. In step 51 the user clicks on a node to call up its menu. The menu is displayed in step 52 with the list of options. The user can then select an option, such as visit the node in step 55, expand the graph in step 54, compare the two versions of the document, if available, in step 56.
In step 40, the user can select to list structural differences. The steps for viewing the list of differences, steps 41-45 of Fig. 5E, are the same as stated for steps 31-35 of Fig. 5C, except the data is displayed in a different format. The list provides an indented lists of documents, as shown in Fig. 6, to indicate the level of recursion.
The display format requires a special indication when more than one document refers to the same document and when a document refers back to a document from a previous level of recursion. Symbols and icons are used to indicate whether the links or the linked documents have been modified. As above, an indication is also provided to inform the user whether two versions from the selected time frames are stored in the system database. Alternatively colors or other distinguishing means can be used to indicate traits of documents. The graph manipulations can also be conducted on the list, since the underlying steps are the same.
Having described the preferred embodiments of the invention with reference to the accompanying drawings, it is to be understood that the invention is not limited to those precise embodiments and that various changes and modifications could be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

We claim:
1. A method of determining and displaying changes to documents retrieved from an external repository which supports recursive tracking, comprising the steps of: storing two versions of a document, one version being a base document, from different times in a database, each document including textual information and structural links; comparing the stored document versions to determine any differences in the text or structural links; checking to determine if said database contains two versions of documents to which the base document contains structural links; and displaying the textual and structural differences between the two document versions, including an indication of whether two versions of the linked documents are stored in said database.
2. A method according to claim 1, further comprising the steps of differencing the two linked document versions to determine if the more recent version has been modified since the other version; and displaying an indication of whether the linked documents have been modified.
3. A method according to claim 2, wherein said document versions contain status information regarding the date of last modification, said step of differencing compares said status information to determine if the linked documents have been modified.
4. A method according to claim 2, wherein said document versions contain status information regarding the document checksum, said step of differencing compares said status information to determine if the linked documents have been modified.
5. A method according to claim 2, wherein the system runs said step of comparing to determine if the linked documents have been modified.
6. A method according to claim 1, wherein the linked document versions are from approximately the same time period as the original document versions.
7. A method according to claim 1, wherein said indication includes an icon before each structured link for which two linked document versions are contained in said database.
8. A method according to claim 1, wherein activating said icon causes a differencing operation to be performed on the two linked documents and their differences displayed.
9. A method of determining and displaying changes the structure of a document retrieved from an external repository which supports recursive tracking, comprising the steps of: storing at least two versions of the document, one version being a base document, each document including their structural links in a database; generating a data set representing the composition of the structured links of each document version; comparing the two data sets to determine if any links have been added or deleted; checking to determine if two versions of documents, to which the base document contains structured links, are contained in the database; displaying the structure of the document including indications of which links have been modified and whether two versions of the linked documents are stored in said database.
10. A method according to claim 9, wherein each document includes textual information in addition to the structured links, further comprising the steps of differencing the two linked document versions to determine if the more recent version has been modified since the other version; and displaying an indication of whether the linked documents have been modified.
11. A method according to claim 10, wherein said document versions contain status information regarding the date of last modification, said step of differencing compares said status information to determine if the linked documents have been modified.
12. A method according to claim 10, wherein said document versions contain status information regarding the document checksum, said step of differencing compares said status information to determine if the linked documents have been modified. - 25 -
13. A method according to claim 10, wherein the system runs said step of comparing to determine if the linked documents have been modified.
14. A method according to claim 9, wherein the linked document versions are from approximately the same time period as the original document versions.
15. A method according to claim 9, wherein the illustration is in the form of a nodal graph, each node representing a document stored in said database.
16. A method according to claim 15, wherein the status of each node is illustrated with a different color.
17. A method according to claim 16, wherein the added or deleted links are indicated by different style lines, such as solid, dashed or dotted lines.
18. A method according to claim 9, wherein the comparison of the structure of the documents is displayed in the form of a textual list made up of list components.
19. A method according to claim 15 or 18, wherein activating the node or list component causes a differencing operation to be performed on the two versions of the linked document and the difference to be displayed.
20. A method according to claim 15 or 18, wherein the display can be dynamically extended to determine the changes in its structure by activating the node or list component other than the one representing the original base document.
PCT/US1997/002407 1996-02-23 1997-02-18 Querying and navigating changes in web repositories WO1997031319A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CA002246650A CA2246650C (en) 1996-02-23 1997-02-18 Querying and navigating changes in web repositories
EP97907643A EP0922258A1 (en) 1996-02-23 1997-02-18 Querying and navigating changes in web repositories
JP53025997A JP2001508561A (en) 1996-02-23 1997-02-18 Query and navigate web repository changes

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US1215196P 1996-02-23 1996-02-23
US08/797,756 1997-02-07
US08/797,756 US5860071A (en) 1997-02-07 1997-02-07 Querying and navigating changes in web repositories
US60/012,151 1997-02-07

Publications (1)

Publication Number Publication Date
WO1997031319A1 true WO1997031319A1 (en) 1997-08-28

Family

ID=26683215

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1997/002407 WO1997031319A1 (en) 1996-02-23 1997-02-18 Querying and navigating changes in web repositories

Country Status (4)

Country Link
EP (1) EP0922258A1 (en)
JP (1) JP2001508561A (en)
CA (1) CA2246650C (en)
WO (1) WO1997031319A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999019783A2 (en) * 1997-10-15 1999-04-22 Telia Ab (Publ) Procedure and arrangement for creation of short information in computer system
EP0950962A2 (en) * 1998-04-17 1999-10-20 Xerox Corporation Methods for visualizing transformations among related series of graphs
EP0950963A2 (en) * 1998-04-15 1999-10-20 Hewlett-Packard Company Apparatus and method for communication between multiple browsers
JP2000322434A (en) * 1999-05-13 2000-11-24 Nec Corp Dynamic updating processing system for information retrieval service
EP1158385A2 (en) * 2000-05-24 2001-11-28 International Business Machines Corporation Trust-based link access control
US7730031B2 (en) 2000-03-01 2010-06-01 Computer Associates Think, Inc. Method and system for updating an archive of a computer file
US20110082930A1 (en) * 1998-01-26 2011-04-07 New York University Method and apparatus for monitor and notification in a network
US8335994B2 (en) 2000-02-25 2012-12-18 Salmon Alagnak Llc Method and apparatus for providing content to a computing device
CN112486796A (en) * 2020-12-30 2021-03-12 智道网联科技(北京)有限公司 Method and device for collecting information of vehicle-mounted intelligent terminal

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DOUGLIS F ET AL: "TRACKING AND VIEWING CHANGES ON THE WEB", USENIX TECHNICAL CONFERENCE, 22 January 1996 (1996-01-22), pages 165 - 176, XP000616939 *
YIH-FRAN R CHEN ET AL: "CIAO: A GRAPHICAL NAVIGATOR FOR SOFTWARE AND DOCUMENT REPOSITORIES", PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE (ICSM), OPIO, NICE, OCT. 17 - 20, 1995, 17 October 1995 (1995-10-17), INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS, pages 66 - 75, XP000556952 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999019783A3 (en) * 1997-10-15 1999-08-05 Telia Ab Procedure and arrangement for creation of short information in computer system
WO1999019783A2 (en) * 1997-10-15 1999-04-22 Telia Ab (Publ) Procedure and arrangement for creation of short information in computer system
US8719403B2 (en) * 1998-01-26 2014-05-06 New York University Method and apparatus for monitor and notification in a network
US20110082930A1 (en) * 1998-01-26 2011-04-07 New York University Method and apparatus for monitor and notification in a network
EP0950963A3 (en) * 1998-04-15 2002-03-20 Hewlett-Packard Company, A Delaware Corporation Apparatus and method for communication between multiple browsers
EP0950963A2 (en) * 1998-04-15 1999-10-20 Hewlett-Packard Company Apparatus and method for communication between multiple browsers
US6369819B1 (en) 1998-04-17 2002-04-09 Xerox Corporation Methods for visualizing transformations among related series of graphs
EP0950962A3 (en) * 1998-04-17 2000-03-22 Xerox Corporation Methods for visualizing transformations among related series of graphs
EP0950962A2 (en) * 1998-04-17 1999-10-20 Xerox Corporation Methods for visualizing transformations among related series of graphs
JP2000322434A (en) * 1999-05-13 2000-11-24 Nec Corp Dynamic updating processing system for information retrieval service
US8335994B2 (en) 2000-02-25 2012-12-18 Salmon Alagnak Llc Method and apparatus for providing content to a computing device
US10374984B2 (en) 2000-02-25 2019-08-06 Zarbaña Digital Fund Llc Method and apparatus for providing content to a computing device
US7730031B2 (en) 2000-03-01 2010-06-01 Computer Associates Think, Inc. Method and system for updating an archive of a computer file
US8019730B2 (en) 2000-03-01 2011-09-13 Computer Associates Think, Inc. Method and system for updating an archive of a computer file
US8019731B2 (en) 2000-03-01 2011-09-13 Computer Associates Think, Inc. Method and system for updating an archive of a computer file
EP1158385A2 (en) * 2000-05-24 2001-11-28 International Business Machines Corporation Trust-based link access control
EP1158385A3 (en) * 2000-05-24 2003-11-19 International Business Machines Corporation Trust-based link access control
CN112486796A (en) * 2020-12-30 2021-03-12 智道网联科技(北京)有限公司 Method and device for collecting information of vehicle-mounted intelligent terminal
CN112486796B (en) * 2020-12-30 2023-07-11 智道网联科技(北京)有限公司 Method and device for collecting information of vehicle-mounted intelligent terminal

Also Published As

Publication number Publication date
CA2246650C (en) 2001-07-24
JP2001508561A (en) 2001-06-26
EP0922258A1 (en) 1999-06-16
CA2246650A1 (en) 1997-08-28

Similar Documents

Publication Publication Date Title
US5860071A (en) Querying and navigating changes in web repositories
EP0747844B1 (en) A method for distributed task fulfillment of web browser requests
US5793964A (en) Web browser system
EP0747841B1 (en) A sub-agent service for fulfilling requests of a web browser
US6209007B1 (en) Web internet screen customizing system
US6189019B1 (en) Computer system and computer-implemented process for presenting document connectivity
EP0747840B1 (en) A method for fulfilling requests of a web browser
US6424979B1 (en) System for presenting and managing enterprise architectures
EP1050831B1 (en) System for providing document change information for a community of users
EP0747845B1 (en) Computer network for WWW server data access over internet
EP0747843B1 (en) A method for fulfilling requests of a web browser
US7168034B2 (en) Method for promoting contextual information to display pages containing hyperlinks
US20020026441A1 (en) System and method for integrating multiple applications
US20050210412A1 (en) Unified navigation shell user interface
US20030009489A1 (en) Method for mining data and automatically associating source locations
US20050114330A1 (en) Method, apparatus and computer-readable medium for searching and navigating a document database
US20030231196A1 (en) Implementation for determining user interest in the portions of lengthy received web documents by dynamically tracking and visually indicating the cumulative time spent by user in the portions of received web document
EP0969389A2 (en) Method for generating display control information and computer
US20040024848A1 (en) Method for preserving referential integrity within web sites
US20030074635A1 (en) Method, apparatus, and program for finding and navigating to items in a set of web pages
US7165070B2 (en) Information retrieval system
US8806060B2 (en) Information retrieval system
JP2005527901A (en) System and method for navigating search results
JP3384745B2 (en) Apparatus and method for quickly returning to a network page in a hierarchy of internet web pages
CA2246650C (en) Querying and navigating changes in web repositories

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): CA JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
ENP Entry into the national phase

Ref document number: 2246650

Country of ref document: CA

Ref country code: CA

Ref document number: 2246650

Kind code of ref document: A

Format of ref document f/p: F

ENP Entry into the national phase

Ref country code: JP

Ref document number: 1997 530259

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 1997907643

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1997907643

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1997907643

Country of ref document: EP