US20050065978A1 - Incremental non-chronological synchronization of namespaces - Google Patents

Incremental non-chronological synchronization of namespaces Download PDF

Info

Publication number
US20050065978A1
US20050065978A1 US10/669,866 US66986603A US2005065978A1 US 20050065978 A1 US20050065978 A1 US 20050065978A1 US 66986603 A US66986603 A US 66986603A US 2005065978 A1 US2005065978 A1 US 2005065978A1
Authority
US
United States
Prior art keywords
entity
namespace
name
information
computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US10/669,866
Other versions
US7584219B2 (en
Inventor
John Zybura
Max Benson
Herman Man
Edward Wayt
Felix Wong
Jing Wu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US10/669,866 priority Critical patent/US7584219B2/en
Assigned to MICROSOFT CORPORATION reassignment MICROSOFT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BENSON, MAX L., MAN, HERMAN, WAYT, EDWARD H., WONG, FELIX W., WU, JING, ZYBURA, JOHN H.
Publication of US20050065978A1 publication Critical patent/US20050065978A1/en
Application granted granted Critical
Publication of US7584219B2 publication Critical patent/US7584219B2/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99944Object-oriented database structure
    • Y10S707/99945Object-oriented database structure processing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99951File or database maintenance
    • Y10S707/99952Coherency, e.g. same view to multiple users
    • Y10S707/99953Recoverability

Definitions

  • This application relates generally to synchronization of information and more specifically to synchronization of information in a plurality of information structures or hierarchies.
  • a synchronizing process typically implements rules and/or specifications to adequately harmonize information in various data sources. Further, such a process may rely on an engine capable of executing software and a storage capable of storing the information, as appropriate.
  • the synchronizing process may replicate information from various data sources in a central storage, wherein the replicated information has some degree of integrity. To achieve this task, information from the various data sources are either pushed or pulled into the central storage. In addition, information may be pulled or pushed out of such a central storage to the various data sources.
  • the information may be provided to the central storage by one of the various data sources non-chronologically.
  • the modifications that occur at a data source have a temporal relationship.
  • notification of those modifications may reach the central storage out of order with respect to that temporal relationship. This situation has the potential to create problems during synchronization.
  • Various exemplary methods, devices and/or systems described below are directed at those problems.
  • Synchronizing two such namespaces involves providing a mechanism for indicating that an entity has been created because a reference to that entity has been made even though that entity does not yet exist. At such time as the entity is formally created, the indication is removed. Synchronizing two such namespaces also involves providing a mechanism for indicating that an entity's unique name in the namespace has been compromised through the synchronization process.
  • FIG. 2 is a functional block diagram illustrating in slightly greater detail the storage of the metadirectory as it interacts with various data sources.
  • FIG. 3 is a functional block diagram generally illustrating information that is included in an “entity” as that term is used in this document.
  • FIG. 5 is a graphical illustration of a synchronization between a master namespace and a slave namespace that suffers from name collision.
  • FIG. 7 is a graphical illustration of a synchronization between a master namespace and a slave namespace that suffers from a dangling reference.
  • FIG. 1 shows an exemplary system 100 that includes an exemplary metadirectory 102 capable of communicating information to and/or from a plurality of data sources (e.g., DSA 150 , DSB 160 , and DSC 170 ).
  • Each data source includes many objects, with each object containing information.
  • each object may be thought of as a body of information, such as information about an individual (e.g., name, address, salary), a mailing list (members), an e-mail account (e-mail address), a corporate asset (serial number), or the like.
  • DSA 150 were a human resources database
  • objects within DSA 150 may correspond to employees, and each employee may have characteristics such as an employee number, a manager, an office location, and the like.
  • the metadirectory 102 is an infrastructure element that provides an aggregation and clearinghouse of the information stored within each of the several data sources associated with the metadirectory 102 .
  • the metadirectory 102 includes storage 130 in which reside “entities” that represent the individual bodies of information stored in each associated data source. Disparate information from different data sources that pertains to the same body of information (e.g., an individual, asset, or the like) is typically aggregated into a single entity within the metadirectory 102 . In this way, a user can take advantage of the metadirectory 102 to view at a single location information that may be stored piecemeal in several different data sources.
  • Such an exemplary metadirectory may consolidate information contained in multiple data sources in a centralized manner, manage relationships between the data sources, and allow for information to flow between them as appropriate.
  • the storage 130 is for storing the aggregated and consolidated information 11 from each of the associated data sources.
  • the storage 130 may be a database or any other mechanism for persisting data in a substantially permanent manner.
  • the storage 130 may include core storage (sometimes referred to as a “metaverse”), in which the data is deemed to be valid, and transient storage (e.g., a buffer or connector space) used to temporarily store information awaiting inclusion in the core storage.
  • core storage sometimes referred to as a “metaverse”
  • transient storage e.g., a buffer or connector space
  • Entities within the metadirectory 102 may be referred to collectively as “central entities,” and entities outside the metadirectory 102 (e.g., within the data sources) may be referred to collectively as “external entities.”
  • central entities entities within the metadirectory 102
  • entities outside the metadirectory 102 e.g., within the data sources
  • external entities e.g., within the data sources
  • a central entity within the metadirectory 102 may correspond to two or more external entities and include an aggregation of the information stored in each of the corresponding external entities. More specific detail about entities is provided below in conjunction with FIG. 3 .
  • the storage 130 may include a core 211 and a buffer 221 .
  • the core 211 represents data that is considered to accurately reflect (from the perspective of a user of the metadirectory 102 ) the information in the various data sources.
  • the buffer 221 includes data of a more transient nature. Recent changes to data at the data sources are reflected in the buffer 221 until that data can be committed to the core 211 .
  • a data source presents to the metadirectory 102 change data that represents changes to an external entity (e.g., external entity 250 ) stored within the data source.
  • the change data may indicate that information about an entity within the data source has changed in some fashion, or perhaps the change data indicates that its corresponding entity has been deleted or added.
  • a buffer entity e.g., buffer entity 260
  • the change data is first created using the change data to create an entity that represents the external entity (e.g., external entity 250 ) at the data source (e.g., DSA 150 ).
  • the buffer entity 260 mirrors its corresponding external entity 250 .
  • Other data sources may also be presenting their own change data to the buffer 221 as well.
  • the change data may sometimes be referred to as “delta information” or “deltas.”
  • namespace means any set of entities.
  • the entities in a namespace may be unordered. Accordingly the term namespace may be used to refer to any set of entities, such as the core 211 or the buffer 221 (i.e., a core namespace or a buffer namespace). Or the term namespace may be used to refer collectively to the metadirectory 102 as a namespace. Similarly, any of the data sources may sometimes be referred to as namespaces.
  • Synchronization may be more generally defined as any process which causes two non-identical namespaces (e.g., the buffer 221 and the DSA 150 ) to converge to identity over a finite period of time.
  • one namespace is sometimes referred to as the master, and it is allowed to be modified by other processes.
  • the DSA 150 is the master.
  • the other namespace is referred to as the slave, and it may only be modified by the synchronization process.
  • the buffer 221 is the slave.
  • the synchronization process is not instantaneous. That is, the time to convergence for the system is nonzero.
  • Such synchronization processes may be termed periodic, since the non-zero convergence time generally means the synchronization processes itself is not continuously active, but rather is invoked on a schedule.
  • a periodic synchronization process could be described by the following steps:
  • a periodic synchronization process is said to be incremental if each change read from the synchronization feed is performed independently on the slave namespace.
  • a non-incremental synchronization process would be one in which the entire summation of all changes in the feed are performed in a single, atomic update.
  • virtually all synchronization processes are incremental. The amount of data which may need to be transferred during a given invocation of the synchronization process is effectively unbounded, which would make it prohibitively expensive to use batching or transactional logic in the underlying data source to achieve an atomic write from the summation of all synchronization modifications.
  • both a name and a unique identifier may at first appear redundant, but each has a special purpose.
  • a human-readable name is intuitive and may be used to reflect a real-world property or concept, thus making the name very useful to users.
  • this usability typically means that the name should also be changeable.
  • a globally unique identifier conveys little in terms of readability or intuitive message. It does however, effectively distinguish the entity from every other entity in existence.
  • a phantom entity is created with the correct name and identity and is used as the referent. If a change to an entity occurs that deletes a value from a reference attribute and a phantom entity was being referred to, the phantom entity is checked to see if it still has anything else referring to it. If not, then the phantom entity is deleted. Note that this process does not have to be immediate. Rather, at the end of the synchronization process, a final sweep can be made to find any phantom entities without references to them. If found, these orphaned phantom entities may then be deleted.
  • Another Boolean-valued flag in the data store for the slave namespace may be used to indicate that an entity is transient.
  • the flag in combination with the string-valued name may also be used as a logical unique “name” for transient entities.
  • entity 451 has a flag 461 that is currently set False and a name “Foo.”
  • the entity 451 ′ is moved to the transient subspace 410 by setting the flag 461 ′ to True.
  • the name of the entity is changed to its identifier “GUID.”
  • FIG. 5 is a graphical illustration of a synchronization between a master namespace (DSA 150 ) and a slave namespace (buffer 221 ) that suffers from name collision. Shown are a data source (i.e., DSA 150 ) acting as a master namespace, and a buffer 221 acting as a slave. The sequence of events occurs from top to bottom, where the state of the DSA 150 changes over time.
  • the DSA 150 includes a first entity 510 currently named Alpha, and a second entity 511 currently named Beta.
  • the second entity 511 is renamed from Beta to Tango, resulting in state 502 .
  • the first entity 510 is renamed to Beta.
  • the buffer 221 matches the initial state 501 of the DSA 150 .
  • the changes to entities within the DSA 150 are transmitted to the buffer 221 out of order.
  • the first change that occurs in the buffer 221 is that the first entity 530 is renamed from Alpha to Beta, thus resulting in a name conflict.
  • the second entity 531 has not yet been renamed. Accordingly, the second entity 531 is identified as transient in an appropriate manner, resulting in state 522 .
  • the name of the second entity 531 is changed in some fashion that prevents the two entities from sharing the same name.
  • FIG. 7 is a graphical illustration of a synchronization between a master namespace (DSA 150 ) and a slave namespace (buffer 221 ) that suffers from a dangling reference. Shown again are a DSA 150 acting as a master namespace, and a buffer 221 acting as a slave. The sequence of events occurs from top to bottom, where the state of the DSA 150 changes over time. Initially, at state 701 , the DSA 150 includes a first entity 710 currently named Alpha. At that point, a second entity 711 named Beta is created, resulting in state 702 . Subsequently, a reference is added to the first entity 710 that points to the second entity 711 .
  • the buffer 221 matches the initial state 701 of the DSA 150 .
  • the changes to entities within the DSA 150 are transmitted to the buffer 221 out of order.
  • the first change that occurs in the buffer 221 is that a reference is added to the first entity 730 that points to the second entity 731 , but the second entity 731 has not yet been created.
  • a phantom entity 731 is created having the name referred to by the first entity 730 .
  • a flag or bit within a typical entity may be used to indicate that the phantom entity 731 is a placeholder.
  • a change is received that formally creates the second entity 731 , and thus the phantom status is removed from it, resulting in state 723 .
  • the final state 723 of the buffer 221 matches the final state 703 of the DSA 150 , and there are no remaining artifacts (i.e., there are no phantom entities).
  • a successfully completed synchronization process should leave the slave namespace (e.g., the buffer 221 ) in such a state that it has no transients or phantoms. This follows directly from the definition of a synchronization process, which stipulates that after a successfully completed synchronization the master and slave namespace are identical.
  • transients or phantoms remain after a synchronization process has completed successfully.
  • This situation can be result from two scenarios. First, the presence of transients after a successful synchronization process means the synchronization feed contains invalid data, and an error should be raised. The presence of phantoms after a successful synchronization would also mean the synchronization feed is invalid, as long as the synchronization process is always synchronizing all entities in both the master and slave namespaces.
  • phantoms can remain after a successful synchronization if the synchronization process is modified such that only a subset of entities in the master namespace is synchronized with a corresponding subset in the slave namespace.
  • Such a process may be called filtered synchronization, and it differs from normal synchronization in that it is possible for entities within the filtered subset to contain references to entities outside the filtered subset. Otherwise, filtered synchronization is identical to normal synchronization, except that the constraints on the process with respect to convergence only apply to the synchronized subsets of the master and slave namespaces.
  • phantom entities will remain if there are such references. From this perspective, phantoms may in fact be considered a useful tool in the representation of references to filtered entities in the underlying data source.
  • FIG. 8 shows an exemplary computer 800 suitable as an environment for practicing various aspects of subject matter disclosed herein.
  • Components of computer 800 may include, but are not limited to, a processing unit 820 , a system memory 830 , and a system bus 821 that couples various system components including the system memory 830 to the processing unit 820 .
  • the system bus 821 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures.
  • such architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISAA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus also known as the Mezzanine bus.
  • ISA Industry Standard Architecture
  • MCA Micro Channel Architecture
  • EISAA Enhanced ISA
  • VESA Video Electronics Standards Association
  • PCI Peripheral Component Interconnect
  • Exemplary computer 800 typically includes a variety of computer-readable media.
  • Computer-readable media can be any available media that can be accessed by computer 800 and includes both volatile and nonvolatile media, removable and non-removable media.
  • Computer-readable media may comprise computer storage media and communication media.
  • Computer storage media include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computer 800 .
  • Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.
  • modulated data signal means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • communication media includes wired media such as a wired network or direct-wired connection and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.
  • the system memory 830 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 831 and random access memory (RAM) 832 .
  • ROM read only memory
  • RAM random access memory
  • RAM 832 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 820 .
  • FIG. 8 illustrates operating system 834 , the exemplary rules/specifications, services, storage 801 (e.g., storage may occur in RAM or other memory), application programs 835 , other program modules 836 , and program data 837 .
  • the exemplary rules/specifications, services and/or storage 801 are depicted as software in random access memory 832 , other implementations may include hardware or combinations of software and hardware.
  • the exemplary computer 800 may also include other removable/non-removable, volatile/nonvolatile computer storage media.
  • FIG. 8 illustrates a hard disk drive 841 that reads from or writes to non-removable, nonvolatile magnetic media, a magnetic disk drive 851 that reads from or writes to a removable, nonvolatile magnetic disk 852 , and an optical disk drive 855 that reads from or writes to a removable, nonvolatile optical disk 856 such as a CD ROM or other optical media.
  • removable/non-removable, volatile/nonvolatile computer storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like.
  • the hard disk drive 841 is typically connected to the system bus 821 through a non-removable memory interface such as interface 840
  • magnetic disk drive 851 and optical disk drive 855 are typically connected to the system bus 821 by a removable memory interface such as interface 850 .
  • the drives and their associated computer storage media discussed above and illustrated in FIG. 8 provide storage of computer-readable instructions, data structures, program modules, and other data for computer 800 .
  • hard disk drive 841 is illustrated as storing operating system 844 , application programs 845 , other program modules 846 , and program data 847 .
  • operating system 844 application programs 845 , other program modules 846 , and program data 847 are given different numbers here to illustrate that, at a minimum, they are different copies.
  • a user may enter commands and information into the exemplary computer 800 through input devices such as a keyboard 862 and pointing device 861 , commonly referred to as a mouse, trackball, or touch pad.
  • Other input devices may include a microphone, joystick, game pad, satellite dish, scanner, or the like.
  • These and other input devices are often connected to the processing unit 820 through a user input interface 860 that is coupled to the system bus, but may be connected by other interface and bus structures, such as a parallel port, game port, or a universal serial bus (USB).
  • a monitor 891 or other type of display device is also connected to the system bus 821 via an interface, such as a video interface 890 .
  • computers may also include other peripheral output devices such as speakers 897 and printer 896 , which may be connected through an output peripheral interface 895 .
  • the exemplary computer 800 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 880 .
  • the remote computer 880 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to computer 800 , although only a memory storage device 881 has been illustrated in FIG. 8 .
  • the logical connections depicted in FIG. 8 include a local area network (LAN) 871 and a wide area network (WAN) 873 , but may also include other networks.
  • LAN local area network
  • WAN wide area network
  • Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and the Internet.
  • the exemplary computer 800 When used in a LAN networking environment, the exemplary computer 800 is connected to the LAN 871 through a network interface or adapter 870 . When used in a WAN networking environment, the exemplary computer 800 typically includes a modem 872 or other means for establishing communications over the WAN 873 , such as the Internet.
  • the modem 872 which may be internal or external, may be connected to the system bus 821 via the user input interface 860 , or other appropriate mechanism.
  • program modules depicted relative to the exemplary computer 800 may be stored in the remote memory storage device.
  • FIG. 8 illustrates remote application programs 885 as residing on memory device 881 . It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
  • program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • program modules may also be practiced in distributed communications environments where tasks are performed over wireless communication by remote processing devices that are linked through a communications network.
  • program modules may be located in both local and remote communications device storage media including memory storage devices.

Abstract

Described are mechanisms and techniques for enabling incremental non-chronological synchronization of namespaces. In an environment, entities must have unique names within a namespace and entities may only refer to entities that actually exist within the namespace. Synchronizing two such namespaces involves providing a mechanism for indicating that an entity has been created because a reference to that entity has been made even though that entity does not yet exist. At such time as the entity is formally created, the indication is removed. Synchronizing two such namespaces also involves providing a mechanism for indicating that an entity's unique name in the namespace has been compromised through the synchronization process.

Description

    TECHNICAL FIELD
  • This application relates generally to synchronization of information and more specifically to synchronization of information in a plurality of information structures or hierarchies.
  • BACKGROUND OF THE INVENTION
  • Often a company stores important information in various data sources. For example, a human resources department may store information about employees in a human resources data source. The human resources data source may be arranged or organized according to a human resources specific information structure or hierarchy. A finance department may also store information about employees, clients, suppliers, etc., in a finance department data source. The finance department data source may be arranged or organized according to a finance department information structure or hierarchy. It is likely that some common information exists in both data sources. Thus, synchronizing the information becomes desirable.
  • A synchronizing process typically implements rules and/or specifications to adequately harmonize information in various data sources. Further, such a process may rely on an engine capable of executing software and a storage capable of storing the information, as appropriate. In general, the synchronizing process may replicate information from various data sources in a central storage, wherein the replicated information has some degree of integrity. To achieve this task, information from the various data sources are either pushed or pulled into the central storage. In addition, information may be pulled or pushed out of such a central storage to the various data sources.
  • Often, the information may be provided to the central storage by one of the various data sources non-chronologically. In other words, the modifications that occur at a data source have a temporal relationship. However, due to the nature of synchronization, notification of those modifications may reach the central storage out of order with respect to that temporal relationship. This situation has the potential to create problems during synchronization. Various exemplary methods, devices and/or systems described below are directed at those problems.
  • SUMMARY OF THE INVENTION
  • Briefly stated, mechanisms and techniques are described for enabling incremental non-chronological synchronization of namespaces. In an environment, entities must have unique names within a namespace and entities may only refer to entities that actually exist within the namespace. Synchronizing two such namespaces involves providing a mechanism for indicating that an entity has been created because a reference to that entity has been made even though that entity does not yet exist. At such time as the entity is formally created, the indication is removed. Synchronizing two such namespaces also involves providing a mechanism for indicating that an entity's unique name in the namespace has been compromised through the synchronization process.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a functional block diagram generally illustrating an exemplary system that includes a metadirectory and a plurality of data sources.
  • FIG. 2 is a functional block diagram illustrating in slightly greater detail the storage of the metadirectory as it interacts with various data sources.
  • FIG. 3 is a functional block diagram generally illustrating information that is included in an “entity” as that term is used in this document.
  • FIG. 4 is a graphical representation of a mechanism for addressing name collision during an incremental non-chronological synchronization.
  • FIG. 5 is a graphical illustration of a synchronization between a master namespace and a slave namespace that suffers from name collision.
  • FIG. 6 is a graphical illustration of another synchronization between a master namespace and a slave namespace that also suffers from name collision.
  • FIG. 7 is a graphical illustration of a synchronization between a master namespace and a slave namespace that suffers from a dangling reference.
  • FIG. 8 shows an exemplary computer suitable as an environment for practicing various aspects of subject matter disclosed herein.
  • DETAILED DESCRIPTION
  • The following description sets forth a specific embodiment of a system for incremental non-chronological synchronization. This specific embodiment incorporates elements recited in the appended claims. The embodiment is described with specificity in order to meet statutory requirements. However, the description itself is not intended to limit the scope of this patent. Rather, the inventors have contemplated that the claimed invention might also be embodied in other ways, to include different elements or combinations of elements similar to the ones described in this document, in conjunction with other present or future technologies.
  • The following discussion refers to an information environment that includes a metadirectory. While a metadirectory is used here for explanatory purposes, the various mechanisms and techniques described here may also be applied generally to other environments where synchronization of information is desired. In general, information should be identifiable in an information environment, for example, through use of an identifier, and preferably an immutable or traceable identifier. In some instances, information is structured or organized in a hierarchy.
  • Exemplary Metadirectory System
  • FIG. 1 shows an exemplary system 100 that includes an exemplary metadirectory 102 capable of communicating information to and/or from a plurality of data sources (e.g., DSA 150, DSB 160, and DSC 170). Each data source includes many objects, with each object containing information. For this discussion, each object may be thought of as a body of information, such as information about an individual (e.g., name, address, salary), a mailing list (members), an e-mail account (e-mail address), a corporate asset (serial number), or the like. For example if DSA 150 were a human resources database, then objects within DSA 150 may correspond to employees, and each employee may have characteristics such as an employee number, a manager, an office location, and the like.
  • There may also be an object in another data source that pertains to the same body of information, but includes slightly different characteristics or information. For example, DSB 160 may be an information technology server that includes information about the logon accounts of employees. Accordingly, there may be a corresponding object within DSB 160 for each or many of the objects in DSA 150. However, the particular body of information for the objects within DSB 160 would be slightly different than those within DSA 150. Collectively, the information associated with a particular body of information are sometimes referred to as “identity data” or the like.
  • The metadirectory 102 is an infrastructure element that provides an aggregation and clearinghouse of the information stored within each of the several data sources associated with the metadirectory 102. The metadirectory 102 includes storage 130 in which reside “entities” that represent the individual bodies of information stored in each associated data source. Disparate information from different data sources that pertains to the same body of information (e.g., an individual, asset, or the like) is typically aggregated into a single entity within the metadirectory 102. In this way, a user can take advantage of the metadirectory 102 to view at a single location information that may be stored piecemeal in several different data sources. Such an exemplary metadirectory may consolidate information contained in multiple data sources in a centralized manner, manage relationships between the data sources, and allow for information to flow between them as appropriate.
  • The metadirectory 102 includes rules 110 and services 120 that are used to aggregate, consolidate, synchronize, and otherwise maintain the integrity of the information presented through the metadirectory 102. The rules 110 and services 120 form or define one or more protocols, APIs, schemata, services, hierarchies, etc. In this particular embodiment, the rules 110 include methods and techniques for achieving incremental non-chronological synchronization of information presented to the metadirectory 102, as will become more clear in the description that follows.
  • The storage 130 is for storing the aggregated and consolidated information 11 from each of the associated data sources. The storage 130 may be a database or any other mechanism for persisting data in a substantially permanent manner. As will be described more fully in conjunction with FIG. 2, the storage 130 may include core storage (sometimes referred to as a “metaverse”), in which the data is deemed to be valid, and transient storage (e.g., a buffer or connector space) used to temporarily store information awaiting inclusion in the core storage. In other words, changes, additions, or deletions to information in one or more data sources may be presented to the metadirectory 102 and temporarily stored in a buffer until they can be committed to the core storage.
  • FIG. 2 is a functional block diagram illustrating in slightly greater detail the storage 130 of the metadirectory 102 as it interacts with the various data sources. The data stored within the system are termed “entities” for the purpose of this discussion (e.g., core entity 270, buffer entity 260, external entity 250). Generally stated, entities are objects that include any arbitrary collection of data (e.g., current values, change information, etc.) about the bodies of information that reside in the various data sources. Entities within the metadirectory 102 may be referred to collectively as “central entities,” and entities outside the metadirectory 102 (e.g., within the data sources) may be referred to collectively as “external entities.” For example, a central entity within the metadirectory 102 may correspond to two or more external entities and include an aggregation of the information stored in each of the corresponding external entities. More specific detail about entities is provided below in conjunction with FIG. 3.
  • As mentioned above, the storage 130 may include a core 211 and a buffer 221. The core 211 represents data that is considered to accurately reflect (from the perspective of a user of the metadirectory 102) the information in the various data sources. In contrast, the buffer 221 includes data of a more transient nature. Recent changes to data at the data sources are reflected in the buffer 221 until that data can be committed to the core 211.
  • As illustrated, a data source (e.g., DSA 150) presents to the metadirectory 102 change data that represents changes to an external entity (e.g., external entity 250) stored within the data source. The change data may indicate that information about an entity within the data source has changed in some fashion, or perhaps the change data indicates that its corresponding entity has been deleted or added. A buffer entity (e.g., buffer entity 260) is first created using the change data to create an entity that represents the external entity (e.g., external entity 250) at the data source (e.g., DSA 150). Essentially, the buffer entity 260 mirrors its corresponding external entity 250. Other data sources (not shown) may also be presenting their own change data to the buffer 221 as well. The change data may sometimes be referred to as “delta information” or “deltas.”
  • As used in this document, the term “namespace” means any set of entities. The entities in a namespace may be unordered. Accordingly the term namespace may be used to refer to any set of entities, such as the core 211 or the buffer 221 (i.e., a core namespace or a buffer namespace). Or the term namespace may be used to refer collectively to the metadirectory 102 as a namespace. Similarly, any of the data sources may sometimes be referred to as namespaces.
  • A process termed synchronization occurs to reconcile the external entities within the data sources with their corresponding central entities within the metadirectory 102. For instance, in this example the external entity 250 is associated with the buffer entity 260. Through the synchronization process, modifications represented in the external entity 250, as well as perhaps other external entities, become reflected in the buffer entity 260. By creating this synchronized relationship between two namespaces (e.g., the buffer 221 and the DSA 150), the pair of namespaces may be referred to as “correlated namespaces.”
  • Synchronization may be more generally defined as any process which causes two non-identical namespaces (e.g., the buffer 221 and the DSA 150) to converge to identity over a finite period of time. In various examples in this document, one namespace is sometimes referred to as the master, and it is allowed to be modified by other processes. Using the terminology of this discussion, the DSA 150 is the master. In this example, the other namespace is referred to as the slave, and it may only be modified by the synchronization process. Again, using the terminology of this discussion, the buffer 221 is the slave.
  • Typically, the synchronization process is not instantaneous. That is, the time to convergence for the system is nonzero. Such synchronization processes may be termed periodic, since the non-zero convergence time generally means the synchronization processes itself is not continuously active, but rather is invoked on a schedule. Thus, a periodic synchronization process could be described by the following steps:
      • 1) Select from the master (e.g., the DSA 150) a set of changes to apply to the slave (e.g., the buffer 221) that is sufficient to bring the two namespaces into convergence. Thus the set should include all changes made to the master since the last time it synchronized with the slave. In simple implementations, this set could simply the set of all entities in the master, where each entity is represented as a modify-entity event which would set all properties. The resultant set is called the synchronization feed.
      • 2) Iterate, in arbitrary order, through the set of changes (i.e., consume the feed) obtained in the previous step. If the change is add-entity, then an entity is created in the slave namespace. Otherwise, apply the change to the appropriate entity in the slave namespace.
  • Finally, a periodic synchronization process is said to be incremental if each change read from the synchronization feed is performed independently on the slave namespace. By contrast, a non-incremental synchronization process would be one in which the entire summation of all changes in the feed are performed in a single, atomic update. In practice, virtually all synchronization processes are incremental. The amount of data which may need to be transferred during a given invocation of the synchronization process is effectively unbounded, which would make it prohibitively expensive to use batching or transactional logic in the underlying data source to achieve an atomic write from the summation of all synchronization modifications.
  • The general process of incremental synchronization as just described allows for the possibility that the changes in the synchronization feed could be applied in non-chronological order. That is to say, the order of changes applied to the slave as a result of processing the synchronization feed is not necessarily the same as the order in which the changes were originally applied to the master namespace. In fact, for the simplest implementation where simply all entities in the master namespace are selected every time, the resultant will feed will almost always be non-chronological.
  • Consuming a non-chronological feed can result in temporary data artifacts in the slave namespace which could violate constraints placed on a namespace. These artifacts are described as temporary because they can only be seen if the state of the slave namespace is viewed after each individual change in the feed is consumed; the artifacts should not remain once the synchronization feed has been depleted. Thus, the artifacts are not problematic if the synchronization process is non-incremental, since the final state of the namespace written to the data store would be consistent. However, these artifacts do pose problems for incremental synchronization processes since each individual change is performed in the slave namespace independently, and any of these changes could violate the namespace constraints and leave the slave in an inconsistent state.
  • Specifically, two types of data artifacts can occur when incrementally consuming a non-chronological synchronization feed, name collision and dangling references. Name collision occurs when an external entity is processed which either adds a new buffer entity or changes the name of an existing buffer entity and results in duplicate names, which violates the namespace constraint that all names be unique at a single point in time. Examples of how name collision can occur are illustrated in FIGS. 5 and 6 and described below. Dangling references occur when a buffer entity refers to another entity that does not yet exist because the change that would create it has not yet been processed. A dangling reference violates the namespace constraint that all reference values contain the names of existing entities. An example of how a dangling reference can occur is illustrated in FIG. 7 and described below.
  • Note that these constraint violations are only temporary artifacts of the synchronization process. A successfully completed synchronization process should not exhibit these problems. However, for an incremental synchronization process, these temporary artifacts can adversely affect the system during intermediate stages between separately transacted modifications made to the slave namespace. Most data sources would not allow the temporary relaxation of these types of constraints. And the fact that the incremental synchronization process is non-atomic means an aborted or unsuccessfully completed process could result in these artifacts persisting after the process has ended.
  • FIG. 3 is a functional block diagram generally illustrating information that is included in an “entity” 310 as that term is used in this document. The entity 310 includes a name (e.g., name 311), which preferably has a string value 321 unique across the namespace. The name can change at any time. Each entity also includes an identity (e.g., identity 312), which is preferably a string value 322 that is globally unique. The identity of an entity does not change, i.e. it is an immutable property of the entity 310.
  • The use of both a name and a unique identifier may at first appear redundant, but each has a special purpose. For example, a human-readable name is intuitive and may be used to reflect a real-world property or concept, thus making the name very useful to users. However, this usability typically means that the name should also be changeable. In contrast, a globally unique identifier conveys little in terms of readability or intuitive message. It does however, effectively distinguish the entity from every other entity in existence.
  • The entity 310 may also include an arbitrary number of reference attributes (e.g., reference attribute 313) that contain name/identity pairs 323 of other entities within the same namespace referred to by the referring entity. The reference attribute 313 may have a single reference pair, or it may include multiple reference pairs, such as a distribution list. The reference attributes allow the modeling of arbitrary, directed relationships between entities. The entity 310 may also include an arbitrary number of user data attributes (e.g., data_1 314 and data_2 315) that contain user data (e.g., user info 324 and 325, respectively).
  • The entity 310 also includes a “phantom” attribute 316, which has special meaning in the context of this discussion. As described above, the process of incremental non-chronological synchronization can result in dangling references (an entity that does not yet exist is referred to by a changed entity). The phantom attribute 316 is a Boolean-valued property 326 of the entity 310 used to indicate that the entity has not yet been officially “created” but yet must exist in the namespace because it has been referred to by another entity. Use of the phantom attribute allows the creation of a “placeholder state,” which is essentially somewhere between an officially-created entity and a non-existent entity. Other constraints may be put on the entity 310 if the phantom property 316 is true. For instance, no other data may be allowed to be stored in a phantom entity except the name 311 and identity 312. Use of the placeholder state is illustrated in FIG. 7.
  • The following guidance is provided for handling phantoms or entities in the placeholder state. First, if an add-entity event occurs and there already exists a phantom entity with the same identity, then the phantom entity is promoted to a non-placeholder entity. If a delete-entity event occurs and the entity to be deleted has references to it in the slave namespace, then the entity is demoted to a phantom entity.
  • If a change to an entity occurs that adds a value to a reference attribute and the referent does not exist, then a phantom entity is created with the correct name and identity and is used as the referent. If a change to an entity occurs that deletes a value from a reference attribute and a phantom entity was being referred to, the phantom entity is checked to see if it still has anything else referring to it. If not, then the phantom entity is deleted. Note that this process does not have to be immediate. Rather, at the end of the synchronization process, a final sweep can be made to find any phantom entities without references to them. If found, these orphaned phantom entities may then be deleted.
  • FIG. 4 is a graphical representation of a mechanism for addressing name collision during an incremental non-chronological synchronization. More specifically, illustrated is a slave namespace partitioned into two subspaces, a transient subspace 410 and a non-transient subspace 411. The non-transient subspace 411 is configured sufficient to contain every possible name in the master namespace. Accordingly, the non-transient subspace 411 is sufficient to include every entity (e.g., entity 450) in the slave namespace if every single entity is is uniquely named (as is the constraint) and is non-transient.
  • Recall that name collision occurs when an entity (which could be a phantom entity) is introduced that violates the constraint that no two entities within a namespace have the same name. Name collision can occur if the name of an existing entity is changed to an already taken name or if a new entity (which could be a phantom entity) is given a name that conflicts with an existing name (possibly the name of an existing phantom entity) prior to another change that would resolve the conflict. To address this situation, the entity is moved to the transient namespace 410 by changing the name of the existing entity to some value that cannot conflict with other entities. The use of the entity's identity (which is globally unique) in combination with (or in lieu of) the entity's name would suffice to prevent name collision. Accordingly, the transient subspace 410 is configured sufficient to contain every possible identity value in the master namespace. Upon being identified as transient, an entity's name may be changed to its identity or a combination of its identity and its name.
  • Another Boolean-valued flag in the data store for the slave namespace may be used to indicate that an entity is transient. The flag in combination with the string-valued name may also be used as a logical unique “name” for transient entities. For example, entity 451 has a flag 461 that is currently set False and a name “Foo.” Upon being identified as transient, the entity 451′ is moved to the transient subspace 410 by setting the flag 461′ to True. In addition, the name of the entity is changed to its identifier “GUID.”
  • The following guidance is provided for handling transient entities, or 19 entities that result from a name collision. If an add-entity event occurs and there already exists an entity with the same name but a different identity, the existing entity is made transient. If a entity's name is changed but there already exists an entity with the new name but a different identity, the existing entity is made transient. A modify-entity event should be treated as a name change if the entity to which the event refers is currently transient. That is, receiving a change to an entity currently identified as transient should bring it out of the transient state. Note that bringing an entity out of the transient state should be treated the same as any other change. Any existing entities with conflicting names should in turn be made transient.
  • The concepts and principles of these mechanisms and techniques will now be described with reference to certain examples of operation. Of course, these examples are for illustrative purposes only and are indeed not exhaustive of the various applications of these mechanisms and techniques.
  • FIG. 5 is a graphical illustration of a synchronization between a master namespace (DSA 150) and a slave namespace (buffer 221) that suffers from name collision. Shown are a data source (i.e., DSA 150) acting as a master namespace, and a buffer 221 acting as a slave. The sequence of events occurs from top to bottom, where the state of the DSA 150 changes over time. Initially, at state 501, the DSA 150 includes a first entity 510 currently named Alpha, and a second entity 511 currently named Beta. At that point, the second entity 511 is renamed from Beta to Tango, resulting in state 502. Subsequently, the first entity 510 is renamed to Beta.
  • At that point an incremental synchronization occurs between the DSA 150 and the buffer 221. Initially, at state 521, the buffer 221 matches the initial state 501 of the DSA 150. However, the changes to entities within the DSA 150 are transmitted to the buffer 221 out of order. Thus, the first change that occurs in the buffer 221 is that the first entity 530 is renamed from Alpha to Beta, thus resulting in a name conflict. Note that the second entity 531 has not yet been renamed. Accordingly, the second entity 531 is identified as transient in an appropriate manner, resulting in state 522. As described above, the name of the second entity 531 is changed in some fashion that prevents the two entities from sharing the same name. Subsequently, a change is received that renames the second entity 531 from Beta to Tango. Thus, the transient state of the second entity 531 is changed to non-transient, and its name is changed to Tango, resulting in state 523.
  • Note that when the synchronization is complete, the final state 523 of the buffer 221 matches the final state 503 of the DSA 150, and there are no remaining artifacts (i.e., there are no remaining transients).
  • FIG. 6 is a graphical illustration of another synchronization between a master namespace (DSA 150) and a slave namespace (buffer 221) that also suffers from name collision. Shown again are a DSA 150 acting as a master namespace, and a buffer 221 acting as a slave. The sequence of events occurs from top to bottom, where the state of the DSA 150 changes over time. Initially, at state 601, the DSA 150 includes only a first entity 610 currently named Alpha. At that point, the first entity 610 is deleted, and the DSA 150 is empty at state 602. Subsequently, at state 603, the second entity 611 is created and named Alpha.
  • At that point an incremental synchronization occurs between the DSA 150 and the buffer 221. Initially, at state 621, the buffer 221 matches the initial state 601 of the DSA 150. However, the changes to entities within the DSA 150 are transmitted to the buffer 221 out of order. Thus, the first change that occurs in the buffer 221 is that the second entity 631 is created and named Alpha, thus resulting in a name collision with the first entity 630. Accordingly, the first entity 630 is identified as transient in an appropriate manner, resulting in state 622. As described above, the name of the first entity 630 is also changed in some fashion that prevents the two entities from sharing the same name. Subsequently, a change is received that deletes the first entity 630, resulting in state 623.
  • Note that when the synchronization is complete, the final state 623 of the buffer 221 matches the final state 603 of the DSA 150, and there are no remaining artifacts (i.e., there are no remaining transients).
  • FIG. 7 is a graphical illustration of a synchronization between a master namespace (DSA 150) and a slave namespace (buffer 221) that suffers from a dangling reference. Shown again are a DSA 150 acting as a master namespace, and a buffer 221 acting as a slave. The sequence of events occurs from top to bottom, where the state of the DSA 150 changes over time. Initially, at state 701, the DSA 150 includes a first entity 710 currently named Alpha. At that point, a second entity 711 named Beta is created, resulting in state 702. Subsequently, a reference is added to the first entity 710 that points to the second entity 711.
  • At that point an incremental synchronization occurs between the DSA 150 and the buffer 221. Initially, at state 721, the buffer 221 matches the initial state 701 of the DSA 150. However, the changes to entities within the DSA 150 are transmitted to the buffer 221 out of order. Thus, the first change that occurs in the buffer 221 is that a reference is added to the first entity 730 that points to the second entity 731, but the second entity 731 has not yet been created. Accordingly, a phantom entity 731 is created having the name referred to by the first entity 730. As described above, a flag or bit within a typical entity may be used to indicate that the phantom entity 731 is a placeholder. Subsequently, a change is received that formally creates the second entity 731, and thus the phantom status is removed from it, resulting in state 723.
  • Note that when the synchronization is complete, the final state 723 of the buffer 221 matches the final state 703 of the DSA 150, and there are no remaining artifacts (i.e., there are no phantom entities).
  • Under normal conditions, a successfully completed synchronization process should leave the slave namespace (e.g., the buffer 221) in such a state that it has no transients or phantoms. This follows directly from the definition of a synchronization process, which stipulates that after a successfully completed synchronization the master and slave namespace are identical.
  • If the slave namespace is left in a state with transients or phantoms, an error conditions occurs. If a synchronization process has terminated abnormally, then the application could simply warn the user about the presence of these entities, since subsequent resumption of the synchronization process should resolve their presence.
  • A different action may be appropriate if either transients or phantoms remain after a synchronization process has completed successfully. This situation can be result from two scenarios. First, the presence of transients after a successful synchronization process means the synchronization feed contains invalid data, and an error should be raised. The presence of phantoms after a successful synchronization would also mean the synchronization feed is invalid, as long as the synchronization process is always synchronizing all entities in both the master and slave namespaces.
  • However, phantoms can remain after a successful synchronization if the synchronization process is modified such that only a subset of entities in the master namespace is synchronized with a corresponding subset in the slave namespace. Such a process may be called filtered synchronization, and it differs from normal synchronization in that it is possible for entities within the filtered subset to contain references to entities outside the filtered subset. Otherwise, filtered synchronization is identical to normal synchronization, except that the constraints on the process with respect to convergence only apply to the synchronized subsets of the master and slave namespaces. When the techniques described here operate on a filtered synchronization process, it is possible that phantom entities will remain if there are such references. From this perspective, phantoms may in fact be considered a useful tool in the representation of references to filtered entities in the underlying data source.
  • FIG. 8 shows an exemplary computer 800 suitable as an environment for practicing various aspects of subject matter disclosed herein. Components of computer 800 may include, but are not limited to, a processing unit 820, a system memory 830, and a system bus 821 that couples various system components including the system memory 830 to the processing unit 820. The system bus 821 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISAA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus also known as the Mezzanine bus.
  • Exemplary computer 800 typically includes a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by computer 800 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. Computer storage media include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computer 800. Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.
  • The system memory 830 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 831 and random access memory (RAM) 832. A basic input/output system 833 (BIOS), containing the basic routines that help to transfer information between elements within computer 800, such as during start-up, is typically stored in ROM 831. RAM 832 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 820. By way of example, and not limitation, FIG. 8 illustrates operating system 834, the exemplary rules/specifications, services, storage 801 (e.g., storage may occur in RAM or other memory), application programs 835, other program modules 836, and program data 837. Although the exemplary rules/specifications, services and/or storage 801 are depicted as software in random access memory 832, other implementations may include hardware or combinations of software and hardware.
  • The exemplary computer 800 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only, FIG. 8 illustrates a hard disk drive 841 that reads from or writes to non-removable, nonvolatile magnetic media, a magnetic disk drive 851 that reads from or writes to a removable, nonvolatile magnetic disk 852, and an optical disk drive 855 that reads from or writes to a removable, nonvolatile optical disk 856 such as a CD ROM or other optical media. Other removable/non-removable, volatile/nonvolatile computer storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. The hard disk drive 841 is typically connected to the system bus 821 through a non-removable memory interface such as interface 840, and magnetic disk drive 851 and optical disk drive 855 are typically connected to the system bus 821 by a removable memory interface such as interface 850.
  • The drives and their associated computer storage media discussed above and illustrated in FIG. 8 provide storage of computer-readable instructions, data structures, program modules, and other data for computer 800. In FIG. 8, for example, hard disk drive 841 is illustrated as storing operating system 844, application programs 845, other program modules 846, and program data 847. Note that these components can either be the same as or different from operating system 834, application programs 835, other program modules 836, and program data 837. Operating system 844, application programs 845, other program modules 846, and program data 847 are given different numbers here to illustrate that, at a minimum, they are different copies. A user may enter commands and information into the exemplary computer 800 through input devices such as a keyboard 862 and pointing device 861, commonly referred to as a mouse, trackball, or touch pad. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to the processing unit 820 through a user input interface 860 that is coupled to the system bus, but may be connected by other interface and bus structures, such as a parallel port, game port, or a universal serial bus (USB). A monitor 891 or other type of display device is also connected to the system bus 821 via an interface, such as a video interface 890. In addition to the monitor 891, computers may also include other peripheral output devices such as speakers 897 and printer 896, which may be connected through an output peripheral interface 895.
  • The exemplary computer 800 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 880. The remote computer 880 may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to computer 800, although only a memory storage device 881 has been illustrated in FIG. 8. The logical connections depicted in FIG. 8 include a local area network (LAN) 871 and a wide area network (WAN) 873, but may also include other networks. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets, and the Internet.
  • When used in a LAN networking environment, the exemplary computer 800 is connected to the LAN 871 through a network interface or adapter 870. When used in a WAN networking environment, the exemplary computer 800 typically includes a modem 872 or other means for establishing communications over the WAN 873, such as the Internet. The modem 872, which may be internal or external, may be connected to the system bus 821 via the user input interface 860, or other appropriate mechanism. In a networked environment, program modules depicted relative to the exemplary computer 800, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation, FIG. 8 illustrates remote application programs 885 as residing on memory device 881. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
  • The subject matter described above can be implemented in hardware, in software, or in both hardware and software. In certain implementations, the exemplary flexible rules, identity information management processes, engines, and related methods may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The subject matter can also be practiced in distributed communications environments where tasks are performed over wireless communication by remote processing devices that are linked through a communications network. In a wireless network, program modules may be located in both local and remote communications device storage media including memory storage devices.

Claims (33)

1. A method for synchronizing information in namespaces, comprising:
receiving an indication of a change to information in a first namespace;
based on the indication, determining if an entity exists in a second namespace related to the information;
if so, determining if the entity has a characteristic that conflicts with the information; and
if a conflict exists, modifying the entity to resolve the conflict prior to applying the change to the second namespace.
2. The method of claim 1, wherein the indication of the change comprises a notice that another entity was added to the first namespace.
3. The method of claim 2, wherein the characteristic comprises a name of the other entity.
4. The method of claim 3, wherein the conflict comprises a name collision between the entity in the first namespace and the entity in the second namespace.
5. The method of claim 4, wherein modifying the entity in the second namespace comprises creating an indication that the characteristic of the entity in the second namespace has become invalid.
6. The method of claim 5, wherein creating the indication comprises associating with the entity in the second namespace an indication that the name of the entity in the second namespace is no longer valid.
7. The method of claim 1, wherein the information in the first namespace comprises an entity in the first namespace.
8. The method of claim 1, wherein modifying the entity comprises altering the characteristic of the entity to eliminate the conflict.
9. The method of claim 8, wherein the characteristic comprises a name of the entity, and wherein altering the characteristic comprises modifying the name of the entity.
10. The method of claim 9, wherein modifying the name comprises replacing the name with a unique identifier.
11. The method of claim 9, wherein modifying the name comprises setting a flag associated with the entity to indicate that the name of the entity is transient.
12. A computer-readable medium having computer-executable instructions for performing the method of claim 1.
13. A method for synchronizing information in namespaces, comprising:
receiving an indication of a change to information in a first namespace;
based on the indication, determining if an entity exists in a second namespace related to the information;
if not, creating a representation of the entity within the second namespace.
14. The method of claim 13, wherein the indication of the change comprises a notice of a reference to'the entity in the second namespace.
15. The method of claim 14, wherein the reference indicates that the information in the first namespace refers to the entity in the second namespace.
16. The method of claim 15, wherein the representation of the entity comprises a phantom entity in the second namespace.
17. The method of claim 16, wherein the phantom entity includes a flag indicating the state of the phantom entity.
18. The method of claim 17, further comprising, receiving a second indication of a second change to information in the first namespace and in response to the second indication, modifying the state of the phantom entity.
19. The method of claim 18, wherein the second indication comprises an instruction to create the entity in the second namespace.
20. A computer-readable medium having computer-executable instructions for performing the method of claim 13.
21. A technique for synchronizing entities within two namespaces, comprising:
while synchronizing the two namespaces:
identifying a conflict between a change notification received from a first namespace and a state of information within a second namespace;
creating a temporary entity within the second namespace that allows the synchronization to proceed without interference by the conflict; and
if the conflict becomes resolved such that the temporary entity is no longer necessary, removing the temporary entity.
22. The technique of claim 21, wherein the conflict becomes resolved by receiving a notice to delete the temporary entity.
23. The technique of claim 21, wherein the conflict becomes resolved by receiving a notice to make the temporary entity permanent.
24. A computer-readable medium encoded with a data structure, comprising:
a plurality of entities, each entity having
a first field having a name, the name being unique across each entity in the data structure;
a second field having an identity, the identity being globally unique; and
a third field having a phantom property, the phantom property being operative to distinguish between a first state of the entity and a second state of the entity.
25. A computer-readable medium having computer-executable components, comprising:
a synchronization environment having an associated external namespace, an associated central namespace, and a synchronization mechanism, the synchronization mechanism being configured to receive change information from the external namespace that identifies a plurality of changes to at least one object in the external namespace, the synchronization mechanism being configured to receive the change information in a first order that differs from a second order, the second order being the temporal order in which the changes occurred to the at least one object in the external namespace, the synchronization mechanism further comprising a name resolving component and a placeholder component, the name resolving component being operative to avoid name collisions and the placeholder component being operative to avoid dangling references.
26. The computer-readable medium of claim 25, wherein the central namespace includes a plurality of objects that are correlated to a corresponding plurality of objects in the external namespace.
27. The computer-readable medium of claim 25, wherein the name collision comprises an error corresponding to two objects in the central namespace having similar names.
28. The computer-readable medium of claim 27, wherein the name resolving component comprises a pair of subspaces, one subspace for transient objects, and the other subspace for non-transient objects.
29. The computer-readable medium of claim 28, wherein the transient objects comprise objects that have been identified as having a name that is no longer valid.
30. The computer-readable medium of claim 28, wherein the non-transient objects comprise objects that have not been identified as having a name that is no longer valid.
31. The computer-readable medium of claim 25, wherein the dangling reference comprises an error corresponding to one object in the central namespace referring to another object in the central namespace that does not yet exist.
32. The computer-readable medium of claim 31, wherein the placeholder component comprises an identifier on a phantom object in the central namespace.
33. The computer-readable medium of claim 32, wherein the phantom object comprises an object that is referred to by another object in the central namespace but which has not yet been formally created.
US10/669,866 2003-09-24 2003-09-24 Incremental non-chronological synchronization of namespaces Active 2026-04-21 US7584219B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/669,866 US7584219B2 (en) 2003-09-24 2003-09-24 Incremental non-chronological synchronization of namespaces

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/669,866 US7584219B2 (en) 2003-09-24 2003-09-24 Incremental non-chronological synchronization of namespaces

Publications (2)

Publication Number Publication Date
US20050065978A1 true US20050065978A1 (en) 2005-03-24
US7584219B2 US7584219B2 (en) 2009-09-01

Family

ID=34313778

Family Applications (1)

Application Number Title Priority Date Filing Date
US10/669,866 Active 2026-04-21 US7584219B2 (en) 2003-09-24 2003-09-24 Incremental non-chronological synchronization of namespaces

Country Status (1)

Country Link
US (1) US7584219B2 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040260591A1 (en) * 2003-06-17 2004-12-23 Oracle International Corporation Business process change administration
US20070028215A1 (en) * 2005-07-26 2007-02-01 Invensys Systems, Inc. Method and system for hierarchical namespace synchronization
US20130073516A1 (en) * 2011-06-23 2013-03-21 Alibaba Group Holding Limited Extracting Incremental Data
US10623491B2 (en) * 2015-11-04 2020-04-14 Dropbox, Inc. Namespace translation
US11704199B1 (en) * 2022-06-11 2023-07-18 Snowflake Inc. Data replication with cross replication group references

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8086633B2 (en) 2009-08-27 2011-12-27 International Business Machines Corporation Unified user identification with automatic mapping and database absence handling
US10140461B2 (en) 2015-10-30 2018-11-27 Microsoft Technology Licensing, Llc Reducing resource consumption associated with storage and operation of containers

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5893116A (en) * 1996-09-30 1999-04-06 Novell, Inc. Accessing network resources using network resource replicator and captured login script for use when the computer is disconnected from the network
US5903753A (en) * 1995-08-18 1999-05-11 International Business Machines Corporation Name space registry with backward compatibility for older applications
US6061743A (en) * 1998-02-19 2000-05-09 Novell, Inc. Method and apparatus for aggregating disparate namespaces
US6154212A (en) * 1997-11-06 2000-11-28 Lucent Technologies Inc. Method and apparatus for constructing network interfaces
US6269405B1 (en) * 1998-10-19 2001-07-31 International Business Machines Corporation User account establishment and synchronization in heterogeneous networks
US6269406B1 (en) * 1998-10-19 2001-07-31 International Business Machines Corporation User group synchronization to manage capabilities in heterogeneous networks
US20020095479A1 (en) * 2001-01-18 2002-07-18 Schmidt Brian Keith Method and apparatus for virtual namespaces for active computing environments
US20020133487A1 (en) * 2001-03-15 2002-09-19 Microsoft Corporation System and method for unloading namespace devices
US6581074B1 (en) * 2000-10-06 2003-06-17 Microsoft Corporation Directory synchronization
US20030131104A1 (en) * 2001-09-25 2003-07-10 Christos Karamanolis Namespace management in a distributed file system
US20030145003A1 (en) * 2002-01-10 2003-07-31 International Business Machines Corporation System and method for metadirectory differential updates among constituent heterogeneous data sources
US6604148B1 (en) * 1999-10-01 2003-08-05 International Business Machines Corporation Method, system, and program for accessing a network namespace
US6611847B1 (en) * 1999-12-30 2003-08-26 Unisys Corporation Method for dynamically linking two objects in two different models
US20030195870A1 (en) * 2002-04-15 2003-10-16 International Business Machines Corporation System and method for performing lookups across namespace domains using universal resource locators
US20030225753A1 (en) * 2001-12-18 2003-12-04 Becomm Corporation Method and system for attribute management in a namespace
US6725262B1 (en) * 2000-04-27 2004-04-20 Microsoft Corporation Methods and systems for synchronizing multiple computing devices
US20040172421A1 (en) * 2002-12-09 2004-09-02 Yasushi Saito Namespace consistency for a wide-area file system
US20040225675A1 (en) * 2003-05-08 2004-11-11 Microsoft Corporation Associating and using information in a metadirectory
US20040267752A1 (en) * 2003-04-24 2004-12-30 Wong Thomas K. Transparent file replication using namespace replication
US20050027734A1 (en) * 2001-11-26 2005-02-03 Microsoft Corporation Extending a directory schema independent of schema modification
US6895586B1 (en) * 2000-08-30 2005-05-17 Bmc Software Enterprise management system and method which includes a common enterprise-wide namespace and prototype-based hierarchical inheritance

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5586318A (en) * 1993-12-23 1996-12-17 Microsoft Corporation Method and system for managing ownership of a released synchronization mechanism
US6418200B1 (en) * 1999-02-26 2002-07-09 Mitel, Inc. Automatic synchronization of address directories for unified messaging
US8055907B2 (en) * 2003-10-24 2011-11-08 Microsoft Corporation Programming interface for a computer platform

Patent Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5903753A (en) * 1995-08-18 1999-05-11 International Business Machines Corporation Name space registry with backward compatibility for older applications
US5893116A (en) * 1996-09-30 1999-04-06 Novell, Inc. Accessing network resources using network resource replicator and captured login script for use when the computer is disconnected from the network
US6154212A (en) * 1997-11-06 2000-11-28 Lucent Technologies Inc. Method and apparatus for constructing network interfaces
US6061743A (en) * 1998-02-19 2000-05-09 Novell, Inc. Method and apparatus for aggregating disparate namespaces
US6269405B1 (en) * 1998-10-19 2001-07-31 International Business Machines Corporation User account establishment and synchronization in heterogeneous networks
US6269406B1 (en) * 1998-10-19 2001-07-31 International Business Machines Corporation User group synchronization to manage capabilities in heterogeneous networks
US6604148B1 (en) * 1999-10-01 2003-08-05 International Business Machines Corporation Method, system, and program for accessing a network namespace
US6611847B1 (en) * 1999-12-30 2003-08-26 Unisys Corporation Method for dynamically linking two objects in two different models
US6725262B1 (en) * 2000-04-27 2004-04-20 Microsoft Corporation Methods and systems for synchronizing multiple computing devices
US6895586B1 (en) * 2000-08-30 2005-05-17 Bmc Software Enterprise management system and method which includes a common enterprise-wide namespace and prototype-based hierarchical inheritance
US6581074B1 (en) * 2000-10-06 2003-06-17 Microsoft Corporation Directory synchronization
US20020095479A1 (en) * 2001-01-18 2002-07-18 Schmidt Brian Keith Method and apparatus for virtual namespaces for active computing environments
US6877018B2 (en) * 2001-03-15 2005-04-05 Microsoft Corporation System and method for unloading namespace devices
US20020133487A1 (en) * 2001-03-15 2002-09-19 Microsoft Corporation System and method for unloading namespace devices
US20030131104A1 (en) * 2001-09-25 2003-07-10 Christos Karamanolis Namespace management in a distributed file system
US20050027734A1 (en) * 2001-11-26 2005-02-03 Microsoft Corporation Extending a directory schema independent of schema modification
US20050044103A1 (en) * 2001-11-26 2005-02-24 Microsoft Corporation Extending a directory schema independent of schema modification
US6952704B2 (en) * 2001-11-26 2005-10-04 Microsoft Corporation Extending a directory schema independent of schema modification
US20030225753A1 (en) * 2001-12-18 2003-12-04 Becomm Corporation Method and system for attribute management in a namespace
US20030145003A1 (en) * 2002-01-10 2003-07-31 International Business Machines Corporation System and method for metadirectory differential updates among constituent heterogeneous data sources
US20030195870A1 (en) * 2002-04-15 2003-10-16 International Business Machines Corporation System and method for performing lookups across namespace domains using universal resource locators
US20040172421A1 (en) * 2002-12-09 2004-09-02 Yasushi Saito Namespace consistency for a wide-area file system
US20040267752A1 (en) * 2003-04-24 2004-12-30 Wong Thomas K. Transparent file replication using namespace replication
US20040225675A1 (en) * 2003-05-08 2004-11-11 Microsoft Corporation Associating and using information in a metadirectory

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040260591A1 (en) * 2003-06-17 2004-12-23 Oracle International Corporation Business process change administration
US20070028215A1 (en) * 2005-07-26 2007-02-01 Invensys Systems, Inc. Method and system for hierarchical namespace synchronization
US20130073516A1 (en) * 2011-06-23 2013-03-21 Alibaba Group Holding Limited Extracting Incremental Data
US10623491B2 (en) * 2015-11-04 2020-04-14 Dropbox, Inc. Namespace translation
US11704199B1 (en) * 2022-06-11 2023-07-18 Snowflake Inc. Data replication with cross replication group references

Also Published As

Publication number Publication date
US7584219B2 (en) 2009-09-01

Similar Documents

Publication Publication Date Title
US11321303B2 (en) Conflict resolution for multi-master distributed databases
US10061828B2 (en) Cross-ontology multi-master replication
US8332359B2 (en) Extended system for accessing electronic documents with revision history in non-compatible repositories
US7389335B2 (en) Workflow management based on an integrated view of resource identity
JP4986418B2 (en) Method and system for caching and synchronizing project data
US7730475B2 (en) Dynamic metabase store
US6873995B2 (en) Method, system, and program product for transaction management in a distributed content management application
US5819272A (en) Record tracking in database replication
US7672966B2 (en) Adding extrinsic data columns to an existing database schema using a temporary column pool
US7594082B1 (en) Resolving retention policy conflicts
US7818300B1 (en) Consistent retention and disposition of managed content and associated metadata
US20180329930A1 (en) Upgrading systems with changing constraints
CN111465930A (en) Violation resolution in client synchronization
US8214377B2 (en) Method, system, and program for managing groups of objects when there are different group types
US20050066059A1 (en) Propagating attributes between entities in correlated namespaces
US7783591B2 (en) Coordinated data conversion systems and methods
US7970743B1 (en) Retention and disposition of stored content associated with multiple stored objects
CA2484007C (en) Providing a useable version of a data item
US7962447B2 (en) Accessing a hierarchical database using service data objects (SDO) via a data access service (DAS)
JP2004295870A (en) Consistency unit replication in application-defined system
US20050192990A1 (en) Determining XML schema type equivalence
US20030041069A1 (en) System and method for managing bi-directional relationships between objects
US7814063B1 (en) Retention and disposition of components of a complex stored object
US7584219B2 (en) Incremental non-chronological synchronization of namespaces
JP4580390B2 (en) System and method for extending and inheriting information units manageable by a hardware / software interface system

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT CORPORATION, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ZYBURA, JOHN H.;BENSON, MAX L.;MAN, HERMAN;AND OTHERS;REEL/FRAME:014551/0811;SIGNING DATES FROM 20030919 TO 20030922

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034541/0477

Effective date: 20141014

FPAY Fee payment

Year of fee payment: 8

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 12