What is Replication?
In any organization, large or small, it is imperative that directory data is updated regularly and available for access to all users. For example, when an employee’s telephone number is modified, it must be communicated throughout the organization ensuring up-to-dateness of every domain controller.
This is accomplished through a mechanism called replication. Simply put, replication is the process by which copies of directory data are created and maintained in several domain controllers.
Replication in Active Directory
Before the advent of Windows server 2000, Windows NT followed a master slave approach to replication by employing a single writable Primary Domain controller (PDC) and many associated, read-only Backup Domain Controllers (BDC). After AD departed from the traditional master slave method of replication, it uses a Multimaster approach for the replication of directory data. As the name suggests, in the Multimaster approach, each domain controller acts as a master and can replicate data to the other domain controllers.
Replication across the three different directory partitions-Schema partition,Configuration partition and Domain partition are carried out differently. Schema container holds definitions about objects and object attributes and is ubiquitous in nature. Any update to the schema is replicated forest wide. Configuration container contains physical layout of sites. Similar to Schema data, configuration data is also replicated throughout the forest. On the contrary, domain controllers residing in different domains, house different set of data that are domain confined. Thus to facilitate dispersion of data throughout an organization, the data in each domain controller is completely replicated to every other domain controller in the domain and partially replicated to the global catalog server.
How does it work?
Now that we know how replication occurs at three levels of directory partition, it is essential to understand that Active Directory replication is attribute based. To understand this lets go back to our first example- the change in employees telephone number. Here “telephone number” is one of the attributes that defines the object “employee”. When this attribute is modified, only the change in the attribute, that is the new telephone number, is replicated to all the domain controllers and not the entire object. Here comes the concept of Update Sequence Numbers(USN). When an object is created, by default a USN is assigned to them. Whenever a change is elicited these USNs are incremented making every other USN in other domain controllers go out of date for that object. To ensure that only the most recent changes are replicated, only the highest USN is stored and displayed. Thus changes are monitored and recorded with the help of USN in Active Directory.
Sites and Replication
Rather than managing a network as a whole, for administrative convenience, it is divided into “sites” to achieve cost efficiency and speed. In such cases when more than one site exists, the replication process is carried out differently for inter and intrasite replication. While intrasite replication occurs within a site,intersite replication takes place between sites.
Due to higher availability of bandwidth, updates are replicated at high speed as and when made, within a site. This type of replication is unscheduled and is carried out using a bidirectional ring topology, wherein a minimum of two connections for each domain controller are established to increase fault tolerance.
While it is assumed that within a site bandwidths are very high, between two sites the bandwidth is typically restricted. Hence to save bandwidth, cost and increase efficiency, updates replicated between sites are usually compressed and follow a configurable schedule. In addition to using RPC/IP which are employed in intrasite replication, intersite replication exercises asynchronous protocols like SMTP as well.
Configuring the topology for inter and intrasite replication is indeed painstaking, but thankfully for us, Active Directory configures its own replication topologies using Knowledge Consistency Checker(KCC). KCC is an Active Directory service that lifts the burden of generating a topology from the shoulders of an administrator. With the help of KCC the domain controllers consolidate all the directory partition copies and disseminate the replicated information through a set of connections that span over LAN’s and WAN’s. These set of connections together form the replication topology.
Site link objects serve as the building blocks for developing an inter site replication topology. Each of these site link objects are assigned a “cost” and a least cost replication topology is generated by the KCC by reducing redundant replication pathways.
Every time a change is made in the directory it is not necessary that the source domain controller replicate the data directly to each one of the destination domain controllers. This creates too much overload on the domain controller and effectuates replication traffic. This is where a replication partner can help greatly.The KCC identifies certain domain controllers as its replication partners and transmits the replication information through them. These connection objects can either directly or transitively transfer data to the destination DC’s
AD Replication Model
Some mechanisms aid in the unhindered replication of directory updates from one DC to another. These mechanisms together, contribute to the Active Directory replication model. This model comprises of four components, which integrate all the replication services.
|Multimaster replication||A peer-to-peer model of replication where each domain controller can send and receive updates when a change is effected. In the multimaster replication all the domain controllers are vested equal powers and can initiate replication whenever situation demands.|
|Store and forward replication||Store and forward replication eliminates point to point replication from one DC to another. Instead the changes are first replicated to one particular DC, which in turn communicates the changes to its replication partners. They subsequently send it around to their own replication partners and so on until all the DCs are updated.|
|Pull replication||When a change is made in a domain controller it simply does not push those updates to other DC’s. Rather, it waits for an update request from them. Once the destination solicits a specific information, the source sends it immediately helping the destination update itself.|
|State-based replication||In a state based replication the domain controllers replicate only the current state of the object instead of replicating the entire object itself.|
Though the mechanism of replication looks absolutely foolproof, there is a possibility of the same object attribute being modified concurrently by two different people from different DC’s. This gives rise to a replication collision. The catch here of course is which change must actually be replicated throughout the directory ! In such cases when there are double USN updates for the object, the time stamp of the two objects are verified and the most recent time stamp is accepted.
Another issue is the name collision. To understand this we need to take a peek into the user object creation process . When an user object is created, it is quintessential for its Relative Distinguished Name (RDN) to be an unique attribute within its parent OU and so should be the sAMAccountName attribute, inside a domain. If these two rules are breached, an error message prompting you to change the name of the user object, pops up. But it is quite possible for two people from different DC’s to create an object with the same RDN within an OU and identical sAMAccountName within a domain at precisely the same time. In such cases AD lets the user object be created with no error message. But when replication commences, ambiguities arise. To make one object exclusive from another, the replication system attaches to the first created object, its GUID, making it unique. Thus name collisions are resolved.