Home > Windows Tips > > When an Active Directory design goes bad -- and how to fix it
Win IT Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 


When an Active Directory design goes bad -- and how to fix it


Gary Olsen, Contributor
08.29.2006
Rating: -3.82- (out of 5)


Expert advice on Active Directory and Group Policy
Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


In a previous article by Gary Olsen, Best practices for Active Directory replication topology design, he discussed some basic principles in regard to designing Active Directory replication topology. This week, he recalls a case study in which a poor design eventually resulted in serious replication problems and caused the topology to be re-designed and implemented.

The company in this week's case complained that after a domain controller failed and became unavailable, the replication didn't flow as the designers had planned. We were able to work with them to figure out what went wrong and how to fix it -- all on the fly.

In the sidebar, you'll find the best practices identified in last week's article which we will use here to analyze this company's Active Directory replication topology to help resolve the problem.

Identifying where things went wrong

Figure 1 shows a graphical illustration of the replication topology at this corporation.

[IMAGE]

We examined the topology, and it was apparent that the designers intended to create a multi-tier replication topology, forcing the sites at the lower bandwidth locations (tier 3) to replicate to regional hubs (level 2), which, in turn, replicates to two "core" hubs (tier 1). Unfortunately, when they implemented this design, they failed to observe best practice number four, and if they would have diagrammed the flow (best practice number five), they most likely would have seen that the design just didn't make sense.

Figure 1 only shows a part of the whole topology. The company in this case collected the Northwest sites -- Calgary, Sacramento, Portland, Boise and Seattle -- and put them into one site link. The company repeated this for other regional groupings of sites -- defining each group into a site link, so there was a Southwest link, Southeast link and a Northeast link as well.

The core link contained the company's two hub sites in Boston (East) and Chicago (West). The pr


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us    Add to Google


RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary


oblem with this topology is that it violates best practice number three: There was no topology defined to connect the three tiers together. That is, you could replicate within the sites in the Northwest link, and between the sites in the Southwest link, and even in the core site links, but they couldn't replicate from, say, Seattle to Los Angeles. This is a perfect example of what would cause the infamous Event 1311 in the Directory Services event log.

Fixing the design can be trial and error

To make this work, the designers decided to connect the links by putting regional hub sites that were in the third tier (such as the Northwest link) in the second tier (West) link as well. They also added the appropriate core site link to the second tier site link (see Figure 2). Note that Sacramento, the regional hub, is now part of two links -- the West link and the Northwest link. The first and second level tiers are connected by putting the Chicago site in the West link as well as in the core link. That satisfies best practice number three and connects the topology together.

[IMAGE]

Although it still violates best practice number four, it did work. It worked until the DC in the Sacramento site had a disk drive fail, which caused them to rebuild the Sacramento DC. For reasons we still don't fully understand, since the KCC had used Sacramento to connect to the West link, and since we had full bridging turned on, the KCC decided to elect another site to replace Sacramento in the topology. For whatever reason, it picked Portland, and replication worked.

Meanwhile, the Sacramento DC comes back online, but the KCC still insists on using Portland. Portland isn't a Level 2 site and thus doesn't have the bandwidth that Sacramento has, so performance suffered. They wanted to get it back the way it was. After trying unsuccessfully to repair it, they rebuilt the Sacramento DC, and the KCC recreated the routing to funnel through Sacramento. All was well until the DC in Los Angeles went out. Los Angeles was the regional hub for the Southwest link. Same thing happened -- with LA gone, the KCC elected the Las Vegas site as the Southwest link "hub." And when LA came back online, the routing didn't change and performance issues arose. Just like for the Northwest link, they rebuilt the Los Angeles DC and replication was routed successfully through Los Angeles again. In addition, they found some third-level sites that had connections created to them from all other sites in the AD. This was indeed strange behavior.

Do I have to rebuild the DC every time?

Now they are wondering if every time a DC in a hub site goes down, would they have to rebuild it? This was not acceptable.

After diagramming the topology (Figure 2), we thought it should work. However, I'd never seen site links created with many sites in a single site link like they had done. I reasoned that we had violated best practice number eight. That is, by putting several sites in a site link, when a failure occurred and the KCC had to rebuild the topology, it picked another site to hook to the next tier. I asked an expert on replication at Microsoft if perhaps it was picking the DC based on the GUID. He was as baffled about this as I was but thought that was a good guess anyway.

The solution, of course, was to obey best practice number four and not allow any site links to have more than two sites. That makes the topology look like Figure 3, with the red and blue lines representing site links and the numbers representing the site link cost. I advised the administrator to delete all site links except the core and to create new site links, each with only two sites, then connect the regional hubs. That is, connect tier 3 hubs to their corresponding tier 2 hubs, etc.

[IMAGE]

Lessons learned

The administrator asked me if that would affect his production AD environment. I told him this action would definitely affect the AD … it would fix the problem! He actually just reconstructed the links in the Northwest site link, creating three new site links, each with Sacramento in it. So that gave him Sacramento-Portland, Sacramento-Calgary and Sacramento-Seattle links. It indeed fixed the problem. They tested it by taking the Sacramento DC out and putting it back in, and it routed replication properly. He then rebuilt the rest of the topology.

The lessons to be learned here are:

We were able to fix a bad design literally on the fly without affecting our production AD. Of course we used common sense and rebuilt it over the weekend, but it was not a significant outage and the administrative work was minimal.

AD replication topologies are not set in stone. It is best to fix them and eliminate the problems than to live with the problems.

Gary Olsen is a systems software engineer for Hewlett-Packard in Global Solutions Engineering. He authored Windows 2000: Active Directory Design and Deployment and co-authored Windows Server 2003 on HP ProLiant Servers.

Rate this Tip
To rate tips, you must be a member of SearchWinIT.com.
Register now to start rating these tips. Log in if you are already a member.




DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.



Windows Technology Updates, Reviews and Solutions

Laptop Discounts with free coupon codes, huge savings at Notebook Review

HomeNewsTopicsITKnowledge ExchangeTipsAsk the ExpertsMultimediaWhite PapersIT Downloads
About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Site Map




All Rights Reserved, Copyright 1999 - 2009, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts