09/23/2014
New Design for Redundant Routers at sites
Description

A new design for fully routed massively redundant self-healing wireless networks

 

We run a rather sizeable wireless ISP network in and around Denver, CO. Our customers are all commercial concerns where reliability is paramount; to that end, we started building all our sites years ago using a fully routed architecture with OSPF to do failover in the case of a link or site failing. We originally did this using Compaq 486 PCs running Linux back in 1999. However, as the network grew, and the number of redundant backhaul links grew (several sites have more than 6 backhaul links to other sites) the complexity of the multiple PCs at each site became unmanageable.

 

So the Mark-II design was to use an industrial mini-ITX computer, still running Linux (CentOS 5.7 in this case) with 2 Sun Quad Ethernet PCI cards giving a total or 9 FE ports per router. Since they were all DC powered, at the same time we converted the sites to DC power with battery banks running everything, removing the aging UPSs. This has been a very reliable system over the years, with the exception of Hard Disk failures of the original drives – this was fixed by replacing them with Solid State Drives.

ITXrouter1.jpg

However, as the network has grown, the need for speeds beyond what the Sun FE ports can provide, coupled with the loss of suppliers for the ITX chassis made us look for another solution. When Ubiquiti came out with their EdgeRouter line, this looked promising. The subsequent ER-PoE units sealed the deal.

 

Existing Network Design – links and Broadcast

 

The design of each site is identical except for the number of backhaul links used. Each site has a dedicated RFC 1918 10.x y.0/24 network for all customer links, which are reached by standard M5 UBNT CPE radios running in Router mode. The customer static /30 or /29 routable IP network is entered in the site router into Quagga, and OSPF then distributes and manages the routes upstream back to the core POPs where we join the Internet. Point-to-Point links between sites are each given their own /24 network, again in the 10.x.y.0 RFC 1918 space. This makes tracking which radio/link you are looking at easy to determine, and keeps things organized in a rapidly growing network. It uses a lot of 10 addresses, but since ther's 16 Million of them to use...

 

At the core of the network, we have multiple routers (running CentOS on HP DL360 Blades) running OSPF toward the wireless network, and BGP toward the Internet to our upstream carriers. We have redundant routers in place, and since routing is logical rather than physical we can handle any failures there. Switching is handled by HP Gig and FE switches with dual power rails, again for redundancy.

 

So customer networks are routed through the internal 10 networks on the backhaul links to the site the customer works off of, and terminated in the customer radio. At each site, the local router's Broadcast Network port connects to a switch which drives the Sector Aps. A site can have 6 or more Sector APs depending on it's coverage area and specific needs. In the larger sites, we used DLI PoE mid span injectors to power both the backhaul link radios and the sector antennas; in smaller sites this is done with passive 24V panel blocks or with ToughSwitch5 PoEs running directly off 24V DC for the sector radios.

 

Redundant Site Design

 

One of the issues with the previous ITX based routers was the single-point-of-failure they created. If a backhaul link failed, OSPF would automatically route around the link. However, if the router itself failed, the entire site and all it's links to other sites would fail as well. While this seldom happened, it was still a possibility.

 

Another issue is one inherent to Linux – the kernel and Quagga get confused if they see ARP replies coming in on more than one interface. This can happen if you have two parallel links between two sites, and both connect to single routers at both ends. In most of our cases, this wasn't an issue, as our redundant backhaul links are designed in rings across multiple sites, but in several locations having this capability would be very nice.

 

In a couple of cases, we tried using 2 routers and splitting the links between them. This worked, but if a router failed, half the links would still fail. But by putting the broadcast network on both routers, with a switch connecting both routers and the local broadcast APs, at least the customers would stay up if one router failed.

 

About a year ago, it became obvious that this current design was reaching the end of it's useful life. We were unable to find a source of mini-ITX chassis that would accommodate two PCI-x cards, and the DC power supplies were starting to fail – bad caps – and though it wasn't hard to repair them, we could see that something new was needed.

 

In looking at the problem, I decided to flip the design on it's head. What was the best way to have the site stay up even if a router failed? If a link failed? The ER-Lite looked promising as the router component, but how to add the redundancy? It occurred to me that if every single link had it's own ER-Lite, and the ERs then all talked to each other using OSPF to sort things out through a local switch domain, then any single one could fail and the others would keep things going. But what about the broadcast network? If the router handling that failed, all the customers would go down. So why not add a second one, connecting them and the Sector APs with another switch? Then if the broadcast router failed the second one could take over.

 

This still left a single point of failure (actually two, with the broadcast switch) which was the interior switch linking the routers together. But what if we doubled up on them too? After all the ER had 3 ports; one for the link radio, and the other two for two separate switch domains linking to all the other switches. The final design looks like this:

 

RedundantSiteConcept.jpg

 

But this still left powering the link radios up to some outboard device, which was something else to have to manage and worry about. Then I remembered the ER-PoEs. If they were used as the routers, then they could power the radio. But they also had a 3 port switch built into them – could that be used for the interior switch domains?

 

Turns out there's a simple way to do this by chaining the switches on ports 2,3,and 4 together with the other routers eth1 port, and vice-versa. Leaving the radio on eth0 so it could be powered from there. And finally, the two Broadcast Routers could be connected to the two switch domain-chains on eth0 and eth1, and their two switches connected and used to feed and power the broadcast APs as well.

 

The final design for a small site (4 backhaul links, 4 sectors) looks like this:

 

FinalRedundSiteConcept.jpg

 

And it can be expanded to be as big as you need by just stacking more ER-PoEs and adding more links:

 

ExpandedSiteOdd.jpg

 

If the number of links isn't even, you just add a switch in place of one of the ERs to close the switch domain. With this design, any single router can fail and everything else will keep going; in many cases, more than one unit can fail and the site will still operate.

 

This is a complex series of networks, and one key element is defining the network IP structure to use to keep it all straight. In our networks, I use the 3rd octet (the y in 10.x.y.0) to define each network or site broadcast domain. I took the site broadcast octet and kept it the same for the two intermediate switch domains, changing the 2nd octet of each by 1 and 2 respectively, so you can remember the intermediate IPs of the ports on the routers for each site. For the routers themselves, I used .1 and .2 for the 4th octet on each point-to-point router (a convention we've used for years) and .2 and .3 for the switch IPs on the broadcast router; .1 (the CPE WAN gateway address) floats between the two routers as needed. The convention for the rest of the ports is similarly logical: the interior domain IPs start at .2 and .3 for the broadcast routers (so the router's 4th octet is the same across eth0, eth1 and switch0), and increment up from there for the link routers. Here is a diagram of an actual site we have operating using this today.

 

HudsonSiteDiag.jpg

 

This design looks complicated, but once you understand it, it really simplifies everything at the site. The number of ethernet interconnect cables goes way down; all the radios (except for AirFibers) are powered by the routers, as are the sector APs, unless you have more than 4, and then you just add a ToughSwitch or whatever. And since the routers themselves are DC powered, the whole thing runs off a DC battery plant with regulated 24V output, eliminating the need for UPS's.

 

Even the size of the entire thing is smaller – we're currently working on a stacking enclosure system that will hold 4 ER units (or TSs) in a 3 RU high aluminum shelf unit which can be expanded to be as big as you need, can fit in an 8” deep NEMA cabinet, has fans and space for a power control / management unit, and can be fitted with rack ears to go in a 19” rack. My goal is to be able to put an entire site with the capacity for 6-8 sectors and 8-10 links with full monitoring and management, DC power system, plus 48 hours of battery backup in two 20x20x8 inch NEMA cabinets. It makes for a very tidy, efficient installation. Plus it just works.

stackable.jpg

 I'll get some pictures of the units operating sometime soon.

 

Jim

New Design for Redundant Routers at sites

by on ‎09-23-2014 01:03 PM

A new design for fully routed massively redundant self-healing wireless networks

 

We run a rather sizeable wireless ISP network in and around Denver, CO. Our customers are all commercial concerns where reliability is paramount; to that end, we started building all our sites years ago using a fully routed architecture with OSPF to do failover in the case of a link or site failing. We originally did this using Compaq 486 PCs running Linux back in 1999. However, as the network grew, and the number of redundant backhaul links grew (several sites have more than 6 backhaul links to other sites) the complexity of the multiple PCs at each site became unmanageable.

 

So the Mark-II design was to use an industrial mini-ITX computer, still running Linux (CentOS 5.7 in this case) with 2 Sun Quad Ethernet PCI cards giving a total or 9 FE ports per router. Since they were all DC powered, at the same time we converted the sites to DC power with battery banks running everything, removing the aging UPSs. This has been a very reliable system over the years, with the exception of Hard Disk failures of the original drives – this was fixed by replacing them with Solid State Drives.

ITXrouter1.jpg

However, as the network has grown, the need for speeds beyond what the Sun FE ports can provide, coupled with the loss of suppliers for the ITX chassis made us look for another solution. When Ubiquiti came out with their EdgeRouter line, this looked promising. The subsequent ER-PoE units sealed the deal.

 

Existing Network Design – links and Broadcast

 

The design of each site is identical except for the number of backhaul links used. Each site has a dedicated RFC 1918 10.x y.0/24 network for all customer links, which are reached by standard M5 UBNT CPE radios running in Router mode. The customer static /30 or /29 routable IP network is entered in the site router into Quagga, and OSPF then distributes and manages the routes upstream back to the core POPs where we join the Internet. Point-to-Point links between sites are each given their own /24 network, again in the 10.x.y.0 RFC 1918 space. This makes tracking which radio/link you are looking at easy to determine, and keeps things organized in a rapidly growing network. It uses a lot of 10 addresses, but since ther's 16 Million of them to use...

 

At the core of the network, we have multiple routers (running CentOS on HP DL360 Blades) running OSPF toward the wireless network, and BGP toward the Internet to our upstream carriers. We have redundant routers in place, and since routing is logical rather than physical we can handle any failures there. Switching is handled by HP Gig and FE switches with dual power rails, again for redundancy.

 

So customer networks are routed through the internal 10 networks on the backhaul links to the site the customer works off of, and terminated in the customer radio. At each site, the local router's Broadcast Network port connects to a switch which drives the Sector Aps. A site can have 6 or more Sector APs depending on it's coverage area and specific needs. In the larger sites, we used DLI PoE mid span injectors to power both the backhaul link radios and the sector antennas; in smaller sites this is done with passive 24V panel blocks or with ToughSwitch5 PoEs running directly off 24V DC for the sector radios.

 

Redundant Site Design

 

One of the issues with the previous ITX based routers was the single-point-of-failure they created. If a backhaul link failed, OSPF would automatically route around the link. However, if the router itself failed, the entire site and all it's links to other sites would fail as well. While this seldom happened, it was still a possibility.

 

Another issue is one inherent to Linux – the kernel and Quagga get confused if they see ARP replies coming in on more than one interface. This can happen if you have two parallel links between two sites, and both connect to single routers at both ends. In most of our cases, this wasn't an issue, as our redundant backhaul links are designed in rings across multiple sites, but in several locations having this capability would be very nice.

 

In a couple of cases, we tried using 2 routers and splitting the links between them. This worked, but if a router failed, half the links would still fail. But by putting the broadcast network on both routers, with a switch connecting both routers and the local broadcast APs, at least the customers would stay up if one router failed.

 

About a year ago, it became obvious that this current design was reaching the end of it's useful life. We were unable to find a source of mini-ITX chassis that would accommodate two PCI-x cards, and the DC power supplies were starting to fail – bad caps – and though it wasn't hard to repair them, we could see that something new was needed.

 

In looking at the problem, I decided to flip the design on it's head. What was the best way to have the site stay up even if a router failed? If a link failed? The ER-Lite looked promising as the router component, but how to add the redundancy? It occurred to me that if every single link had it's own ER-Lite, and the ERs then all talked to each other using OSPF to sort things out through a local switch domain, then any single one could fail and the others would keep things going. But what about the broadcast network? If the router handling that failed, all the customers would go down. So why not add a second one, connecting them and the Sector APs with another switch? Then if the broadcast router failed the second one could take over.

 

This still left a single point of failure (actually two, with the broadcast switch) which was the interior switch linking the routers together. But what if we doubled up on them too? After all the ER had 3 ports; one for the link radio, and the other two for two separate switch domains linking to all the other switches. The final design looks like this:

 

RedundantSiteConcept.jpg

 

But this still left powering the link radios up to some outboard device, which was something else to have to manage and worry about. Then I remembered the ER-PoEs. If they were used as the routers, then they could power the radio. But they also had a 3 port switch built into them – could that be used for the interior switch domains?

 

Turns out there's a simple way to do this by chaining the switches on ports 2,3,and 4 together with the other routers eth1 port, and vice-versa. Leaving the radio on eth0 so it could be powered from there. And finally, the two Broadcast Routers could be connected to the two switch domain-chains on eth0 and eth1, and their two switches connected and used to feed and power the broadcast APs as well.

 

The final design for a small site (4 backhaul links, 4 sectors) looks like this:

 

FinalRedundSiteConcept.jpg

 

And it can be expanded to be as big as you need by just stacking more ER-PoEs and adding more links:

 

ExpandedSiteOdd.jpg

 

If the number of links isn't even, you just add a switch in place of one of the ERs to close the switch domain. With this design, any single router can fail and everything else will keep going; in many cases, more than one unit can fail and the site will still operate.

 

This is a complex series of networks, and one key element is defining the network IP structure to use to keep it all straight. In our networks, I use the 3rd octet (the y in 10.x.y.0) to define each network or site broadcast domain. I took the site broadcast octet and kept it the same for the two intermediate switch domains, changing the 2nd octet of each by 1 and 2 respectively, so you can remember the intermediate IPs of the ports on the routers for each site. For the routers themselves, I used .1 and .2 for the 4th octet on each point-to-point router (a convention we've used for years) and .2 and .3 for the switch IPs on the broadcast router; .1 (the CPE WAN gateway address) floats between the two routers as needed. The convention for the rest of the ports is similarly logical: the interior domain IPs start at .2 and .3 for the broadcast routers (so the router's 4th octet is the same across eth0, eth1 and switch0), and increment up from there for the link routers. Here is a diagram of an actual site we have operating using this today.

 

HudsonSiteDiag.jpg

 

This design looks complicated, but once you understand it, it really simplifies everything at the site. The number of ethernet interconnect cables goes way down; all the radios (except for AirFibers) are powered by the routers, as are the sector APs, unless you have more than 4, and then you just add a ToughSwitch or whatever. And since the routers themselves are DC powered, the whole thing runs off a DC battery plant with regulated 24V output, eliminating the need for UPS's.

 

Even the size of the entire thing is smaller – we're currently working on a stacking enclosure system that will hold 4 ER units (or TSs) in a 3 RU high aluminum shelf unit which can be expanded to be as big as you need, can fit in an 8” deep NEMA cabinet, has fans and space for a power control / management unit, and can be fitted with rack ears to go in a 19” rack. My goal is to be able to put an entire site with the capacity for 6-8 sectors and 8-10 links with full monitoring and management, DC power system, plus 48 hours of battery backup in two 20x20x8 inch NEMA cabinets. It makes for a very tidy, efficient installation. Plus it just works.

stackable.jpg

 I'll get some pictures of the units operating sometime soon.

 

Jim

" How can anyone trust Scientists? If new evidence comes along, they change their minds! " Politician's joke (sort of...)
"Humans are allergic to change..They love to say, ‘We’ve always done it this way.’ I try to fight that. "Admiral Grace Hopper, USN, Computer Scientist
":It's not Rocket Science! - Oh wait, Actually it is... "NASA bumper sticker
":The biggest problem in tech I see right now is that most users don't want to do things that are hard. That doesn't bode well for the industry or the society.": (me. actually ;-)
Comments
by
on ‎09-23-2014 01:56 PM

wow!

 

that looks pretty cool.  It is more complicated/redundant than I have ever had to deal with.  I deal mostly with end-users, not the actual switching/routing infrastructure.

 

I would love to check out your setup sometime.  My folks live in the Springs, so I am always dropping by...

by Previous Employee UBNT-stig
on ‎09-23-2014 02:15 PM

Wow, very cool, but kinda makes my head hurt.  Think I'd better stick to writing router software & building little lab test networks and leave the production networks to the pros.

by
on ‎09-23-2014 04:55 PM

Here's a question: With this many routers and swiches at each site, how are you dealing with security/feature/bug upgrades? @eejimm

by
on ‎09-23-2014 06:12 PM

Josh,

So far we haven't had to do more than update to v.1.5 to get the OSPF routing to work correctly.   One advantage of the design is upgrades can be done literally any time (except for on the Broadcast routers) since the redundant units will take over everything, so in that sense it's actually easier than for a single router.   On the Broadcast units, we'll do them one at a time in a maintenance window, probably on a Saturday night late.

Since the routers are doing so little that's complex, and don't need any added features to do their job, I don't even know if we'll worry about upgrades much.   Our existing ITX units are still running on a (technically) obsolete OS version, and there's no problem with that - from a security standpoint, the routers never hit the Internet, and there's no way for anyone from the outside to reach them; from a feature standpoint, what do we need to add?   The OS version didn't have a problem with heartbleed, and the ERs don't either.   If some other security issue arises, we'll deal with it.

AC3 may provide a better answer, but (as we do now with the radios) I see no reason to even keep up with current versions since the existing versions do everything we need.   If there are things we might need in the future, we'll be able to deal with it any time the traffic in the system dies down.

Jim

by
on ‎09-23-2014 08:22 PM

Very cool.  Awesome for a backhaul dominant site like you describe.  Still have single point of failures for APs (dual really, ER + AP), but the APs only have a single ethernet port so there isn't much to be done about it.  You could dedicate an ER to each AP like you have for backhaul links also so that 2 APs would have a single ER to fail.

 

This is where GPS sync (or other effective AP pairing) would be nice.  Have a pair of radios in same channel, same direction, same SSID.  Then you could lose an AP or ER or sometimes an AP and an ER and only have a little capacity drop.

by
on ‎09-23-2014 09:02 PM

I've thought about the AP backup issue a lot over the years.   In the last 5 years, I think we've had a total of 2 Rockets fail on sectors.  Ince with lightning, once with a power issue (went overvoltage and fried the radio, killing the ethernet)    Had some problems a few times with cabling issues (we dodged the initial ToughCable debacle fortunately) and everything's in pretty good shape right now.   The radios are just so reliable, I believe there's more return on time investment at the routing and DC power side of things. 

Jim

by
on ‎09-23-2014 09:08 PM

Thanks for sharing Jim, this is a fantastic configuration!!!!

by
on ‎09-23-2014 09:09 PM

Could always run two radios on different freqs (say 5ghz with a backup 3.65 link) and setup OSPF with path costs to only use one radio for TX and another radio for RX. If a link fails, OSPF converges onto one radio. @eejimm 

by
on ‎09-23-2014 09:21 PM

I have a hard enough time finding people that can understand basic routing. If I implemented something like this, I could never take a vaction.

 

Color coded cables and backup configs works well enough for us.

 

 

by
on ‎09-23-2014 09:39 PM

@suge I know the feeling. We do OK in that regard, but if we tune up MPLS and OSPF soon like I plan, it'll be tough.. at least for awhile.