Best Failover solution

busster8
Posts: 388
Member Since:
2006-06-25

Have a customer who intends to install a 100% sip system and wants a failover solution. I realize that building a good server with redundant hard drives/power supplies/UPS/grounding is important, but in a worse case scenario, what is the best solution to having a "standby" Trixbox?
I have read some blogs regarding clustering, but these blogs usually stop short of saying that the implementation is complete and working.
Xorcom has an interesting product with the dual USB ports, but it appears that this does nothing for SIP trunking, only T1/PRI.
Moreover, it would be best if the backup Trixbox was located elsewhere in the country since the primary site is located in hurricane alley. Should the site need to close down, making the secondary site operational would be necessary. Having the two TB's in sync would be even more difficult with the second site being remote.
Mondo Rescue appears to be a great backup solution for Linux in general, but was hoping to have a seamless failover solution so that if a database change is made in the primary TB, the secondary is updated as well.

Ideas?



SkykingOH
Posts: 9540
Member Since:
2007-12-17
Redundant hosted trixbox

I send you a private message with my contact information. My firm has a tier 1 data center in Cleveland Ohio that would be ideal for your redundant system.

We can setup the configuration sync between the two trixbox servers.

Depending on how your provider handles registrations the inbound calling failover may require manual intervention.

--

Scott

aka "Skyking"



rjsm2co
Posts: 49
Member Since:
2007-03-26
Redundant Trixbox Pro

You can try the Failover solution of Trixbox Pro.
But muybe you need some manual intervention.

Ricardo Saavedra
FtOCC Admin & Tech.



busster8
Posts: 388
Member Since:
2006-06-25
Failover - Server duplication

After a lot of thought, we have decided to create a primary and secondary server, basically mirror images of each other.
Implementing database replication should keep the two mysql databases in sync.
Should be able to keep the two voicemail systems sync using rsync for voicemail files, the same for Announcement files.

Any thoughts? Gotcha's?



b14ck
Posts: 773
Member Since:
2009-03-03
trixbox PRO has this

trixbox PRO has this solution already implemented. It's called the 'live backup'. If you are interested in PRO, I'd recommend check it out. It keeps two boxes continuously synced. You can, of course, accomplish this yourself using rsync (man page it)!

--

Randall Degges
Lead Developer, RCI Telecommunications
projectb14ck - http://projectb14ck.org/ - Weblog



joshelson
Posts: 244
Member Since:
2006-12-07
Two areas of

Two areas of comments/considerations, from having done this project numerous times:

Before you go too far down any sort of road here, I'd recommend taking a close look at what business problem your customer is trying to solve with a "failover" solution. You need to weigh the risks of downtime against installation and maintenance complexity of the "clustered" environment.

No matter which way you go, you're adding complexity to what at this point is designed to be a single box standalone environment. Your lowest impact (and slowest recovery time) method is going to be to use a Mondo Rescue type solution. You note the obvious deficiencies there.

Beyond that, you need to cleanly delineate what kind of risk are you trying to mitigate with a failover solution. Are you trying to be able to survive a single box failure? Are you trying to survive a whole site failure? Would it be of any business use to have a geographically dispersed telephony cluster if your office was hit with a major natural disaster? Does single box with decent live hardware monitoring mitigate enough risk to make the complexity of cluster management unappealing? Are you carrier and network redundant? If not, does geographic redundancy buy you much?

Now to the technical components:

I'd suggest you do some additional investigation on the architecture of Asterisk / TB to understand what you're getting as you're using mysql clustering or rsync /csync. You need to understand exactly what is contained in the mysql database from a Trixbox perspective and what isn't. The solution you describe, in itself, will not provide you failover in anything more than a data preservation sense. In fact, it's not going to give you much more than having a clean box with a current FreePBX Backup/Restore file is going to.

Understand that Trixbox stores static configurations in the mysql database, but does not store SIP registration information there. You're going to lose all that data when the failover happens. Additionally, if you have the primary and secondary boxes, how are you going to point the phones to the secondary server on failover? Are you going to use a VIP? If so, how do you hold that VIP role? Are you going to force Asterisk reload on failover? You're going to have to set a very low re-registration interval on your endpoints to have anything resembling sub minute failover (especially for inbound calling / SIP trunking) in this scenario.

I would highly recommend you consider Scott's offer - he gets great marks and hosting can produce great results if you design properly. You should check out with Fonality does with TB pro. If those meet your requirements, I don't see why you wouldn't go that route. It'd probably be easier than what you're trying to do.

I can tell you from A LOT of experience on this, if you're looking for on-site cluster with near instantaneous failover, your solution isn't going to be adequate - you'll pay dearly in lost hair and increased alcohol consumption. You'll need to consider Linux HA (Heartbeat + DRBD + a LOT of customization and solid network design) or a form of peering (with Asterisk Realtime) to get you there. The Linux HA has the advantage of being relatively well understood and at least somewhat manageable. The DUNDi Peering / ART solution scales beautifully and can survive lots of kinds of failures, but isn't compatible with Trixbox in the least at this point. If you're looking for help getting over the hump on this, let me know. I've installed mission critical accounts using Linux HA with 2+ years of uptime - with the cluster PBX surviving network core outages and hardware failures. All this requires a lot of careful planning to make worthwhile.

Josh

--

FluentStream Technologies - Integrate * Communicate



bubbapcguy
Posts: 3774
Member Since:
2006-06-02
Best Failover solution

What Scott and Josh said.
I look to the provider to provide failover in case a server goes down.
As for a "hot swap" I always say buy two of everything to have on hand.
The cost to have a true failover system many times will out pace any saving you where looking for.

You could also look a VM machine as you are all SIP... I know.. I know..
I am starting the whole VM thing, but it does work and works very well

With A VM you can move your image around just about at will.
When Katrina wiped the coast off the map the companies there using our services never miss a call in or out
The images where moved to a backup data center before Katrina as well as backuped ...DURING



johnjces
Posts: 302
Member Since:
2007-10-13
bubbapcguy, are you

virtualizing on Linux or on a Windows OS?

If Linux, which flavor?

Thanks!

John



joshelson
Posts: 244
Member Since:
2006-12-07
If you're seriously going to

If you're seriously going to do virtualization, your "host" needs to be a hardware-assisted hypervisor. The two most mature today are VMware ESX / ESXi and Citrix XenServer.

VMware ESX gets expensive but allows for unmodified Linux hosts to run on it. If you get the Free ESXi, you'll have to pay for the live migration "VMotion" capabilities. XenServer is totally free, including XenMotion Live Migration capability. You need to use a modified paravirtualization kernel, however. The install can't be done the "normal" way and is a bit tricky, though I've written elsewhere on how to do it. I also can supply a template for XenServer if that's desirable.

Be aware here - because most configurations of Trixbox run as a sort of B2BUA you are carrying the full load of the RTP stream. Hypervisors time slice between VMs and the host OS to provide those virtualization services. I'm not clear on how far up you can scale, though I'd love to hear experiences for people that have taken it up higher than the 15-20 user range. I also had heard there was a project to expose a timing source (ztdummy or the like) in the hypervisor and allow guests to access it. I'm not sure what became of it. At minimum, you need to use resource pooling and optimization to ensure that your vPBX has the absolute highest priority.

In any event, bubbapcguy is right about VMs and HA. You potentially introduce a SAN into the mix, which adds cost and complexity (or consider OpenFiler). But you'll gain a whole bunch of additional flexibility for failover / HA. It's far easier to have a single disk file representing the whole of the system than it is to sync individual components within the filesystem.

Josh

--

FluentStream Technologies - Integrate * Communicate



bubbapcguy
Posts: 3774
Member Since:
2006-06-02
VM

Once Josh has it covered.
We have used the VM ESX the longest and it works great.
OpenVZ is my next choice (just cuz I like the CMD line)

But even the freebie ESXi works fine (I have sen upto 30 extens / 20,000 mins a month on a beefy box.
most host boxes are Centos and FC (I just redid and older Centos 4.7 just cuz I had a spare to move the VM's around on).



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.