Community service outages occur. It’s not a matter of if however when. Cloud platforms and content material supply networks (CDNs) with 100% uptime SLAs aren’t immune. They expertise outages similar to the whole lot else.
The query is: what do you do when one in all your community companies goes down? Will the dearth of redundant companies knock you offline? Or will you failover to a different supplier, sustaining a seamless person expertise? On the back-end, how will that failover course of work? Will it’s automated or handbook?
Most midsize and huge organizations have redundant systems in place to assist them survive an outage. What they may or may not have in place is the automated mechanism that redirects site visitors to these redundant techniques when a core service goes down.
IBM NS1 Join Filter Chain™ expertise makes use of the facility of DNS to routinely reroute site visitors between service suppliers when there’s a community service disruption. With just a few primary guidelines in place, NS1 Join monitors your network’s status and switches endpoints as wanted. You set the principles and the priorities upfront; the whole lot after that occurs routinely.
On the NS1 platform, filter chain configurations are utilized to particular person data inside DNS zones. Filter chains decide how NS1 handles queries in opposition to every report—particularly, which solutions to return. Every filter chain makes use of a singular logic to course of queries. You’ll be able to create combos of filters to attain a particular end result based mostly in your operational or enterprise wants.
In fact, not everybody desires to direct failover site visitors in the identical means. So, we’ve put collectively a fast information on find out how to construct active-active, active-passive and handbook failover techniques through the use of filter chains.
Lively-active failover
On this use case, NS1 or third-party knowledge sources monitor the standing of particular person endpoints in your utility supply infrastructure. When the information signifies an outage on one system, NS1 routinely routes site visitors to the secondary techniques you select. It’s referred to as “active-active” as a result of these secondary techniques are in all probability up and working as a part of your load balancing system anyway. When there’s an outage in a single system, NS1 simply rebalances the load towards the already lively techniques.
The primary filter within the chain is “Up”. This filter tells the system whether or not the service supplier’s endpoint is operational or not.
The second filter within the chain is both “Shuffle” or “Weighted Shuffle”. If the “Up” filter returns a “false” reply for any endpoint, it routinely distributes site visitors to different suppliers. Shuffle distributes site visitors randomly, whereas Weighted Shuffle distributes it based mostly on weights you present.
Lastly, specify what number of solutions you need DNS to offer to inbound queries. RFC 1912 requires that just one reply needs to be returned for each CNAME question. The “Choose First N” filter permits you to specify the variety of solutions which might be returned to the requesting consumer, however the default should be one.
Lively-passive failover
As within the active-active use case, NS1 or third-party knowledge sources monitor the standing of your utility supply infrastructure and route site visitors to secondary techniques within the occasion of a main system outage. The distinction right here is that the secondary techniques will not be dealing with site visitors already—they’re solely spun up when wanted as a redundant possibility.
As within the earlier instance, the primary filter on this chain is “Up”. Drawing from monitoring knowledge, NS1 figures out which of the underlying companies are on-line.
The second filter on this chain is “Precedence”. This filter creates a logic that prioritizes lively techniques over passive or backup techniques. If the upper precedence solutions can be found, they may type to the primary place on the attainable reply checklist. If not, NS1 continues down the precedence checklist till it finds an out there useful resource.
Lastly, “Choose First N” dictates the variety of solutions to ship. The reply you’d need it to ship on this case is one.
Guide failover
Generally you need to make failover selections solely after you already know extra concerning the state of affairs. In these instances, the filter chain is the implementation mechanism that you simply use when you’ve decided the place you need site visitors to go. As an alternative of pointing a knowledge feed to NS1, you’ll manually flip the filter on when it’s wanted through the use of the active-passive logic.
The primary filter on this chain is “Up”, with the distinction right here that you simply manually outline which companies are up and down (as a substitute of a knowledge feed doing that for you).
The second filter on this chain is “Precedence”, beginning with lively techniques over passive or backup techniques. If the upper precedence solutions can be found, they type to the primary place on the attainable reply checklist. If not, NS1 continues down the precedence checklist till it finds an out there useful resource.
Lastly, “Choose First N” dictates the variety of solutions to ship. The reply you’d need it to ship on this case is one.
Multi-cloud or multi-CDN availability
Within the “active-active” state of affairs above, the filter chain makes use of a easy up/down metric to steer site visitors. Nevertheless, generally service availability is extra nuanced. For instance, companies generally expertise regional outages that end in poor service high quality—whereas the service as an entire is technically “up”, it will not be acting at optimum capability. This filter chain permits you to add some nuance to what’s thought-about “up”, utilizing NS1 Join’s superior analytics device as the information supply.
The primary filter on this chain is “Pulsar Availability Threshold”. This filter permits you to set a share worth that may decide the utilization of a service based mostly on availability metrics.
The second filter within the chain is “Weighted Shuffle”, which distributes site visitors to different suppliers that meet the definition of “out there” from the primary filter. Visitors is distributed based mostly on weights that you simply present.
The third filter is “Pulsar Efficiency Kind”, which takes the weighted distribution from the earlier filter and directs site visitors to the quickest out there service, eliminating low-performing companies based mostly on a threshold you outline.
Lastly, “Choose First N” will dictate the variety of solutions to ship. The reply you’d need it to ship on this case is one.
For extra info on find out how to use filter chains to enhance efficiency and resilience, lower prices and extra, discover extra beneath.
Guard against outages with resilient, redundant network services
Was this text useful?
SureNo