Multihoming

From Wikipedia, the free encyclopedia

Multihoming is a technique to increase the reliability of the Internet connection for an IP network. As an adjective, it is typically used to describe a customer, rather than an Internet service provider (ISP) network.

1 Multihoming variants
2 Multihoming caveats
3 IPv4 multihoming
4 IPv6 multihoming
5 External links

[edit] Multihoming variants

There are several ways to multihome, separate from the actual protocols used to do so, amongst which the most important are:

Single Link, Multiple IP address (Spaces)

The host has multiple IP addresses (e.g. 2001:db8::1 and 2001:db8::2 in IPv6), but you only have one physical upstream link. When the single link fails, connectivity is down for all addresses.

Multiple Interfaces, Single IP address per interface

The host has multiple interfaces and each interface has one, or more, IP addresses. If one of the links fail that IP address is unreachable, the other will still work. Hosts that have multiple AAAA or A records enabled can then still be reachable at the penalty of having the client program time out and retry on the broken address. Existing connections can't be taken over by the other interface, as neither TCP nor UDP support this. To remedy this, one could use SCTP which does allow this situation. Unfortunately SCTP is not used very much in practice.

Multiple Links, Single IP address (Space)

This is what in general is meant with Multihoming. With the use of a routing protocol, in most cases BGP, the end-site announces this address space to its upstream links. When one of the links fails, the protocol notices this on both sides and traffic is not sent over the failing link any more. Usually this method is used to multihome a site and not for single hosts.

Multiple Links, Multiple IP address (Spaces), no routing protocol like BGP

This approach uses a specialized Link Load Balancer (or WAN Load Balancer) appliance between the firewall and the link routers. No special configuration is required in the ISP’s routers. It allows to use all links at the same time to increase the total available bandwidth and detects link saturation and failures in real time to redirect traffic. Algorithms allow traffic management. Incoming balancing is usually performed with a real time DNS resolution.

[edit] Multihoming caveats

While multihoming is generally used to eliminate network connectivity as a potential Single point of failure (or SPOF), certain implementation caveats apply which can affect the success of such a strategy.

In particular, each of the following items must be addressed in order to eliminate the network SPOF:

Upstream connectivity: A given network operations center must have multiple upstream links to independent providers. Furthermore, to lessen the possibility of simultaneous damage to all upstream links, the physical location of each of these upstream links should be far enough apart that a piece of machinery (such as a backhoe) won't accidentally sever all connections at the same time.
Routers: Routers and switches must be positioned such that no single piece of network hardware controls all network access to a given host. In particular, it is not uncommon to see multiple Internet uplinks all converge on a single edge router. In such a configuration, the loss of that single router disconnects the Internet uplink, despite the fact that multiple ISPs are otherwise in use.
Host connectivity: A "reliable" host must be connected to the network over multiple network interfaces, each connected to a separate router or switch. Alternatively, and preferably, the function of a given host could be duplicated across multiple computers, each of which is connected to a different router or switch.
Referencing Entities: Not only must a host be accessible, but in many cases it must also be "referenced" to be useful. For most servers, this means in particular that the name resolution to that server be functional. For example, if the failure of a single element blocks users from properly resolving the DNS name of that server, then the server is effectively non-functional, despite its otherwise connected state.

The purposes of multihoming are only truly achieved when each component that can potentially fail is duplicated.

[edit] IPv4 multihoming

In order to be multihomed, a network must have its own public IP address range and an AS number. Then a connection to two (or more) separate ISPs is established. The routing over these connections is normally controlled by a BGP enabled router.

In the case where one outgoing link from the multihomed network fails, outgoing traffic will automatically be routed via one of the remaining links. More importantly, other networks will be notified, through BGP updates of the multihomed network routes, of the need to route incoming traffic via another ISP and link.

A key pitfall in multihoming is that two apparently independent links, from completely different ISPs may actually share a common transmission line and/or edge router. This will form a single point of failure and considerably reduce the reliability benefits from multihoming.

Another problem to look out for is that multihoming too small a network may not be effective since route filtering is very common among BGP users and smaller prefixes may be filtered out. This will make multihoming fail.

[edit] IPv6 multihoming

Multihoming in the next-generation IPv6 protocol is not yet standardized, as discussions about the various possible approaches to multihoming are still unresolved.

Current solutions:

PI IPv6 address space has been made available. This will lead to routing table growth greater than the routing table size we see today with IPv4. Router Memory/CPU will however be able to scale to this magnitude due to Moore's law and the fact that IPv6 blocks are much bigger and thus won't fragment as much as IPv4. Also see cons below.

There are some people that claim that routing table growth will overwhelm current routers. This may be so. But PI space is the only workable solution for IPv6 deployment at the moment. And it is better to deploy an imperfect solution now than a perfect solution when it doesn't matter anymore. See Worse is better.

Current possibilities:

Get a /48 from a LIR and announce it in BGP.

Pro: works like IPv4

Con: Some (only few) ISP's filter >/32 and thus won't see it and thus one is unreachable for those; problems with LIR, then still have to renumber

Use multiple AAAA's, from different upstream /48's per host

Pro: works in most cases

Con: Need to update all DNS records, firewall configurations etc when renumbering; which source does one use when; when one of the uplink breaks the connection also breaks etc.

Upcoming possible solutions: