Prefix Label Forwarding (PLF) - Modified Header Forwarding for IPv6

Previously titled: Ivip6 - instead of map-encap, use the 20 bit Flow Label as a Forwarding Label in the /ivip6/ directory.

2010-01-07:  I intend to write this up as an ID, but for now, this page and this one lfe/ (Label Forwarding in Edge Networks)  is the best account of this work in progress.

To the main Ivip page

Robin Whittle rw@firstpr.com.au 2008-08-07  (Updated 2008-08-18) 

A Short Explanation is here: psg.com/lists/rrg/2008/msg02076.html (2008-08-04) - including why this proposal is unrelated to MPLS. Also, this is just for IPv6 packets, not any other protocol.  Below is a chart of the short explanation, followed by a long explanation.

I have to invent new terminology, since this is a novel approach.  Initially I called this proposal "Flow6", "Flow Label Forwarding",  then "Label Forwarding in the Core".

As per the notes at the main Ivip page I am now calling this approach Prefix Label Forwarding (PFL).  This is only for IPv6.  A somewhat similar approach for IPv4 is ETR Address Forwarding (EAF).  Both involve reusing bits in the existing IP header and both require relatively minor upgrades to most DFZ routers before they could be deployed at all.

For IPv6, 20 bits are available and I use 2^19 of these 2^20 combinations to forward the packet to a  border router which advertises the prefix in which the ETR is located.  

For IPv4 (EAF), 30 bits are available, and I use this to encode the 30 most significant bits of the ETR's address.  Then (via suitably modified core and internal routers) the packet is forwarded all the way to the ETR.  I have not yet written a proper description of  EAF.  Below is a good description of PFL, but please read the Short Explanation above first.

Note, the bit numbers I give here are ordinary binary bit numbers, where 0 is the least significant.  I need to update it to also show the IETF bit numbers, where 0 is the most significant.

Contents

>> Chart of the example
>> Historical background
>> Two things which would need to be done for this proposal to be practical
>> Advantages over map-encap (LISP, APT, Ivip4 and TRRP)
>> Advantages over Translation (Six/One Router)
>> The "Flow Label" to become the "Forwarding Label"
>> Terminology and main concepts
    > Conventional networks
    > SEN - Scalable End-user Network
    > SPI - Scalable PI address space
    > Traffic Engineering - load sharing
    > Micronet
    > UAB - User Address Block
    > MAB - Mapped Address Block
    > ITR - Ingress Tunnel Router
    > ITRD - ITR with full mapping Database
    > ITRC - ITR with Cache
    > ITRH - ITR function in sending Host
    > OITRD - Open Ingress Tunnel Router in the DFZ
    > ETR - Egress Transit Router
    > CEP - Core Egress Prefix 
    > FLPER - Forwarding Label Path Exit Router
>> Mapping system
>> QSD - Query Server with full Database
>> QSC - Query Server with Cache
>> Tutorial by way of example - Detailed Explanation
    > Enhanced RIB functionality
    > Enhanced FIB functionality
>> Transition: non-upgraded networks
>> Transition: non-upgraded core routers
>> PMTUD (Path MTU Discovery)
>> TTR Mobility

On other pages:

>> Using the Label Forwarding  approach in Edge networks too including:
      > List of new functions for core routers

Introduction

This page discusses one of the two approaches I am suggesting for a new scalable routing and addressing architecture for the Internet. See the Ivip page for the full context.  The proposal here is not a map-encap (LISP, APT, Ivip4, TRRP  etc.) or translation (Six/One Router) scheme - but uses a different technique to get packets across the interdomain routing "core" of the IPv6 Internet.

This is by using most, or all, of the 20 bit IPv6 Flow Label header for a completely different purpose to what it was intended for - as a Routing Label instead.  This proposal would require some modest enhancements to the FIB and RIB functions of core BGP routers. I guess these enhancements could be achieved with software updates.  There is no change to the BGP behaviour of the routers. I guess this is practical, considering it will be 2015 or later  before IPv6 usage is large enough that a scalable routing solution actually needs to be implemented.  IPv6 has a thousand or so advertised prefixes in 2008.  It is the IPv4 Internet which has the scaling problem, with 260k+ prefixes bgp.potaroo.net .

For now I am calling this method of transporting an otherwise unmodified packet across the interdomain core of the IPv6 Internet:

Prefix Label Forwarding (PFL)
Label Forwarding in the Core - LFC


The 20 bits is used by each core router to direct look up (index into an array in the FIB) the Forwarding Equivalence Class for the packet, so it is easy for the router to forward the packet towards the desired network.  

These 20 bits will be set by an ITR (Ingress Tunnel Router) before the packet gets the core, and the rest of the packet, including its source and destination address are not affected.  The destination address is ignored by these modified core routers, so the destination provider network to which the packet will be forwarded is set by the ITR, which sets these bits according to the ETR mapping of the micronet which matches the packet's destination address. This only applies to a special set of prefixes advertised by providers who perform the ETR function - provider networks used by the new kind of end-user network.  To understand the following fully, you will need to understand map-encap proposals in general, and ideally the Ivip proposal in detail.  The 8 Page Conceptual Summary and Analysis of Ivip (../) is the best place to start.

This page discusses my specific proposal for using this apparently novel technique for Ivip6.  Further to this, discussed in a separate page linked to below, I also propose using the same techniques in private networks.  That will be called  Prefix Label Forwarding (PFL) Label Forwarding in the Core - LFC .  This second purpose is not directly related to the routing scaling problem, but I think it is worth exploring at the same time.

Ivip6f is the new name for the proposal initially known as FLOWv6.  I made the FLOWv6 proposal on the RRG list, on 2008-07-31:

FLOWv6: IPv6 Flow Label to control DFZ forwarding
psg.com/lists/rrg/2008/msg02036.html

Previously, I discussed the Ivip map-encap system primarily for IPv4, but also for IPv6, although I was less enthusiastic about it for IPv6, due in part to the heavier encapsulation overhead caused by the IPv6 IP header being 40 bytes, rather than IPv4's 20 bytes. See the start of this message for some examples of map-encap overhead for IPv6:  psg.com/lists/rrg/2008/msg02034.html (2008-07-31).

I will now use the Ivip to refer to the general architecture and common elements of the two scalable routing and addressing proposals:
I believe this approach has many benefits over the alternatives: map-encap and translation (Six/One Router).  These are listed in sections below.

Below is a revised version of this proposal.  This design is in an early stage of development and some parts of the description discuss finer points which are not central to understanding the operation of the system.  I put these parts in grey.

It will take me some time to refine the design and find a good way of presenting it, with the more detailed discussion well separated from the basic description.  Then I will write it up as an Internet Draft.  

Some of these finer details relate to the complexities of the various business and network models the system will need to work in.  This may appear as complexity of the Ivip6 system, and to a degree this is true.  However, if the other proposals were fully documented by exploring all the relevant network and business scenarios, and to solve all the problems which arise, they too would be seen to involve at least this level of complexity.

Please let me know of any criticisms you have about the design, suggestions for improvement etc.  I will answer any queries and will be happy to discuss it further by phone. 

Later, I intend to add separate pages on:
This is not the place to fully explain map-encap in general or Ivip in particular.  Please refer to the Ivip page for that.  Nonetheless, the following explanation does mention some pertinent parts of Ivip.  


Chart of the example

[#chart]  Here is a chart of how the packet is handled - for the short explanation listed above, and for the long explanation which follows.

This sequence shows the Destination Address of the packet and the 20 bits of the proposed Forwarding Label (the old Flow Label) as packet is sent from one host in a provider network to another host  in an end-user network, via an internal router, an ITR border router, two core transit routers and the border router of the recipient provider network used by the end-user network.  The ITR could be in various locations and the sending host could be in an end-user network, or in a network with no ITRs - in which case the ITR will be an OITRD.  In all cases, the same events happen:
  1. ITR looks up mapping for the destination address' micronet - and writes the 20 bit result to the Forwarding Label.
  2. The ordinary FIB function of this and subsequent core routers use this 20 bit value to index into an array of FEC values, which quickly tells the FIB which interface to forward the packet on.
  3. The packet is forwarded across the core like this and arrives at a border router of the provider network to which the destination host's end-user network is using for connection to the Net.

The Destination Address is never changed, nor is the Source Address or any other part of the packet - only the 20 bit Forwarding
Label.

Packet's        Action     
location               


Sending host    Packet is emitted, with the destination address
                being that of a host which is located in an
                end-user network which uses the new kind of
                Ivip-managed "micronet" address space.  All such
                space is within 4::/3.

   Forwarding Label = 0 0000       
   Destination Addr = 4000:0050:7000:1234::33



Router 1        Packet is forwarded towards border router, because
Internal        it matches 4::/3 and because there is an internal
router          route for this prefix, leading to the nearest border
                router.
            
   Forwarding Label = 0 0000       
   Destination Addr = 4000:0050:7000:1234::33


Router 2
Border router
which is also
an ITR.

Step 1 - the
FIB of the
ITR function    The ITR function of the FIB recognises the
                Destination Address is within the 4::/3 prefix
                which covers all the Ivip-managed new form of
                address space, for scalable support of
                potentially millions of end-user networks
                with portable, multihomable, address space.

                The FIB does a mapping lookup (to a local query
                server at first, but subsequent packets with the
                same Destination Address use the cached result).  

                The lookup is for the Destination Address and
                the mapping reply includes a caching time, the
                starting address and length of the micronet which
                the Destination is within, and 20 bits (bits 96 to
                115) of the ETR address to which this micronet is
                mapped.  In our example, these 20 bits in hex are
                "0 0003". 

                This is because the ETR address for this micronet
                is:

                   E000:0003:0000:0055::7
                      - ----
               
   Forwarding Label = 0 0003       
   Destination Addr = 4000:0050:7000:1234::33
              
 
Step 2 - the
ordinary router
FIB function:  Since this is a core router with the modified
               functionality, the FIB recognises that the
               Forwarding Label is non-zero - and therefore
               ignores the Destination Address, instead using the
               Forwarding Label to decide which of its neighbours
               to forward the packet to.  To do this it needs to
               determine the Forwarding Equivalence Class (FEC)
               for this packet.

               As described fully below, the FIB already has an
               array FLFEC[2^20] into which which the RIB has
               written the FEC values for all the 2^20
               currently advertised /32 prefixes in E00::/12.

               The Forwarding Label's value causes the FIB to
               read the FEC value from FLFEC[00003].  For instance
               this value directs the router to forward the
               packet out of its interface 7.

               The router forwards the packet to that neighbor:


   Forwarding Label = 0 0003       
   Destination Addr = 4000:0050:7000:1234::33


Router 3     
1st core
transit
router:        This core router also has the modified
               functionality and performs the same algorithm:

               It ignores the Destination Address and uses the
               Forwarding Label value to index into its FLFEC[]
               array.  In this example the resulting FEC value
               causes the packet to be sent from interface 2 to
               the neighbor:

   Forwarding Label = 0 0003       
   Destination Addr = 4000:0050:7000:1234::33


Router 4     
2nd core
transit
router:        The same quick process as for Router 3. 

               The packet is forwarded to a (perhaps the) border
               router of the provider network (or one of the
               provider networks if this micronet is multihomed)
               which the destination host's end-user network is
               using to connect to the Net.  This router
               advertised the prefix E000:0003::/32 (the 0 0003rd
               of this set of 2^20 such prefixes) which matches
               the ETR address to which this micronet is mapped.

   Forwarding Label = 0 0003       
   Destination Addr = 4000:0050:7000:1234::33


Router 5
Border router
of recipient
provider
network:      Since this router advertised the E000:0003::/32
              prefix, so it recognises that when a packet
              arrives from the core with its Forwarding Label
              set to 0 0003 that this is the end of that
              packet's journey across the core.  Its FIB zeroes
              the Forwarding Label:

   Forwarding Label = 0 0000       
   Destination Addr = 4000:0050:7000:1234::33

              In this example, there is no "ETR" as such -
              because this recipient provider network's
              internal routing system has a route for
              the address range of the end-user network.

              The packet is therefore forwarded by normal
              means from this router, potentially through
              other internal routers, to some router which
              links to the end-user network.  Within the
              end-user network, the packet is forwarded to
              the
destination host.

If the recipient network's internal routing system does not carry the end-user network's prefix, then this border router needs to somehow get the packet to the right ETR - the router which does have a link to the end-user network.  If there is only one ETR in the network, that is easy.  However if there is more than one, the border router needs to get the full 128 bit address of whatever this micronet is mapped to.  That is the address of the desired ETR.  See the full explanation below for details - but usually this will involve no delay, since this border router has already looked up and cached the mapping information for this micronet.

Historical background

Brian Carpenter, writing in the RRG list on 2008-08-03: psg.com/lists/rrg/2008/msg02067.html provided some history which is relevant to the technique I describe below.  It is from Louis Pouzin, who made great contributions to the early design of packet switched computer networks.

OK, explained like that, it seems coherent with Pouzin's
proposal in 1974 that the catenet address format should be
 
   <Format> <PSN name> <Local name>

where you propose to put the <PSN name> in the current flow label
field.

Pouzin's full explanation read:


    "There is no need to interpret the destination address
     any more than
 required to find an appropriate gateway
     in the correct direction.  
Putting gateway names in
     addresses is unacceptable, as it would 
tie up addressing
     and network topology. Thus, only PSN [packet
switched
     network] names should be used as catenet [internet]

     addresses. Delivering a message to a final destination
     is carried 
 out only by the final PSN."

[Pouzin74] Pouzin, L., Interconnection of packet switching networks,
7th Hawaii International Conference on System Sciences, Supplement,
pp. 108-109, 1974.

On 2008-05-29, he also wrote, in part (psg.com/lists/rrg/2008/msg01384.html):

Now, I fear he was right, but that's not what got implemented.

We got a model based on fixed length addresses without a format
prefix. I didn't see in the IPng discussion and don't see now how
we can jettison that.


I understand that Pouzin's original proposal was that the packet's destination address would have a variable length local address.

What I am proposing below is related to Pouzin's proposal in that the router looks only at a particular part of the address section of the packet in order to determine how to forward it towards its destination.

However, my proposal differs in these respects:
  1. This is not a generalised approach to how all Internet packets are handled, just for solving a particular problem of getting packets from an ITR (Ingress Tunnel Router) to a destination end-user network which connects to the Net via one or more provider networks.
  2. The packet's destination address points to the address in the final end-user network where the packet will be forwarded to.  This is not directly related to the actual prefix of the recipient provider network which the core routers will be forwarding the packet to. 
  3. The ITR sets these 20 or so bits, based on a mapping lookup, which tells it which provider network to forward (or tunnel) the packet to so it can be delivered to the correct end-user network.  Ordinarily, in Pouzin's proposal, the bits used by the router to determine forwarding are part of the whole address.  In this Ivip6 proposal, these 20 or so bits are totally separate from the destination address of the packet, and cause the packet to be forwarded to a recipient provider network which advertises the prefix of the ETR, which is independent from the prefix (Mapped Address Block) in which the destination address of the packet is located.



Two things which would need to be done for this proposal to be practical

The two primary things which would need to happen for Ivip6 to be feasible are:

Rename the 20 bit "Flow Label" in the IPv6 header to "Forwarding Label" and develop new semantics for it - to support Ivip6 in the core and potentially other uses in edge networks.  This would involve withdrawing RFC 3697 and replacing it with a new RFC.

Recent messages (2008-08-01) from Brian Carpenter and Tony Li indicate the bits are probably not currently used for any substantial purpose:

psg.com/lists/rrg/2008/msg02041.html
psg.com/lists/rrg/2008/msg02049.html


Ensure a sufficiently high proportion of IPv6 BGP routers have modest upgrades to their FIB and RIB functionality, by the time Ivip6 is deployed.  There is no change to the BGP protocol or implementation.

These upgrades are mentioned in the Short Explanation #short and are described in detail below.  I guess these could be implemented in many modern routers via a firmware upgrade.


Advantages over map-encap (LISP, APT, Ivip4 and TRRP)

The benefits of the Ivip6 approach over the Encapsulate (map-encap) techniques - including in most cases Ivip4 - seem to include:
  1. No header overhead.  Packets remain the same length.
  2. No PMTUD problems whatsoever - a Packet Too Big message will be sent to the sending host, with the original packet details, from any router in the full path, including the LFC portion of the path.  (I assume that the sending host won't care if the "flow label" bits in the returned packet fragment are different - maybe this will require a modification to to host stacks.)
  3. Significant reduction in computational effort for each such packet passing through a core router, compared to it fighting its way through up to 48 bits of destination address to determine the packet's Forwarding Equivalence Class (FEC).
  4. Traceroute will work fine through the full path, including the LFC portion of the path.
  5. Ease of continuing to filter out packets with spoofed source addresses at border routers.  (Ivip4 already does this, since the outer header's source address is the same as the sending host's address.) If the provider network in which the ETR is located normally has its border routers set to reject packets arriving from the core with source addresses matching any of the provider network's own prefixes then this works fine and normally with Ivip6, but does not work for LISP, APT or TRRP, where the outer source address is that of the ITR.  To implement this with LISP, APT or TRRP, either the border routers would need to look into each encapsulated packet and filter on the inner header's source address, or all ETRs would need to do similar filtering on the source address of the decapsulated packets.  This filtering - looking for packets which match any one of potentially tens of thousands of prefixes - is extremely expensive and best done with TCAM - so it can't easily be done for a large number of prefixes if the ETR function is done.
These are all major benefits which I think justify upgrading the core routers and using the Flow Label bits for this purpose.  The first three are major factors in the complexity, communications overhead and computational demands of handling packets which are sent through the core to hosts in the new scalable form of end-user networks.

Point 3 overcomes one of the major objections I had to IPv6: the heavy computational effort routers have to do in order to classify each incoming packet to determine its FEC, and so to determine which of the router's interfaces to forward it on.  This depends on the length of the prefix which the destination address is found to be in, but now that most (all?) RIRs are handing out IPv6 PI space as /48 prefixes, it could involve routers processing the most significant 48 bits of the destination address of incoming packets.  See my notes on Tree Bitmap and how state-of-the-art routers actually classify packets here:  ../../sram-ip-forwarding/router-fib/ - very expensive hardware with dozens of CPUs lots of DRAM running hot.  Cisco has 188 250MHz 32 bit CPUs on a single chip!

Point 1 is partly about not causing packet-too-big problems, since the packet is not made any longer.  

Point 1 also concerns efficiency.  Encapsulation for IPv6 is even more undesirable an operation than with IPv4, since it adds at least 40 bytes to each packet for basic IP-in-IP encapsulation, whereas the IPv4 header is 20 bytes. (Ivip4 uses simple IP-in-IP encapsulation, and I had planned to use this for IPv6 as well.)  For IPv6 map-encap with LISP: IP, UDP and LISP headers, the overhead is 56 bits. See details at the start of: psg.com/lists/rrg/2008/msg02034.html  

For instance, a VoIP packet stream, with 20 bytes of payload per packet and 50 packets a second (8:1 compression of the originally 64kbps audio) is 32,000 Bps with ordinary IPv6, or 39,200 Bps including Ethernet headers.  With IP-in-IP encapsulation for IPv6 these rates become 48,000 at the IP packet level and 55,200 Bps including Ethernet headers.  With LISP (IP, UDP and LISP headers) the rates become 54,400 and 61,600 Bps.

With traffic volumes multiplying rapidly, and potentially hundreds of millions of people sending 50 VoIP packets a second - supposedly for free - each with typically 20 bytes of payload, it is highly desirable to avoid encapsulation in the scalable routing solution for IPv6.



Advantages over Translation (Six/One Router)


For the July 2008 revision of Six/One Router and my tentative critical review, please see: psg.com/lists/rrg/2008/msg02034.html .

The benefits over the only currently discussed Translation scheme (Six/One Router):  seem to include:
  1. No address rewriting, so:
    1. Less complexity and computational effort in the ITR and ETR functions ("translation routers" in Six/One Router).
    2. No problems with the header changing in ways which upset IPsec or other cryptographic protocols.
    3. No contortions of bits 64 to 79 to produce the same checksum due to changing bits 80 to 127.
  2. No need for using a prefix of provider space to match each prefix of end-user network space.
  3. Ivip6 includes OITRDs (like LISP PTRs) to collect packets from non-upgraded networks, so there is full support, including for multihoming, for hosts in networks without ITRs.  This is vital for making the system attractive even when few other networks have adopted it.



The "Flow Label" to become the "Forwarding Label"

This proposal involves a completely different use for the 20 bit "Flow Label".  Below, I will refer to it as the "Forwarding Label".

The RFC which defines it is tools.ietf.org/html/rfc2460 (1998-12).  

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| Traffic Class |* * * * * Flow Label * * * * * * * * *|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload Length | Next Header | Hop Limit |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Source Address +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
+ +
| |
+ Destination Address +
| |
+ +
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

These 20 "Flow Label" bits are sick of always being set 0000 0000 0000 0000 0000.  
They all sang together that they want to do Useful Work:  20-bits-singing.html

The non-normative appendix: tools.ietf.org/html/rfc2460#appendix-A has been superseded by:

tools.ietf.org/html/rfc3697 (2004-03)

For this Ivip6 proposal to proceed, RFC3697 would need to be withdrawn and replaced with a new definition of the semantics of these 20 bits.

These 20 bits are not included in any checksums or in IPsec cryptographic integrity checks.

RFC 3697 states that the Flow Label is to be set by the sending host and not altered for the packet's entire trip to the destination host.  It is supposed to be set in a coordinated fashion by applications, so that if there are two separate logical flows of packets between this host and one destination host (for instance one concerning an HTTP session and the other being VoIP packets) then the packets of the two flows will have different Flow Label values.

The aim is to enable routers in the middle of the network make good decisions about sending one stream (one "flow") along one path and the other by another path.  It is important not to send packets from one flow along two different paths, because of the likelihood that the packets would arrive out of order.  Some discussion of how routers in the core might do this, to improve the performance of the whole network, are in a paper:

Multipath Transport, Resource Pooling, and implications for Routing
Mark Handley, IRTF Routing Research Group meeting, Dublin, 2008-00-01:
www.ietf.org/proceedings/08jul/slides/RRG-2.pdf

 The Flow Label is only useful for distinguishing two or more flows from one host to one other host.  

Although the intention was that the Flow Label be set by the sending host, there is nothing to stop any intermediate node, such as a router, changing its value. This would not be detected by IPsec and there is no checksum packet integrity checking by routers which would be upset by a router changing the value.  That the Flow Label bits physically can be changed by routers or any other node the packet passes through is evident from these sections of RFC 3697 which discuss the limited security problems which would result from such changes:

P5:  . . . forging a non-zero Flow Label on packets that 
originated with a zero label, or modifying or clearing
  a label, could only occur if an intermediate system
such as a router was compromised, or through some other
form of man-in-the-middle attack.

P6: Hence modification of the Flow Label by a network node
has no effect on IPsec end-to-end security, because it
cannot cause any IPsec integrity check to fail. As a
consequence, IPsec does not provide any defense against
an adversary's modification of the Flow Label (i.e., a
man-in-the-middle attack).

So it is clear that it is practical for the ITR (Ingress Tunnel Router) to change the value of the Flow Label, which below we will refer to as the Forwarding  Label.



Terminology and main concepts


Here are some terms and concepts which are necessary to an understanding of the whole Ivip6 system.

It is best to develop a good understanding of the Ivip4 system (currently described simply as "Ivip") from the material in the Ivip page: ../ .

Some of the concepts described here are required in a map-encap system and may or may not be required in Ivip6.  For instance, the ETR has definite work to do in any map-encap system, but in one mode of operation of Ivip6 (the provider network's internal routing system automatically handles all end-user network prefixes), there is no need for such a device.  In other modes, there does need to be something like an ETR.  I will use the term "Core to End User Network function" to refer to whatever has to be done to the packet after it arrives at the recipient provider network for it to be forwarded to to the destination end-user network.




Conventional networks

Provider and end-user networks whose address prefixes are managed exactly as they are today - by advertising each one in the global BGP system.

Conventional networks which have no ITRs (Ingress Tunnel Routers) are known as "non-upgraded" networks.

There are three classes of global unicast address space in use:

Conventional provider and end-user networksThese will continue to use whatever prefixes they use today, but not within the prefixes noted below.
SPI
(Scalable PI) address space for new kind of end-user network
All within 4::/3

Ivip's mapping system divides this into micronets.  All micronets are within one MAB (Mapped Address Block) and all MABs (in this example) are within 4::/3.  MABs are advertised in BGP, but not the smaller divisions within them, so that packets sent to these addresses from networks without ITRs will be forwarded to Open ITRs in the DFZ (OITRDs).
CEP
(Core Egress Prefix)
For ETR addresses
2^20 /32 CEP prefixes, all within E000::/12

In this example, there are 2^20 CEP prefixes, each a /32.  All ETR addresses (the address to which a micronet is mapped) are within one of these CEP prefixes.



SEN - Scalable End-user Network

A network for an end-user using the new SPI (defined below) form of address space. In order to solve the routing scaling problem, we need to have most or all new end-user networks, and many, most or all existing (conventional) end-user networks adopt SPI space.

Therefore we need to make this new form of address space and type of end-user network highly attractive to the great majority of end-users, of all sizes - including for instance corporations, universities, schools and large hosting companies.  It is not good enough to say that big hosting companies shouldn't want this kind of address space, just small ones - because most small hosting companies want to be big, and would try to avoid the new kind of address space if this were the case.

All SENs either have their own ITRs (defined below, including perhaps ITR functions in the sending host) or are connected to the Net via one or more conventional networks which provide ITRs to handle their outgoing packets. So the term "non-upgraded network" only applies to a conventional network without ITRs - never to a SEN.

SPI - Scalable PI address space

A new form of address space, intended solely for end-user networks (all networks other than those of Internet Service Providers) which is Provider Independent, but in a manner which supports scalable routing.

Conventional PI prefixes are each globally advertised in the BGP system. The large number of these prefixes, and their rate of change, is the cause of
the routing scaling problem.

SPI address space remains stable for each end-user network, no matter which one or more ISPs they use to connect to the Net. SPI space is therefore
entirely portable and can be used for multihoming.  SPI space is typically rented from some organisation which may have nothing to do with the current one or more ISPs each end-user network uses to connect to the Net.

For IPv4, translation schemes are not suitable and there is no possible Routing Label in the IPv4 header - so Ivip provides SPI IPv4 space via map-encap tunnels.  In the context of what follows, "SPI space" refers to the new kind of IPv6 address space provided by Ivip6, with packets being sent across the core with Label Forwarding in the Core (LFC) "tunnels".

Traffic Engineering - load sharing

Ideally SPI space can also be used for inbound Traffic Engineering (TE) too. In Ivip - both Ivip4 and Ivip6 - inbound TE is achieved indirectly within certain limits, rather than with the explicit load balancing arrangements of the other map-encap schemes. Ivip provides very fine-grained control of mapping with real-time user control - and this may enable inbound TE which is superior to that possible with the other, non-real-time control, map-encap schemes.

In the non-Ivip map-encap schemes (LISP, APT and TRRP), and in Six/One Router, ITRs (or their equivalent in Six/One Router) perform explicit load balancing TE for all EIDs (these proposals' name for what in Ivip is a "micronet") by being told by the mapping information to spread the outgoing load over two or more ETRs.  This is assuming the destination network is multihomed.  The mapping information provides the ITR with "weights" for each of the two or more ETRs.  The ITR needs to choose how to send packets statistically according to these instructions, while ensuring that all packets of any one presumed flow are all sent the same way.  (The ITR can typically only assume different flows if the destination and/or source address of the packets is different - we assume the ITR is not equipped to look at source or destination port numbers.)

In the non-Ivip schemes, end-user control of the mapping information used by the world's ITRs is "slow".  It is infeasible to change the mapping frequently and have ITRs respond accordingly.  So the TE load sharing weights cannot be adjusted in real-time.

With Ivip, there is no explicit TE load sharing.  The mapping information tells the ITR to tunnel packets which are addressed to a particular micronet to a particular ETR address.  There is no concept of multiple ETR addresses, either for load sharing or for the ITR to detect a failure of one ETR so it can send packets to another instead.  Ivip separates out the multihoming fault detection and recovery functions from the scalable routing system and requires end-user networks to do this themselves - or hire someone else to do it for them.  In practice, multihoming monitoring and failure recovery decisions are likely to be handled by a specialized company which is contracted by the end-user network.  That company's system will automatically change the mapping to another ETR in the event that the current ETR is not operating correctly. 

TE load sharing can still be achieved with Ivip, but it is not possible to load split traffic being sent to a single IP address (IPv4) or to a single /64 prefix (IPv6).  To achieve inbound load balancing TE over two or more provider links, the end-user must split the recipient hosts over multiple IP addresses so that they can be covered by separate micronets.  Then, each micronet can be mapped to any one of the multiple ETRs at the multiple ISPs - thereby giving the end-user real-time control over the incoming traffic levels of each of the links from the two or more ISPs.  (Each mapping update costs a small fee, such as a few cents, to pay for the burden on the fast-push system.  Mapping updates are therefore not a burden on most parts of the Ivip system.  Some end-users with busy networks would still find it attractive to change their mapping dynamically to optimise their usage of the two or more providers.)

Micronet

A contiguous sequence of IP addresses - of the new SPI type of address space - which are mapped to a single "locator" address. In most map-encap schemes, the micronet concept is implemented as an EID (Endpoint IDentifier) prefix.

In Ivip4 and Ivip6, a micronet is not necessarily a binary-boundary prefix. In Ivip4, a micronet can start on an IP address and have a length of any integer (to 2^24) number of IPv4 addresses.  That is to say: the granularity of Ivip4's mapping system is 1 IP address.

Ivip6's mapping granularity is a /64 prefix.  The starting point of an Ivip6 micronet is on any /64 boundary, inside whatever overall prefix the new kind of end-user SPI space is available.  In the example which follows, all SPI space is in the 4::/3 prefix, but this is just an example, and some larger or different prefix would be defined in the final system.  (The 1/8 of the IPv6 address space still provides 2^61 /64 prefixes, or 2^45 /48s.  That is 32 trillion /48s - or for a world population of 10 billion people, about 32,000 /48s, each of 64k /64s (each with up to 2^64 hosts . . . ) for every person.

In the mapping system, a micronet is specified by a starting address (64 bits) and a length, in /64 steps. In principle, the length may be up to 64 bits, however in practice it may be limited to 32 bits.

UAB - User Address Block

This is a contiguous range of addresses which are controlled by one end-user. An end-user may be as large as a corporate or university network, or simply an individual who has a mobile device, such as a cellphone.  Typically they would rent this space from a company who runs the MAB (Mapped Address Block, described below) within which the UAB is located - either directly or through some intermediary company.

UABs are integer numbers of /64s, just like micronets. They are specified by a 64 bit starting address and a 64 bit length. ITRs, ETRs (described below) and the mapping distribution system do not use UABs. A UAB is an administrative construct.

End users can divide their UAB into as many micronets as they like, and each micronet can be mapped to any 128 bit IP address - the address of an ETR (or an ETR-like function) which forwards the packet to the destination end-user network.  In the example used in this explanation, all MABs,  UABs and micronets must be within the E00::/12 prefix.

A single UAB could be used to create multiple micronets, and each micronet could be mapped to a different ETR, in any ISP in any country.


MAB - Mapped Address Block

A Mapped Address Block is a BGP advertised prefix in which the enclosed address space is managed by the scalable routing system.  This space contains many (typically thousands and perhaps many millions) of micronets.

The individual micronets are not advertised as prefixes in BGP.  However, the entire MAB of which they are a part is advertised as a single BGP prefix.

While technically a single MAB could provide space for just one SEN, this would help little - or not at all - with the routing scaling problem.

Generally, each MAB should be relatively large compared to the size of micronets. (That said, some SENs may need only a single micronet of /64, and others may require many more, much larger, micronets - so there isn't a typical size of micronet.)

Generally, each MAB should include a large number of micronets, such as hundreds or millions of them. This will enable the micronets serving the needs of very large numbers of SEN end-user networks to be handled from a single MAB subset of the address space which requires a single BGP advertisement.

Each MAB in advertised in BGP to facilitate support of non-upgraded networks.  As described below OITRDs are ITRs in the DFZ which advertise the MABs to collect packets sent to SPI address space by hosts in non-upgraded networks.

It is desirable to limit the total number of MABs, since each one contributes a prefix to the "global DFZ routing table" - which is what we are trying to limit the size of.  For instance the current IPv4 DFZ routing table size of 260k or so is regarded as undesirable -  bgp.potaroo.net - but the real concern is that without a new scalable routing and addressing architecture, this number will grow to half a million, a million etc.  We probably don't want more than two or three hundred thousand MABs.  There are reasons for limiting the number of micronets in each MAB as well.  There should be no problem in the final system creating MABs which are big enough not to be excessively numerous.

ITR - Ingress Tunnel Router

The term "Tunnel" often refers to a two-way arrangement between two hosts which have already exchanged a number of packets and which have set up the tunnel after a series of two-way communications - which usually takes a second or two.

In a map-encap scheme, the term refers to the ITR delivering packets to an ETR, where the final destination address of those packets is not the ETR address, but the address of a host in some SPI address space of some end-user network which the ETR connects to.  This is a much more tenuous tunnel arrangement than is typical with VPNs etc.  

Firstly, it is a one-way tunnel.  (Although it is possible that the ETR is also an ITR tunneling packets in the opposite direction to the ETR function of the original ITR, these would be two unrelated tunnels.)

Secondly, the ITR needs to deliver packets reliably to the ETR without any preliminary communications, and without having any prior knowledge of the ETR at all.  For instance, when the ITR (in this example, a caching ITR, rather than one with a full database of mapping information) receives a packet addressed to some micronet X, for which it has no cached mapping information, it needs to issue a map request message to a nearby Query Server.  The response (typically within a few tens of milliseconds) tells the ITR the ETR address to tunnel this packet to, and any other packets whose destination address matches the micronet which this packet's destination address is part of. (The mapping reply returns the start and length of this micronet, as well as the ETR address.)

The "tunnel" function for map-encap involves encapsulating the original packet - by placing a header in front of it.  The outer header has the ETR's address as its destination address.  For Ivip4, this is a simple IP-in-IP IPv4 header.  The resulting packet is then forwarded by conventional BGP routers to the network in which the ETR is located, and then that network's internal routing system forwards the encapsulated packet to the ETR.  

In Ivip4 - Ivip as described elsewhere, I state that the internal routing systems of provider networks should not handle the address space of the end-user networks.  My reasons for stating this include difficulties keeping that routing system responding as quickly and reliably as ITRs respond to changes in the mapping.  I will contemplate this more for Ivip4, and for Ivip6 am considering both situations - where the internal routing system of provider networks does and does not have a route for the end-user network.

In a map-encap scheme the ETR strips off the outer IP header and by one means or another forwards the raw packet (unaltered from its state when it left the sending host, other than its TTL having being decremented according to the total number of routers it passed through) to the destination SEN end-user network.

For Ivip6, this transportation of the packet, typically across the core of the Net - to the provider network by which the destination network connects to the Net - does not use any extra headers, but simply involves the ITR setting the Forwarding Label (the old Flow Label bits) to a value which will cause all the (Ivip6 upgraded) BGP routers in the core to efficiently forward it to the correct recipient provider network.  This is a novel technique, and it is not entirely clear that "tunneling" is the best term for it.  However, because "tunneling" is the best term for the encapsulation approach of the map-encap schemes, "tunneling" will be retained as the description of a process based on the Forwarding Label which achieves the same purpose.

The most obvious location for ITR function to be implemented is at the border routers of networks, to collect and encapsulate all packets which need to be tunneled to an ETR. For instance, BGP routers at the borders of all conventional networks which SENs use to connect to the Net.  

It will also be possible to locate routers which perform the ITR function inside the source end-user network, or inside the provider network the end-user network uses to connect to the Net.  We will not discuss every aspect of where ITRs could be located.  For simplicity in this explanation, we will assume that ITRs are border routers at the edge of a provider network, where the packets leave that network and are forwarded to routers of other Autonomous Systems (such as transit providers) so that they will ultimately be forwarded to the provider network with the ETR which connects to the SEN end-user network which has the micronet which covers the packet's destination address.

The ITR function processes all packets whose destination address falls within a micronet which is part of any MAB it advertises, to set the Flow Label bits to a value which uniquely identifies the BGP advertised prefix towards which this packet should be forwarded by all BGP routers in the inter-domain core. (Full explanation below.)

ITRs can be dedicated routers or servers running sufficient routing software that they have packets in need of encapsulation sent to them.

ITRs can also be located inside SEN networks and are likely to be found at the border of an SEN network and the one or more conventional provider
networks which the SEN networks uses to connect to the Net.

The third place an ITR function can be found is, in effect, in the DFZ - where it is known as an OITRD (Open ITR in the DFZ).

Like ITRCs, ITRDs can be on conventional addresses or SPI addresses - but not behind NAT.  This is because they need to be reachable when a Query Server has a Mapping Cache Update message to send to it.


ITRD - ITR with full mapping Database

An ITR is really a caching ITRC with an integrated QSD, or an ITRC using a QSD in the same rack connected directly by Ethernet.

ITRC - ITR with Cache

ITRs are typically caching ITRs: ITRCs. They cache the mapping information they currently require, and do not attempt to store a copy of the entire
mapping database.

ITRH - ITR function in sending Host

A caching ITR function can also be built into a sending host. This could be a zero or low cost method of  reducing or eliminating the need for separate ITRs.

Like dedicated ITRs, the ITRH needs an address which be reached from anywhere, so they can receive Mapping Cache Updates from query servers. This means they can be on conventional or SPI addresses.  They cannot be behind NAT.

OITRD - Open Ingress Tunnel Router in the DFZ

Ivip's OITRDs do much the same job as LISP's PTRs (Proxy Tunnel Routers).

OITRDs are distributed around the Net, conceptually "in the DFZ" to attract and process packets sent to micronet addresses by hosts in "non-upgraded"
networks: those conventional networks which have no ITRs of their own.

In fact, OITRDs are within or at the border of some conventional AS network, like many other ITRs.  

A typical ITR at the border router of an AS does not accept packets addressed to micronets from routers outside the AS.  It only accepts these packets from internal routers.  (It does this by advertising one, many or all MAB prefixes to the internal routing system, but not to its BGP neighbours in other ASes.)

Likely locations of OITRDs are Internet exchanges, peering points etc.

They are ideally close to non-upgraded networks, so the total path traveled by the packet from its source, through the OITRD and to the ETR is not much longer than, or is the same distance as, the most direct path from the sending host to the ETR which serves the SEN end-user network in which the destination host is located.

Ideally, in the future, all conventional IPv6 networks will have their own ITRs and OITRDs will not be needed.

An OITRD advertises one or more MABs to its BGP neighbours in other ASes and so attracts packets sent from nearby non-upgraded networks.

It then does what all ITRs do: use mapping information to set the Forwarding Label of the packet so that (upgraded) core routers will forward the packet
towards the ETR to which this micronet is currently mapped.

The business case for Ivip6 OITRDs is identical to that for Ivip4 OITRDs, as discussed in this message:

Business incentives for LISP PTRs and Ivip OITRDs
psg.com/lists/rrg/2008/msg02021.html  2008-07-29.

The above message was written just before FLOWv6 was developed, to it assumes that Ivip for both IPv4 and IPv6 will use map-encap.  The fact that Ivip6 ITRs, including OITRDs, use Label Forwarding in the Core rather than map-encap does not alter the business case for their deployment.

ETR - Egress Transit Router

The ETR is a required physical device in Ivip4 and the other map-encap schemes.  It may be required as a physical device in Ivip6, or there may be no need for an actual device.  Although it may be somewhat confusing, this description of Ivip6 generally assumes that there is a physical ETR.  Below we discuss a scenario in which there is no need for an actual ETR - when the end-user network's micronet is handled by the internal routing system of the provider by which the end-user network connects to the Net.

Ivip6 uses the Forwarding Header so the core BGP routers - and perhaps internal routers in the network(s) of the sending host, between the ITR and that network's BGP border router - will forward the packet towards this ETR address. This use of the Forwarding Header for tunneling across the core is somewhat more elaborate and flexible than in the map-encap system and is described more fully below, however in many ways it is a simpler and more elegant approach.

There is no encapsulating header to remove - as there is for map-encap ETRs.  There is no need to rewrite addresses, as there is in the receiving Translation Router in Six/One Router (the equivalent of the ETR in a map-encap scheme).

In scenarios where the ETR exists as an actual device, it will probably zero the Routing Label bits when it receives a packet forwarded from the ITR.

The most important function the ETR performs is that it recognises from the destination address which SEN network the packet should be forwarded to, and forwards the packet to that network. The ETR could connect to one or to many separate end-user networks.

As the packet was forwarded across the core of the Net, the destination address has been ignored by these BGP core routers, due to them
using the Forwarding Label to decide which port to forward the packet on.

CEP - Core Egress Prefix

As described more thoroughly below, the IPv6 address space is administered to create a regular series of prefixes, each of which can be advertised in BGP.

In this explanation, there are 2^20 such prefixes: 1,048,576. Each has the same length, say /32. /48 would probably be fine too.

In practice, fewer than these would be required, so the final system may have half this number or less.

A conventional provider network which has one or more ETRs (or which connects one or more SEN end-user networks to the Net) and which has a  single "site" - such as a network in a city, or a data centre - needs one CEP. If it has multiple such sites and does not want to ferry traffic between them which addressed to ETRs, then it needs a separate CEP for each such site.

The mapping of a micronet to a particular ETR address is constrained so that that address is always within a particular block of address space: such that it always is within one of the CEPs.  

ETRs are located on addresses within one of these CEPs. Below we discuss administrative arrangements for this limited resource of about a million CEPs.  Each advertised CEP places a burden on the control plane of the global BGP system - adding another route to the "global BGP routing table" (sometimes this is states simply as "adding another route to the DFZ").  To reduce the routing scaling problem, it is desirable to have as few advertised prefixes as possible, so there needs to be good reason for a provider to obtain a CEP prefix.

[#core-edge] CEP prefixes can also be used for other purposes than the one discussed here: as the prefix which the Routing Label causes the tunneled packet to be forwarded to.  In principle it is possible that a provider could also use the CEP prefix for the space it uses internally and for its PA customers.  In that case, each provider site might advertise just the single CEP prefix, at one or more border routers.  This would involve abandoning current address assignments, and would not be necessary to achieve a scalable routing system.  However, in principle, it is an attractive outcome for those who seek a clear separation between core and edge.  The inter-domain core could consist purely of BGP routers handling traffic between providers, all of which use a single CEP at each of their sites.  All end-user networks would use the new kind of address space.  All traffic traveling across the core to any end-user network would be forwarded according to its non-zero routing label, which is easier for the core routers than conventional Tree-Bitmap analysis of the destination address until a matching prefix is found.  

[#number-of-CEPs] It may never be necessary to use the full possible number of CEPs - 2^20, 2^19. Perhaps a few tens of thousands will be all that is required for the foreseeable future, assuming IPv6 is widely adopted.  For instance, if there are 10 billion people, split into cities of 1 million each, there are 10k cities.  An average ISP serves a million customers, and has branches in 5 cities.  So each ISP has 5 sites and 50k CEPs are required.

In the TTR (Translating Tunnel Router) extensions to map-encap for mobility, there are likely to be multiple competing TTR companies, each with TTRs in a wide variety of locations all around the world.  The TTR behaves exactly like an ETR to the map-encap system.  The fact that it has a two-way tunnel to the Mobile Node (MN) does not affect the operation of the map-encap system.  Nor does the fact that the TTR may also be performing ITR functions on packets being sent out from the MN.

Side-note on assignment of CEP prefixes to support TTR mobility

The TTR approach to mobility is equally applicable to Ivip6.  The TTR's private two-way arrangements with its MN are similarly not a concern for the main Ivip6 system, including its mapping system.

There is, however, a challenge for both Ivip4 and Ivip6 in providing the prefixes by which these TTRs connect to the core.  If there were 100 TTR companies, each with a TTR site (one or more TTRs) at 1000 geographically and topologically dispersed sites, then this could require 100k additional advertised prefixes (CEP prefixes in the case of Ivip6) just for these TTRs.  While there may never be this number of TTR companies, or a real need for this number of separate TTR sites, we need to consider the impact of this system on the number of advertised prefixes.

One approach is to consider mobility, and competition between TTR companies, as a worthwhile reason for adding 100k prefixes to the global routing table.  For Ivip6, this also means taking up 100k or so of the 2^20 or 2^19 possible CEPs.  This too may be judged as reasonable and desirable compared to the alternatives.  However this brute-force approach is not the only method by which this number of TTRs could be located as desired.

One approach is to have a single CEP prefix for each physical site, and for the various TTRs of the various companies to share it.  (In practice, it seems likely there wouldn't be 10 different TTRs at a site, but one or a few, with the company which owns those TTRs contracting out its services to the various customer-facing TTR companies.)

An objection to this is that all the TTRs at that site would depend on a single router which advertises the CEP prefix.  The workaround for that is two such routers at the same site.    

Another approach is based on the notion that some or many TTR sites will need to be close to mobile access networks, so the most logical location for them is inside the provider network of that mobile access network.  There, they could all use a single CEP, including perhaps the CEP the mobile provider has for that site.



FLPER - Forwarding Label Path Exit Router

The FLPER is the BGP router at which the traffic packet, having had its Forwarding Label set by an ITR, completes its journey across the core.

This is fully described below in the Tutorial by way of Example section. The FLPER is a BGP router at the border of a the recipient provider network - router which advertises the CEP prefix which encloses the 128 bit address to which the packet's destination address micronet has been mapped.

There may be one or more border routers which advertise this CEP prefix.  

The FLPER may perform the ETR function, or if there is no need for an ETR function - as is the case where the provider network's internal routing system directly handles the micronet address space - the FLPER  has almost nothing to do.  In this scenario, the packet's path results naturally from Ivip6's requirement that core routers use a non-zero Routing Label to forward the packet.  A border router which advertises the CEP prefix which corresponds to the value of the Routing Label will recognise this and therefore recognise the packet has completed its trip across the core.  The Routing Label now contains information of no further importance, so the FLPER zeroes it and looks at the packet's destination address - which is to a micronet of a particular end-user network.  Since, in this example, the provider's internal routing system advertises a route which covers this micronet exactly, or covers a wider address range which encloses this micronet, the FLPER forwards the packet according to the internal routing system rules.  The only action taken by the FLPER is to zero the Routing Label bits.  However, this may not strictly speaking be necessary if internal routers ignore those bits.  

To Do: explain more about how packets from sending hosts find their way to an ITR en-route to the border router, or to the border router which also does the ITR function, or when the ITR function is in the sending host.

To Do: Note and explain more fully that internal routing systems always operate on binary boundary prefixes, but that a micronet can begin and end on any /64 boundary.

To Do: Link to discussion of why I think it is best for internal routing systems of provider networks not to have routes for the prefixes of end-user networks, but to rely on every packet (including those sent from within the provider network) going via an ITR and an ETR, rather than relying on the internal routing system to respond rapidly to mapping changes.  



Mapping system

This is a system by which end-users can issue commands which change their micronets' starting points and addresses and by which they can change
the 128 bit address within a CEP to which each micronet is mapped.

In Ivip4, each micronet is mapped to a specific 32 bit address: the address of the ETR, which will remove the outer header and forward the packet to the end-user network.  The ITR needs this full 32 bit address, because it writes that address to the destination address of the encapsulating header.

In Ivip4, the mapping distribution system fast-pushes all changes to the mapping database to tens of thousands of full database Query Servers all over the world.  Each mapping record consists of the 32 bit starting address of the micronet, its length (24 bits) and the full 32 bit address of the ETR.

[#20-bits-only]

In Ivip6, for each micronet, the mapping system holds a full 128 bit address of a physical or notional ETR, to which all packets addressed to that micronet will be tunneled.  There are some differences from the Ivip4 situation, which generally simplify the requirements on the Ivip6 mapping system:
  1. The ITR does not need the full 128 bit address of the real or notional ETR.  It only needs the 20 bits (or probably 19 or perhaps less in a practical system) which it will write into the Forwarding Header.  This is a significant reduction in the amount of data the mapping system must send to full database Query Servers all over the world.
  2. If any of the three following conditions is true then there is no need for the FLPER to find out the full 128 bit ETR address to which a micronet is mapped.  (This matches map-encap, in that the ETR does not need to look up any mapping for the packet - the outer destination address is that mapping value, and the ETR has this address - and that address is no longer needed.)
    1. There is no physical ETR - the recipient provider network's internal routing system handles the micronet address space
    2. The one FLPER, or perhaps any one of the FLPER (they all advertise the CEP prefix) is capable of forwarding the packet to the end-user network, without relying on any ETR or on the provider's internal routing system.
    3. There is only one ETR, and the FLPER(s) will forward all packets received from ITRs to that ETR.
  3. Alternatively to 2, if the FLPER needs to decide which ETR to send the packet to, then it needs to do a mapping lookup of the full 128 bit "ETR" address.  This is never required for a map-encap ETR.
I have further work to do on this last situation.  If there is a way of satisfying this need by some FLPER for the full 128 bit ETR address, without fast pushing the full 128 bits around the world to tens of thousands of Query Servers (since otherwise, only 20 bits are required) then this would significantly lighten the load on the fast-push mapping distribution system.   It would also lighten the storage load for full database query servers.

This is a specialized topic, not directly relevant to the explanation:

One approach would be to only send the 20 bit value via the fast push system, but have each FLPER router covered by one of two scenarios:
  1. The FLPER gets a full feed of mapping updates, just like a QSD.  Then it can see when any mapping change mentions the CEP prefix it advertises.
  2. The FLPER has an arrangement with one or more nearby QSDs to send it all those mapping changes which mention the CEP prefix it advertises.  This would be an additional function to add to the QSD, which would be regrettable.  Unfortunately, it would also be complex, because the QSD would need to catch not just the mapping changes which involved a micronet being mapped to an ETR address in this CEP, but mapping changes which involved a micronet formerly mapped to an address in this CEP being mapped to some other address, either within the CEP or to some CEP prefix other than this one .  
If this could be assured, this would be a good solution if the FLPER could query some query server to retrieve the full 128 bit ETR address for the micronets which were mapped to any ETR address in this FLPER's CEP.  The volume of these queries and responses would be pretty low, which means a handful of specialized servers in the world would handle the load.  I will think more about this, and about any delay or unreliability problems which might prevent the FLPER from doing its job properly the instant the mapping was changed.

Below I will assuming the matters mentioned in grey above can be resolved, so the Ivip6 fast-push mapping system only needs to push a 20 (19?) bit "ETR address" (actually the relevant bits of the CEP prefix which enclose the ETR address) in each mapping update..
In Ivip6, the mapping system pushes to all full database Query Servers the 64 bits which define the starting address of the micronet (on a /64 boundary), its length, which we can probably safely limit to 32 bits (maximum micronet size is one /32) and the 20 bit value (19 bits in practice?) which defines the  CEP prefix of the provider network the packet should be forwarded to.

This is would be another advantage of the Ivip Label Forwarding in the Core approach compared to map-encap.  If map-encap was used for IPv6, the full 128 bit address would need to be sent to all the full database Query Servers, and to every ITR which needed this micronet's mapping.

So, for each micronet, the amount of mapping information pushed globally would be:

                   Micronet    Micronet   "ETR"      Total   Total  
                   start       length     address    bits    bytes

IPv4: Ivip4         32          24          32        88     11

IPv6: Ivip6         64          32          20       116     14.5
       
IPv6: map-encap-A   64          32         128       224     28  

IPv6: map-encap-B  128           7         128 * N   263+     33+   
               

Ivip6 should requires only 14.5 bytes to be sent by the fast push mapping distribution system for each micronet.  Of course there would be considerable overhead, but also perhaps some compression, in the full practical system.  In the example below, some compression could be obtained, including some due to all micronet space being within the 4::/3 prefix, which reduces the number of bits required to specify the micronets's start point to 61 bits.

By comparison, figures are shown for "map-encap-A" which is the Ivip4 approach to map-encap as it would be applied to IPv6, with the full 128 bit address of the ETR.

Figures are also shown for a notional "map-encap-B" which represents, in part, the mapping information to be sent by LISP, APT or TRRP.  Note that only LISP-NERD and APT involve pushing all the mapping data to thousands of ITRs or Default Mappers (APT full database Query Servers).  

For a single-homed EID prefix (all these systems use binary boundary prefixes, rather than Ivip's more flexible micronets) - without any TE load sharing arrangements or multihoming failover selection for multiple ETRs - the mapping data presumably includes a full 128 bit base address for the EID prefix.  This is in accordance with the RRG rough consensus (To Do: link to the message in June) that the mapping granularity should be a single IP address.

LISP, APT and TRRP use a 7 bit binary number to specify the length in bits of the EID prefix, which is more compact than Ivip6's use of a 32 bit integer to specify the number of /64s spanned by the micronet.  For a single-homed EID prefix, a single 128 bit ETR address is required.  For multihomed EID prefixes, multiple 128 bit ETR addresses need to be specified ("* N" in the table above), with further bits for TE weighting and for priorities when choosing which alternative ETRs to tunnel packets to when the ITR decides than an ETR is unreachable.

QSD - Query Server with full Database

QSDs get the full continual feed of mapping updates from the fast push mapping system.

They handle queries from nearby ITRs - ITRCs and ITRHs.  An ITRD (full database ITR) is really a caching ITRC with an integrated QSD, or an ITRC using a QSD in the same rack connected directly by Ethernet.


QSC - Query Server with Cache

These can optionally be deployed, so there may be one or more layers of QSCs between ITRCs/ITRHs and the nearest one or several QSDs. The purpose is to reduce the number of QSDs needed.

When a QSC has no cached information which answers a query, it pass the query upwards to (or towards, via one or more QSCs) the nearest local QSD.  When the QSC receives the reply, it caches it and sends the response downwards to (or towards, via one or more QSCs) the ITRC/ITRH which made the request.

Likewise, when a QSC gets a Mapping Cache Update message from a QSD above it (perhaps via one or more QSCs), it passes it downwards to whatever ITRCs, ITRHs or QSCs below it which, in the last 10 minutes (for instance) queried the mapping for this micronet.

Mapping Replies and any subsequent Mapping Cache Updates are secured by a nonce which the querier places in the query which gave rise to them.


Tutorial by way of example - Detailed Explanation

This is a detailed explanation of the system, using a particular example.  Please see the chart above #chart - which illustrates this explanation.

For simplicity, we assume that all core IPv6 routers have been upgraded for Ivip6. In a section below we discuss transition arrangements while not all routers have Ivip6 upgrades.

We will also ignore OITRDs in this explanation - the ITRs which collect and tunnel to ETRs the packets sent from non-upgraded networks which are addressed to micronet addresses.

In this example, CEPs (Core Egress Prefixes) are /32s and a prefix E00::/12 has been reserved for them. Consequently, the first few CEPs are:

CEP-0 E000:0000::/32
CEP-1 E000:0001::/32
CEP-2 E000:0002::/32

and the highest is:

CEP-1048575 E00F:FFFF::/32

In our example, so far, 8191 CEP have been allocated - only to operators of provider networks. CEPs are only needed by provider networks which SEN end-user networks use for their connection to the Net.  Assuming Ivip6 is widely adopted, this would include all, or almost all, provider networks.

CEP-0 is reserved. (The final design may reserve more low numbered or high-numbered CEPs for other purposes.) In our example, the allocated CEPs include:


CEP-0001 E000:0001::/32   ISP-A (has only one "site")

CEP-0002 E000:0002::/32 } ISP-B (has 30 "sites")
CEP-0003 E000:0003::/32 }
... }
CEP-0031 E000:001F::/32 }

CEP-0032 E000:0020::/32 } ISP-C (has two "sites")
CEP-0033 E000:0021::/32 }


It is not desirable to have a million CEPs, since each is advertised in BGP and so places a burden on the entire core routing system.  CEPs are only allocated to organisations which need them, and pay for them. (To Do: develop plans for administering these CEPs and for the commercial and regulatory aspects of Ivip6.)

The ISPs generally have other "conventional" prefixes, outside this special CEP set - as they do today. The ISPs use these "conventional" prefixes for their own internal purposes, and for some of their customers. Those customers use the space in today's "PA" manner. Whether they get a single IP address or a prefix, and whether they get it for a short dial-up or mobile session, of for some period of years, the space they get is only available as long as they use this ISP. It is "PA" - Provider Assigned - space and therefore not portable to other ISPs.

These conventional prefixes and their PA usage has nothing to do with the SPI space provided by Ivip6.

We will consider two end-user networks with SPI space: Net-X and Net-Y. For simplicity of explanation, these micronets are from the same MAB (Mapped Address Block).

In our example, the prefix 4::/3 has been reserved for MAB prefixes. It is not absolutely necessary for all MABs to be in any reserved prefix such as this, but it would simplify the functionality of ITRs and ETRs.

In IPv4, for a map-encap system, there is no chance of making all the MABs appear in some clearly defined subset of the whole address space - since, over the next five to ten years, there needs to be progressive conversion of a great deal of the whole address space into MABs.

In IPv6, by administrative fiat, it would be easy for the IANA to carve out two special prefixes which would make the Ivip6 system simpler to implement. In addition to the above-mentioned E00::/12 reservation for 2^20 CEP prefixes, in our example, the IANA reserves 1/8 of the entire IPv6 address space for MABs: 4::/3 .  (See www.iana.org/assignments/ipv6-address-space for current assignments.)

Some company D - probably, but not necessarily an ISP or an RIR - has been assigned the MAB:

4000:0050::/24

There could be 2 million MABs of this size in the 4::/3 reservation.  MABs don't necessarily need to be of the same size, or have no gaps between them.

We don't want tens of millions of MABs. Ideally, we probably want a few dozen or at most a few hundred. Each MAB will have its own stream of mapping updates. Each OITRD will advertise one or more - or potentially all - active MABs.

D rents out some of this MAB's space to Net-X and Net-Y. This rental is effectively permanent. Unless D goes broke (in which case the MAB would be taken over by another company such as D and probably administered to preserve the previous assignments), X and Y can have their space for as long as they like.

Sidebar on fees for mapping changes and for OITRD traffic

Both Net-X and Net-Y pay D for their space, such as a certain fee per year for each /64. They also pay D for the mapping changes they make. This would probably be a charge per update, or some flat fee for a certain number of updates per month.

In this fast-push mapping distribution system, it is important that end-users pay for the updates they send on the system. The fee may be as low as a few cents per update. These fees help pay for most of the fast-push system, especially the Launch servers and Replicators. This occurs through company D and others like it, who directly or indirectly pay for the operation of the fast-push system.

The fee per update also discourages "excessive" use - such as changing the mapping ever few seconds for months on end - to implement fancy TE, or just to create annoyance. Each mapping change involves a small amount of computation, storage and communications bandwidth in the entire fast-push system and in all recipient QSDs.

The cost will be very low, and it should still be low enough that end-users with busy networks will find it attractive to use frequent mapping changes to fine-tune the inbound TE of their multiple links.  The space of a network would be split into separate micronets, each with some recipient hosts. By dynamically changing the ETR each micronet is mapped to, the incoming traffic volume can be managed in real-time and directed as desired to each of the two or more ETRs and so via each of the two or more links from the two or more ISPs.

Net-X and Net-Y also pay D for D's operation of a global network of OITRDs which handle packets addressed to the above-mentioned MAB, sent by hosts in non-upgraded networks. This means that Net-X and Net-Y will probably pay according to traffic flowing through the OITRDs which was addressed to each end-user's micronets.

This is because one SPI end-user network might have only a small amount of space, perhaps just a single micronet of /64, but could run a very popular web site on it, and so generate far more OITRD traffic than another end-user network, which has much more space.

D would have a sampling system to estimate OITRD traffic, it would not make sense to count every byte.

In the following examples, ordinary IPv6 prefix notation will be used to show the base address and length of each micronet, but in practice the micronets can start and end at any /64 boundary.

Net-X has the micronet:

4000:0050:7000::/48

This is 65,536 contiguous /64s:

   4000:0050:7000::
to 4000:0050:7000:FFFF:FFFF:FFFF:FFFF:FFFF.

This sounds like quite a large micronet, but it is technically valid and perhaps there will be call for such micronets.

Net-Y's micronet has just two /64s:

4000:0050:9999:6666::/63

Micronets and UABs can range from a single /64, in principle to as many /64s as fit in the MAB. In this case, the /24 MAB covers 1.02 trillion /64s.

Before depicting the passage of a packet through the Ivip6 system, we will describe the function of the CEP prefixes.

While an ISP could use space within an CEP prefix for any purpose, here we assume that all ISPs use these prefixes solely for ETRs.

Our example involves two CEP prefixes:

CEP-0001 E000:0001::/32 ISP-A

CEP-0003 E000:0003::/32 ISP-B's "Site-2".


ISP-A advertises its CEP-0001 from a single border router.

ISP-B advertises its CEP-0003 from two border routers at its second site.

The BGP system treats these CEP prefixes exactly the same as any other BGP prefixes. (Note, ISPs must advertise these CEP prefixes intact - no more specific prefixes within them.)  All BGP routers therefore develop and maintain best paths for both these prefixes, and likewise for all the other CEP prefixes.

[#enhanced-rib]

Enhanced core router RIB functionality

Please also see separate page: > List of new functions for core routers

The new functionality in the RIB specifically recognises this set of 8191 or whatever CEP prefixes, due to the fact they are within the IANA defined prefix of E00::/12.

The new RIB function is programmed to detect each such /32 CEP prefix, and to copy its FEC value (the internal value by which the router's FIB knows which interface to forward the packet from) to a special array in the FIB. This is the FLFEC[] array.

FLFEC[] is indexed 0 to 1048575.  

In a practical system, it is possible that the lower half of this be reserved for Forwarding Labels found by routers in the core (LFC), and the top half be reserved for Label Forwarding in Edge networks (LFE).

Each element in FLFEC[] stores a FEC value, copied straight from the FEC of the corresponding CEP in the RIB.

So in a given core router, if the BGP RIB has decided that the best path towards ISP-A's CEP-0001 is "Interface 7", then the FEC value
which specifies "Interface 7" is copied to the location 1 in FLFEC[].

With Ivip6, it is required that all packets being handled in the BGP core have their Forwarding Label set according to the following rules:

Set to 0 if the packet has not had its Forwarding Label set to a
particular value by any ITR.

Any non-zero value is assumed by all core routers (we
assume in this example they are all upgraded to Ivip6
functionality) to represent the fact that this packet's
destination address is for a micronet which is currently
mapped to some ETR address  within a particular CEP
- where this CEP's distinguishing bits 96 to 115 are
 directly specified by the value of  the Forwarding Label.

Having set the stage, we now provide an example packet flow, a packet sent by a host Host-A to another host Host-B.

Host-A is on a conventional address in some ISP's BGP advertised prefix, or in a conventional PI space end-user network.

Host-B is in Net-X's /48 micronet mentioned above:

4000:0050:7000::/48

Host-B's address is:

4000:0050:7000:1234::33.

Net-X is currently using ISP-B's second site for Internet access, and the address of the ETR incoming packets should be forwarded to
(via Ivip6's Forwarding Label in the Core system) is:

E000:0003:0000:0055::7

The packet is sent by Host-A and forwarded by its network's internal routing system towards a border router, which also has ITR functions.  The ITR function recognises it as being addressed to somewhere in the SPI (Scalable PI) address space, since all such space is defined to be within a micronet - and since all micronets are within MABs and all MABs within the prefix 4::/3, as is this packet's destination address.

In our example the ITR has no cached mapping information for this address. A subsequent packet from Host-A to Host-B will have a less complex process, due to the presence of cached mapping data in the ITR's FIB.

When the packet is analyzed by the FIB, the result is of the form:

 
This packet is addressed to a section of the address space which is known to be covered by the Ivip6 scheme, but the FIB currently has no mapping information for this particular address.

Therefore, hold the packet and query the routing processor (RIB) to ask for the mapping information for this address. Soon (10 or 20 msec max), when the reply arrives, the packet will have its Forwarding Label set and then will be forwarded to a BGP router in the core.

Subsequent packets matching the micronet which was specified in the mapping reply will be handled by a faster, FIB-only, process (next box) which sets the Forwarding Label to the same value, and again forwards the packet to the core.

This is one of four initial responses the FIB could produce. The other 3 are listed at psg.com/lists/rrg/2008/msg02029.html Briefly, they are:


Use cached mapping information for this packet's destination prefix to set the Forwarding Label, as above, before forwarding the packet to the core.


Send the packet conventionally, based on the normal FIB analysis of its destination address to determine the shortest BGP advertised prefix it is within.


Drop the packet or process it via via some slower and more arduous mechanism - which is not needed for Ivip6.

Once in the core the packet is handled by one or more upgraded BGP routers.

In our example, the ITR requests mapping information for the packet's destination address:

4000:0050:7000:1234::33

Actually, since the mapping system's granularity is /64, the map request is for the 64 bit value, in hex:

4000 0050 7000 1234

Within a few tens of milliseconds, the response from the local QSD (full database query server) comes back to the effect:

The queried address is within the micronet:

4000:0050:7000::/48

which is currently mapped to the ETR at:

E000:0003:0000:0055::7

Cache this response for 600 seconds.


The Ivip6 section of the ITR's RIB caches this information, and processes it into a form to be sent to the FIB:


Any incoming packet matching:

4000:0050:7000::/48

should have its Forwarding Label set to (hex):

0 0003

and should then be handled by the usual forwarding mechanism.


The RIB sends this to the FIB, and by one means or another the FIB matches the stored packet to this new rule. (600 seconds later, the RIB will tell the FIB to delete the above rule.)

[#enhanced-fib]

Enhanced core FIB functionality

Please also see separate page: > List of new functions for core routers

(This explanation is for an FIB function in the ITR, but it is the same function as needs to be added to the FIBs of all core routers.)

Now the packet has its Forwarding Label set to (hex) 0 0003 and the FIB's forwarding mechanism (enhanced to do this Ivip6 additional function) looks at its Forwarding Label, discovers it is 3, and uses this to index into the array FLFEC[].

This produces the correct FEC value for this packet - the number which will cause it to be sent out the interface which leads to the BGP router which is the best path towards the prefix in which the ETR is located.

Once it reaches that core router, the same process happens:

Forwarding Label == 0?

Yes:  Use ordinary FIB process to analyze destination address
      until the longest matching prefix is found. Then use
      that information to look up the FEC for this prefix.

No:   Use the Forwarding Label to index into FLFEC[] to retrieve
      the FEC.  Forward according to this FEC value.

This process is repeated for as many core routers which the packet is forwarded to, until it reaches a BGP router at the border of the provider network in which the ETR is located.  This is the FLPER (Forwarding Label Path Exit Router).

This will be faster and simpler than the usual process of each core router analysing up to 48 bits in the destination address with the Tree-Bitmap algorithm.

In this way, as long as the packet is handled by an upgraded core router, it will be forwarded towards one of the border routers of ISP-B's Site-2.

Note that the packet does *not* contain any address which refers to the CEP prefix advertised by ISP-B's Site-2:

CEP-0003 E000:0003::/32

The Forwarding Label was set just once by the ITR in the source site. Once set, the packet is easily handled by (upgraded) core routers and is forwarded towards whichever one or more core routers advertise this prefix.

When the packet reaches the border router for ISP-B's Site-2, this FLPER router performs a somewhat different operation.  It recognises that the value in the Forwarding Label (hex 0 0003) matches  to the above CEP prefix, which this router advertises.  So it zeroes the Forwarding Label and presents the packet to its normal FIB process.  In this example, ISP-B's internal routing system has a route which covers the destination address of the packet.

The FLPER router sets the Forwarding Label to 0.

The standard FIB function of this border router and of any other internal routers now forwards the packet to the end-user network.

So in this example, there was no actual "ETR" function, other perhaps than the FLPER recognizing the Forwarding Label should now be set to 0.

If ISP-B's internal routing system did not handle the end-user network's prefix, then something else needs to happen.  This is discussed above in the section FLPER section



Transition: non-upgraded networks


The task of this transition arrangement is to ensure that packets sent by hosts in networks without ITRs are all forwarded to an OITRD, where they can have their mapping looked up and their Forwarding Label set appropriately. The same principles which apply to Ivip OITRDs apply also to Ivip6 OITRDs:

OITRDs should be distributed widely around the Net.

They should be able to handle peak packet rates without unreasonable losses.

Their locations should try to minimize the packet taking an overall longest path than it would without Ivip6.

They will be paid for by the organisations who rent micronet space to end-users.  See: #business-oitrd .


Transition: non-upgraded core routers


Ivip6 is only going to be useful once a substantial number, probably a majority, of BGP core routers have the Ivip6 upgrades. This is a significant hurdle for deployment, although perhaps tunneling could be used between upgraded routers initially when only a few DFZ routers are upgraded.  (This would raise packet length and PMTUD problems.)

It will probably be many years before IPv6 usage is so high that a scalable routing and addressing solution needs to be deployed - plenty of time for new routers to have the extra functions and for many older ones to be upgraded with firmware.

Ideally there needs to be a way the system can work reliably even when some percentage of routers are not upgraded - such as 10% or less.

Non-upgraded routers in the core and in edge networks are fine provided there are no ITRs or ETRs located behind them.

The most important thing to ensure is that each upgraded BGP router, including the border routers, never forwards to any non-upgraded router a packet which has its Forwarding Label set. The non-upgraded router would ignore the Forwarding Label, and do a standard BGP FIB operation on the destination address.

This undesirable situation would result in the packet being forwarded towards the nearest OITRD which is advertising the MAB which encloses the destination address. There, according to the above algorithms, the packet will have its Forwarding Label set again to the same value it already has, and that Forwarding Label will be used to forward it to a router which should take it towards the network which has the ETR.

The packet could easily get into a loop and so be dropped, as its TTL reaches zero.

There are probably better ways of ensuring packets with a non-zero Forwarding Label are only sent to upgraded core routers, but some techniques to protect against this might include:

Manually configure every upgraded BGP router not to accept routes matching E00::/12 from neighbours which are not upgraded.

and/or:

Manually configure every non-upgraded BGP router not to accept (and therefore not to offer) any routes to any neighbours if they match E00::/12.

There may still be problems with not enough upgraded routers in a particular part of the core to handle the Forwarding Label forwarding of packets.


PMTUD


This proposal is at a very early stage of development, but it seems that there are no PMTUD problems with this approach.

The fact that packets do not get any longer is a major benefit compared with map-encap systems. Solving those problems, including
making the best use of jumboframe paths in the DFZ, is quite challenging:

www.firstpr.com.au/ip/ivip/pmtud-frag/

Assuming the TTL value is still decremented every time the packet is handled by a router, Traceroute should still work fine through the entire path, including the section where forwarding is controlled by the Forwarding Label.

At any router in this part of the path, if the packet is too long for the next-hop MTU, the router should be able to send a Packet Too Big (PTB) message to the sending host.

There is a potential gotcha here:

Would the sending host recognise the PTB message, if it comes back with a copy of the start of the too-long packet with a value in what the host may regard as the "Forwarding Label" which is different from the all zeroes value the packet was sent with?  

There is nothing in RFC1981 specifying how fussy the sending host should be before accepting a PTB message as valid.  The PTB message itself  (RFC1985) sends back a large slab of the packet which gave rise to it: "As much of invoking packet as will fit without the ICMPv6 packet exceeding 576 octets."  Since the IPv6 header is 40 bytes and the ICMPv6 PTB header is 8 bytes, up to 528 bytes will be returned.

To Do: try to figure out how fussy current IPv6 implementations are about accepting PTBs.  In the longish timescale we have for deploying an IPv6 scalable routing solution, there is plenty of scope for altering host OS code to cope with a packet fragment containing a different "Flow Label" to that in the original packet.

A similar problem applies to the other ICMPv6 error messages: Destination Unreachable, Time Exceeded and Parameter Problem.  The packet fragment which the host needs to check carefully against its record of packets recently sent may have a different "Flow Label" value to what was sent.  Hosts need to be fussy about accepting these messages, to prevent against spoofed packets (with values guessed by the attacker, who is assumed not to be in the path of the outgoing packets) causing a DoS problem.

A messy workaround, rather than changing fussy host software, would be for the complaining router to send the PTB with a copy of the original packet, but with the Forwarding Label bits (the "Flow Label" bits as far as old host software is concerned) set to 0.  

Then what if the sending host included an ITRH function and set the Forwarding Label bits to some non-zero value?  Hopefully the rest of that host software would be upgraded properly and so be wise enough to the new use of these bits to ignore them when checking the ICMPv6 Packet Too Big messages copy of the initial part of the packet.

This is a major advantage over map-encap schemes, where the source address of the too-big packet may be that of the ITR (not with Ivip, which uses the sending host's address) and where the too-big packet is longer than and different from the packet sent by the sending host - resulting in any PTB message not being recognised by the sending host.

Any translation scheme (Six/One Router is the only one so far) would have serious difficulties with PMTUD in the translated part of the path, since the packet has different addresses to those it had when it left the sending host. So even if the PTB was somehow sent back to that host, a properly implemented PMTUD system on that host would fail to recognise the PTB as relating to any packet this host sent.


TTR Mobility


Any map-encap scheme, and Ivip in particular, can support a global mobility scheme with highly attractive characteristics. A paper on this will appear soon. For now, the  descriptive material at:

www.firstpr.com.au/ip/ivip/#mobile

describes the Translating Tunnel Router approach to extending a map-encap scheme for mobility.

It is not necessary to change the mapping every time the mobile node gets a new care-of address. Typically a mapping change, to select a new TTR, is only required when the care-of-address moves more than about 1000km or so from wherever the current TTR is.

The TTR principles should apply in general to a system such as Ivip6. Instead of tunneling packets across the DFZ to the ETR-like TTR, they would be forwarded according to the Forwarding Label.

However, the Forwarding Label approach won't work taking packets to and from the mobile node and the TTR. So tunneling should be used for
this, as described in the above-mentioned material.

This raises some PMTUD problems. Fortunately, the TTR <--> MN tunnel technology is not related at all to the map-encap scheme or to the Ivip6 system, and can be negotiated at set-up time between the TTR and MN. This means that there does not need to be a single fixed technology for this tunneling, enabling a variety of techniques, innovation, and more localized potential solutions to PMTUD.

Typically, the two-way tunnel between the TTR and the MN (actually, the MN can make tunnels multiple TTRs) to will be two-way and use the same techniques as encrypted VPNs. These two-way tunnels are a lot easier to handle PMTUD over than the so-called tunnels from an ITR to an ETR in a map-encap system, where an ITR has to get packets to an ETR which it has had no prior contact with and with which it cannot reasonably engage in extensive communications.

 The type of tunnel used will naturally cause PTB messages to be sent to the TTR or the MN.  The MN can modify its packet size accordingly.  The TTR has two choices.  It can either fragment the long packets it is getting from one or more hosts via one or more ITRs or it can send back a PTB message for any packets which exceed the length of the tunnel MTU, once encapsulated.  Any PTB message the TTR generates to packets arriving via ITRs using Label Forwarding in the core will naturally go straight back to the sending host.

.