draft-ietf-idr-bgp-nh-cost-00.txt   draft-ietf-idr-bgp-nh-cost-01.txt 
Internet Engineering Task Force I. Varlashkin Internet Engineering Task Force I. Varlashkin
Internet-Draft Easynet Global Services Internet-Draft Easynet Global Services
Intended status: Standards Track R. Raszuk Intended status: Standards Track R. Raszuk
Expires: August 2, 2012 NTT MCL Inc. Expires: September 28, 2012 NTT MCL Inc.
January 30, 2012 March 27, 2012
Carrying next-hop cost information in BGP Carrying next-hop cost information in BGP
draft-ietf-idr-bgp-nh-cost-00 draft-ietf-idr-bgp-nh-cost-01
Abstract Abstract
This document describes new BGP SAFI to exchange cost information to This document describes new BGP SAFI to exchange cost information to
next-hops for the purpose of calculating best path from a peer next-hops for the purpose of calculating best path from a peer
perspective rather than local BGP speaker own perspective. perspective rather than local BGP speaker own perspective.
Status of this Memo Status of this Memo
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
skipping to change at page 1, line 33 skipping to change at page 1, line 33
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 2, 2012. This Internet-Draft will expire on September 28, 2012.
Copyright Notice Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 32 skipping to change at page 2, line 32
4. USING BGP TO POPULATE NHIB . . . . . . . . . . . . . . . . . . 4 4. USING BGP TO POPULATE NHIB . . . . . . . . . . . . . . . . . . 4
4.1. NEXT-HOP SAFI . . . . . . . . . . . . . . . . . . . . . . . 4 4.1. NEXT-HOP SAFI . . . . . . . . . . . . . . . . . . . . . . . 4
4.2. CAPABILITY ADVERTISEMENT . . . . . . . . . . . . . . . . . 4 4.2. CAPABILITY ADVERTISEMENT . . . . . . . . . . . . . . . . . 4
4.3. INFORMATION ENCODING . . . . . . . . . . . . . . . . . . . 4 4.3. INFORMATION ENCODING . . . . . . . . . . . . . . . . . . . 4
4.4. SESSION ESTABLISHMENT . . . . . . . . . . . . . . . . . . . 5 4.4. SESSION ESTABLISHMENT . . . . . . . . . . . . . . . . . . . 5
4.5. INFORMATION EXCHANGE . . . . . . . . . . . . . . . . . . . 5 4.5. INFORMATION EXCHANGE . . . . . . . . . . . . . . . . . . . 5
4.6. TERMINATION OF NH SAFI SESSION . . . . . . . . . . . . . . 6 4.6. TERMINATION OF NH SAFI SESSION . . . . . . . . . . . . . . 6
4.7. GRACEFUL RESTART AND ROUTE REFRESH . . . . . . . . . . . . 6 4.7. GRACEFUL RESTART AND ROUTE REFRESH . . . . . . . . . . . . 6
5. Security considerations . . . . . . . . . . . . . . . . . . . . 6 5. Security considerations . . . . . . . . . . . . . . . . . . . . 6
6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6
7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6 7. Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . 7
7.1. Normative References . . . . . . . . . . . . . . . . . . . 6 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 7
7.2. Informative References . . . . . . . . . . . . . . . . . . 6 8.1. Normative References . . . . . . . . . . . . . . . . . . . 7
8.2. Informative References . . . . . . . . . . . . . . . . . . 7
Appendix A. USAGE SCENARIOS . . . . . . . . . . . . . . . . . . . 7 Appendix A. USAGE SCENARIOS . . . . . . . . . . . . . . . . . . . 7
A.1. Trivial case . . . . . . . . . . . . . . . . . . . . . . . 7 A.1. Trivial case . . . . . . . . . . . . . . . . . . . . . . . 7
A.2. Non-IGP based cost . . . . . . . . . . . . . . . . . . . . 7 A.2. Non-IGP based cost . . . . . . . . . . . . . . . . . . . . 8
A.3. Multiple route-reflectors . . . . . . . . . . . . . . . . . 8 A.3. Multiple route-reflectors . . . . . . . . . . . . . . . . . 8
A.4. Inter-AS MPLS VPN . . . . . . . . . . . . . . . . . . . . . 8 A.4. Inter-AS MPLS VPN . . . . . . . . . . . . . . . . . . . . . 9
A.5. Corner case . . . . . . . . . . . . . . . . . . . . . . . . 8 A.5. Corner case . . . . . . . . . . . . . . . . . . . . . . . . 9
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 9 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 9
1. Motivation 1. Motivation
In certain situation route-reflector clients may not get optimum path In certain situation route-reflector clients may not get optimum path
to certain destinations. ADDPATH solves this problem by letting to certain destinations. ADDPATH solves this problem by letting
route-reflector to advertise multiple paths for given prefix. If route-reflector to advertise multiple paths for given prefix. If
number of advertised paths sufficiently big, route-reflector clients number of advertised paths sufficiently big, route-reflector clients
can choose same route as they would in case of full-mesh. This can choose same route as they would in case of full-mesh. This
approach however places additional burden on the control plane. approach however places additional burden on the control plane.
skipping to change at page 4, line 35 skipping to change at page 4, line 35
A BGP speaker willing to exchange next-hop information MUST advertise A BGP speaker willing to exchange next-hop information MUST advertise
this in the OPEN message using BGP Capability Code 1 (Multiprotocol this in the OPEN message using BGP Capability Code 1 (Multiprotocol
Extensions, see [RFC4760]) setting AFI appropriately to indicate IPv4 Extensions, see [RFC4760]) setting AFI appropriately to indicate IPv4
or IPv6 and SAFI to the value assigned by IANA for NH SAFI. Note or IPv6 and SAFI to the value assigned by IANA for NH SAFI. Note
that if BGP speaker whishes to exchange cost information for both that if BGP speaker whishes to exchange cost information for both
IPv4 and IPv6, then it MUST advertise two capabilities: one NH SAFI IPv4 and IPv6, then it MUST advertise two capabilities: one NH SAFI
for IPv4 and one NH SAFI for IPv6. for IPv4 and one NH SAFI for IPv6.
4.3. INFORMATION ENCODING 4.3. INFORMATION ENCODING
Routers use standard BGP UPDATE messages to exchange NH SAFI
information. Cost to reachable next-hops is communicated using
MP_REACH_NLRI (attribute 14) with NLRI part as described below.
Requests are also sent using MP_REACH_NLRI. Informing a neighbour
about unreachable next-hop is done using MP_UNREACH_NLRI. All NH
SAFI messages MUST contain BGP COMMUNITY attribute with value
NO_ADVERTISE (0xFFFFFF02) and their propagation MUST follow normal
BGP rules (i.e. they're not to be propagated).
To request cost to a next-hop from peer or to inform peer about cost To request cost to a next-hop from peer or to inform peer about cost
to a next-hop BGP attribute 14 is used as follow: to a next-hop BGP attribute 14 is used as follow:
1. AFI is set to indicate IPv4 or IPv6 (whichever is appropriate) 1. AFI is set to indicate IPv4 or IPv6 (whichever is appropriate)
2. SAFI is set to NH SAFI 2. SAFI is set to NH SAFI
3. Network Address of Next-Hop field is zeroed out 3. Network Address of Next-Hop field is zeroed out
4. NLRI field is encoded as shown in the next figure 4. NLRI field is encoded as shown in the next figure
+-------------+------------+ Format of NH SAFI NLRI is as follow:
| NEXT_HOP | cost | +-----+------+-------+----------+------+
+-------------+------------+ | AFI | SAFI | Flags | NEXT_HOP | cost |
+-----+------+-------+----------+------+
Where cost is 32-bit unsigned integer (value described below), and Flags - 1 octet field. Least significant bit MUST be set to 1 for
NEXT_HOP is AFI-specific address of the next-hop cost to which is Request and to zero for Response
being communicated or requested. Size of NEXT_HOP field is inferred
from total length of attribute 14.
To request cost to arbitrary next-hop from a peer, BGP speaker sets AFI/SAFI fields can be set either to one of the registered values to
cost field to zero. indicate that next-hop cost info applies only to specified AFI/SAFI.
Alternatively when both fields are be set to zero, the cost
information applies to any compatible AFI/SAFI negotiated with given
peer.
To inform peer about cost to a next-hop BGP speaker sets cost to Next-hop - IPv4 or IPv6 address for which cost is being communicated
actual cost value. or requested. Type is determined from context, and length is
inferred from total length of attribute.
To inform peer that a next-hop is not reachable the cost is set to Cost is 32-bit unsigned integer (value described below), and NEXT_HOP
all-ones (0xFFFFFFFF). is AFI-specific address of the next-hop cost to which is being
communicated or requested. Size of NEXT_HOP field is inferred from
total length of attribute 14.
To inform peer that particular next-hop is unreachable
MP_UNREACH_NLRI attribute is used with same NLRI format as described
above. In this case cost field SHOULD be set to 0xFFFFFFFF.
4.4. SESSION ESTABLISHMENT 4.4. SESSION ESTABLISHMENT
BGP speakers willing to exchange next-hop information SHOULD NOT BGP speakers willing to exchange next-hop information SHOULD NOT
establish more then one session for given AFI and NH SAFI, even using establish more then one session for given AFI and NH SAFI, even using
different transport addresses. This can be ensured for example by different transport addresses. This can be ensured for example by
checking peer's Router Id. checking peer's Router Id.
4.5. INFORMATION EXCHANGE 4.5. INFORMATION EXCHANGE
skipping to change at page 5, line 41 skipping to change at page 6, line 11
without waiting for response, and its peers MAY send cost information without waiting for response, and its peers MAY send cost information
before or after receiving such request. On the other hand, Router before or after receiving such request. On the other hand, Router
Reflectors SHOULD request cost information from their internal peers Reflectors SHOULD request cost information from their internal peers
as soon as possible (due to reasons stated in section "BGP best path as soon as possible (due to reasons stated in section "BGP best path
selection modification"). BGP speaker does not need to track selection modification"). BGP speaker does not need to track
outstanding requests to the peer. outstanding requests to the peer.
When a BGP speaker receives request for cost information it MUST When a BGP speaker receives request for cost information it MUST
reply with actual cost (not necessarily IGP cost, but whatever has reply with actual cost (not necessarily IGP cost, but whatever has
been chosen to be carried in NH SAFI) to given next-hop or with cost been chosen to be carried in NH SAFI) to given next-hop or with cost
set to all-ones indicating that next-hop is unreachable. set to all-ones indicating that next-hop is unreachable. If next-hop
information is obtained from sender's routing table, then sender MUST
Note that BGP speaker MUST use longest match rather than exact match perform lookup exactly the same way as it would for resolving next-
for the next-hop. hop in BGP UPDATE message. For example, for non-labelled
destinations (e.g. AFI/SAFI 1/1 or 2/1) lookup would be done using
longest match, whereas for labelled IPv4 (AFI/SAFI 1/4, 1/128 or 2/4)
exact-match would be used.
When a BGP speaker detects change in cost to previously advertised When a BGP speaker detects change in cost to previously advertised
next-hop with delta equal or exceeding configured advertisement next-hop with delta equal or exceeding configured advertisement
threshold, it SHOULD inform peer by advertising new cost or threshold, it SHOULD inform peer by sending MP_UNREACH_NLRI as
0xFFFFFFFF. described earlier.
When a BGP speaker discovers new next-hop among candidate routes it When a BGP speaker discovers new next-hop among candidate routes it
SHOULD request cost information from the peer. SHOULD request cost information from the peer.
4.6. TERMINATION OF NH SAFI SESSION 4.6. TERMINATION OF NH SAFI SESSION
When BGP speaker terminates (for whatever reason) NH SAFI session When BGP speaker terminates (for whatever reason) NH SAFI session
with a peer, it SHOULD remove all cost information received from that with a peer, it SHOULD remove all cost information received from that
peer unless instructed by configuration to do otherwise. peer unless instructed by configuration to do otherwise.
4.7. GRACEFUL RESTART AND ROUTE REFRESH 4.7. GRACEFUL RESTART AND ROUTE REFRESH
NH SAFI sessions could use graceful restart and route refresh NH SAFI sessions could use graceful restart and route refresh
mechanisms in the same way as it's used for IPv4 and IPv6 unicast. mechanisms in the same way as it's used for IPv4 and IPv6 unicast -
preservation and purge of next-hop cost information follows normal GR
rules.
5. Security considerations 5. Security considerations
No new security issues are introduced to the BGP protocol by this No new security issues are introduced to the BGP protocol by this
specification. specification.
6. IANA Considerations 6. IANA Considerations
IANA is requested to allocate value for Next-Hop Subsequent Address IANA is requested to allocate value for Next-Hop Subsequent Address
Family Identifier. Family Identifier.
7. References 7. Acknowledgment
7.1. Normative References Authors would like to thank Keyur Patel, Anton Elita, Nagendra Kumar
for critical reviews and feedback.
8. References
8.1. Normative References
[RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
Protocol 4 (BGP-4)", RFC 4271, January 2006. Protocol 4 (BGP-4)", RFC 4271, January 2006.
[RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter,
"Multiprotocol Extensions for BGP-4", RFC 4760, "Multiprotocol Extensions for BGP-4", RFC 4760,
January 2007. January 2007.
7.2. Informative References 8.2. Informative References
[I-D.raszuk-bgp-optimal-route-reflection] [I-D.raszuk-bgp-optimal-route-reflection]
Raszuk, R., Cassar, C., Aman, E., and B. Decraene, "BGP Raszuk, R., Cassar, C., Aman, E., and B. Decraene, "BGP
Optimal Route Reflection (BGP-ORR)", Optimal Route Reflection (BGP-ORR)",
draft-raszuk-bgp-optimal-route-reflection-01 (work in draft-raszuk-bgp-optimal-route-reflection-01 (work in
progress), March 2011. progress), March 2011.
[RFC2918] Chen, E., "Route Refresh Capability for BGP-4", RFC 2918, [RFC2918] Chen, E., "Route Refresh Capability for BGP-4", RFC 2918,
September 2000. September 2000.
 End of changes. 19 change blocks. 
34 lines changed or deleted 62 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/