logo资料库

以内容为中心——DONA架构.pdf

第1页 / 共12页
第2页 / 共12页
第3页 / 共12页
第4页 / 共12页
第5页 / 共12页
第6页 / 共12页
第7页 / 共12页
第8页 / 共12页
资料共12页,剩余部分请下载后查看
A Data-Oriented (and Beyond) Network Architecture ∗† Teemu Koponen tkoponen@iki.fi ‡ Andrey Ermolinskiy andreye@cs.berkeley.edu mchawla@cs.berkeley.edu ‡ Mohit Chawla ‡ Kye Hyun Kim kyekim@cs.berkeley.edu ‡ Ion Stoica istoica@cs.berkeley.edu ‡ ∗‡ Byung-Gon Chun bgchun@cs.berkeley.edu Scott Shenker shenker@icsi.berkeley.edu ABSTRACT The Internet has evolved greatly from its original incarnation. For instance, the vast majority of current Internet usage is data retrieval and service access, whereas the architecture was designed around host-to-host applications such as telnet and ftp. Moreover, the original Internet was a purely transparent carrier of packets, but now the various network stakeholders use middleboxes to improve security and accelerate applications. To adapt to these changes, we propose the Data-Oriented Network Architecture (DONA), which involves a clean-slate redesign of Internet naming and name resolution. Categories and Subject Descriptors C.2.5 [Computer-Communication Networks]: Local and Wide- Area Networks—Internet; C.2.1 [Computer-Communication Net- works]: Network Architecture and Design General Terms Design Keywords Naming, Internet architecture, name resolution, data, middleboxes 1 The DNS name resolution system is a fundamental part of today’s Internet, underlying almost all Internet usage. However, the DNS was developed rather late in the Internet’s evolution, after many basic pieces of the architecture were in place. For instance, TCP sessions were already bound to IP addresses and the Berkeley Socket API referred to addresses, not names; frozen design decisions, such as these, limited the extent to which DNS names (or any other naming ∗ International Computer Science Institute (ICSI) † Helsinki Institute for Information Technology (HIIT) ‡ UC Berkeley, Computer Science Division Introduction Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGCOMM’07, August 27–31, 2007, Kyoto, Japan. Copyright 2007 ACM 978-1-59593-713-1/07/0008 ...$5.00. system) could permeate the architecture. As a result, the current role of naming in the architecture is more an accident of history than the result of principled architectural design. In this proposal, we take a “clean-slate” look at naming and name resolution. The test of any architecture is whether it gracefully accommodates a wide spectrum of potential uses (and can withstand potential abuses), both those we encounter in the present and those we anticipate for the future. However, to motivate our design, we first focus more narrowly on one particular issue; the shift in usage from host-centric to data-centric applications. The first Internet applications, such as file transfer and remote login, focused strictly on host-to-host communication: The user explicitly directed the source to communicate with another host, and the network’s only role was to carry packets to the destination address in the packet header. The Internet architecture was built around this host-to-host model and, as a result, the architecture is well-suited for communication between pairs of stationary hosts. Today, however, the vast majority of Internet usage is data retrieval and service access, where the user cares about content and is oblivious to location. That is, the user knows that she wants headlines from CNN, or videos from YouTube, or access to her bank account, but does not know or care on which machine the desired data or service resides. The current architecture can support data/service access, as is obvious from its prevalence on the Internet, but it does not fit comfortably within the host-to-host model. For instance, consider the following three user-relevant issues: • Persistence: once given a name for some data or service, the user would like that name to remain valid as long as the underlying data or service is available. There should be no equivalent of today’s “broken links” when data is moved to another site. Today, HTTP redirect and dynamic DNS are used to minimize this problem, but they are not sufficient answers. For instance, neither works if the data changes administrative domains, unless the operator of the previous domain provides perpetual support. • Availability: data and services should have high availability, in terms of both reliability and low-latency. Availability is usually provided by replication at endpoints, and the network’s role is to allow user requests to find nearby copies. The first large-scale solution to this was deployed by Akamai, using intelligent DNS servers and URL rewriting. More recently, P2P mechanisms like BitTorrent have become prevalent. While the success of these mechanisms is undeniable, it is not at all clear that such a fundamental
requirement, availability, should rely on a set of ad hoc and application-specific mechanisms. • Authenticity: users would like to know that the data came from the appropriate source, rather than from some spoofing adversary. Today this requires a PKI to provide users with the public key of the provider. Moreover, authenticity today is typically achieved by securing the channel to the source, rather than explicitly authenticating the data. Thus, several of the most natural features one would want for service access and data retrieval — persistence, availability, and authentication — are made unnecessarily hard by the current host- to-host model of the Internet, often requiring awkward or expensive work-arounds. Given this discordance between historical design (host-oriented) and current usage (data-oriented), we ask: what would the architecture look like if we built it around service and data access? Somewhat surprisingly, our research suggests that most of the necessary changes reside in how Internet names are structured and resolved. We propose replacing DNS names with flat, self-certifying names, and replacing DNS name resolution with a name-based anycast primitive that lives above the IP layer. We call the resulting design the Data-Oriented Network Architecture (DONA). DONA improves data retrieval and service access by providing stronger and more architecturally coherent support for persistence, availability, and authentication. It can also be extended to provide support for caching and RSS-like updates. However, DONA’s impact is not limited to data and service access; we use these applications as motivating examples because they force us to think differently about some fundamental issues, but most of these issues are not particular to data/service access. As a result, as we describe below, DONA’s overall design has architectural implications that range far beyond data/service access. DONA’s name-based anycast primitive is useful for many kinds it can provide the basic of resource discovery; for instance, primitives underlying SIP, support host mobility and multihoming, and establish forwarding state for interdomain multicast. Placing anycast at the naming layer, rather than at the IP layer, allows us to design for functionality rather than be limited by concerns about scalability, since the mechanisms need not operate at link speed. There is another issue where historical design is at odds with current usage. The original Internet architecture, following the end-to-end principle, intended the network to be a purely trans- parent carrier of packets. Today, however, the various network stakeholders (such as enterprises) use middleboxes to improve security (e.g., firewalls, proxies) and accelerate applications (e.g., caches) [3]. Because DONA’s anycast name resolution process follows essentially the same administrative path as the ensuing data packets (we will address the subtleties in this statement later), DONA can treat the stakeholders along the path as relevant Internet actors. This allows DONA to provide clean support for network- imposed middleboxes. This isn’t a repudiation of the end-to-end principle, in that functionality is still provided at the ends; it is merely a recognition that operators should have, at their disposal, architecturally coherent mechanisms to control how and what traffic traverses their network. More recently, there has been much hand-wringing about the scalability of routing in the current addressing paradigm [27]. DONA’s anycast primitive provides a discovery mechanism that lives above the IP layer; as we will later describe, this enables the use of path-labels rather than global addresses, an approach that results in tiny interdomain routing tables. At a more speculative level, DONA represents a partial shift away from sender-based primitives to a more receiver-based approach. One of our future research tasks is to explore how far we can go in this direction, and what this might mean for a future Internet. These architectural implications encouraged us that DONA is not merely restricted to data and service access (which, by itself, is significant as it is by far the dominant usage on the current Internet), but rather facilitates improvements along many dimensions. However, there are many other issues, not discussed here, that demand attention: the Internet still needs better security (particularly against DoS and malicious/misconfigured routers), better manageability, better usability, and many other properties. We aren’t proposing DONA as a solution to these problems; in fact, we think DONA is largely orthogonal to them. We hope to eventually incorporate work on these problems within a larger framework that also includes DONA. The next section presents DONA’s basic design and Section 3 describes how this design supports such tasks as server selection, mobility, multihoming, session initiation, and interdomain multicast state establishment. Section 4 discusses how DONA’s infrastructure could be extended to support more advanced functionality, such as content delivery, delay-tolerant networking, and a variety of administrative access policies (including middlebox insertion). Section 5 discusses our prototype implementation and Section 6 addresses the crucial question of DONA’s feasibility. The name- based anycast primitive will require routing on a very large namespace, but it need only be done at name resolution speeds, not line speeds; we present various estimates indicating that DONA is within reach of today’s technology. We delay our discussion of related work until Section 7, in order to have enough context to make the necessary connections. For now, we merely note that almost every aspect of our design is (proudly) stolen from elsewhere, most notably from TRIAD [19], HIP [28], and SFS [25]. It is the synthesis of these various ideas into a coherent architecture that we claim as our contribution. The paper ends, in Section 8, with some speculations on DONA’s broader architectural implications. In particular, we discuss the possibility of basing the interface offered to applications on DONA’s name-based anycast primitive. 2 Basic Design DONA involves a major redesign of Internet naming. In this section we first motivate these changes and then present DONA’s naming structure and name resolution mechanism. This is followed by a brief discussion of security and addressing issues. For lack of space, many details are omitted. 2.1 Motivation We start with the problem of service and data access and ask how we might easily achieve persistence, availability, and authentication, basic tasks which today are (sometimes badly) handled by external ad hoc mechanisms. In DONA, we propose a strict separation of concerns between naming and name resolution in handling these tasks: names handle persistence and authenticity, while name resolution handles availability. To provide persistence and authenticity, we use flat, self-certifying names [25,28]. This form of naming is, by now, a standard technique. As we review shortly in Section 2.2, such names will remain invariant and enable easy authentication. The use of flat names makes informal identification harder (since you can’t remember your friend’s 128-bit identifier), but it makes formal authentication easier. High availability means that when a user requests data by name,
she receives the data quickly and reliably. To provide availability, the name resolution process should (a) guide requests to nearby copies of the data, and (b) avoid failed or overloaded servers. The question, then, is how to build such a name resolution process. There are two main resolution paradigms in the literature. The first is what we use today: lookup-by-name in a distributed database, which returns the location (IP address) of a nearby copy. This database must maintain locations of all the copies, identify the location of the requester, and then find a reasonably good match between the two. Akamai has pioneered the development of techniques to accomplish this, but it clearly requires significant mechanism to achieve. The other possibility, most notably used in TRIAD [19], is to route-by-name to the closest copy. Routing protocols are designed to find shortest paths and route around failures, exactly the two tasks (a) and (b) we’ve assigned to name resolution. This led us to conclude that route-by-name, rather than look-up, was the most natural approach. We discuss our design for this in Section 2.3. We note that there is one other issue — in addition to persistence, authenticity, and availability — that the user cares about, trustwor- thiness: users would like to know whether they are getting their information from a reliable source. We believe that this, like several other issues we discuss below, is best handled by mechanisms that are external to the architecture. Trust is so idiosyncratic and subjective that we don’t believe any network architecture should mandate the mechanisms by which trust is established. Moreover, by remaining outside of the core architectural structures, external trust mechanisms can evolve along with changes in society and institutions in a way that a fixed architecture can’t. Today, a variety of external mechanisms, ranging from Google to personal recommendations, are used to establish trust. We expect that new trust mechanisms, such as reputation systems and enhanced “webs- of-trust”, will be developed in the future, but they will (and should) lie outside the confines of the architecture. 2.2 Naming DONA names are organized around principals. Each principal is associated with a public-private key pair, and each datum or service or any other named entity (host, domain, etc.) is associated with a principal. Names are of the form P:L where P is the cryptographic hash of the principal’s public key and L is a label chosen by the principal, who ensures that these names are unique. The granularity of naming is left up to principals; a principal might choose to just name her web site, or name her web site and each page within it, or name at a finer granularity (such as naming each individual photo or publication). Principals are considered to own their data, in the sense that only hosts authorized by the principal P can offer to serve (i.e., provide access to) entities with names of the form P:L. Each datum comes with metadata including the principal’s public key and the principal’s signature of the data; thus, when we speak of a client retrieving data we mean it has received the triplet (along with perhaps other metadata).1 In such a scheme, requesting clients rely on the principal’s signature to ensure the data’s integrity. These names are application-independent and globally unique (and can refer to anything, not just data or services). They are also self-certifying in the following sense [25,28]: When a client asks for a piece of data with name P:L and receives the triplet , it can immediately verify that the data did indeed come from the principal by checking that the public key hashes to P, 1The signature might be signed with a secondary key, but accompanied by a certificate from the principal’s key authorizing the secondary key. and that this key also generated the signature. This satisfies the need for authentication; persistence follows from the fact that the names don’t refer to location, and thus the data can be hosted anywhere. With a slight alteration, these basic ideas can be naturally applied to immutable data: here, the label L is the cryptographic hash of the contents of the data and the principal P is the purveyor of the data, not the owner; for instance, the purveyor could be the hosting CDN. Since the client need not rely on a principal to ensure the integrity of the data (the hash over the contents ensures this), the only role of the principal is to ensure data delivery. Since different CDNs may have different degrees of reliability or coverage, clients seeking immutable data might specifically request it from a particular purveyor. Note that there is a difference between the administrative structure of the hosting machines, and the nature of the principal. A person’s web page would be associated with their own public key. The web page might initially be hosted at their university or company or paid hosting service, but this is not reflected in the name; instead, the owner (as we discuss below) would authorize (with some reasonable TTL) this entity to host their web page (and this authorization could be applied to only a specific datum, or to a portion of the principal’s data, or to all the principal’s data). If the person decided to move the page (for instance, if they changed employers), then they would authorize a new entity to host their page (and let the old authorization expire). The name of the data would not change, even though the entity hosting the data (or service) did change. The biggest challenge to making such flat names work is making sure they resolve to the appropriate locations, which is what we discuss below. But there are usage questions that must also be addressed (though we are not the first to propose names of this form, and several of these issues have been discussed elsewhere, such as in [37]). For instance, one usage question is: how will users learn these flat, long, and user-unfriendly names? We expect that users will learn these flat names through a variety of external mechanisms that the user trusts (to varying degrees), such as search engines, private communication, recommender services, and the like. Users won’t, of course, remember the flat names directly, but will have their own private namespace of human-readable names, which map onto these global and flat names (as in [14]). While such flat names are harder to use than today’s DNS names, they offer the advantage that the mappings between private human-readable names and flat names will be free to reflect evolving social structures rather than being tied, as DNS names are, to a fixed administrative structure. Also, DONA does not provide a reverse-lookup mechanism for names, like DNS, where one can map an IP address to a name. The basic use of reverse-lookup is to tie a user-unreadable label (in this case an IP address) to an user-meaningful identity (in this case, a DNS name). We think the analogous service in DONA would allow a user to map a principal’s key (which is human-unreadable) to a trust-chain between the principal’s key and his own that would help the user understand the identity of the principal. 2.3 Name Resolution We now turn from discussing the nature of DONA’s names to presenting how DONA translates these names into locations. Our goal in this design is to achieve high availability, by finding close-by copies and avoiding failures. As discussed earlier, DONA uses the route-by-name paradigm for name resolution. Rather than use DNS servers, DONA will rely on a new class of network entities called resolution handlers (RHs). Name resolution is accomplished through the use of two basic primitives: FIND(P:L) and REGISTER(P:L). A client issues a
Tier-1 RH RH RH RH RH Copy RH Copy RH Client Figure 1: Registration state (solid arrows) in RHs after copies have registered themselves. RHs route client-issued FIND (dashed arrow) to a nearby copy. register : the received REGISTER message, if any. regs pref reg: preferred REGISTER to disseminate to peers/parents, : all REGISTER messages for the name (P:L). : the name (P:L) event concerns. if any. name regs ←load(name); if register received then if duplicate or invalid signature then return; end set timer for expiration(register); else // A REGISTER expired... end foreach out in provider and peer links do pref reg ←decision process(out,regs); if pref reg changed for out then msg ←new message(pref reg); add intra cost(out,msg); sign message(private key,msg); queue(out,msg); end end store(name, all changes); FIND(P:L) packet to locate the object named P:L, and RHs route this request towards a nearby copy. REGISTER messages set up the state necessary for the RHs to route FINDs effectively. Each domain or administrative entity will have one logical RH (but perhaps many physical incarnations); we will denote the RH associated with an administrative entity X by RHX. RHX is the provider/customer/peer (or, alternatively, parent/child/peer) of RHY if X is the provider/customer/peer of Y in terms of AS-level relationships. This RH structure can extend to finer granularity than ASes to reflect other organizational and social structures; for instance, there could be departmental RHs at universities and firms and, going even further, users could have their own local RHs which peer with those of their neighbors and friends. RHs use local policy (consistent with their domain’s peering agreements) when processing REGISTERs and FINDs. Each client knows the location of its local RH through some local configuration (much like they know about their local DNS server). Any machine authorized to serve a datum or service with name P:L sends a REGISTER(P:L) command to its local RH. Registrations can also take the form REGISTER(P:*) if the host is serving all data associated with the principal (or will forward incoming FIND packets to a local copy). Each RH maintains a registration table that maps a name to both a next-hop RH and the distance to the copy (in terms of the number of RH hops, or some other metric). There is a separate entry for P:*, in addition to individual entries for the various P:L. RHs use longest-prefix matching; if a FIND for P:L arrives and there is an entry for P:* but not P:L, the RH uses the entry for P:*; when entries for both P:* and P:L exist, the RH uses the one for P:L. Only when the RH has neither P:* or P:L entries do we say that P:L does not have an entry in the registration table. When a FIND(P:L) arrives, the forwarding rule is straightforward: if there is an entry in the registration table, the FIND is sent to the next-hop RH (and if there is more than one, the choice is based on the local policy and which entry is closest); otherwise, the RH forwards the FIND towards its parent (i.e., its provider) using its local policy to choose among them if the RH is multi-homed. Thus, registration table misses are forwarded up the AS hierarchy in the hope of finding an entry (see Figure 1). In the case of immutable data, a FIND command can take the normal form FIND(P:L), or the special form FIND(*:L) which indicates that the client is willing to receive the (self-certified) data from any purveyor. If RHX receives a REGISTER from a child (i.e., customer), it does not forward it onward unless no such record exists or the new REGISTER comes from a copy closer than the previous copy. If so, RHX forwards the REGISTER to its parents and peers (after updating its registration table). If the REGISTER comes from a peer, the entry can be forwarded or not based on local policy (depending, for example, on whether the AS is willing to serve as a transit AS for content). By letting the forwarding of FINDs and REGISTERs be driven by the local policies, DONA can faithfully respect the basic interdomain policies as reflected in BGP. In addition, the forwarding of a REGISTER can be terminated at any point if dictated by some administrative policy (such as a corporate firewall). REGISTER commands must be authenticated. The local RH issues a challenge with a nonce, which the client must sign with P’s private key, or sign with some other key and provide a certificate from P empowering this other key to register this piece of data. When forwarding REGISTERs, the RH signs it so that the receiving RHs know that the data came from a trusted RH. These signatures are hop-by-hop and accumulated in a REGISTER along the path. In a similar manner, the RHs accumulate the distances; they append their distance/cost to the previous-hop RH before sending 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 Figure 2: Pseudo code for processing received and expired REGISTERs. the REGISTER to next RH. REGISTER commands have a TTL and must be refreshed periodically. DONA also provides an UNREGISTER command so that clients can indicate that they are no longer serving some datum. Figure 2 shows the pseudo code for processing received and expired REGISTERs. The FIND packet does not just resolve the name, it also initiates the transport exchange. The FIND packet takes the form as shown in Figure 3, where the DONA-related content is essentially inserted as a shim layer between the IP and transport headers. The name- based routing provided by DONA ensures that the packet reaches an appropriate destination. If the FIND request reaches a Tier-1 AS and doesn’t find a record associated with that principal, then the Tier-1 RH returns an error message to the source of the FIND. If the FIND does locate a record, the responding server returns a standard transport-level response (the same as if the transport header had been received on a normal data packet, not on a FIND packet). To make this work, transport protocols should bind to names, not addresses, but otherwise do not need to change. Similarly, application protocols need only be modified to use names, not addresses, when calling
Type IP header Name (P:L, 40 bytes) Next header type Transport protocol header Figure 3: Protocol headers of a FIND packet. Type is to separate FINDs from their responses. transport. In fact, many applications could be simplified when implemented on top of DONA. Using HTTP as an example, we note that the only essential information in an HTTP initiation is the URL and header information (such as language, etc.); the URL is not needed, since the data is already named in a lower layer, and if each variation of the data (such as language) is given a separate name then the header information is also superfluous. The packet exchanges that occur after a FIND has been received are not handled by RHs (except, as we note in Section 4, when they serve as caches or other middleboxes), but instead are routed to the appropriate destination using standard IP routing and forwarding. To this extent, DONA does not require modifications of the IP infrastructure. 2.4 Security Issues There are a variety of security issues that must be addressed, some by DONA itself and some by underlying or external mechanisms. For bandwidth denial-of-service attacks, we assume that there are IP-level mechanisms that can restrain unwanted packet streams that are overwhelming an RH, server, or client. For resource exhaustion attacks against RHs, DONA relies on contractual limits providers place on customers for the number of FINDs and REGISTERs they can submit per time period. RHs may additionally impose other rate-limiting techniques such as cryptographic puzzles. We assume that as part of establishing customer/provider/ peering relationships, peering RHs have securely exchanged their public keys, so RHs can always ensure that they are receiving packets from the appropriate RH. However, a malicious RH can still cause damage. For instance, a malicious RH can refuse to forward REGISTERs and FINDs; this is a failure of the AS, in much the same way an AS could fail to forward packets, and presumably commercial pressures would reduce this form of misbehavior. More subtly, a malicious RH could forward REGISTERs overheard from other RHs. To minimize this risk, when RHs forward a REGISTER they include the next-hop’s public key (or its cryptographic hash). The worst a malicious RH can do is deny a client service (since cryptographic measures allow the client to authenticate the data). In Section 3 we discuss ways in which clients can request access to other copies (i.e., not just the closest one); this will allow a client to avoid misbehaving RHs, unless the misbehaving RH lies on the path to all copies of that particular item. In all cases, though, RHs are commercially related to the clients they are serving; they are paid (perhaps recursively) by either the client or the server. Thus, DONA is not relying on the cooperation of arbitrary entities, but is relying on the nature of a commercially provided service. Thus, while the design should be reasonably secure against misconfigured or subverted RHs, we anticipate that any such problems would be detected and corrected by the provider. In such a key-centric design, the greatest fear is key compromise. There is no remedy except for providing effective means for key revocation. DONA does not, itself, provide a key revocation mechanism. However, there are a variety of mechanisms one could use for this, such as third-parties (e.g., Google) providing databases of revocation lists. Each revocation is cryptographically proven, so they cannot be faked; thus the database need merely provide access to the data, not vouch for its correctness. DONA could itself provide a useful substrate for revocation lists and online key status query protocols. That is, key revocations related to a principal P (both of P’s private key, and of any secondary keys P has used) could be stored under P:L for some special reserved name L, and if an RH finds any entry corresponding to that name it immediately returns notification of a key compromise.2 Thus, if a client wanted to check a key related to P it could issue a FIND(P:L) for this special value of L. As we describe in Section 4, DONA also supports update functionality, so a client could subscribe to be notified of any such revocation. Internet Addressing Finally, principals and replicas may ensure that their key is not already in use by doing a FIND(P:*) on a freshly generated key P (using DONA’s name resolution). 2.5 The DONA design, as just described, could function over the current IP layer, with its present form of addressing and routing. However, many think the current Internet addressing scheme is facing a loom- ing crisis, as the increasing demand for multihoming threatens to explode routing tables [27]. Even aside from this speculative threat, the current addressing paradigm requires a delicate balance between scalability (e.g., aggregation) and flexibility (e.g., multihoming, policy routing) that isn’t always easy to achieve. DONA’s name-based anycast primitive can remove much of the pressure on the lower-level addressing structure by providing a separate mechanism for path discovery. In particular, DONA could enable IP to use path-labels (as in [22]) rather than globally routable addresses. In what follows, we refer to the client as the source of the FIND and the server as the node that responds to the FIND (presumably a node that generated the REGISTER, or a caching RH as discussed later in Section 4). Moreover, each host has a domain-specific address; that is, for each domain within which it is homed, that domain associates an address to that host, and that address has no meaning outside of that domain. In this approach, when a client sends a FIND, its source address is originally just its domain-specific address. As the FIND is forwarded from client to server, next-hop domain path instructions are appended to this source address. Each such instruction has purely local meaning; for instance, as the FIND passes from domain A to domain B, an annotation is added to the path instruction that tells A that the next-hop domain is B and, vice-versa, tells B that, in the reverse direction, the next hop is A. This instruction need only be understood by the two connected domains A and B. When the FIND arrives at the server, the server appends its domain-specific address to the path description. It can then reverse these path instructions and use them for its response to the client (since reversing the order just gives the path in the opposite direction). Similarly, when the server’s packets arrive at the client, the client can reverse the path in order to send packets to the server. Because these per-hop path instructions only need to distinguish between the various next-hop domains, they can be quite short (say, on the order of a few bytes), so the total path instruction would be quite short. More importantly, the interdomain routing tables would be extremely small (and quite static); merely enough to translate these per-hop instructions into a next-hop AS. Note that these path-instructions would not have global meaning, since if a source in a different domain used this path, the domain-specific next-hop instructions would not necessarily lead to the desired 2Note that once any key revocation entry has been registered for that principal, there is nothing the key compromiser can do to cause its removal from the registration tables.
destination. Thus, in this design, there would be no globally meaningful addresses and the DONA FIND/REGISTER primitives would be required to establish end-to-end connectivity. This approach would require the endpoints to detect AS-level path failures and to resend a FIND in that case. This is a substantial extra burden on connections, but it is the tradeoff for doing path-discovery above the IP layer. Also, while this design might superficially resemble other connection-oriented designs, with the FIND playing the role of a connection-establishment, an important distinction is that in DONA there is no per-flow state in the network. This approach would produce symmetric AS-level paths, because the path of the FINDs lay down path instructions which guide the reverse path. There are other possibilities for how policy routing could be handled in this case, ranging from allowing asymmetric routes but requiring a FIND sent in the opposite direction (like many capability mechanisms [41,43]), or giving policy control going from the client to the core (and from the core to the client) to the providers near the client, and giving policy control going from the core to the server (and from the server to the core) to the providers near the server; this would preserve AS-level path symmetry but distribute the control differently among the ASes. Lastly, one could adopt a NIRA-like [42] approach to providing endpoint-control. We do not explore these options here. 3 Using DONA’s Basic Functionality DONA’s name resolution process is essentially a name-based anycast service: a FIND(P:L) request is routed (by RHs) to a nearby RH at which the name P:L is registered. We now discuss a few ways in which this basic primitive might be used (and later, in Section 4, we discuss how RHs could be extended to provide greater functionality). We present only a very high-level description of the examples below, omitting (due to space) many protocol details. Server Selection: The most basic use of DONA’s anycast primitive is to select among several possible servers (e.g., for content distri- bution, or replicated service access). Each server (or datacenter) authorized by a principal P to host a service or datum named P:L simply registers P:L at their local RH. DONA routes any FIND(P:L) to the closest such server (closest according to the DONA routing metric). Note that these servers could be from one or many different CDN networks; the name P:L remains the same no matter which server is providing the data. P2P applications could also employ this primitive, and would probably use immutable names where the principal would identify the particular P2P infrastructure (these may have different degrees of reliability and coverage). Systems like BitTorrent that break files into chunks are also supported; the name of the file (immutable data) is mapped to an index of the chunk names (also immutable), and the client can then separately request each chunk. If it so desires, the client can then offer to serve the content, registering both the chunks and the index. We will see later in Section 4 how increased functionality in the RHs can further improve the CDN-like properties of DONA. Mobility and Multihoming: A roaming host can first unregister from one location and then re-register at its new location. Sub- sequent FINDs will be routed to the new location as soon as the new registrations have installed the necessary state.3 Well-known techniques below the transport layer [28], at the transport layer [17], or at the session-layer [32] can be used to mask mobility from higher layers. Multihoming is similarly straightforward: a multihomed host 3A host that suddenly loses its connectivity may have to wait until its previous registration’s TTL expires before it can be sure that subsequent FINDs will find it. registers with each local RH and a multihomed domain forwards its REGISTERs to each provider. This allows FINDs, and thus the resulting data connections, to make use of multiple paths. While the above discussion implicitly assumed current IP addressing, and no need for symmetric AS-level paths, similar multihoming could be achieved with the addressing described in Section 2.5. Session Initiation: Rendezvous mechanisms are the core of appli- cation-layer session initiation and presence protocols. Consider, for instance, the Session Initiation Protocol (SIP) [31]. A SIP user agent begins by sending a SIP INVITE message. The SIP proxy infrastructure then routes the INVITE message to the current location of the remote agent, which responds to begin the session negotiation. To keep their location up-to-date in the proxy infrastructure, SIP user agents register their current location with registrars (often co-located with SIP proxies). This process maps directly onto DONA’s basic primitives. SIP INVITE messages translate to FINDs and the SIP and DONA REGISTER messages play the same role. Thus, SIP’s rendezvous functionality can be provided by DONA and SIP’s negotiation functionality could be implemented directly on top of DONA’s REGISTER and FIND. In Section 4, we show how DONA enables the network to impose middleboxes; networks could impose (stateful and stateless) SIP proxies to enrich SIP services, control service access, and to protect themselves and their customers just as they do today (e.g., SIP providers often deploy session border control boxes to rewrite SIP messages to hide their internal network topologies). Of course, after DONA enables the discovery, the data transfer between the two endpoints occurs over IP. Multicast State Establishment: It has been a long struggle to define a simple and scalable interdomain multicast protocol. We now show how DONA could be used to establish this interdomain multicast state in a straightforward manner (similar to how this is done in [29]). In this design, DONA’s anycast primitive provides the tree discovery function, allowing a domain’s border router that has local members in a multicast group G to discover and establish connectivity with other domains that have members in the group. We assume that each domain runs some intradomain multicast protocol. Each multicast group has a name of the form P:G, where the principal is the originator of the group. Such a structure makes it easy to keep group names unique. When a new node in a particular domain joins P:G, the domain’s border router Rnew issues a FIND(P:G) packet which DONA routes to the nearest router R that also has local members in P:G, if one exists. Upon receiving the FIND(P:G) packet, R attaches Rnew to the overlay topology for P:G as its neighbor and the two routers add appropriate entries to their neighbor tables for P:G.4 To complete the join operation, Rnew sends a REGISTER(P:G) command, announcing its membership and willingness to serve as an attachment point for other incoming group members. This construction ensures that the resulting overlay topology remains acyclic at all times, thereby simplifying packet forwarding. To transmit a multicast packet destined for a particular group P:G, the sender’s border router R similarly issues a FIND packet to locate a nearby domain that belongs to the group and forwards the packet to that domain’s border router, which in turn initiates the packet’s dissemination. Note that if the sender’s domain has one or more members in P:G then R is itself a member of the overlay topology for P:G and has the forwarding state necessary to initiate the dissemination. 4By overlay we mean that the two routers tunnel packets to each other, so the intervening routers need not have routing state for this multicast group.
find: the FIND packet received. if unknown transport or application protocol then forward to next hop(find); else else if content cached and still valid then terminate protocols(find); send content(); // Follow the caching policy if should be cached then terminate protocols(find); new find ←clone(find); change src address(new find); initiate protocols(new find); send to next hop(new find); store and send content(); forward to next hop(find); else end end end 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Figure 4: Pseudo code for caching logic. Improving Content Delivery 4 Extending DONA As we’ve just discussed, DONA’s name-based anycast primitive provides support for a range of discovery-like tasks. Even restricted to this capability, we believe DONA would be a better naming foundation for the Internet than the current DNS system. In this section we discuss extensions to DONA that broaden its impact. These extensions involve adding functionality to the RHs, so they take a more active role in the architecture than just forwarding FINDs and REGISTERs . 4.1 The basic DONA design provides support for server selection, which is useful when accessing widely replicated data or services. However, DONA can provide better support for content delivery in three ways. Caching: RHs can be extended to provide a universal (i.e., general- purpose and always on-path) caching infrastructure. An RH that implements caching must first populate its cache. It can do so by changing the source IP address of an incoming FIND packet to be its own (and also source port number, if necessary) before forwarding the FIND to the next-hop RH. This ensures that the response to the FIND will traverse this RH, so the RH will receive the returning data and can install it in its cache. As per current practice, cacheable data items will be labeled with a TTL or other invalidation metadata that determines when data becomes stale. When a FIND arrives and there is a cache hit, the RH responds to the FIND’s source IP address, returning appropriate transport responses which then will proceed into a standard application-level exchange. If the RH does not understand the transport or application-level protocol (as defined by the port in the transport header) for a particular FIND, it does not provide caching for that request. Figure 4 shows the pseudo code for the above logic and Figure 5 depicts the packet exchanges. Subscriptions: Often clients would like to subscribe for updates, as in RSS. This can be easily accomplished by adding TTLs to FINDs. When the server responds to such a TTL’ed FIND, it notes whether and how long it will provide updates to the FIND. When a server Cache hit in RH 1: Client RH 1 Cache miss in RH 1, RH 1 skips the data: Client RH 1 Cache miss in RH 1, RH 1 caches the data: Client RH 1 RH 2 Server RH 2 Server RH 2 Server Figure 5: Three cases of processing a FIND packet when caching is supported: serve from the cache, skip (merely forward the FIND), and cache. RH 2 is a middlebox for all traffic. updates a piece of content that has a pending TTL’ed FIND, it sends the update to the source of the FIND. RHs can assist in this process by caching TTL’ed FINDs, using the previously described technique of inserting their own address before forwarding the FIND. If a collection of RHs have cached the TTL’ed FINDs for a particular item, they form a distribution tree that provides scalable updates to that item. There is no explicit tree construction algorithm; it is formed via the normal routing of FINDs. Avoiding Misbehaving and Overloaded Servers: In general, RHs will route FINDs to nearby copies of the data. However, some of these servers may be misbehaving (due to failures or malicious intent) in a way that is not visible to the RHs but which deprives the client of a valid copy of the data. To make sure that clients can still access the data even when their closest replica server is misbehaving, we allow the client to ask that its FIND be routed to a different server. In particular, we amend the REGISTER commands to keep track of the number of servers below a particular RH (this number is incremented as when a REGISTER is sent upwards in the RH graph), and we amend the FIND command to allow the client to request access to the k’th closest server, rather than the closest one. This may be particularly useful when accessing data from relatively unreliable P2P servers. We can also augment DONA to allow overloaded servers to indicate they are overloaded, so the RHs can then direct excess load to other, less-loaded, servers. When an RH sends REGISTERs to a peer, it includes the distance to and load on the closest server and also, if the closest server is overloaded, the distance to the closest underloaded server; thus, at any time, an RH knows the next-hop towards the closest server with spare capacity. We have simulated the algorithm and find that it comes close to matching the optimal load distribution in several settings. With these three changes, DONA could provide a combined caching, updating, and load-balancing infrastructure for content delivery. High-end commercial content providers may still require more advanced features and more careful monitoring than DONA can provide, but we expect even these high-end CDN services to augment DONA’s content-delivery infrastructure rather than replace it entirely. 4.2 Delay-Tolerant Networking There has been recent interest in developing architectures for networks where end-to-end connectivity may rarely exist, but sporadic hop-by-hop connectivity is often present (see e.g., [6,
12]). These connectivity-challenged networks occur in many settings, ranging from large (interplanetary communication), to small (sensornets), to deep (oceanic communications). The key concept in designing a delay-tolerant network (DTN) architecture is message-level custody transfer; rather than the communication being end-to-end, there are intermediate elements that take custody of the message and then are responsible for making sure it is forwarded onward. In DONA, the RHs can act as these custody agents (or can deputize other hosts to act as custody agents). If an RH knows that connectivity to its neighboring RH is intermittent, it can choose to accept custody of the subsequent transfer in much the same way as it places itself on the return path for caching. Previous work on DTNs [12] has also embraced name-based routing. Though these two uses of name-based routing are realized somewhat differently, the similarity in approaches suggests that DONA might provide a reasonable substrate for DTN architectures. 4.3 Access Rules and Middleboxes The original Internet architecture was built around the end-to-end principle, with intermediate routers playing no role other than forwarding packets to their destination. However, commercial and security pressures have created an Internet where layer-violating access policies and on-path middleboxes are common. There is no support for access rules and middleboxes in the current architecture; some maintain that this is a feature, not a bug, but we suggest that the presence of so many access rules and middleboxes is an indication of important unmet needs in the current architecture, and that any viable future design must address those needs. One can view FINDs and REGISTERs as a general signaling mechanism; these are addressed at the IP layer to each RH along the path, and as such the RHs are not violating layering by inspecting them. Corporate networks may choose to not forward REGISTERs originating at internal RHs to external RHs if they don’t want internal content available to the public. Similarly, such networks might also choose to not forward FINDs originating at external RHs to internal servers, or perhaps make the decision based on the application-port found in the FIND’s transport header. More generally, there are several natural policy decisions an RH can make upon receiving a FIND. The RH could deny the FIND, either failing silently or returning an error code. The RH could also ask the source of the FIND for authentication; for instance, it could ask for credentials proving that the user was indeed an employee. DONA itself would not standardize the format and nature of these credentials; DONA would merely provide a channel for authentication protocols to exchange these credentials. An RH could also impose a middlebox, such as an application- specific proxy or a firewall, which is otherwise off-path. There has been previous work on incorporating endpoint-imposed middle- boxes (see [2, 38]), where either the sender or receiver desires the packets to traverse a middlebox such as a firewall or transcoder. The middleboxes we consider here are network-imposed; the decision is being made by one or more of the networks carrying the packets, not the endpoints. We expect future networks to have a mixture of network-imposed and endpoint-imposed middleboxes, so we see the two approaches (DONA and [2, 38]) as complementary. An example of an authentication request and a middlebox insertion is shown in Figure 6. Here, the FIND goes from client to RH1 to RH2, which forwards it to the firewall FW. The firewall asks the client for authentication, and the client responds; the firewall then forwards the client’s FIND to the server, inserting its own IP address in the FIND. The server responds to the firewall, which then forwards the response to the client. Finally, the data flows between the client and server via the explicitly addressed firewall. FW requires authentication: RH 1 Client RH 2 FW Server Figure 6: Server-side RH imposes a firewall middlebox which requires authentication (dashed arrow) before becoming a proxy for client-server communication. When RHs play the role of caches, both in responding to FINDs and inserting themselves on the return path to receive the response, they are acting as middleboxes (as shown in Figure 5). RHs can similarly insert themselves as a middlebox for reasons other than caching, and, as just shown in the example in Figure 6, can also forward the FIND to some other box (e.g., a firewall) that inserts itself on the path. This is done by inserting the IP address of the middlebox in the source address field in the FIND packet (and changing the transport source port, as well as updating checksums and TTL fields) before it is forwarded onwards, so all returning packets pass through the middlebox. This method of inserting middleboxes does not violate layering, as all packets intercepted by these middleboxes would have their IP address in the destination field. Lastly, an RH could also provide the capabilities required to traverse the domain. We won’t delve into the details here, but mechanisms similar to SIFF [41] and TVA [43] could be used (but on a per-domain basis, not per-router basis), with the capabilities inserted in the FIND as it traversed domains. There are two subtleties here. First, if more than one intervening domain wants to require capabilities, there must be a way of either concatenating the capabilities, or somehow sharing a fixed length capabilities field. The second subtlety is that using FINDs to gather the capabilities works if the FIND packet and the data packets being sent in both directions take the same AS-level (or administrative-level) path. This symmetry is required only for the networks that use capabilities. Today, restrictive access policies are typically only exerted at the edges, where the paths are likely to be symmetric. However, if one expects that capabilities might be used more generally, the path- labels discussed in Section 2.5 could be used, because they produce symmetric AS-level paths. In contrast to capabilities, middlebox insertion does not require symmetric AS-level paths, because the middlebox is explicitly addressed. 5 Implementation To gain some experience with our design, we implemented a Linux-based prototype of an RH and deployed it on Planetlab. Implementing DONA helped us uncover many details that we had neglected, but it did not raise any fundamental issues. We think the far more interesting issue is large-scale feasibility, which we could not address in our small-scale prototype deployment; we address these scaling issues in Section 6. The RH prototype is a stand-alone user-level Java daemon that uses a TUN/TAP device to interpose itself as an overlay between the transport and IP layer, and uses the Socket API to invoke the kernel transport protocol implementations. Most of the core features described earlier have been implemented, including dissemination of REGISTERs, name-based routing of FINDs, content caching, and protocol support for a few applications. The implementation is comprised of the following modules:
分享到:
收藏