|
| 1 | +FRAMING PEER TO PEER FILE SHARING |
| 2 | +by Robb Topolski |
| 3 | + |
| 4 | + |
| 5 | +A Fact: A BitTorrent uploader transfers to only 3-4 peers at a time. |
| 6 | + |
| 7 | +Peer-to-peer (P2P) file-sharing applications are widely believed to strain |
| 8 | +access providers' upload links by uploading over hundreds of TCP |
| 9 | +connections. It is the basis incorporated in problem statements supporting |
| 10 | +extreme management methods. It is also a basis that escaped testing. The |
| 11 | +fact is that BitTorrent simultaneously uploads data over only 3 to 4 TCP |
| 12 | +connections. |
| 13 | + |
| 14 | + |
| 15 | +1. DEFLECTING ACCOUNTABILITY BY DEFLECTING RESPONSIBILITY TO OTHERS |
| 16 | + |
| 17 | +While the growing demand for upstream bandwidth among Internet users is |
| 18 | +nothing new, today's operators are claiming difficulty in dealing with the |
| 19 | +phenomena. I choose the word 'claiming' because technical conclusions |
| 20 | +require data and analysis, and insufficient data has been presented to be |
| 21 | +openly analyzed. I also ask the readers to read the word 'claiming' as |
| 22 | +neutral in tone, as no strong evidence is available to rebut providers' |
| 23 | +claims. |
| 24 | + |
| 25 | +Without providing data about this congestion, operators avoid the |
| 26 | +possibility that they themselves have issues bearing scrutiny. As a result, |
| 27 | +the latest debates skip analyzing congestion and advance to how and when to |
| 28 | +"throttle" the "bandwidth hogs," how to correct a congestion-control |
| 29 | +fairness imbalance between users, under what conditions network operators |
| 30 | +may inspect the payload of IP packets prior to forwarding them, and how to |
| 31 | +further assure routine availability for real-time applications. |
| 32 | + |
| 33 | +This puts technologists in a position of playing "Bring Me a Rock"; a game |
| 34 | +that never ends because someone is unhappy with the particular rock that is |
| 35 | +returned and requests a different rock be returned instead, meanwhile never |
| 36 | +adding any useful information to help the player choose the correct rock. |
| 37 | + |
| 38 | +Without data, we cannot tell if the problem of congestion is due to the |
| 39 | +failure of existing congestion-control algorithms, the addition of "smart" |
| 40 | +devices installed mid-network that are changing the normal behavior of |
| 41 | +packets, a sudden insurgence of aggressive or incorrect network decorum |
| 42 | +among applications, a QoS technique problem, or mismanaging the division |
| 43 | +of a shared resource among users. Data is also necessary to understand the |
| 44 | +size of a problem and under what conditions it occurs in order to |
| 45 | +accurately predict whether a proposed solution is likely to solve the |
| 46 | +problem. |
| 47 | + |
| 48 | +Today's problem has been framed as a P2P file-sharing problem. However, |
| 49 | +without congestion, applications do not contend for bandwidth and any of |
| 50 | +the technical solutions to efficiently share congested bandwidth become |
| 51 | +moot. Therefore, congestion itself must be a causal factor to include in |
| 52 | +theories about the problem. Data about causes must be gathered. In the |
| 53 | +past, the Internet community coped with congestion by dealing with |
| 54 | +congestion along with its external additives such as protocol efficiency. |
| 55 | +We should not allow today's congestion to escape a studied examination. |
| 56 | + |
| 57 | +Specifically, the Working Group should first gather data before reaching |
| 58 | +its problem statement. It should then test the problem statement to ensure |
| 59 | +it is accurate so that any solutions to be brought are accurate. If the |
| 60 | +data cannot or will not be made available, doing nothing at all would be |
| 61 | +better than continuing to hear "No, not that one. Bring me a different |
| 62 | +rock. |
| 63 | + |
| 64 | + |
| 65 | +2. ASSIGNING ACCOUNTABILITY AND RESPONSIBILITY WITHOUT CORRECT EVIDENCE |
| 66 | + |
| 67 | +A recent description of this problem, said: |
| 68 | + "[the] traditional management of fairness at the transport |
| 69 | + level has largely been circumvented by [P2P] applications |
| 70 | + designed to achieve the best end-user transfer rates" |
| 71 | + |
| 72 | +A renowned researcher explained the problem, saying: |
| 73 | + "Even though file-sharing generally uses TCP, it uses |
| 74 | + the well-known trick of opening multiple connections-- |
| 75 | + currently around 100 actively transferring over different |
| 76 | + paths is not uncommon." |
| 77 | + |
| 78 | +It's tribal knowledge that P2P file-sharing applications open hundreds of |
| 79 | +connections, all active. This is a notion that has been repeated in |
| 80 | +technical blogs and shown as evidence to Members of Congress and the U.S. |
| 81 | +Federal Communication Commission as to the cause of the congestion bringing |
| 82 | +about the Network Neutrality debate. In as much as P2P file-sharing |
| 83 | +behavior is represented by its most-used protocols, namely BitTorrent, |
| 84 | +Gnutella, and ED2K, it is also false (or greatly misjudged). |
| 85 | + |
| 86 | +The unsubstantiated blaming of P2P for today's problems is only the first |
| 87 | +problem with the first line. The phrase, "traditional management of |
| 88 | +fairness," prejudges whether any "fairness" that congestion control |
| 89 | +algorithms accomplish are intended, or are they merely a side-effect of |
| 90 | +robust network-availability watchdog functions? The word "management" |
| 91 | +implies intent, and it is unlikely that extended terms of fairness was |
| 92 | +intended for a algorithm designed to keep congestion from bringing the |
| 93 | +Internet to a halt. |
| 94 | + |
| 95 | +Finally, the phrase, "designed to achieve the best end-user transfer |
| 96 | +rates", ignores the fact that the vast minority of Internet applications |
| 97 | +impose bandwidth limitations upon themselves. The normal behavior of |
| 98 | +network transports is to use any bandwidth that is available. Applications |
| 99 | +using these transports are generally blind to network conditions and |
| 100 | +cannot "speed up" or "slow down" based on signals that are not there. |
| 101 | +Instead, this is a function of symptoms and signals analyzed at the lower |
| 102 | +levels of the network stack. |
| 103 | + |
| 104 | +All of the top P2P file-sharing protocols that I have described access |
| 105 | +socket services through the same limited API set used by every other TCP |
| 106 | +application. In other words, P2P doesn't speak "in TCP," it only knows how |
| 107 | +to speak the O/S specific API commands - a limited command set that |
| 108 | +provides network services to all applications on the platform. Finally, |
| 109 | +such nefariousness on the part of a background application would defeat |
| 110 | +the use of other applications on a person's computer. |
| 111 | + |
| 112 | +The statements refer to BitTorrent (the #1 P2P file-sharing protocol on |
| 113 | +the Internet). They do not resemble behavior by ED2K (#2) or Gnutella (#3), |
| 114 | + both of which use a queuing system instead of a swarming method. A |
| 115 | +queuing system holds incoming requests for files while filling those |
| 116 | +requests only a very few at a time. In some parts of the world, a |
| 117 | +source-obfuscating generation of encrypted and overlaid P2P file-sharing |
| 118 | +applications is emerging including Freenet, Share, and others. I have not |
| 119 | +profiled the network behavior of these applications, but I believe that |
| 120 | +the authors of the above quotes were not focused on these applications, |
| 121 | +either. The chief suspect of these particular allegations is BitTorrent. |
| 122 | + |
| 123 | + |
| 124 | +3. FOCUSING SOLELY ON THE BEHAVIOR OF BITTORRENT |
| 125 | + |
| 126 | + Is the Problem Uploading? Downloading? Both? |
| 127 | + |
| 128 | +To perform a BitTorrent transfer of a file or set of files, the BitTorrent |
| 129 | +client connects to peers focused on the same file (called a 'swarm.'). The |
| 130 | +maximum number of connections to open is recommended to be somewhere in |
| 131 | +the neighborhood of 50-55 and nearly all BitTorrent clients install with a |
| 132 | +default maximum number of TCP connections that falls near this range. |
| 133 | +Within a few minutes of joining a typical swarm, many of these connections |
| 134 | +may be downloading. |
| 135 | + |
| 136 | +BitTorrent's first incarnations were realized in applications that |
| 137 | +transferred a single file (or a grouping of files behaving as one). For |
| 138 | +comparison purposes, this is analogous to transferring a single large file |
| 139 | +via FTP or HTTP. This is important, because many problem statements |
| 140 | +involving BitTorrent are made between BitTorrent involved in a file |
| 141 | +transfer versus a web browser fetching a web page, a VOIP app handling a |
| 142 | +call, and other dissimilar activities. |
| 143 | + |
| 144 | +One question to refine for the problem statement is whether multiple |
| 145 | +connections that are downloading are substantially contributing to any |
| 146 | +congestion problem today - or is any problem confined to how BitTorrent |
| 147 | +uploads? Indeed, some of the operator solutions proposed and imposed |
| 148 | +attempt to focus on the upload side, alone. While some solutions also |
| 149 | +focus on the downloading side as well, it is also an unanswered question |
| 150 | +as to whether their concern for doing so has more to do with congestion |
| 151 | +than it does controlling consumption and transit costs. |
| 152 | + |
| 153 | +When considering whether BitTorrent interferes with normal congestion |
| 154 | +control, it seems relevant that the uploading host is responsible for |
| 155 | +responding to the signs of network congestion. The downloading host has a |
| 156 | +lack of of insight and control. Being that the signal of congestion is a |
| 157 | +dropped packet (one that is not delivered) or a duplicated ACK, the |
| 158 | +downloading host receives no early indication of a problem. |
| 159 | + |
| 160 | +If downloading is found to contribute to the problem, it should be analyzed |
| 161 | +in the context that often none of these BitTorrent connections are |
| 162 | +downloading. After a node has collected all of the pieces in a swarm, it |
| 163 | +only uploads. |
| 164 | + |
| 165 | +Without data, I am not concluding that BitTorrent downloading activity has |
| 166 | +no bearing on this or any problem. However, it is quite possible that |
| 167 | +downloading via BitTorrent has a positive effect owing to its ability to |
| 168 | +route around congested paths while single-stream transfers have no such |
| 169 | +ability and must "power-through" a congested route until a transfer is |
| 170 | +completed. |
| 171 | + |
| 172 | + The Surprising Reason BitTorrent Uploads on Only 3-4 Connections |
| 173 | + |
| 174 | +Most indicators are that the congestion pressure on last-mile operators |
| 175 | +exists primarily in the uploading direction. These networks were designed |
| 176 | +several years ago, at a time when graphical interfaces and World-wide web |
| 177 | +browsers began to replace shell sessions and text-based networking tools. |
| 178 | +Indeed, the ratio of upload-to-download dial-up and fledgling high-speed |
| 179 | +bandwidth usage back then became the basis for planning the networks we are |
| 180 | +using today. |
| 181 | + |
| 182 | +Focusing on the upload side of the equation, the fact that BitTorrent |
| 183 | +clients are only uploading to 3-4 other peers per swarm - not hundreds - is |
| 184 | +probably the most relevant contradiction to the present analysis. This fact |
| 185 | +casts into doubt any conclusions made on the assumption that BitTorrent |
| 186 | +uploads across hundreds of connections. |
| 187 | + |
| 188 | +The primary reason that BitTorrent uses only 3-4 upload slots is to be a |
| 189 | +good network neighbor. As recorded in one instance of the specification, |
| 190 | +the Slot-and-Choking algorithm was a design decision to facilitate |
| 191 | +Congestion Control, among other things. This is the algorithm used by |
| 192 | +nearly all of the BitTorrent applications. .While concluding that it does |
| 193 | +so is beyond the scope of this paper, any analysis that BitTorrent does |
| 194 | +not facilitate congestion control deserves an explanation of how and why |
| 195 | +it fails - not an accusation that the protocol was somehow designed to |
| 196 | +trick the system out of bandwidth belonging to other network applications. |
| 197 | + |
| 198 | +As the popularity of the BitTorrent protocol increased, clients appeared |
| 199 | +that could handle multiple BitTorrent streams simultaneously. The popular |
| 200 | +examples of these clients contain similar per-task configurations |
| 201 | +(including uploading slot limitations) as their single-task predecessors: |
| 202 | +3-4 slots per swarm. A user running up to 3-4 simultaneous tasks is not |
| 203 | +unusual or inefficient, given today's last-mile upload speed allocations. |
| 204 | +This would mean that 12 to 16 upload slots may be simultaneously running |
| 205 | +at the most - and its HTTP analog is 3-4 individual uploads in different |
| 206 | +clients native to those protocols. |
| 207 | + |
| 208 | + |
| 209 | +4. FOCUSING ON THE PROBLEM |
| 210 | + |
| 211 | +In summary, I offer that an examination about how to deal with P2P |
| 212 | +file-sharing on a congested network should begin with examining the |
| 213 | +congested network and understanding the parameters and causes for that |
| 214 | +problem. In systems, problems are fixed as close to the source as |
| 215 | +possible. These are the least expensive solutions. We strive to fix root |
| 216 | +causes because, if fixed, this solves its effects. |
| 217 | + |
| 218 | +There are plenty of anecdotal reasons to blame BitTorrent for poor |
| 219 | +performance on a congested network, and there are plenty of anecdotal |
| 220 | +reasons to blame other causes -- none of which bearing repeating here. |
| 221 | +They, too, first require an analysis as to the root cause of the congestion. |
| 222 | + |
| 223 | + |
| 224 | +References: |
| 225 | + |
| 226 | +http://wiki.theory.org/BitTorrentSpecification |
| 227 | +http://www.bittorrent.org/ |
0 commit comments