comparison doc/www.anonet2.org/public_pod/anonymity.pod @ 113:5100b1fb4f5c draft

added "anonymity" section to a2.o
author Nick <nick@somerandomnick.ano>
date Sun, 15 Aug 2010 18:00:42 +0000
parents
children a29e72c5408d
comparison
equal deleted inserted replaced
112:9fba60ff2ed3 113:5100b1fb4f5c
1 =head1 AnoNet2 - Anonymity & Pseudonymity
2
3 Back to homepage - L<http://www.anonet2.org/>
4
5 =head2 Introduction
6
7 This page is intended to explain a bit of the theory behind anonymity
8 and pseudonymity. If your goal in joining AnoNet is to protect your
9 anonymity, this page may help you avoid some "leaks."
10
11 =head2 Definition
12
13 Anonymity translates literally into "having no name," and means having
14 no useful identification "marks" ("useful" being defined as "usable
15 for future find operations"). While it's technically possible to be
16 truly anonymous on AnoNet, true anonymity is not really necessary (nor
17 desirable) in order to achieve the goals that most guys here expect.
18 Pseudonymity ("having no real name") is what most of us are here to
19 achieve. (Most of us don't care if you can find us again on AnoNet
20 (and in fact, we normally _want_ you to). We only care if you can find
21 us _outside_ AnoNet.) However, the theory behind both is quite similar,
22 since the potential attacks against both are quite similar. Therefore,
23 this page primarily concerns itself with true anonymity on the assumption
24 that a certain amount of correlation between your actions is already
25 feasible for an attacker.
26
27 =head2 Introduction to Triangulation
28
29 The fundamental method that people use for identification is
30 triangulation, where we look at something from a bunch of different angles
31 and then narrow down our guesses to items that match that combination
32 of observations. For example, a duck is something that looks like
33 a duck, quacks like a duck, etc. It should go without saying, then,
34 that our goal here is to avoid others being able to apply triangulation
35 "against" us. That is, our goal is to prevent triangulation "attacks."
36
37 =head2 Simple Triangulation
38
39 If you see someone on a chatroom around 1800 GMT, and he tells you that
40 his mother just bought him some colourful pants when he got back from
41 school, it'd be a pretty safe bet to say that he probably:
42
43 =over
44
45 =item 1
46
47 is a kid (his mother buys him simple clothing items, after school)
48
49 =item 2
50
51 in England (colourful == British spelling; pants == underpants)
52
53 =item 3
54
55 who is actually a she (boys with colorful pants?)
56
57 =back
58
59 Now, obviously, if you found more details concerning the makeup of his
60 class, you may be able to narrow down the possibilities for his schools.
61 Combine that with his IP address, and you can focus on your candidates
62 within range of his geographical location. Perhaps he (she) talks about
63 his older brother walking him (her) to school in the morning, before
64 going to his own school. Well, in that case, you can be reasonably sure
65 that his older brother graduated from the same school "back in the day."
66 Given the fact that England's birth rate is relatively low, you can
67 therefore speculate that this bit of information is likely to narrow
68 down the possibilities (especially if he tells you how much older his
69 brother is). Another reasonably safe guess is that he's probably located
70 in a rather urban area. Now, you can add a bit of active triangulation
71 to the mix, by telling his ISP that his IP address has been sharing
72 your intellectual property. If the owners of that IP address really
73 do have a girl in primary school and your intellectual property sounds
74 like something oriented towards kids, the parents' first defense is
75 likely to be that they don't fileshare, so it was probably their kid (or
76 maybe some guy who drove by with wifi, who happens to like kid stuff).
77 (Obviously, if you're a civilian, your country is likely to have laws
78 against you committing fraud like that, but intelligence agencies
79 routinely do this type of thing, so it's worthwhile understanding some
80 of the options physically available to an attacker, even if they're not
81 "legally" available to him. You certainly don't want your anonymity
82 dependent on an adversary "playing by the rules," do you?)
83
84 =head2 A Bit More Formality
85
86 A very powerful science for dealing with these types of problems is
87 Mathematics, so we gain an advantage if we can translate our problems into
88 Mathematics (and our solutions out of it, of course). Our Mathematical
89 model for triangulation is similar to that of geolocating a cellular phone
90 that dials for emergency assistance. Initially, we can only say that
91 the cellular phone is likely to be someplace on (or near) planet Earth.
92 Since we know that the cellular signal deteriorates over distance and we
93 know (based on the phone's specifications) the original signal strength at
94 source, each tower can guage its distance from the phone by translating
95 backwards from its observed signal strength to meters. Most towers
96 are well out-of-range, and won't observe any measurable signal at all
97 (meaning an effectively infinite distance), while the nearby towers will
98 observe measurable signals. Now, each tower has a circle around it made
99 up of all the points at a particular distance from it. (Actually, it's a
100 three-dimensional sphere, but in our case, we're assuming the phone isn't
101 in flight or underground, for a bit of simplification. Real systems will
102 add an additional tower in order to triangulate in all three dimensions.)
103 Two intersecting circles will normally intersect (touch or cross over each
104 other) at two points. Three intersecting circles will rarely intersect
105 at more than a single point. Therefore, as long as the towers can safely
106 assume that the phone is broadcasting a uniform signal in all directions,
107 they can safely claim to have triangulated his position.
108
109 Now, let's see if we can apply triangulation to our own problem space.
110 We know that there are approximately 6 billion people on our planet,
111 so we're starting out with a population of 6 billion candidates.
112 (Obviously, we're assuming that aliens don't have anything interesting to
113 do on our ICANN-dominated Internet, and so for all intents and purposes
114 don't count.) Now, there are many "dimensions" in which these people
115 are organized. (A dimension is simply a metric where each individual
116 has a potentially measurable coordinate.) For example, everybody has
117 a gender. Everybody lives in some country. Everybody has some level
118 of computer expertise, some level of Mathematical education, some set
119 of familiar authors, some set of favourite bands, some color skin and
120 some length hair, etc. Now, as you're able to intersect coordinates in
121 different dimensions, you can start eliminating unlikely candidates and
122 focusing on the likely ones. For example, the number of males is quite
123 high (on the order of 3 billion or so), the number of people in Portugal
124 is quite high, the number of 15-year-olds is quite high, the number of
125 stay-at-home parents is quite high, the number of people who are still
126 married to their first wife is quite high, and the number of parents with
127 two kids is quite high, but the number of Portuguese males around age 15
128 who stay at home to care for their two kids while their first wife is out
129 working is very low (probably well under 1000 - low enough for you to be
130 able to go door-to-door looking for him, if you'd recognize him by face).
131 Clearly, by triangulating coordinates between a variety of dimensions,
132 we're able to take the intersection of a variety of sets, which is quite
133 small when the sets have little in common (which is normally true when
134 there's no causal relationship between the sets in question).
135
136 Therefore, if you're that guy and you don't want others to find you,
137 you probably shouldn't give away too many facts about yourself.
138
139 =head2 Countermeasures
140
141 Remember when we talked about the cellular phone geolocation problem,
142 where we noted that the towers need to assume the phone is broadcasting
143 the same value (in this case, the same starting signal strength) in
144 all directions? Obviously, a phone without an omnidirectional antenna
145 could point a different directional antenna at each nearby (or even far
146 away) tower, and transmit a highly focused signal at an arbitrary power
147 level to each tower, and thereby confuse the towers. Alternatively, it
148 could even work backwards through the triangulation algorithm in order
149 to figure out a set of inputs that would cause the towers to geolocate
150 the phone "accurately" as being kilometers away from its true location.
151 It should come as no surprise, then, that similar techniques work in
152 our own problem space. For example, how do you know that the guy is
153 really male? Given the other dimensions, wouldn't you say he's more
154 likely to be a female?
155
156 =head2 Verification
157
158 Going back to our cellular phone geolocation problem, we left off
159 with our phone fooling the towers into thinking it's someplace else.
160 However, we didn't take into account that the towers themselves may
161 have directional antennas scanning around on a regular basis in order
162 to detect precisely this type of fraud. If the phone is supposed to be
163 southwest of one of our towers, why is its signal coming in from the east?
164 Not surprisingly, certain verification techniques may be applicable in
165 our own problem space. For example, suppose you somehow got a list of
166 all candidates, and then combed all of Portugal door-to-door looking
167 for the guy, and didn't find him? What if he told you that he was a
168 licensed pilot, but you couldn't find any pilot matching his description?
169 The goal of a verification algorithm is to assess the probability of
170 our data sources being correct. The goal of a verification algorithm
171 is to tell us how likely it is that we've been fooled, not to find the
172 right answer. (Obviously, a verification algorithm may itself reveal
173 additional information that we can then triangulate with. For example,
174 the towers employing directional antennas can geolocate our phone with
175 the directional antennas (using the law of intersecting lines), without
176 even relying on the omnidirectional antennas. Therefore, the verification
177 algorithm in this particular case not only verifies the likelyhood of the
178 triangulation, but actually provides its own alternative triangulation
179 dataset.)
180
181 =head2 AnoNet
182
183 On AnoNet, the single most important factor in securing your anonymity is
184 precluding verification. If an adversary can't verify his data about you,
185 then he's trivially vulnerable to countermeasures, making it difficult for
186 him to trust the results of his triangulation (and making it difficult,
187 therefore, for him to even justify the cost of triangulating in the
188 first place).
189
190 For example, you probably don't want to recycle a nickname you
191 use elsewhere, since a simple Google search may give adversaries
192 a verification tool to use against anything they learn about you on
193 AnoNet. You also want to make sure that the public IP address you use
194 for peering doesn't geolocate your exact location (try MaxMind's online
195 tool, for example). A good way of getting around this one is to get a
196 VPS (Virtual Private Server) before peering with too many other guys.
197 There are plenty of cheap ones (well under 10EUR or 10USD each month),
198 and you can easily get a VPS in a different country. An even better
199 way of getting around this is to peer over i2p, if you don't mind
200 installing Java on your routers. If you're lucky, your ISP may
201 SNAT outgoing traffic from its users, giving you a certain amount of
202 "built-in" protection. If you're not comfortable giving a peer your IP
203 address and none of the above is an option, you may consider peering
204 using TCP over tor or something. In addition, it's also possible to
205 exchange data using DNS, so if each of you has access to a DNS server
206 and some method to automatically load TXT records into it, you can
207 tunnel a VPN over it without either of you giving away his IP address.
208 (This particular method can also get around restrictive firewalls, which
209 may be independently useful.) Other things you probably don't want
210 to advertise are your name (especially not your full name), location,
211 age, marital status, occupation, school, and hobbies. Under normal
212 circumstances, it's safest to assume that anything you tell anybody
213 on AnoNet may be used by anybody else on AnoNet for triangulation or
214 verification attacks, and so the only reliable method of preventing
215 these types of attacks is to avoid leaking any verifiable information
216 to anyone on AnoNet. When that's not feasible, try to avoid giving
217 multiple pieces of information to individuals. For example, if you're
218 coming in with UFO's CP, it's probably unwise to use his IRC server.
219 (It's also smart not to come onto IRC as soon as you connect, since
220 then UFO can guess that the guy who just joined IRC is probably the
221 same guy who just connected to his CP. To protect your anonymity from
222 the organizers of a darknet, it's imperative that you peer with someone
223 (preferably not an organizer) ASAP after joining. The more often you
224 come in through the CP, the higher the probability that an organizer
225 will find you. If you've come in over the CP more than a few times
226 before getting peered, you'll probably want to at least change your IRC
227 nickname before rejoining IRC after peering, so the darknet organizers
228 at least can't trivially connect your IcannNet IP address with your
229 AnoNet nickname. If a darknet's organizers try to put you through a
230 "hazing" period before they'll allow anybody to peer with you, that's
231 a strong indication that they don't care much for I<your> anonymity.
232 They may tell you that "nobody here trusts you enough yet to give you his
233 IP address," but that's (at best) just a thinly veiled way of saying that
234 "nobody here cares enough about your anonymity to have bothered to get
235 himself a VPS for peering." By making it difficult for new users to join,
236 they're effectively dooming their darknet into forever being a small and
237 incestuous club, a fraternity if you will, where everybody gradually gets
238 to know everybody else quite well (since static analysis works quite well
239 against rigid structures). An anonymity-preserving darknet makes it easy
240 for users to enter and exit at will, with the organizers keeping minimal
241 (or no) tabs, in order to resist static analysis.)
242
243 =head2 AnoNet2 vs. The Competition
244
245 AnoNet2 aims to provide the best anonymity feasible with TCP/IP, through
246 a variety of techniques:
247
248 =over
249
250 =item minimizing required direct information disclosure
251
252 Most TCP/IP-based darknets require new users to submit a fair amount of
253 information up-front. Non-anonymizing darknets like dn42, for example,
254 expect users to sign up for a wiki account to register resources, to join
255 a mailing list for operational discussions, etc. (dn42, incidentally,
256 deserves special mention, as the resource database has recently been
257 migrated over to a decentralized resdb-like registry. In addition,
258 there's now an NNTP gateway to the mailing list reachable from inside
259 dn42, making it feasible to avoid giving away much information.)
260 So-called "anonymizing" darknets, by comparison, tend to turn these types
261 of expectations into policy requirements. A case in point is AnoNet1,
262 where new users are expected to go through a "hazing" process for 2-4
263 weeks before anybody is supposed to peer with them. During the "hazing"
264 process, the new user is expected to answer questions like "what brings
265 you here?" from an informal panel of existing members, and is expected
266 to "participate in the discussion" for a couple of weeks to prove that
267 he's serious about joining AnoNet1. (The official excuses range from
268 avoiding "drive-by peerings" to preventing infiltration by law enforcement
269 officials. The former commands a high price relative to the nuisance
270 factor of a temporary peering, while the latter is just plain laughable.)
271 AnoNet1 also requires members to maintain their resource registrations
272 on a centralized wiki, making certain information available to crzydmnd.
273 There is only one official client port (run by Kaos), and users are
274 discouraged from setting up additional ones. AnoNet2 gets this part
275 right by making it very easy for new users to join, and to peer as early
276 as technically possible.
277
278 =item avoiding centralization of critical infrastructure
279
280 Most TCP/IP-based darknets have a fair amount of centralized
281 infrastructure. Centralized infrastructure is problematic, since it
282 creates a single point of control (or evesdropping), making it easy for
283 the operator to learn information that's not intended for him, and/or
284 alter transmissions that aren't intended for him. Typical examples are
285 things like resource databases, chatrooms, DNS, routing infrastructure,
286 documentation stores, forums, mailing lists, and public Web pages.
287 AnoNet1 is a model of centralized infrastructure, with centralized
288 mechanisms in-place for pretty much all of the above minus routing
289 (and even routing is quite centralized on AnoNet1, due to their peering
290 policies). Even dn42 (whose primary claim to fame is decentralization)
291 retains centralized mechanisms for IRC, wiki, mailing list, and public
292 Web pages. AnoNet2 has only a single point of centralization, in the
293 public Web pages here, and even they are easy for anybody on AnoNet2 to
294 modify (although there's still a centralized point of control over what
295 ends up getting published here and what doesn't, a point which has never
296 been used so far (a fact that's very easy to prove in a decentralized
297 way), and which will hopefully never be used). In addition, users are
298 encouraged to set up their own public Web pages and to put links to them
299 here, in order to further reduce centralization of AnoNet2's Web presence.
300 In addition to protecting your anonymity, this level of decentralization
301 makes it far more likely for AnoNet2 to survive a splitbrain condition
302 (where some bad guys take a number of central users out of the picture,
303 leaving a few disconnected fragments with critical services missing),
304 something that an anonymity-preserving darknet always has to plan for.
305 If AnoNet1 were to become split, the "non-central" side would most
306 likely wither away and die (a statistical fact that AnoNet1 used to
307 try and destroy AnoNet2 before it ever got off the ground), whereas if
308 AnoNet2 splits, the individual fragments should have no problem carrying
309 on indefinitely as independent darknets, and little difficulty merging
310 back together again if their paths cross at some point in the future.
311 What git and monotone do for software development, AnoNet2 does for
312 darknet development.
313
314 =item not requiring resource registration
315
316 AnoNet1 had a very powerful idea, of allowing people to mark a resource
317 "reserved" without specifying who has reserved it, but like most good
318 ideas in AnoNet1, this one also turned out incompatible with what
319 AnoNet1 has become. AnoNet2 takes this idea one step further: not only
320 can you easily leave out the "owner" field in a resource registration,
321 but you can even leave out the registration completely, and let someone
322 who happens to notice the resource in use (presumably, someone who's
323 scanning to make sure a resource is available before using it himself)
324 add it himself as "apparently in use."
325
326 =item not requiring resource exclusivity
327
328 In fact, AnoNet2 takes it a step further, by having no conflict resolution
329 policy for resources. This means two users can use the same IP address,
330 for example, and leave it up to routing to decide who "wins." (Under
331 normal circumstances that's not likely to happen, since at least one of
332 the users will almost certainly prefer to renumber rather than fighting
333 it out with the other guy. If they both want to fight it out, though,
334 there's no AnoNet2 rule that either of them is violating by refusing
335 to "talk it out," even if it's trivial to prove which guy's claim came
336 first.) This is intended to be useful during darknet merges, but it can
337 also aid in anonymity protection for cooperating users who agree among
338 themselves on some algorithm to determine who gets the resource when,
339 or perhaps they use the split routing to their advantage, SNATting (or
340 proxying) through each other for locations they can't reach directly
341 (or even for locations they I<can> reach directly, if they really
342 want to confuse an attacker - and themselves, if they're not careful).
343 The same thing goes for ASNs, domains, nicknames, etc. Static analysis
344 against any of these resource types is not guaranteed to yield useful
345 information (i.e., excessive triangulation may yield strange results),
346 and with only a little bit of coordination, a group of users can achieve
347 true anonymity, if that's really what they want.
348
349 =item avoiding bandwidth requirements for peering
350
351 Not everybody can afford a VPS, but everybody should be able to enjoy his
352 anonymity, not just as a leaf, but also as a transit. Conversely, many
353 users will want more path diversity, even if it means using slower links.
354 Therefore, AnoNet2 defines no rules about minimum bandwidth for peering.
355 Individual users can obviously do whatever they want, but there's no
356 official policy for them to use as an excuse. There's nothing wrong
357 with a transit node being on dial-up. If you prefer speed over path
358 diversity, just tell your router to avoid any path going through that ASN.
359 By the same token, if you have both VPSes and dial-up links and you want
360 to make it easy for people to implement different policies for routes
361 passing through each of them, it's probably wise to use different ASNs.
362
363 =item avoiding I<all> censorship
364
365 AnoNet1 officially sanctions some censorship, and unofficially practices
366 much more. The problem is that once you start complexifying the
367 definition of censorship, where do you draw the line? AnoNet2 has a very
368 simple definition of censorship: interfering with communications of which
369 you are not the (I<the>, not I<an>) intended recipient. AnoNet2 doesn't
370 impose anybody's morals (nor anybody's legal system) on you, so feel
371 free to communicate anything you want. If we don't like what you say,
372 we can always just ignore you.
373
374 =item avoiding arbitrary restrictions on freedom
375
376 Working around restrictions wastes resources, so those who are determined
377 to achieve their goals will still achieve them, while the rest of us
378 suffer the consequences of a legal framework. To avoid wasting your
379 resources working around AnoNet2 rules, AnoNet2 simply avoids defining
380 any rules. Anything goes. If you manage to annoy enough people (and
381 you'll probably have to put in a serious effort, if you really want to
382 annoy enough of us), you'll most likely wind up forking AnoNet2, which
383 is probably what you'd want in that case, anyway.
384
385 =back