WEBVTT

0
00:01.550 --> 00:06.680
In this lecture we’ll dive deeper into how Onion Routing and Tor work.

1
00:07.500 --> 00:13.940
In the last lecture, we've learned that in order to remain anonymous, the Tor client will bounce the data

2
00:13.940 --> 00:19.620
through three different servers or Tor relays and the last relay will connect to the Internet

3
00:19.620 --> 00:22.740
server, lets say to a Web site on your behalf.

4
00:23.250 --> 00:30.660
It's like a proxy. When building a Tor circuit the first relay is called the “Entry Guard” and 

5
00:30.660 --> 00:32.270
the last relay in the circuit


6
00:32.490 --> 00:33.960
is the “exit relay” 

7
00:34.230 --> 00:37.680
that sends the traffic out onto the public Internet.

8
00:38.790 --> 00:42.180
Let's open the Tor browser and connect to Google.

9
00:42.720 --> 00:48.900
Don't worry for the moment about how to get the Tor browser, because I'll show you how to install

10
00:48.920 --> 00:51.090
and use it in the next lectures.

11
00:51.990 --> 00:55.110
So I'm connecting to Google.com.

12
00:58.400 --> 01:05.710
This is Google and I'm connecting to it through the Tor Network. To see the circuit that Tor Browser

13
01:05.780 --> 01:11.510
is using for the current tab go to the site information menu, in the URL bar. 

14
01:12.020 --> 01:13.520
This is the Tor circuit!

15
01:14.640 --> 01:16.640
Pay special attention to the Tor 

16
01:16.680 --> 01:17.370
relays.

17
01:17.760 --> 01:19.650
Note that they are three.

18
01:21.650 --> 01:26.360
My browser has connected to the Guard relay that is located in Ukraine.

19
01:26.750 --> 01:29.570
The middle relay is in the United Kingdom.

20
01:30.020 --> 01:37.790
And the exit relay in France; and the exit relay has connected on my behalf to Google.com.

21
01:38.820 --> 01:41.860
Let's take a deeper dive into how Tor works.

22
01:42.410 --> 01:49.870
The first step in building a Tor connection is the Tor browser or client to download a list of Tor

23
01:49.880 --> 01:54.320
notes, servers or relays from unknown directory server.

24
01:55.160 --> 02:03.980
Each tor client selects a few relays, at random, to use as entry points and uses only those relays for

25
02:03.980 --> 02:06.650
the first hop, which is normal Tor behavior.

26
02:07.040 --> 02:15.260
So the first relay in your circuit is called an "entry guard" and  it’s a fast and stable relay that remains 

27
02:15.290 --> 02:22.370
the first one in your circuit for 2-3 months in order to protect against a known anonymity-breaking 

28
02:22.370 --> 02:22.880
attack.

29
02:23.420 --> 02:27.890
The rest of your circuit changes with every new Web site you visit

30
02:28.250 --> 02:34.250
and together, all of these relays will provide the full privacy protection of Tor.

31
02:35.240 --> 02:42.020
Now, instead of taking a direct route from source to destination, like in the case of a normal Internet

32
02:42.020 --> 02:49.250
routing, data packets on the Tor network take a random pathway through several relays and no observer,

33
02:49.310 --> 02:54.650
at any single point, can tell where the data came from, or where it is going to.

34
02:55.860 --> 03:01.890
To create a private network pathway within the Tor network, the Tor client builds a secret

35
03:02.130 --> 03:05.610
of encrypted connections through three different relays.

36
03:06.090 --> 03:11.040
The client will negotiate a symmetric encryption key with each

37
03:11.040 --> 03:11.420
relay

38
03:11.700 --> 03:13.470
so there are three encryption keys.

39
03:14.040 --> 03:21.240
The client will encrypt the message three times, in layers, with these three encryption keys.

40
03:22.310 --> 03:29.000
It will first encrypt the message with K3, the resulting encrypted message will be encrypted again with

41
03:29.000 --> 03:29.590
K2

42
03:29.960 --> 03:37.370
and finally again with K1. It's like the layers of an onion and the message resides at the heart

43
03:37.580 --> 03:39.360
of the onion. Let's 

44
03:39.650 --> 03:41.090
see what that means!

45
03:42.050 --> 03:49.160
When the message leaves the Tor client, it has been encrypted three times; the outer layer being

46
03:49.160 --> 03:57.140
encrypted with K1, and that means that the first relay, the guard relay can unlock the first layer of 


47
03:57.140 --> 03:59.510
encryption because it has the key.

48
04:00.440 --> 04:03.250
The guard relay has the key named K1.

49
04:05.470 --> 04:13.510
Note that the guard relay cannot read the message because it’s still encrypted twice (with k2 and 

50
04:13.510 --> 04:16.570
k3). It’s complete gibberish for it.

51
04:17.500 --> 04:24.460
What can it do is forward the message to the middle relay, which has K2, and can decrypt the second

52
04:24.460 --> 04:24.840
layer.

53
04:26.110 --> 04:34.150
The middle relay cannot look at the message because it’s still encrypted with k3 and it doesn’t have

54
04:34.150 --> 04:34.660
the key.

55
04:35.620 --> 04:37.750
The middle relay will forward 

56
04:37.810 --> 04:45.850
the message to the last relay, called the exit relay, which in turn will decrypt the last layer with

57
04:45.850 --> 04:48.430
K3 and get the original message.

58
04:48.580 --> 04:56.320
So the final relay knows the destination like say twitter.com and forwards the packet to a that server.

59
04:57.290 --> 05:01.110
Note that the data in the packet could still be encrypted

60
05:01.220 --> 05:07.550
if the protocol was https or unencrypted, if the protocol was http.

61
05:08.810 --> 05:12.620
The server receives the message and responds back.

62
05:13.130 --> 05:17.270
Now, on the way back, the exact reverse process happens.

63
05:18.620 --> 05:25.700
The first relay, which was the exit relay, gets the normal message, encrypts it with K3 and then

64
05:25.700 --> 05:33.860
forwards it to the middle relay, which encrypts it with K-2 and forwards it to the last relay, which

65
05:33.920 --> 05:39.290
in turn encrypts the message with K1 and sands it to the client.

66
05:40.850 --> 05:47.900
This one has all three keys and can unlook all three layers of encryption and get the original

67
05:47.900 --> 05:48.440
message,

68
05:48.770 --> 05:50.480
the response from the server.

69
05:51.720 --> 05:54.160
And now comes something important!

70
05:54.550 --> 06:00.820
The Tor circuit is extended one hop at a time, and each relay along the way

71
06:00.850 --> 06:04.660
knows only which relay gave it data and which relay it is giving data to

72
06:04.870 --> 06:10.200
So a relay knows only its direct neighbors. 

73
06:10.840 --> 06:19.450
No individual relay ever knows the complete path that a data packet has taken because each relay sees

74
06:19.690 --> 06:21.910
no more than one hop in the circuit,

75
06:22.360 --> 06:30.430
neither an eavesdropper nor a compromised relay can use traffic analysis to link the connection's source


76
06:30.490 --> 06:31.450
and destination.

77
06:32.350 --> 06:39.310
In this example, if for example the middle relay gets compromised, it doesn’t know the identity 

78
06:39.400 --> 06:42.010
of the source nor the destination server.

79
06:42.820 --> 06:50.390
It only knows that it has received the message from the entry guard and that it has to forward 

80
06:50.400 --> 06:52.570
the message to the exit relay.

81
06:53.570 --> 07:02.180
If we continue with the analogy, the Guardia Node only knows that the client is using Tor and the Exit

82
07:02.180 --> 07:07.100
Node only knows that someone which is anonymous is visiting

83
07:07.190 --> 07:12.440
twitter.com. The exit node doesn't knows the identity of the source.


84
07:13.130 --> 07:16.760
It only knows the next hop to the Tor client.

85
07:18.150 --> 07:21.950
This is the Tor browser; if you connect to different websites

86
07:22.140 --> 07:25.710
then a new Tor Circuit for each website will be created. 

87
07:26.400 --> 07:33.300
But connections to a single website address will be made over the same Tor circuit, meaning you can

88
07:33.300 --> 07:41.550
browse different pages of a single website, in separate tabs or windows, without any loss of functionality.

89
07:42.400 --> 07:44.100
Let's take a look at that!

90
07:45.120 --> 07:49.040
Using the Tor browser, I'm connecting to Twitter.com.

91
07:56.040 --> 07:57.090
This is the circuit:

92
07:58.230 --> 08:01.500
Ukraine, Netherlands, Germany, and twitter.com.

93
08:03.520 --> 08:06.590
And I'll click on a link on this Web page.

94
08:06.770 --> 08:10.310
So, in fact, I am opening a new Web page from the same domain.

95
08:16.990 --> 08:18.520
It's the same circuit.

96
08:22.030 --> 08:26.500
This browser: Ukraine, Netherlands, Germany, and twitter.com.

97
08:27.610 --> 08:29.200
And here the same.

98
08:30.190 --> 08:34.870
But if I'm connecting to another server, let's say Facebook.com,

99
08:35.270 --> 08:37.120
a new circuit will be created.

100
08:40.020 --> 08:42.000
you see, it's another circuit!

101
08:44.000 --> 08:50.540
And for efficiency, the Tor client uses the same circuit for connections that happen within the same 

102
08:50.540 --> 08:52.200
10 minutes period of time.

103
08:53.360 --> 09:00.900
Later, requests are given a new circuit to keep people from linking someone's earlier actions to the

104
09:00.900 --> 09:01.510
new ones.

105
09:02.660 --> 09:11.960
Also, note that Tor only works for TCP streams and can be used by any application with SOCKS support.

106
09:12.650 --> 09:20.810
One drawback of using Tor is that because of these relays, it's much slower than a normal Internet

107
09:20.810 --> 09:22.730
connection or even a VPN.

108
09:23.240 --> 09:25.190
Just consider that the Tor

109
09:25.190 --> 09:30.500
notes are run by volunteers and might be serving lots of clients at once.

110
09:30.770 --> 09:35.420
And also consider that they may be not be geographically located nearby.

111
09:35.970 --> 09:42.560
Maybe a Tor client from Germany is connected to a server located in Romania, going through a relay from South

112
09:42.560 --> 09:43.100
Korea,

113
09:43.550 --> 09:45.680
and another one located in Canada.

114
09:46.160 --> 09:49.490
So the packets will make large global hops.

115
09:51.090 --> 09:52.250
OK, that's all!

116
09:52.560 --> 09:59.880
Now you have a deep understanding of how onion routing and the particular implementation called Tor

117
10:00.090 --> 10:00.600
work.