WEBVTT

0:00:02.960000 --> 0:00:07.920000
 Welcome to this video titled
 an introduction to QoS.

0:00:07.920000 --> 0:00:12.320000
 In this video I'm going to answer the
 questions, what problems are solved

0:00:12.320000 --> 0:00:15.420000
 by QoS, why would we use this?

0:00:15.420000 --> 0:00:18.760000
 How does QoS control traffic?

0:00:18.760000 --> 0:00:22.160000
 And we're going to look at the day and the
 life of a packet from the perspective

0:00:22.160000 --> 0:00:25.420000
 of a router and two different
 types of switches.

0:00:25.420000 --> 0:00:30.680000
 We're also going to look at the differences
 between buffers and queues.

0:00:30.680000 --> 0:00:33.320000
 So let's start with
 an overview of QoS.

0:00:33.320000 --> 0:00:36.560000
 With any acronym, the very first thing
 we need to know is what does the

0:00:36.560000 --> 0:00:38.400000
 acronym actually stand for?

0:00:38.400000 --> 0:00:42.160000
 QoS stands for quality of service.


0:00:42.160000 --> 0:00:44.280000
 Now what problem does it solve?

0:00:44.280000 --> 0:00:48.620000
 It provides predictability of management
 of network resources during times

0:00:48.620000 --> 0:00:51.500000
 of congestion. Let's
 break that down.

0:00:51.500000 --> 0:00:55.120000
 So when we think of an interface, any
 interface, whether it be a fast

0:00:55.120000 --> 0:00:58.760000
 ethernet, a gig of an ethernet, a serial,
 whatever it is, and whether

0:00:58.760000 --> 0:01:03.140000
 it be on a router or switch, one of two
 things can be true of that interface.

0:01:03.140000 --> 0:01:06.320000
 Either that interface is
 congested or it's not.

0:01:06.320000 --> 0:01:10.540000
 If it's not congested, that means that
 as packets are going through that

0:01:10.540000 --> 0:01:15.040000
 router or switch, being directed to
 that interface to be transmitted,

0:01:15.040000 --> 0:01:16.020000
 there's no delay.

0:01:16.020000 --> 0:01:18.840000
 There's plenty of bandwidth, there's
 plenty of room, so as soon as those

0:01:18.840000 --> 0:01:21.920000
 packets or frames are sent to that
 interface, they're immediately put

0:01:21.920000 --> 0:01:24.460000
 on the wire and out they go.

0:01:24.460000 --> 0:01:25.800000
 So that's our perfect scenario.

0:01:25.800000 --> 0:01:29.440000
 And in a perfect world, when you think
 about the packets as they leave

0:01:29.440000 --> 0:01:34.360000
 your laptop per PC, all the interfaces
 they have to go through to get

0:01:34.360000 --> 0:01:35.620000
 to the destination.

0:01:35.620000 --> 0:01:39.620000
 In a perfect world, there wouldn't be
 congestion anywhere along that path.

0:01:39.620000 --> 0:01:44.000000
 So those packets would get to the
 destination as fast as possible.

0:01:44.000000 --> 0:01:47.700000
 There would be no holdup, no delay,
 and certainly none of your packets

0:01:47.700000 --> 0:01:49.260000
 would be dropped.

0:01:49.260000 --> 0:01:54.300000
 However, that's not always the case, because
 a lot of times, as an interface

0:01:54.300000 --> 0:01:58.760000
 is getting your packet to be transmitted,
 it has lots of other packets

0:01:58.760000 --> 0:02:03.000000
 from other sources that also need to
 be transmitted out that exact same

0:02:03.000000 --> 0:02:08.480000
 interface. So imagine a situation where
 a router or switch has 10 or 15

0:02:08.480000 --> 0:02:13.160000
 different interfaces are all receiving
 packets at the same time, and all

0:02:13.160000 --> 0:02:19.220000
 those packets are being directed out one
 common egress or transmit interface.

0:02:19.220000 --> 0:02:23.160000
 Well, so much stuff is coming in at
 a rate that the transmit interface

0:02:23.160000 --> 0:02:27.720000
 can't transmit those packets onto the
 wire fast enough, now we could have

0:02:27.720000 --> 0:02:31.740000
 congestion. Where the transmit interface
 says, okay, since I can't get

0:02:31.740000 --> 0:02:35.040000
 them on the wire fast enough, I'm going
 to start buffering or holding

0:02:35.040000 --> 0:02:39.580000
 back these packets in memory in queues,
 and then put them out on the wire

0:02:39.580000 --> 0:02:41.840000
 as I can if I can catch up.

0:02:41.840000 --> 0:02:46.680000
 But if you can't catch up, and those
 queues end up getting full, now we

0:02:46.680000 --> 0:02:50.160000
 have to end up dropping stuff, because
 there's no place left to put them.

0:02:50.160000 --> 0:02:56.340000
 Now without QOS, every packet is seen
 as the same by a router or switch.

0:02:56.340000 --> 0:03:00.560000
 For you as a human being, certain packets
 might have more or less priority,

0:03:00.560000 --> 0:03:04.320000
 maybe your IP phone packets have a higher
 priority than your web browsing

0:03:04.320000 --> 0:03:09.480000
 packets. But to the router switch, without
 QOS turned on, and QOS is not

0:03:09.480000 --> 0:03:13.040000
 on by default, every packet
 is seen as the same.

0:03:13.040000 --> 0:03:18.560000
 So if a memory queue starts getting full,
 packets are going to be delayed,

0:03:18.560000 --> 0:03:21.780000
 and it could be your voice packet that's
 delayed, it could be your data

0:03:21.780000 --> 0:03:23.500000
 packet that's delayed.

0:03:23.500000 --> 0:03:27.980000
 If memory queues get completely full and
 start as to start getting dropped,

0:03:27.980000 --> 0:03:32.340000
 once again, completely random, completely
 uncontrolled as far as what

0:03:32.340000 --> 0:03:34.480000
 will be dropped.

0:03:34.480000 --> 0:03:39.340000
 So quality of service provides some
 predictable management during those

0:03:39.340000 --> 0:03:41.720000
 times of congestion.

0:03:41.720000 --> 0:03:47.520000
 So for you as a human being, it assists
 in maximizing your end user experience

0:03:47.520000 --> 0:03:49.860000
 of your critical sessions.

0:03:49.860000 --> 0:03:52.580000
 I mean after all, if there's going to
 be some delay in the network somewhere

0:03:52.580000 --> 0:03:56.300000
 because some queue is getting filled
 up, because an interface just can't

0:03:56.300000 --> 0:04:00.700000
 put the bits on the wire fast enough,
 wouldn't you prefer that that delay

0:04:00.700000 --> 0:04:04.660000
 happen when you're trying to download
 some web page like Google.com or

0:04:04.660000 --> 0:04:08.940000
 something versus a delay in your IP phone
 conversation where you're talking

0:04:08.940000 --> 0:04:13.380000
 to your customer, QOS will
 help you manage that.

0:04:13.380000 --> 0:04:18.140000
 And it provides differentiates services
 to packets based on predefined

0:04:18.140000 --> 0:04:23.200000
 user criteria. So part of QOS is that
 you as a network administrator or

0:04:23.200000 --> 0:04:28.020000
 a network engineer, you have to get onto
 that router switch, and you have

0:04:28.020000 --> 0:04:48.040000
 to take what's in your head like voice
 is more important than data.

0:04:48.040000 --> 0:04:50.100000
 So how does QOS do this?

0:04:50.100000 --> 0:04:52.040000
 How does it control
 network traffic?

0:04:52.040000 --> 0:04:55.960000
 Well, there's a lot of different QOS
 features, more than any one person

0:04:55.960000 --> 0:04:57.880000
 could ever possibly know.

0:04:57.880000 --> 0:05:02.240000
 And now some of those features are designed
 to accomplish a very specific

0:05:02.240000 --> 0:05:05.980000
 task. Other features when you turn
 them on can accomplish two or three

0:05:05.980000 --> 0:05:08.540000
 things at one time.

0:05:08.540000 --> 0:05:13.580000
 So when we talk about the various tasks
 at a really high level of what

0:05:13.580000 --> 0:05:17.800000
 QOS can do, we can break it down to
 about five or six different things

0:05:17.800000 --> 0:05:21.940000
 or categories. Number one, there's
 classification of data.

0:05:21.940000 --> 0:05:25.360000
 This is what I was just describing
 when I said, hey, you need to turn

0:05:25.360000 --> 0:05:30.340000
 on some feature that will recognize
 voice, that will recognize video,

0:05:30.340000 --> 0:05:33.860000
 that will recognize data and be able
 to distinguish between them.

0:05:33.860000 --> 0:05:39.100000
 Whatever that feature is, we would
 call that a classification feature.

0:05:39.100000 --> 0:05:40.940000
 There's also Q management.

0:05:40.940000 --> 0:05:45.100000
 So as my interface is saying, hey, I can't
 put bits on the wire fast enough.

0:05:45.100000 --> 0:05:47.600000
 They keep hitting me on the back of
 the head and I can't get them on the

0:05:47.600000 --> 0:05:49.080000
 cable quickly enough.

0:05:49.080000 --> 0:05:52.800000
 I need to put those bits
 into a memory called a Q.

0:05:52.800000 --> 0:05:56.140000
 Well, there's QOS features that talk
 about how do you do that and how

0:05:56.140000 --> 0:06:01.040000
 do you manage that Q such as, you know,
 how do we control the size of

0:06:01.040000 --> 0:06:02.980000
 the Q or the placement of packets?


0:06:02.980000 --> 0:06:05.740000
 The scheduling order,
 the transmission rate.

0:06:05.740000 --> 0:06:08.080000
 There's QOS features that
 deal with all of that.

0:06:08.080000 --> 0:06:11.320000
 There's something called
 preemptive Q drops.

0:06:11.320000 --> 0:06:13.980000
 This is another type of QOS feature
 that you can turn on.

0:06:13.980000 --> 0:06:17.100000
 You can say, hey, look, this memory
 queue that you have that's filling

0:06:17.100000 --> 0:06:21.380000
 up with packets because you just can't
 get them on the wire fast enough.

0:06:21.380000 --> 0:06:25.120000
 Well, maybe what we want to do as packets
 are trying to get into that

0:06:25.120000 --> 0:06:29.700000
 memory queue, if they're low priority,
 maybe we should just preemptively

0:06:29.700000 --> 0:06:34.500000
 drop those. So we always reserve a little
 bit of room in that queue for

0:06:34.500000 --> 0:06:38.420000
 our higher priority traffic,
 like our voice.

0:06:38.420000 --> 0:06:40.780000
 And there's also the
 marking of packets.

0:06:40.780000 --> 0:06:44.660000
 There's a difference between as packets
 come in, being able to identify

0:06:44.660000 --> 0:06:49.820000
 them based on protocol or application
 or the nature of the packet, how

0:06:49.820000 --> 0:06:56.020000
 it behaves versus putting a label on
 that packet like a simple number,

0:06:56.020000 --> 0:07:00.480000
 like one, two, or three, and then having
 subsequent network devices like

0:07:00.480000 --> 0:07:04.380000
 subsequent routers and switches, being
 able to just look at that marked

0:07:04.380000 --> 0:07:07.740000
 number to make their
 QOS decisions.

0:07:07.740000 --> 0:07:11.300000
 So placing that number or that label
 on a packet is what's called the

0:07:11.300000 --> 0:07:12.780000
 marking of packets.

0:07:12.780000 --> 0:07:19.040000
 Now, the vast majority of what these
 QOS features do has something to

0:07:19.040000 --> 0:07:23.200000
 do with memory, the manipulation
 of router or switch memory.

0:07:23.200000 --> 0:07:27.120000
 And so we have to be aware of the differences
 between something called

0:07:27.120000 --> 0:07:30.400000
 buffers and QOS.

0:07:30.400000 --> 0:07:35.140000
 So a buffer is the physical memory used
 to store packets both before and

0:07:35.140000 --> 0:07:37.720000
 after a forwarding
 decision is made.

0:07:37.720000 --> 0:07:41.420000
 As it says here on routers, the same
 memory can be allocated to interfaces

0:07:41.420000 --> 0:07:46.240000
 as interface ingress
 or egress QOS.

0:07:46.240000 --> 0:07:52.880000
 Now, shared memory, which is part of
 allocate buffers, is also used by

0:07:52.880000 --> 0:07:55.040000
 a lot of other CPU processes.

0:07:55.040000 --> 0:07:57.160000
 So let's talk about this
 for just one moment.

0:07:57.160000 --> 0:08:08.460000
 Imagine for a moment here that I have
 my router and here are his interfaces.

0:08:08.460000 --> 0:08:13.980000
 Just a few interfaces down here, maybe
 a fast ethernet interface, another

0:08:13.980000 --> 0:08:17.860000
 fast ethernet, maybe
 a serial interface.

0:08:17.860000 --> 0:08:23.620000
 Now that router also has a CPU, which
 is his brain, and he has some physical

0:08:23.620000 --> 0:08:29.300000
 memory space. So this is a physical
 memory chip right here.

0:08:29.300000 --> 0:08:34.600000
 And each one of these squares inside
 here represents a memory cell, a

0:08:34.600000 --> 0:08:39.680000
 location where he can store a zero
 or a one as an electrical charge.

0:08:39.680000 --> 0:08:44.020000
 So all of this would be considered
 my memory buffers.

0:08:44.020000 --> 0:08:47.060000
 So it's my physical memory.

0:08:47.060000 --> 0:08:50.960000
 And the CPU, when he boots up, when
 this router switch boots up, he's

0:08:50.960000 --> 0:08:55.500000
 going to decide how to carve out
 or allocate these buffers.

0:08:55.500000 --> 0:08:58.320000
 He might say, okay, this part
 of the buffer is right here.

0:08:58.320000 --> 0:09:01.400000
 I'm going to use for storing
 my routing table information.

0:09:01.400000 --> 0:09:03.580000
 This part of the buffer
 is right here.

0:09:03.580000 --> 0:09:09.360000
 I'm going to use the store, maybe my
 OSPF, link state database, if I'm

0:09:09.360000 --> 0:09:13.880000
 running OSPF. And maybe this part of
 the buffer is right here, I'm going

0:09:13.880000 --> 0:09:16.340000
 to use as my interface buffers.

0:09:16.340000 --> 0:09:21.500000
 Those will be my interface
 buffers.

0:09:21.500000 --> 0:09:28.880000
 Now, without QUS turned on, without
 QUS enabled, the router is basically

0:09:28.880000 --> 0:09:35.120000
 going to say, okay, of my interface buffers,
 I'm going to take maybe this

0:09:35.120000 --> 0:09:41.020000
 section right here, and I'm going to allocate
 that for one of my interfaces.

0:09:41.020000 --> 0:09:51.020000
 That'll be my transmit buffer for
 fast-ethinate zero slash zero.

0:09:51.020000 --> 0:09:56.020000
 So as packets come in, and if they have
 to be transmitted out, fast-ethinate

0:09:56.020000 --> 0:09:59.840000
 zero zero, I will store their ones
 and zeros in this section of memory

0:09:59.840000 --> 0:10:06.260000
 right here. Now, without QUS, we've
 got this physical memory here for

0:10:06.260000 --> 0:10:11.300000
 fast-ethinate zero zero, but anything that's
 put in there is treated equally.

0:10:11.300000 --> 0:10:14.480000
 In other words, whatever goes in there
 might be voice, it might be video,

0:10:14.480000 --> 0:10:17.360000
 it might be data, but the router
 switch doesn't care.

0:10:17.360000 --> 0:10:19.720000
 It's first in, first out.

0:10:19.720000 --> 0:10:23.880000
 And if that thing starts getting full
 and ends up having to drop stuff,

0:10:23.880000 --> 0:10:26.660000
 there's no telling what's going
 to be dropped in there.

0:10:26.660000 --> 0:10:29.560000
 It's just completely random,
 first in, first out.

0:10:29.560000 --> 0:10:32.020000
 That is what's called a buffer.

0:10:32.020000 --> 0:10:38.680000
 Now, what's the difference
 between a buffer and a Q?

0:10:38.680000 --> 0:10:46.540000
 Well, when we turn on quality of service,
 now we can manipulate that buffer.

0:10:46.540000 --> 0:10:50.820000
 We can say, hey, that physical space, that
 physical buffer that was allocated

0:10:50.820000 --> 0:10:55.900000
 for fast-ethinate zero zero, now we
 turn on QUS, why don't we carve it

0:10:55.900000 --> 0:10:59.700000
 out into like some logical spaces?


0:10:59.700000 --> 0:11:06.040000
 In other words, maybe right here was
 that buffer space that was allocated

0:11:06.040000 --> 0:11:10.380000
 for fast-ethinate zero zero, so this
 is part of the larger physical memory

0:11:10.380000 --> 0:11:14.800000
 chip. But with QUS turned on, we can
 say, hey, why don't we logically

0:11:14.800000 --> 0:11:17.500000
 carve that up into
 some subsections?

0:11:17.500000 --> 0:11:21.360000
 Like, why don't we say this part of
 the buffer space here, we're going

0:11:21.360000 --> 0:11:27.580000
 to call it the low priority Q.

0:11:27.580000 --> 0:11:33.160000
 And then why don't we take another
 buffer space right here, and we'll

0:11:33.160000 --> 0:11:38.000000
 call this the medium priority Q, and
 then why don't we take another part

0:11:38.000000 --> 0:11:42.960000
 of the buffer space, like maybe this
 section right here, and call this

0:11:42.960000 --> 0:11:48.220000
 the high priority Q.

0:11:48.220000 --> 0:11:54.320000
 And with QUS, we can dictate or determine
 which packets are placed into

0:11:54.320000 --> 0:11:56.100000
 each one of these sections.

0:11:56.100000 --> 0:12:00.640000
 Further more than that, once we dictate
 with QUS which packets, for example,

0:12:00.640000 --> 0:12:07.220000
 go in the low priority Q, we can determine
 how those packets will be serviced.

0:12:07.220000 --> 0:12:11.700000
 Will that Q be completely emptied out
 before the medium priority Q has

0:12:11.700000 --> 0:12:16.800000
 even looked at? Or, well, we maybe service
 five packets from the low priority

0:12:16.800000 --> 0:12:21.800000
 Q, pause that Q, move over to the medium
 priority Q, and service ten packets

0:12:21.800000 --> 0:12:25.420000
 from there, pause that, and then
 move over to the high priority Q.

0:12:25.420000 --> 0:12:31.340000
 You see, without QUS, this was just
 generic memory buffers, first in,

0:12:31.340000 --> 0:12:36.040000
 first out. But with QUS turned on,
 we can now view that memory buffer

0:12:36.040000 --> 0:12:43.860000
 from a logical perspective of dividing
 it up into these Qs, so that's

0:12:43.860000 --> 0:12:47.960000
 the difference between
 Qs and buffers.

0:12:47.960000 --> 0:12:52.800000
 Alright, so let's take a look at the
 day or a day in the life of a packet

0:12:52.800000 --> 0:12:56.860000
 as it moves through router and two
 different types of switches, and so

0:12:56.860000 --> 0:13:01.700000
 we can get a better feel for how these
 buffers and Qs are utilized.

0:13:01.700000 --> 0:13:08.660000
 Alright, so step number one, a packet
 arrives on an ingress interface.

0:13:08.660000 --> 0:13:13.060000
 Now the moment that ingress interface
 starts detecting the bits coming

0:13:13.060000 --> 0:13:16.260000
 in, it detects the electrical energy,
 or if we're talking about a fiber

0:13:16.260000 --> 0:13:19.420000
 optic interface that starts detecting
 the laser light and the different

0:13:19.420000 --> 0:13:23.180000
 frequencies of light, it's able to
 say, ah, ones and zeros are coming

0:13:23.180000 --> 0:13:28.160000
 in. The moment that happens, those ones
 and zeros have to be stored somewhere,

0:13:28.160000 --> 0:13:31.600000
 and so they're put into physical memory
 buffers, and we call this the

0:13:31.600000 --> 0:13:36.580000
 receive ring. Okay, so the receive ring
 is simply a memory buffer that's

0:13:36.580000 --> 0:13:39.480000
 accessed via direct memory access.


0:13:39.480000 --> 0:13:44.320000
 In other words, as soon as it comes
 in, whatever the physical electronic

0:13:44.320000 --> 0:13:48.900000
 components are that make up that interface,
 those electronic components

0:13:48.900000 --> 0:13:53.600000
 have direct memory access to the buffers
 where it can store these ones

0:13:53.600000 --> 0:13:55.480000
 and zeros as they're coming in.

0:13:55.480000 --> 0:14:00.880000
 So this is what we call
 the receive ring.

0:14:00.880000 --> 0:14:04.920000
 Alright, so now the packet is queued
 in a memory buffer, it's being stored

0:14:04.920000 --> 0:14:07.860000
 in there. We had to wait until the
 whole thing came in from beginning

0:14:07.860000 --> 0:14:12.000000
 to end, and now a forwarding
 decision is made.

0:14:12.000000 --> 0:14:15.540000
 So typically speaking, once something's
 placed into the receive ring,

0:14:15.540000 --> 0:14:19.660000
 now there's some process that goes to
 the CPU and says, hey, knock, knock,

0:14:19.660000 --> 0:14:23.560000
 knock CPU. I've got something over
 here in memory cell locations, you

0:14:23.560000 --> 0:14:26.860000
 know, one four five a through
 one four five c.

0:14:26.860000 --> 0:14:29.060000
 I need you to take a look at this.


0:14:29.060000 --> 0:14:33.720000
 So now the CPU will pause whatever
 it's doing, and it'll start the IP

0:14:33.720000 --> 0:14:37.220000
 input process. It'll say, okay,
 let me look at that memory.

0:14:37.220000 --> 0:14:40.760000
 Oh, okay, right here in this block of
 memory, here's what the destination

0:14:40.760000 --> 0:14:43.880000
 IP address is, or if we're talking
 about a switch, the destination MAC

0:14:43.880000 --> 0:14:46.220000
 address, let me go look it up.

0:14:46.220000 --> 0:14:50.000000
 So now a forwarding decision is made,
 which involves, you know, should

0:14:50.000000 --> 0:14:52.320000
 I forward this packet
 or should I drop it?

0:14:52.320000 --> 0:14:54.300000
 Should I translate it via NAT?

0:14:54.300000 --> 0:14:55.300000
 You know, what should
 I do with it?

0:14:55.300000 --> 0:14:59.640000
 And then the last component of that
 is, where's this packet going?

0:14:59.640000 --> 0:15:03.040000
 Is it going to be forwarded out fast,
 Ethan, at zero zero or gigabit one

0:15:03.040000 --> 0:15:09.460000
 one? Now, once we determine the outbound
 or egress interface, now that

0:15:09.460000 --> 0:15:12.820000
 packet will be placed onto
 the hardware transmit ring.

0:15:12.820000 --> 0:15:17.800000
 Now this might not actually mean
 physical movement of the bits.

0:15:17.800000 --> 0:15:21.960000
 You see, when those bits came in, and
 we're talking about maybe thousands

0:15:21.960000 --> 0:15:27.140000
 of bits just for one single packet or
 frame, when they came in for a few

0:15:27.140000 --> 0:15:32.560000
 microseconds, all those memory cells
 that were accessed via direct memory

0:15:32.560000 --> 0:15:37.760000
 access, those memory cells were say,
 okay, we're going to classify these

0:15:37.760000 --> 0:15:42.500000
 as part of the receive ring for,
 let's say, serial one slash one.

0:15:42.500000 --> 0:15:44.660000
 That's where they came
 in, the receiving ring.

0:15:44.660000 --> 0:15:48.220000
 Now, once the forwarding engine has
 determined, all right, that string

0:15:48.220000 --> 0:15:52.920000
 of bits making up that packet, those
 need to go out fast, Ethan, at zero

0:15:52.920000 --> 0:15:57.640000
 zero. Part of the CPU's job will say,
 okay, those same bits, you can stay

0:15:57.640000 --> 0:16:01.320000
 where you are. We don't have to move
 you, but those memory cells were

0:16:01.320000 --> 0:16:05.800000
 now going to recategorize it as part
 of the transmit ring of fast, Ethan,

0:16:05.800000 --> 0:16:07.980000
 at zero zero. So you see
 what we did there?

0:16:07.980000 --> 0:16:09.240000
 We didn't actually move them.

0:16:09.240000 --> 0:16:10.560000
 They stayed right where they were.


0:16:10.560000 --> 0:16:14.040000
 We just sort of looked at those
 memory cells differently.

0:16:14.040000 --> 0:16:17.400000
 They're now part of the transmit
 ring of the egress interface.

0:16:17.400000 --> 0:16:20.640000
 Or sometimes, they're actually
 physically moved.

0:16:20.640000 --> 0:16:24.020000
 They're taken out of those memory cells
 and put into another section of

0:16:24.020000 --> 0:16:28.400000
 memory cells, which is the transmit
 ring for the egress interface.

0:16:28.400000 --> 0:16:32.880000
 And then they're transmitted
 onto the egress media.

0:16:32.880000 --> 0:16:36.440000
 They're put out onto the wire.

0:16:36.440000 --> 0:16:40.720000
 Now, what about a switch
 that does shared memory?

0:16:40.720000 --> 0:16:45.300000
 Now, with switches, let me just draw
 something here real quickly.

0:16:45.300000 --> 0:16:49.060000
 Switches are broken down into sort of
 at a high level, two different types

0:16:49.060000 --> 0:16:50.660000
 of architectures.

0:16:50.660000 --> 0:16:57.420000
 We have shared memory, and
 we have distributed memory.

0:16:57.420000 --> 0:17:00.020000
 This slide is talking
 about shared memory.

0:17:00.020000 --> 0:17:01.920000
 The next slide is going to talk
 about distributed memory.

0:17:01.920000 --> 0:17:04.420000
 Here's the main difference.

0:17:04.420000 --> 0:17:13.220000
 So here are my interfaces of my
 two different types of switches.

0:17:13.220000 --> 0:17:21.860000
 And we have some sort of ASIC application
 specific integrated circuit

0:17:21.860000 --> 0:17:27.020000
 that's controlling these interfaces
 that's in charge of receiving bits

0:17:27.020000 --> 0:17:30.420000
 off the wire and transmitting
 bits onto the wire.

0:17:30.420000 --> 0:17:33.180000
 And it might be like
 I've drawn here.

0:17:33.180000 --> 0:17:37.300000
 It might be that several interfaces
 are shared among a single ASIC.

0:17:37.300000 --> 0:17:41.400000
 Sometimes on higher end platforms, each
 interface will have its own dedicated

0:17:41.400000 --> 0:17:46.000000
 ASIC. But from the perspective of
 memory, that's kind of irrelevant.

0:17:46.000000 --> 0:17:47.840000
 So here's the difference.

0:17:47.840000 --> 0:17:51.880000
 In a shared memory architecture, and
 let's say that we also have another

0:17:51.880000 --> 0:17:54.420000
 ASIC, which is our
 forwarding engine.

0:17:54.420000 --> 0:17:56.300000
 All this market has FE here.

0:17:56.300000 --> 0:18:00.320000
 So here's the workhorse that's going
 to figure out what to do with a frame

0:18:00.320000 --> 0:18:02.680000
 or a packet and where to send it.

0:18:02.680000 --> 0:18:04.440000
 That's the forwarding engine.

0:18:04.440000 --> 0:18:06.620000
 All right, now here's
 the difference.

0:18:06.620000 --> 0:18:13.820000
 In a shared memory architecture, when
 a frame comes in, it's forwarded

0:18:13.820000 --> 0:18:20.080000
 to a big block of shared
 memory up here.

0:18:20.080000 --> 0:18:28.280000
 So like maybe this portion of memory
 right here might be allocated as

0:18:28.280000 --> 0:18:35.580000
 ingress memory or receive memory
 for a particular port.

0:18:35.580000 --> 0:18:38.020000
 Maybe fast ethernet
 zero slash one.

0:18:38.020000 --> 0:18:40.820000
 But it's just part of
 a big block of memory.

0:18:40.820000 --> 0:18:45.240000
 So this memory is shared among
 multiple interfaces.

0:18:45.240000 --> 0:18:54.220000
 In a distributed memory architecture,
 each interface has its own memory.

0:18:54.220000 --> 0:19:01.600000
 So as a frame comes in, that interface
 has its own dedicated buffers that

0:19:01.600000 --> 0:19:04.420000
 it doesn't share with
 anybody else.

0:19:04.420000 --> 0:19:06.820000
 They're allocated just for that.

0:19:06.820000 --> 0:19:11.480000
 So that's sort of the high level differences
 between shared memory and

0:19:11.480000 --> 0:19:13.760000
 distributed memory platforms.

0:19:13.760000 --> 0:19:17.560000
 So right now, let's look at how a packet
 moves through a switch in this

0:19:17.560000 --> 0:19:23.500000
 one right here, which is a shared memory
 allocation or shared memory architecture.

0:19:23.500000 --> 0:19:27.040000
 So the packet arrives on
 the ingress interface.

0:19:27.040000 --> 0:19:33.160000
 Whatever interface or module ASIC that's
 controlling that physical port,

0:19:33.160000 --> 0:19:36.620000
 he will immediately forward the packet,
 like it says, into a common shared

0:19:36.620000 --> 0:19:43.980000
 memory pool. At that point, the bits
 will stay in that memory pool, but

0:19:43.980000 --> 0:19:47.940000
 some of those bits will be extracted
 or copied and sent to the forwarding

0:19:47.940000 --> 0:19:50.980000
 engine. After all, the forwarding engine
 doesn't need to see everything

0:19:50.980000 --> 0:19:52.640000
 necessarily, right?

0:19:52.640000 --> 0:19:57.800000
 An ethernet frame could be
 almost 10,000 bits long.

0:19:57.800000 --> 0:20:01.000000
 But if all those bits in that ethernet
 frame, the forwarding engine doesn't

0:20:01.000000 --> 0:20:02.340000
 care about most of it.

0:20:02.340000 --> 0:20:05.520000
 The data? The forwarding engine
 doesn't need to see that.

0:20:05.520000 --> 0:20:08.820000
 He just needs to see the bits corresponding
 to like the destination MAC

0:20:08.820000 --> 0:20:11.780000
 address and the source MAC address
 and maybe the ethernet type code.

0:20:11.780000 --> 0:20:14.980000
 And if we're talking about a multi
-layer switch, he needs to see some

0:20:14.980000 --> 0:20:19.360000
 bits behind that corresponding to like the
 source and destination IP addresses.

0:20:19.360000 --> 0:20:23.560000
 But all that stuff there is like
 a minuscule portion of the frame.

0:20:23.560000 --> 0:20:25.860000
 Most of the frame is the data.

0:20:25.860000 --> 0:20:28.160000
 And the forwarding engine
 doesn't care about that.

0:20:28.160000 --> 0:20:31.660000
 So we'll take all the bits of the frame
 and we'll copy just the relevant

0:20:31.660000 --> 0:20:35.440000
 portions and send that to the forwarding
 engine so he can decide what

0:20:35.440000 --> 0:20:36.700000
 to do with this thing.

0:20:36.700000 --> 0:20:39.440000
 That's what's happening here
 in step number three.

0:20:39.440000 --> 0:20:43.640000
 So the forwarding engine makes
 this decision about what to do.

0:20:43.640000 --> 0:20:48.140000
 And then at that point, the memory ownership
 of the packet buffer is transferred

0:20:48.140000 --> 0:20:49.960000
 to the egress interface.

0:20:49.960000 --> 0:20:52.760000
 This is like what I talked about in
 the last slide where we say, okay,

0:20:52.760000 --> 0:20:58.180000
 all you memory cells, before we did
 our lookup, you buffers right there,

0:20:58.180000 --> 0:21:02.120000
 you were the received buffers
 for fast ethernet zero one.

0:21:02.120000 --> 0:21:05.980000
 Now the forwarding engine comes back
 and says, hey, that frame, that needs

0:21:05.980000 --> 0:21:08.740000
 to go out gigabit zero zero.

0:21:08.740000 --> 0:21:11.540000
 So now there's going to
 be some memory manager.

0:21:11.540000 --> 0:21:14.660000
 There's some other background process
 that's in charge of that shared

0:21:14.660000 --> 0:21:19.920000
 memory and in charge of determining
 each cell, what its meaning is right

0:21:19.920000 --> 0:21:25.600000
 now. That memory manager will now say,
 okay, you memory buffers, you're

0:21:25.600000 --> 0:21:29.020000
 no longer part of the received queue
 for interface fast ethernet zero

0:21:29.020000 --> 0:21:33.920000
 one. Now we're going to redefine you for
 a few microseconds as the transmit

0:21:33.920000 --> 0:21:37.940000
 buffer or the transmit queue
 for gigabit zero zero.

0:21:37.940000 --> 0:21:43.720000
 And then the packet will be transmitted
 onto the egress media and away

0:21:43.720000 --> 0:21:48.120000
 it goes and now those buffers are freed
 to be reallocated or reassigned

0:21:48.120000 --> 0:21:50.640000
 to something else.

0:21:50.640000 --> 0:21:52.900000
 So that's shared memory.

0:21:52.900000 --> 0:21:56.160000
 In a distributed memory platform,
 it's very, very similar.

0:21:56.160000 --> 0:21:59.940000
 The packet arrives on the ingress interface,
 but in this case, the ASIC

0:21:59.940000 --> 0:22:03.660000
 that's controlling that interface
 has his own dedicated memory.

0:22:03.660000 --> 0:22:05.940000
 He doesn't have to share
 it with anybody else.

0:22:05.940000 --> 0:22:11.600000
 Now once again, some information, some
 bits are extracted or copied from

0:22:11.600000 --> 0:22:15.220000
 that, like the MAC addresses, the IP
 addresses sent to the forwarding

0:22:15.220000 --> 0:22:18.000000
 engine, where decision is made.

0:22:18.000000 --> 0:22:22.220000
 The forwarding engine will send his
 result back saying, okay, that frame

0:22:22.220000 --> 0:22:25.160000
 needs to go out here,
 gigabit one one.

0:22:25.160000 --> 0:22:30.920000
 Now the packet will be put onto something
 like a shared bus or maybe a

0:22:30.920000 --> 0:22:34.220000
 shared ring that all the
 port ASICs connect to.

0:22:34.220000 --> 0:22:39.380000
 There it goes in the front of that
 packet will be the look up result.

0:22:39.380000 --> 0:22:42.520000
 Hey, this packet here, I'm putting on
 the ring or I'm putting on the bus.

0:22:42.520000 --> 0:22:44.440000
 This needs to go to you,
 gigabit one one.

0:22:44.440000 --> 0:22:49.840000
 This is yours. And now that egress
 interface will get it, put it into

0:22:49.840000 --> 0:22:54.040000
 the appropriate egress queue, schedule
 the packet and away it goes for

0:22:54.040000 --> 0:23:02.720000
 transmission. So in all of this, we
 haven't talked about congestion.

0:23:02.720000 --> 0:23:06.500000
 In all of this, as the packets have been
 put into the queues or the buffers,

0:23:06.500000 --> 0:23:10.760000
 we've looked it up right away,
 sent it out on its way.

0:23:10.760000 --> 0:23:16.840000
 But if egress traffic cannot immediately
 be transmitted out the transmit

0:23:16.840000 --> 0:23:21.500000
 port, now what we have to do is
 place it into an egress queue.

0:23:21.500000 --> 0:23:26.100000
 So that little memory buffer where
 it's been stored, we can't empty it

0:23:26.100000 --> 0:23:30.040000
 out right away because there's other
 memory buffers in front of it that

0:23:30.040000 --> 0:23:31.860000
 are still waiting
 to be transmitted.

0:23:31.860000 --> 0:23:36.300000
 So now the memory manager says, okay,
 I need to now rethink about this

0:23:36.300000 --> 0:23:41.940000
 memory buffer here and call it something
 else as an egress queue.

0:23:41.940000 --> 0:23:46.640000
 So without QS, the queue is just one large
 block of memory, just one buffer.

0:23:46.640000 --> 0:23:51.420000
 First in, first out, we don't know
 what's going to happen to the bits.

0:23:51.420000 --> 0:23:57.060000
 With QS, we can control the characteristics
 of what happens to those bits

0:23:57.060000 --> 0:24:00.520000
 as they're waiting
 to be dealt with.

0:24:00.520000 --> 0:24:03.760000
 So what is affected by QS?

0:24:03.760000 --> 0:24:07.520000
 So in this last slide here in the presentation
 in this particular video,

0:24:07.520000 --> 0:24:10.100000
 what types of things can
 we control or memory?

0:24:10.100000 --> 0:24:13.540000
 We can manipulate when we turn
 on these various QS features.

0:24:13.540000 --> 0:24:16.180000
 Well, we can manipulate bandwidth.


0:24:16.180000 --> 0:24:21.000000
 We can say, okay, this queue, you're
 allowed to transmit at five megabits

0:24:21.000000 --> 0:24:22.380000
 per second only.

0:24:22.380000 --> 0:24:26.000000
 This other queue, you're allowed to
 transmit at ten megabits per second

0:24:26.000000 --> 0:24:33.080000
 because remember, when we turn on QS,
 my physical interface here, here's

0:24:33.080000 --> 0:24:38.860000
 my physical port, fast Ethan at zero
 slash zero, here's the memory buffers

0:24:38.860000 --> 0:24:45.060000
 that are allocated to him
 on the transmit side.

0:24:45.060000 --> 0:24:52.120000
 Now, without QOS, as stuff is put into
 the memory buffer, hopefully it'll

0:24:52.120000 --> 0:24:53.720000
 just be sent out.

0:24:53.720000 --> 0:24:56.820000
 And that's a fast Ethan interface,
 so he's going to go at 100 million

0:24:56.820000 --> 0:24:57.900000
 bits per second.

0:24:57.900000 --> 0:24:59.060000
 That's what his speed is.

0:24:59.060000 --> 0:25:00.660000
 We can't slow him down.

0:25:00.660000 --> 0:25:02.120000
 We can't speed him up.

0:25:02.120000 --> 0:25:03.940000
 That's how fast he goes.

0:25:03.940000 --> 0:25:11.260000
 But with QOS, we can now subdivide this
 into Qs, like a low priority queue,

0:25:11.260000 --> 0:25:14.620000
 like a high priority queue.

0:25:14.620000 --> 0:25:15.940000
 And now we can affect bandwidth.

0:25:15.940000 --> 0:25:18.980000
 We can say, okay, we know the fast
 Ethan interface is 100 million bits

0:25:18.980000 --> 0:25:22.680000
 per second. Why don't we divide
 that and say low priority queue?

0:25:22.680000 --> 0:25:27.440000
 You can get 40 million bits per second,
 high priority queue, you can get

0:25:27.440000 --> 0:25:30.560000
 the rest. 60 million
 bits per second.

0:25:30.560000 --> 0:25:34.560000
 So that's something we can do with
 QOS, is we can affect the bandwidth

0:25:34.560000 --> 0:25:37.540000
 of how bits are going
 to be transmitted.

0:25:37.540000 --> 0:25:41.680000
 We can affect delay.

0:25:41.680000 --> 0:25:44.880000
 We can say now that we've divided into
 Qs, if there's going to be any

0:25:44.880000 --> 0:25:47.680000
 delay anywhere, we would prefer the
 delay happen with the stuff that's

0:25:47.680000 --> 0:25:51.300000
 in the low priority queue, stuff that's
 put in the high priority queue,

0:25:51.300000 --> 0:25:55.520000
 you get to the front of the
 line, you get out first.

0:25:55.520000 --> 0:25:57.820000
 We can affect jitter.

0:25:57.820000 --> 0:26:01.180000
 Now, delay and jitter are both
 very, very closely related.

0:26:01.180000 --> 0:26:04.340000
 They both have to do with delaying of
 packets, but here's the difference.

0:26:04.340000 --> 0:26:07.720000
 So, let's take a voice
 call as an example.

0:26:07.720000 --> 0:26:12.800000
 You've probably experienced delay with
 a voice where you say hello, and

0:26:12.800000 --> 0:26:16.480000
 then you have to wait like a second
 before your hello gets to the other

0:26:16.480000 --> 0:26:19.500000
 end of the phone line, especially if
 you're talking to somebody on another

0:26:19.500000 --> 0:26:25.500000
 continent. So with delay, with an IP
 phone, when you start speaking, it

0:26:25.500000 --> 0:26:28.940000
 starts translating your
 voice into IP packets.

0:26:28.940000 --> 0:26:33.020000
 And each one of those IP packets has
 a little bit delay in between them,

0:26:33.020000 --> 0:26:35.840000
 but the delay is evenly
 interspersed.

0:26:35.840000 --> 0:26:39.160000
 In other words, if you were to look at
 those IP packets with a timestamp,

0:26:39.160000 --> 0:26:42.200000
 you would see the delay when packet
 number one was transmittent, then

0:26:42.200000 --> 0:26:45.380000
 packet number two, packet number
 three, it was all equal.

0:26:45.380000 --> 0:26:48.880000
 Hopefully it's very, very
 small, but it's all equal.

0:26:48.880000 --> 0:26:55.800000
 So if that string of packets hit some
 interface somewhere and it's delayed,

0:26:55.800000 --> 0:27:01.320000
 we don't like that, but it's tolerable
 if the delay is consistent.

0:27:01.320000 --> 0:27:04.300000
 In other words, if I know that when
 I'm talking, I have to wait like one

0:27:04.300000 --> 0:27:08.380000
 second for my voice to get there, it's
 irritating, but I can still maintain

0:27:08.380000 --> 0:27:12.220000
 a conversation. So delay, we don't
 want it, but if we're going to have

0:27:12.220000 --> 0:27:16.280000
 it as long as it's not too
 big, we can deal with it.

0:27:16.280000 --> 0:27:19.280000
 Jitter on the other
 hand is really bad.

0:27:19.280000 --> 0:27:24.880000
 Jitter means, okay, jitter means something
 like this, where I've got a

0:27:24.880000 --> 0:27:29.160000
 string of packets and maybe the first
 few packets have this much delay

0:27:29.160000 --> 0:27:32.980000
 in between them, but now I've got a
 couple packets here with a lot more

0:27:32.980000 --> 0:27:34.280000
 delay in between them.

0:27:34.280000 --> 0:27:37.700000
 Now I've got a couple packets
 here with very little delay.

0:27:37.700000 --> 0:27:42.660000
 Now I've got some big, so the delay
 in between the packets is random.

0:27:42.660000 --> 0:27:45.740000
 It's variable. It's
 not consistent.

0:27:45.740000 --> 0:27:50.280000
 And where that shows up in a voice
 call is where someone's talking to

0:27:50.280000 --> 0:27:56.400000
 you, but because the delay is variable,
 as you're trying to listen to

0:27:56.400000 --> 0:28:01.040000
 them, it sounds like, and hello, and
 I walked to the car and then I saw

0:28:01.040000 --> 0:28:06.040000
 my dog. You can't even tell what you're
 listening to, because as the packets

0:28:06.040000 --> 0:28:12.680000
 are coming in, the IP phone can't reconstruct
 them into a legible conversation

0:28:12.680000 --> 0:28:16.740000
 because they're coming in with variable
 spaces in between them.

0:28:16.740000 --> 0:28:19.460000
 That's jitter, and
 jitter is very bad.

0:28:19.460000 --> 0:28:20.740000
 QOS can manage that.

0:28:20.740000 --> 0:28:31.200000
 QOS can hopefully prevent
 jitter from happening.

0:28:31.200000 --> 0:28:36.040000
 But if stuff has to be dropped, now
 with QOS we can manage what will be

0:28:36.040000 --> 0:28:38.700000
 dropped and what won't.

0:28:38.700000 --> 0:28:43.100000
 So that concludes this video
 on an introduction to QOS.