WEBVTT

00:00:01.040 --> 00:00:06.020
Previously, we looked at cloud data security for the CCSP from the

00:00:06.020 --> 00:00:09.050
perspective of cloud data security concepts.

00:00:09.540 --> 00:00:13.160
Now, we'll take a look at some of the technologies we use

00:00:13.410 --> 00:00:16.260
to ensure security of data in the cloud.

00:00:17.540 --> 00:00:21.820
The idea, of course, is that data is stored and

00:00:21.820 --> 00:00:24.260
processed in the cloud in different ways.

00:00:24.840 --> 00:00:29.220
We have ephemeral storage such as data that's stored just

00:00:29.220 --> 00:00:33.570
temporarily, say, in a virtual machine. And once that machine has

00:00:33.570 --> 00:00:37.050
been powered down, that data should be lost.

00:00:37.840 --> 00:00:42.550
We have raw storage and, of course, vast amounts of rather

00:00:42.560 --> 00:00:48.040
inefficient storage, in some cases, where we can keep data that, in

00:00:48.040 --> 00:00:53.130
many cases, needs a lot of cleaning up and so on.

00:00:53.140 --> 00:00:57.850
But often, of course, this is used in very large files such

00:00:57.870 --> 00:01:00.560
as even photos, for example, as well.

00:01:01.240 --> 00:01:07.620
We have long‑term storage where data is being stored both for archiving,

00:01:07.620 --> 00:01:11.840
but also, of course, for production use over an extended

00:01:11.840 --> 00:01:16.330
period of time. There's a couple of different data storage

00:01:16.330 --> 00:01:22.350
types that have commonly been used, volume storage and object storage.

00:01:23.540 --> 00:01:27.160
When we look at, for example, block storage,

00:01:27.640 --> 00:01:32.290
we can look, for example, at things like files and file folders.

00:01:32.290 --> 00:01:38.190
These are traditionally then stored in blocks on specified hardware.

00:01:39.040 --> 00:01:44.690
Quite often, we would map this out in the mainframe days on tracks and sectors

00:01:44.690 --> 00:01:47.760
that would be available for a program that was going to run.

00:01:48.540 --> 00:01:53.810
The problem with this was that if we had a sector that wasn't being fully used,

00:01:53.810 --> 00:01:58.490
that was a waste of resources. It was rather expensive,

00:01:58.500 --> 00:02:03.220
poor scalability, and jobs would even fail if not enough tracks,

00:02:03.220 --> 00:02:03.870
for example,

00:02:03.870 --> 00:02:08.979
have been allocated for a job. Very often, block storage was

00:02:08.979 --> 00:02:12.560
accessed through applications and operating systems.

00:02:14.040 --> 00:02:19.010
When we looked at file storage, we often stored data in files in some

00:02:19.010 --> 00:02:22.350
type of hierarchical structure such as a directory.

00:02:22.990 --> 00:02:27.130
These, again, could be accessed through the application and operating system,

00:02:27.540 --> 00:02:32.880
but we ended up with larger and larger directories that became a little bit

00:02:32.890 --> 00:02:37.530
cumbersome and sometimes hard to organize. Some of you,

00:02:37.530 --> 00:02:43.140
your email inbox probably looks like that, in a lot of cases. But these, of

00:02:43.140 --> 00:02:47.540
course, are also good for transactional activity, for example,

00:02:47.540 --> 00:02:52.500
when I need to load something into a database, for example. The idea of

00:02:52.510 --> 00:02:57.840
object storage is to store data as objects, including the metadata that

00:02:57.840 --> 00:03:01.460
describes what that object actually is.

00:03:01.940 --> 00:03:03.560
This works well in the cloud.

00:03:04.140 --> 00:03:10.790
We could store these in containers and access them through HTTP‑based

00:03:10.800 --> 00:03:15.720
RESTful APIs, for example. It's a type of flat storage.

00:03:15.720 --> 00:03:20.610
It's not stored in a tree or directory, and the metadata is what makes it

00:03:20.620 --> 00:03:25.580
easily searchable we could say. One of the advantages of this is we could

00:03:25.590 --> 00:03:29.550
easily detect any type of integrity problems.

00:03:29.940 --> 00:03:34.980
It's easy to replicate or copy the various objects or to check to

00:03:34.980 --> 00:03:38.860
make sure that they have not been improperly altered using things

00:03:38.860 --> 00:03:41.550
like check values and hash values.

00:03:42.640 --> 00:03:48.100
The benefits of using object storage is that now our data objects can be

00:03:48.100 --> 00:03:53.440
accessed from anywhere in the world and, in many cases, by a lot of

00:03:53.450 --> 00:03:58.650
different types of devices. We're not limited to one type of end user

00:03:58.650 --> 00:04:01.650
device, for example, or one type of application.

00:04:02.240 --> 00:04:06.880
They can easily be accessed from different types of operating systems and so

00:04:06.880 --> 00:04:12.890
on. Because these can be easily replicated, it's very good for things like load

00:04:12.890 --> 00:04:16.459
balancing and even hardware abstraction as well.

00:04:16.839 --> 00:04:25.340
That allows us to be able to ensure that a person is able to get

00:04:25.340 --> 00:04:31.720
access from hopefully a remote edge device without having to

00:04:31.730 --> 00:04:34.260
retrieve data from all the way around the world.

00:04:35.540 --> 00:04:39.770
The other thing is that this allows for faster search and retrieval using

00:04:39.770 --> 00:04:44.700
things like the metadata. We could say, in many cases, the storage systems

00:04:44.710 --> 00:04:50.120
operate just like a valet in a car parking lot that the valet can put the

00:04:50.120 --> 00:04:53.460
cars where it's best to optimize the data.

00:04:53.940 --> 00:04:55.430
It's very scalable,

00:04:55.440 --> 00:04:59.190
less expensive, and therefore this works really well

00:04:59.190 --> 00:05:02.720
with our content delivery networks, our videos,

00:05:02.720 --> 00:05:05.060
images, blogs, and so on.

00:05:06.440 --> 00:05:10.480
The idea of metadata, of course, is data that describes

00:05:10.490 --> 00:05:13.860
data. Kind of a funny term in one way.

00:05:14.340 --> 00:05:17.870
But the advantage of this is that it is the descriptor, say,

00:05:17.870 --> 00:05:21.200
of that object or that data that can be searched.

00:05:21.410 --> 00:05:27.500
So we could find all data that fits into, say, a certain category. We

00:05:27.500 --> 00:05:32.910
see here a picture of an evening in Kuwait, and we can see the metadata

00:05:32.910 --> 00:05:36.810
there that names what this JPEG is called,

00:05:36.820 --> 00:05:39.740
its resolution, the date the picture was taken.

00:05:39.920 --> 00:05:45.920
But also, we have something that indicates what is the actual hash values.

00:05:46.280 --> 00:05:50.660
So if that data or that picture was in any way altered,

00:05:51.040 --> 00:05:54.820
that hash value would also change, and we'd know that that

00:05:54.820 --> 00:05:58.450
was not the original or unaltered document.
