WEBVTT

00:00:01.010 --> 00:00:03.960
Let's take a look at the cloud data lifecycle.

00:00:04.640 --> 00:00:08.650
This was defined by the Cloud Security Alliance as a way to ensure

00:00:08.650 --> 00:00:13.410
we understand that data must be protected all the way through the

00:00:13.410 --> 00:00:16.400
lifecycle from creation until deletion.

00:00:17.240 --> 00:00:20.920
The lifecycle is defined as create, store,

00:00:21.050 --> 00:00:27.680
use, share, archive, and then delete. During the create

00:00:27.680 --> 00:00:30.530
phase is when we first receive the data.

00:00:30.830 --> 00:00:34.290
Maybe, for example, it came in from a customer or a

00:00:34.290 --> 00:00:36.450
business partner or another system.

00:00:37.000 --> 00:00:42.360
This is the point at which we already begin to set up the ownership and

00:00:42.360 --> 00:00:48.750
classification of the data. As the data is received, then it is correctly

00:00:48.750 --> 00:00:53.870
labeled, and, of course, it could be that we even want to make sure the receipt

00:00:53.870 --> 00:01:00.850
of the data is done in a secure manner, for example, by using encryption. The

00:01:00.850 --> 00:01:06.920
store step happens usually together with create. We receive the data, and we

00:01:06.920 --> 00:01:09.250
enter it into our system, we store it.

00:01:09.740 --> 00:01:14.280
It is important then that we follow the data handling procedures that

00:01:14.280 --> 00:01:17.810
were associated with the classification of the data.

00:01:18.020 --> 00:01:23.880
So right away, we begin to label it and indicate how that data

00:01:23.880 --> 00:01:26.860
must be protected within the organization.

00:01:28.790 --> 00:01:34.130
The next step of the data lifecycle is when we actually use the data. We

00:01:34.130 --> 00:01:38.650
must protect data while it's in use, and this, for example, means we

00:01:38.650 --> 00:01:42.720
train our users into when you access data,

00:01:42.730 --> 00:01:44.200
what can you do with that?

00:01:44.210 --> 00:01:49.260
Who can you share it with, for example? And sometimes, we will, of

00:01:49.260 --> 00:01:53.250
course, practice things like obfuscation or data hiding.

00:01:53.540 --> 00:01:58.870
We won't show users parts of the data they shouldn't see, whether or not

00:01:58.870 --> 00:02:03.140
we do that through obfuscation or maybe through something like a database

00:02:03.140 --> 00:02:08.780
view that hides some of the columns of data from the users that don't

00:02:08.780 --> 00:02:10.949
need to see what's in that column.

00:02:11.940 --> 00:02:16.240
One of the ways we do date hiding, of course, is through encryption.

00:02:16.510 --> 00:02:20.560
We can encrypt data so that it's not visible to those

00:02:20.560 --> 00:02:22.460
who are not authorized to see it.

00:02:22.940 --> 00:02:26.970
We could mask it so that even if a person was shoulder surfing,

00:02:26.970 --> 00:02:28.960
looking over the shoulder of a user,

00:02:29.340 --> 00:02:33.270
they would not be able to see what the password was that was being entered.

00:02:33.740 --> 00:02:37.340
Or we can use things, of course, like screen filters that can make it

00:02:37.340 --> 00:02:41.560
difficult for someone to read the data on somebody else's screen.

00:02:42.040 --> 00:02:47.430
We use obfuscation to try to hide the data by putting in other

00:02:47.430 --> 00:02:51.350
ways to then mask the data so it's not visible.

00:02:51.430 --> 00:02:56.050
We often do this, of course, with things like passwords and so on as well.

00:02:57.040 --> 00:03:00.590
One of the things we'll often do is remove any personal

00:03:00.640 --> 00:03:04.970
identifiable information that's associated with the data, so we

00:03:04.970 --> 00:03:07.610
can make sure the data remains anonymous.

00:03:07.750 --> 00:03:11.570
But when we do this, we still have to be careful of things like data

00:03:11.570 --> 00:03:16.950
aggregation where a person could combine many data sources together and

00:03:16.950 --> 00:03:21.000
learn something that we thought was hidden, or, of course, we have to make

00:03:21.000 --> 00:03:25.950
sure that our base is large enough that it's not going to be possible for a

00:03:25.950 --> 00:03:31.940
person to deanonymize the data. That, of course, is always a risk. You fill

00:03:31.940 --> 00:03:37.960
out that confidential employee survey, for example. And the problem is that

00:03:38.440 --> 00:03:43.570
when it asks some of the demographic questions, gender, age, department,

00:03:43.580 --> 00:03:47.520
well then you actually know exactly whose answers those are because

00:03:47.520 --> 00:03:50.330
they're the only person of that age in that department, for

00:03:50.330 --> 00:03:55.130
example. One of the ways we, of course, protect data from

00:03:55.140 --> 00:04:00.240
unauthorized disclosure is through things like data loss or data

00:04:00.240 --> 00:04:03.460
leakage prevention, DLP, systems.

00:04:04.140 --> 00:04:07.690
We can also protect data that goes outside of the

00:04:07.690 --> 00:04:11.860
organization using things like digital rights management or

00:04:11.870 --> 00:04:14.250
information rights management as well.

00:04:15.540 --> 00:04:19.790
During the share phase, this is where we're sharing the data

00:04:19.790 --> 00:04:22.650
with maybe a third party with another system.

00:04:23.040 --> 00:04:27.480
And, of course, we have the problem here that the cloud is very much

00:04:27.480 --> 00:04:33.260
a globally accessible repository of information.

00:04:33.640 --> 00:04:37.770
So therefore, we have to make sure only authorized users are able

00:04:37.770 --> 00:04:41.460
to perform authorized functions to that data.

00:04:42.040 --> 00:04:46.540
We can do this in part through things like multi‑factor authentication,

00:04:46.550 --> 00:04:52.310
MFA, to make sure it truly is that legitimate user who is trying to access

00:04:52.310 --> 00:04:55.350
the data over the cloud provider's network.

00:04:55.840 --> 00:05:01.880
We enforce those age‑old access control concepts of things like least

00:05:01.880 --> 00:05:06.210
privilege and need to know. Need to know usually comes first.

00:05:06.220 --> 00:05:09.140
Do you even need to know what that information is?

00:05:09.300 --> 00:05:11.660
If not, don't show it to them.

00:05:12.230 --> 00:05:15.980
The next thing is least privilege that says and if you do need

00:05:15.980 --> 00:05:19.450
to know, what can you do? Can you update it?

00:05:19.460 --> 00:05:20.790
Can you read only?

00:05:20.790 --> 00:05:22.760
Can you change it or modify it?

00:05:23.240 --> 00:05:28.340
So we bring in these ideas of least privilege to try to make sure

00:05:28.340 --> 00:05:33.600
that authorized people, need to know, don't perform unauthorized

00:05:33.600 --> 00:05:35.550
functions through least privilege.

00:05:37.040 --> 00:05:40.260
One of the things we always have to watch for is that we're dealing with

00:05:40.260 --> 00:05:45.840
a global environment, and cloud service providers often operate in many

00:05:45.840 --> 00:05:51.270
different countries, and we have to be careful of the laws that pertain

00:05:51.380 --> 00:05:57.140
to things like intellectual property and to such things as where can my

00:05:57.140 --> 00:05:58.830
data then be hosted?

00:05:59.220 --> 00:06:03.100
The jurisdiction for things like data storage and access.

00:06:03.840 --> 00:06:08.850
This also becomes an issue with countries that have laws regarding cryptography.

00:06:09.340 --> 00:06:14.020
It could be that I'm not allowed to use certain forms of cryptographic

00:06:14.020 --> 00:06:19.840
algorithms in certain countries, and I have to be careful if my systems are

00:06:19.840 --> 00:06:22.860
employing those types of cryptographic algorithms.

00:06:23.940 --> 00:06:26.960
The next step in the data lifecycle is archive.

00:06:27.340 --> 00:06:31.770
This is where we quite often will put our data into long‑term storage.

00:06:31.770 --> 00:06:38.130
Maybe we put it on a storage medium such as optical disks, tape.

00:06:38.160 --> 00:06:42.760
There's a lot of ways we can store data for long term because

00:06:43.540 --> 00:06:49.180
flash memory and solid state drives are not really that good

00:06:49.180 --> 00:06:51.670
and reliable for long‑term storage.

00:06:51.890 --> 00:06:55.860
It's quite often better to use something such as an optical disk.

00:06:56.240 --> 00:07:00.210
But even something like that has to be properly stored and maintained.

00:07:00.500 --> 00:07:04.630
We have to have the right hardware that will be able to read that tape maybe

00:07:04.630 --> 00:07:08.520
five years from now when we have to recover data off of it.

00:07:09.340 --> 00:07:13.060
We also have to look at what type of format the data is in.

00:07:13.440 --> 00:07:17.330
Do I have the correct file structures that are going to be able to be

00:07:17.330 --> 00:07:21.700
readable five years from now? And, of course, do we have the encryption

00:07:21.700 --> 00:07:25.850
keys and the algorithms still available that we used when we took that

00:07:25.850 --> 00:07:31.870
backup or that archive five years ago? It's also important to ensure the

00:07:31.870 --> 00:07:36.840
protection of our archive data so it's not going to be lost due to fire,

00:07:36.850 --> 00:07:38.360
earthquake, storms.

00:07:38.630 --> 00:07:43.000
We have proper physical security in place to protect us from anything from

00:07:43.000 --> 00:07:49.380
a natural disaster, of course, to something like high heat or humidity in

00:07:49.380 --> 00:07:51.560
the place where we're restoring our data.

00:07:52.040 --> 00:07:55.440
One of the concepts, of course, with that is geographic

00:07:55.440 --> 00:08:01.440
separation so that my data that is being stored off site is far

00:08:01.440 --> 00:08:05.820
enough away not to be affected by maybe the same disaster that

00:08:05.820 --> 00:08:08.360
could affect my primary data center.

00:08:10.040 --> 00:08:15.290
The final step in the data lifecycle is really the delete or destroy.

00:08:15.290 --> 00:08:19.740
We see both terms used here. And one of the things that is often

00:08:19.740 --> 00:08:24.120
done here is the use of secure destruction of data.

00:08:24.420 --> 00:08:28.950
Now, one of the things they mean when they say secure is mean as defensible.

00:08:29.440 --> 00:08:34.340
Could you stand up and say with certainty that that data was destroyed?

00:08:34.600 --> 00:08:40.270
Is it defensible, or is it, well, I think it was? We had a contract in

00:08:40.270 --> 00:08:42.850
place, but I'm not sure if they followed up on it.

00:08:43.640 --> 00:08:49.110
So part of this is that should we say here that quite often the

00:08:49.110 --> 00:08:52.460
term auditors use is professional skepticism.

00:08:53.240 --> 00:08:58.580
Yes, you said it was destroyed, but I'm not the trusting type of person.

00:08:58.590 --> 00:09:03.130
So could you show me how you actually destroyed that hardware that that data

00:09:03.130 --> 00:09:08.260
was on, for example? One of the important things with this is we have to

00:09:08.260 --> 00:09:13.620
keep data as long as it's required to be kept by law or as long as, of

00:09:13.620 --> 00:09:17.970
course, the business would need it. And therefore, we should have policies

00:09:17.970 --> 00:09:23.710
that clearly state what the retention period of our data is, as well as then

00:09:23.860 --> 00:09:28.810
the process that will be used to destroy that data at the end of the

00:09:28.810 --> 00:09:30.050
retention period.

00:09:32.190 --> 00:09:36.560
In the end, data must be protected consistently,

00:09:36.940 --> 00:09:40.620
appropriately throughout the entire data lifecycle.

00:09:40.970 --> 00:09:45.430
Now that doesn't say the classification can't change. Data that was

00:09:45.430 --> 00:09:50.230
originally classified may actually have a lower level classification

00:09:50.230 --> 00:09:52.450
by the time it reaches end of life.

00:09:53.790 --> 00:09:59.050
The determination of the correct handling procedures is the

00:09:59.050 --> 00:10:04.930
responsibility of the data owner, and it is important here to assure

00:10:05.210 --> 00:10:11.380
that we are compliant with regulations, and that data will be available

00:10:11.390 --> 00:10:14.660
to support business operations as required.
