Protecting unstructured data

04 Aug, 2021 6 min read

Before we zoom in on the problems with unstructured data in the finance industry, let’s take a step back and look at the massive proliferation of data in general.

Did you know, around 59 zettabytes (ZB) of data – that’s 59 with 21 zeros after it – were expected to be created, captured, copied and consumed in the world, according to Global DataSphere from International Data Corporation (IDC)? That’s a lot of data!

If we look at the ratio of unique data created and captured to data copied and consumed, it is roughly 1:9, with a trend toward less unique and more replicated data. This is driven by the COVID-19 pandemic which hindered the creation of new data, but increased the consumption of unstructured data, in particular downloaded and streamed videos.

Looking ahead, the sharp data growth trajectory will continue. IDC predicts that the amount of data created over the next three years will be more than the amount of data created over the past 30 years. That means, the world will create more than three times the amount of data over the next five years than it did in the previous five years, yikes!

This is a lot of data but before we talk about how the finance industry can better protect it, it’s important to understand the two distinct forms of data - structured data and unstructured data.

So, what is structured data?

Structured data is typically what first comes to mind when you think of digital data and big data analytics. It’s the type of information that the finance industry stores in traditional databases composed of columns and rows, such as a customer database comprising names, addresses, telephone numbers and orders. Structured data is highly organised, and it’s easy to process, access, and work with.

And, what is unstructured data?

Unstructured data is everything else. Every financial organisation has a whole host of unstructured data. In fact, some 80% of data is unstructured and much of it is personal information.

Common examples of unstructured data are:

Spreadsheets
Email conversations
Chat logs
Word processing documents
Slideshow presentations
Image libraries
Videos

Unstructured data in the financial industry is everywhere

The financial industry is particularly affected by unstructured data. In addition to the large databases of client information and transactions in the finance industry, financial institutions hold a wide range of other data and documents such as trading reports, HR records, meeting notes, business plans, financial statements and spreadsheets, many of which are highly confidential. The finance industry's love of spreadsheets represents a particular data security problem as they often contain highly sensitive data but they are weakly protected.

Unstructured financial data needs better protection

For years, the finance industry has sought to protect data by using multiple layers of security to prevent unauthorised access. Unfortunately, it isn’t working. The relentless flow of headlines around successful cyber-attacks and breaches proves so. The reality is, most financial organisations have data files that are stored on laptops and which are accessible by staff who have no reason to see that information.

So, the question becomes, if we cannot keep the cyber criminals out with the traditional ‘castle and moat’ or perimeter defenses, and we can’t trust the people around us, what can we do? The answer is, we need to adopt a data centric approach, where security is built into the data itself. That way, even if the data does get into the wrong hands, it’s completely useless as it’s completely encrypted.

Why is full disk encryption not good enough?

Many financial organisations do not realise that full disk encryption will only protect structured and unstructured data when it is at rest on a dormant hard disk or USB stick. That means full disk encryption is only beneficial if your employees lose their laptops. It’s of absolutely no use in protecting data against unauthorised access or theft from any running system. And let’s face it, data is only useful when it’s on a powered-on machine - which is most of the time - when staff run reports, analyse data, make presentations and work on proposals.

And though the situation may gradually change, most financial organisations currently deploy endpoints with local storage, where extracted data is often stored. Data, whether it be structured or unstructured, therefore needs to be protected not only at rest, but also in-transit and in-use, as well as on-site or in the cloud. But this is no easy task.

All data is sensitive - especially unstructured data

The other problem in the finance industry is that time-consuming and costly data classification technology is often used to identify ‘important’ or ‘sensitive’ data so that it can be encrypted. The problem is that data classification is error-prone and we don’t even know where all the ‘sensitive’ data is.A 2020 Ponemon report shows that 67% of respondents said discovering where ‘sensitive’ data resides is the number one challenge in planning and executing a data encryption strategy.

And, when it comes to unstructured data, deciding what is the ‘most important’ data to protect is even more difficult. It typically involves assessing and classifying the data, such as intellectual property, merger and acquisition plans, letters, emails and human resources records etc., then taking into account risk and business impact analysis and regulatory requirements.

Manual classification is impractical for most organisations. Even with the advent of data classification automation, search patterns and rules still need to be developed, so it’s highly likely that a proportion of data will be mis-classified. And even if it’s not, often the user is allowed to override the assigned classification.

Once this is all done, the initial effort to catalogue and assign classifications to all existing data must then become an ongoing process. Users need to assign classification tags to data as new information is created, modified, and shared. Then there is the biggest question of all which is, where do you set the bar? Even seemingly trivial information can be useful to a cybercriminal.

A new approach to protecting structured and unstructured data

Data encryption has been used in the financial sector for decades - it’s a tried and trusted technology. But, data encryption should be used to protect ALL data – not just that which is classified as the most important, and definitely not structured data in isolation. By protecting structured and unstructured data at the file-level, financial organisations can be assured that if data does get stolen, it remains protected and useless to the thief.

The best part about the SecureAge Security Suite is that it gives financial companies the ability to slide encryption technology in ‘behind’ other software that’s already in-use. It automatically secures data - structured or unstructured - without having to change any applications or spend time deciding what is important. As a result it doesn’t matter whether the data is stored, in-transit or in-use - financial companies are finally able to secure the only thing which has value – the data itself. Visit our SecureAge Security Suite page to find out more and get in touch with our representative to see live in action how SecureAge Security Suite works.

Data security

Email encryption - the ultimate guide in 2023

Grace Cao, 23 Nov, 2022

Data security

Enterprise data encryption - the ultimate guide in 2023

Nigel Thorpe, 18 Nov, 2022