You may suspect that there are riches buried in the myriad of text and documents your organisation produces every day. And you’re right. The documents already inside your organisation can provide early warning of risk and compliance issues. They’ll bring you new insights into what your customers really need. They offer a detailed understanding of your business that will drive better decisions.

“Art is a kind of mining,” he said. “The artist a variety of prospector searching for the sparkling silver of meaning in the earth.” ― Jane UrquhartThe Underpainter

Even relatively small organisations generate large amounts of text every day. This data is worth getting to grips with. But that’s not an easy thing to do, given the sheer diversity and volume of documents created. The trick to uncovering gems in your treasure trove of text is actually less of an art and more of a science, specifically the data science technique called text mining.

Even a cursory understanding of the potential text mining might offer can help you decide whether it’s worth considering, and, if it is, commission better solutions, whether developed by an in-house team or external consultants.

This article looks first at what text mining is, then at three potential ways that text mining could be used to create value for your business.

Download a free checklist: Getting a Data Project Started

what is text mining?

“Text mining, also referred to as text data mining, is the process of deriving high-quality information from text. High-quality information is typically derived through the devising of patterns and trends through means such as statistical pattern learning” Wikipedia.

This definition seems pretty clear. But putting it into practice isn’t as straightforward. The challenge with text, as opposed to more structured data sources like spreadsheets, is two-fold.

  1. The lack of consistent structure. Consider the variety of different formats text appears in. From emails to CRM entries, from reports in Word to PDF documents, each has a different structure.
  2. It’s big data. The volume of documents and text in an organisation is the very definition of “big data” – data that by its sheer size and complexity requires significant processing power to yield any insights of value.

The process by which text mining solves the problems of structure and scale is where data science comes in. The basic approach is to turn text into numbers, so that we can use machines to analyse the large volumes of documents and discover insights through mathematical algorithms. A number of techniques can be used to do this: for example Mastodon C used an approach called topic detection to help Defra build a prototype early warning system.

Others terms commonly associated with text mining are:

  • Machine Learning – where a computer is given the ability to learn and improve what it does without being explicitly programmed with rules,
  • Natural Language Processing – which enables the recognition of similar concepts even if they’re spelt differently or expressed in different ways.
  • Sentiment Analysis (an approach to finding subjective insights such as opinions, mood, and emotions)

how can text mining create business insights?

There are a number of reasons business leaders should be interested in text mining. IDC estimate that less than 1% of data is ever analysed. Text mining will help bridge the gap to the missing 99%. Here are some ways that text mining could make an impact on your business.

1 – Risk, Compliance and Threat Detection.

Across a variety of sectors, insufficient risk analysis creates massive problems. This is especially true in the financial services industry where text mining is used to detect potential compliance issues or provide early warning of fraud and criminal activities like money laundering. Public sector organisations will also benefit from using text mining to provide early warning of issues, for example by helping hospitals spot potentially dangerous issues before they cause real harm.

2 – Customer Engagement

Your interactions with customers generate mountains of text. Whether that’s through customer emails, social media, or notes in the CRM system Text mining, through the application of techniques like natural language processing, can generate early insights into what your customers are thinking. This can save money by reducing dependence on cost areas like call centres. You will also be able to protect your reputation by getting early warning about issues. Or build new revenue streams by identifying new products your clients need or new market segments to target.

3 – Better Business Decisions

Analysts need data if they’re to provide a business with accurate insights. Text mining can help by providing more accurate insights across a broader range of documents and sources. This approach is especially powerful when combined with external data sources. Bringing together a variety of internal and external data sources helps improve both the speed and competency of decision making.

Getting Started

The key to getting a good outcome from text mining is, as for many business initiatives, clearly defining the problem you’re trying to solve, and building a business case. We’ll be looking at ways to do that next week.

In the meantime email us at if you’d like a chat, or join the conversation on twitter @mastodonc

Share this article