“It’s not me, it’s you…” 

If you’ve found yourself saying this to your CRM once again, we’re here to tell you that it’s wrong. In fact, it’s neither of you: it’s your data.

In fact, almost any CRM can work for you—if your data is clean enough.

If you're in the process of moving to a new CRM or trying to set up email marketing campaigns, then you may have had a shock when you saw the state of your CRM database. Typos, missing data, incorrect emails, and jumbled formatting litter your data, and you don't even know where to start. With all this bad data, you:

  • Can't personalize emails due to missing data
  • Don't have your contacts segmented or organized
  • Are receiving complaints from the sales team about duplicate data or inaccurate information
  • Have heard about the benefits of email marketing but aren't seeing those results yourself

Getting this data under control is essential if you want to optimize your marketing efforts. 

You don't need to be an expert in data science to implement effective data management. Clean up your data with these data cleansing techniques to create quality lists that you can leverage for real results.

Want to know really, once and for all, how healthy your data is? Request your free data quality check.

Get our ultimate bundle of checklists, workflows and swipe files to manage your customer database like a data pro

  • Email deliverability checklist
  • Lead management workflow
  • Email cleaning checklist to make the most of your list
  • A welcome email swipe file to engage your list from the start
  • And more!

Why you need a data cleanup: consequences of bad data

Before you even consider using your data the way it is, let me stop you. Sending out email campaigns to a dirty data list will have a string of negative consequences. If you're putting garbage data into your CRM, you can only expect to get garbage results. No algorithms can change that.

1. Loss of revenue

If you're using bad data, you're going to cost your organization money. The last thing you want to show your CEO is that your marketing strategies are actually losing money. According to Cio.com, 77% of companies lose revenue as a direct result of bad data.

If you aren't working with clean data, your sales and marketing teams will be unable to target contacts effectively. It will lead to:

  • Your sales team calling the wrong phone numbers

  • Your marketing team sending email blasts to invalid email addresses

  • Sending out double the emails due to duplicate data

  • Paying twice or thrice for a single user due to duplicate data

  • Ending up in the spam folder

  • Spending more on CRM software because the number of contacts in your database

  • Missing opportunities to connect with potential customers

  • Etc.

All of these factors contribute to an overall loss of revenue.

2. Loss of productivity

If your teams are working with bad data, it will result in a loss of productivity. Business processes depend on accurate reporting to function efficiently. If you make decisions based on inaccurate information, you'll spend more time trying to fix data errors or validate the data. Also, instead of spending time targeting quality contacts, your team will waste their time with flawed ones.


3. Loss of reputation

Nobody likes being called by the wrong name, or worse—by FNAME. When your contacts receive untargeted emails that are entirely irrelevant to them, they will drop it right into the spam folder. If your data isn't clean, when you send bulk emails your contacts won't trust you, and it can be detrimental to your reputation as a brand.

The importance of email list hygiene when cleaning data 

Personalization is the marketers' secret weapon, and if you can personalize your email marketing, your emails are 26% more likely to be opened. Email list hygiene is essential to creating targeted campaigns. If you want to segment your contact lists, get high open rates, and boost conversions,  start with clean data.

When people subscribe to your email list, over 20% of them will have errors. That could include typos, information in the incorrect field, syntax errors, and more. That means that even if you standardize your data collection, there will still be errors that slip through. 

Some CRMs like MailChimp won’t even allow you to send out your email campaigns if they have too much bad data. You will get a MailChimp omnivore warning, which will tell you the data quality isn’t good enough to send out the email. For that reason, data cleansing needs to be a regular part of your data management strategy.

How to assess the quality of your data to determine if it’s clean data?

Before you begin the data cleaning process, assess the quality of your data. Doing so will help you determine which data sources deliver quality data and where most of the bad data is coming in. You can use your assessment to put data collection processes in place that limit the number of bad data points that end up in your CRM.

examples of how to ensure data integrity.

There are five criteria to use when analyzing the quality of your data.

1. Is it valid, and does it fit your formatting rules?

Validity is vital when determining your data quality. Consider whether the data fits your formatting rules, i.e., do the fields have proper capitalization, are phone numbers formatted correctly, and do the emails appear in the right fields? 

Structural errors in validity can occur when input methods are neglected. The wrong information could end up in the wrong cell due to import errors or human error.

Here is what that might look like:

Name

Birthday

Address

Email

john smith

28/10/1982

123 upper St.

JSmith82@gmail.com

John Smith

October 28, 1982

123 upper STreet

jsmith82@gmail.com

J.Smith

jsmith82@gmail.com

123 Upper Street

28.10.1982

2. Is it accurate and true?

Next, determine if the data is accurate and true. It might be valid and in the correct field, but that doesn't mean the information itself is correct. Determining accuracy can be slightly more complicated since you can't use algorithms to identify the data quality. 

A few inaccurate data entries don't mean that the entire process is flawed. But if you notice data quality errors in a particular field occurring frequently, you'll want to rethink your data collection process.

For example, someone's phone number could be in the correct field and the right format, but it might be the wrong phone number.

3. Is it consistent?

Data values should always be consistent across your data set. If you have two values that contradict each other in the same set, you have inconsistent data. 

For example, if someone's contact details state that they are 11 years old and have their marital status as "married," then the data isn't consistent since an 11-year-old wouldn't be married. Another example could be that the same customer is in multiple data lists, but their phone number is different in each one. If the data seems inconsistent, you'll need to cross-check it to determine which of the values is accurate.

4. Is it complete?

Missing values are extremely common in datasets. People may fill out online forms and just choose to skip a field. If that is the case, it can be hard to fill in the blanks since you can't assume the missing data. If you have any contact details for the person, you can ask them for the missing values. If you don't have an email address or phone number, then the listing will remain incomplete.

Assessing data quality yourself is a complex process. Luckily, you don't need to do it alone. Take advantage of data quality tools or click here  to get a free data quality check from the data experts at tye.

What is data cleaning? How to ensure you're always collecting high-quality, clean data 

Once you've assessed the quality of your data,  put data collection processes in place to ensure that you are collecting high-quality data. If you don't change the way you collect data, you will just continue filling your contact lists with bad data.

There are four steps to ensuring the data you collect is as clean as possible:

1. Inspect

Inspecting your data is necessary to identify where bad data is leaking in. You’ve already done the inspecting above when you went through the steps to assess your data’s quality. If you do not inspect your data you can continue to cleanse it but never get the root of the problem, which is your data management processes. 

2. Clean

Once you’ve inspected your data and updated your data management practices accordingly you can cleanse your data. We will break down the exact steps to clean your data below in the next section.

3. Verify

After your data is clean, review your datasets to verify that they are now correct. If you’re cleaning data yourself you may have made formatting errors or created algorithms that are slightly off. If you use data cleaning tools you are more likely to have success with your first clean. 

4. Report

Lastly, reporting is an important part of the data management process. You should always report any changes that you’ve made and the quality of that data that is currently stored in your lists. Reports should contain which formatting rules were broken and how many times so you can identify the causes of those errors. 

You can also use reporting to identify where you’re pulling quality leads from, and which leads never convert. Is there a pattern in the type of contact that doesn’t ever convert or open your emails? 

Check for patterns. If you see many disqualified leads that are all on personal email addresses then you might want to consider limiting email opt-ins to business emails. There may also be a type of content that seems to produce leads which never end up converting. This will depend on your target audience, your business and your content, but it’s something to consider and look out for.

What is data cleansing?

Data cleansing involves identifying, amending, or removing any data within a contact list, which is:

  • Incomplete

  • Incorrect

  • A duplication

  • An outlier

  • Irrelevant or unnecessary

The process of data cleaning enables companies to keep their data current, accurate, GDPR compliant, and reduces the risk of it becoming outdated. With B2B data constantly decaying, data cleansing needs to be a regular part of your data management efforts.

Image showing when you're allowed to legally process personal data in accordance with GDPR

Your CRM database might be a mess, but cleaning up your data act will allow you to leverage your data to optimize your marketing efforts. Check out this ultimate guide to #data cleansing by @tye_io where who break down the entire process, step-by-step:

Click to Tweet

How to Clean Data

Data cleaning can seem like a tedious and time-consuming task. And that's because it is. Let's not make light of the fact that even complex algorithms in Excel may not be able to normalize your data in its current state. While you can’t snap your fingers and have a clean database, you can enlist the help of expert data cleansers and data cleansing tools like tye

To clean data, here are the data cleaning steps that are used:

1. Remove irrelevant data

The first step is to get rid of any data that is irrelevant. If there are fields that don't apply to your industry or marketing tactics, they need to go. Not only are they of no use to you, but keeping them goes against GDPR guidelines.

If you're a manufacturer who produces screws and bolts for construction companies, then you don't need your contacts’ eye color in your dataset.

2. Get rid of duplicates

Once you've removed anything irrelevant to your industry, you'll want to weed out the duplicates. Duplicates only increase the amount of data in your database and end up wasting more time. 

There are simple duplicates, and complex duplicates that can find their way into your datasets. Simple duplicates are two contacts in your database who you know to be the same person. For example if you have Matt Smith, and Matthew Smith, but they both have the same email address and phone number. That is a simple duplicate.

If you are unable to identify if two contacts represent the same person then that is a complex duplicate. For example, you have a Matthew Smith who lives in New York and a Matthew Smith in Portland, they both list different emails, but have the same phone number. It becomes more complicated to determine if they are actually the same person or not. 

Most CRMs, like HubSpot, will charge you based on how many contacts you have. So, if you have duplicates, you will end up paying extra for them unnecessarily.

Duplicate data can also cause contacts to receive multiples of your marketing communications, which comes off spammy and will increase unsubscribes.

Duplications can end up in your database in multiple different ways. Your marketing team may be migrating contacts between CRMs or uploading from Excel sheets where they were listed twice. Users may also submit online forms multiple times by mistake.

3. Type conversions

Type conversion is the process of converting one data type to another. Each field in your CRM is formatted to a specific type of data. Fields that contain numbers should be formatted as numbers, which means no alphabetic characters can end up in them. 

A good rule of thumb is that numbers will always align to the right of a cell while text aligns to the left. If you see numbers that align to the right, then the field is likely formatted as text, and you may need to change it. Similarly, fields that contain dates should be formatted as dates. By doing type conversions, you ensure normalization across all your fields.

You can convert cells to whichever types you need and set parameters around what data is compatible. For example, if one field contains the contact age, you can set it only to contain a maximum of 2 numbers.


Related post: 5 CRM Best Practices You Can Apply Today

Data cleaning steps - continued

4. Standardizing

Standardizing involves ensuring the same format across datasets. You can choose how you'd like to standardize fields and then use those to normalize the data. Standards include:

  • whether you want to include capitalization or lowercase characters

  • how you want to format dates

  • which measurement units you want to use

5. Missing values

If your data is missing values, you have two choices: try to fill in the blanks or scrap them entirely. Which action you choose will depend on which value is missing. If it's their email address that’s missing, then you'll likely need to scrap it since you won't be able to contact them at all. If you've got their email but are missing other vital details like their name, you can either use their email to find their name or contact them to ask.

6. Outliers

Outliers include any piece of data that visibly doesn't fit with the rest. If the values in a field are entirely off from other values in the dataset, then it's a good indication that it's been entered incorrectly. While you always want to investigate before deleting data, if you see someone who entered their age as 289, then you can make an educated assumption that it's wrong.

Data cleansing techniques – continued

7. Cross-dataset errors

The next step is to get rid of or amend any cross-dataset errors. We touched on this above when talking about consistency in your data. If multiple fields in the same contact data contradict each other, then identify where the problem is. If someone is 18 years old, but they've ticked the box for a senior's discount on your website, then there is misinformation somewhere. Did they mean 81?

Another typical example is when your dataset features the sum of other columns, but the figures don't add up. Somewhere the formula is off, so determine which cell isn't formatted correctly. Here is what cross-dataset errors could look like.

Name

Food

Rent

Car Insurance

Total Sum

Sarah

$100

$850

$129

$229

Tom

$328

$920

$214

$542

Joe

$290

$680

$207

$497

By analyzing the data, you can see that the cells which contain the rent haven't been considered in the formula for the sum.

8. Get rid of spaces

This one is pretty self-explanatory, but spaces can throw off the algorithms in your contact lists. One inaccurately placed space could prevent emails from reaching your contact or mess with other fields' formatting in the dataset.

How to install best practices for regular data cleaning 

Regular maintenance of data is required to ensure your data stays clean. You can never entirely prevent bad data from getting in, but you can work to reduce it. To keep your raw data as clean as possible when it comes in, you need:

  • A thorough data management plan
  • Consistent, standard formatting and a data normalization process
  • A data quality assessment

Regular data cleansing for email marketing via a reputable data cleansing provider

Key takeaways

Data cleaning doesn't have to be a scary process. By following these data cleaning steps, you can completely transform your messy, disorganized CRM database and optimize your marketing efforts.

Want to know really, once and for all, how healthy your data is? Request your free data quality check.

Get our ultimate bundle of checklists, workflows and swipe files to manage your customer database like a data pro

Markus Beck

Markus Beck - November 17, 2020

CEO with a passion for data relationships. Markus is half Finnish, half Austrian & fully committed to helping businesses keep bad data from ruining great relationships. Process Engineer by training, with digital marketing & project management skills from previous jobs.