Excel CLEAN Function – How To Use

The CLEAN function comes under the category of text/string functions in Excel. It is used to remove first 32 non-printable characters from the text.

Based on the 7-bit ASCII character set, codes starting from 0 to 31 are considered as Non-Printing Control Characters such as horizontal tab, vertical tab, carriage return etc.

The CLEAN function is usually used when dealing with a large amount of exported data that contains non-printable characters like line breaks due to encoding differences between different systems and devices.

Excel CLEAN Function

Syntax

The syntax of the CLEAN function is as follows:

=CLEAN(text)

Arguments:

'text' – The text argument can be supplied in double quotes or as a cell reference.

Important Characteristics of the CLEAN function

Some of the notable features of the CLEAN function are as follows.

  • The CLEAN function only removes the first 32 characters from the 7 bit ASCII character set.
  • Non-printable characters from the Unicode character set (values 127, 129, 141, 143, 144, 157 and 160) cannot be removed using this function.
  • If the text argument is in numeric format, the return value will be in text format.
  • The CLEAN Function only works in Excel 2003 and versions after that.

Examples

The CLEAN function when used in a straightforward manner simply removes all the ASCII non-printable characters from the given data. In this example, cell B2 contains text copied from a webpage and therefore includes some non-printable data.

Using the function as follows:

=CLEAN(B2)

Examples of CLEAN Function in Excel

We now have the clean text in cell C2.

Example 1 – Cleaning Numeric Data in Excel

Suppose we enquired about a few mobile phones and received the prices of all of them in a spreadsheet. Upon downloading, we realized that there are some non-printable characters in the data which creates an issue while calculating the total sum.

Cleaning Numeric Data in Excel

One option is to manually clean the data in column C (prices), which is an ineffective approach. Instead, a quicker approach is using the CLEAN function, where we can easily remove all the non-printable characters using the below formula.

=CLEAN(C3)

Cleaning Numeric Data in Excel

As the CLEAN function is a text function, the problem is that all the cleaned values are now in text format. Hence, the sum of all the prices in cell D10 is 0. We can easily rectify it using the VALUE function which converts the numbers in text format to numerical format. The formula used will be as follows.

=VALUE(D3)

Cleaning Numeric Data in Excel

Now, we have all the mobile prices in a numeric format without any non-printable characters. To minimize the number of steps and make it cleaner, we can use all both the functions together as follows.

=VALUE(CLEAN(C3))

Cleaning Numeric Data in Excel

Example 2 – Removing Line Breaks in Excel

In this example, we have a dataset with some names and addresses. The addresses are written with line breaks for better understanding. However, we wish to have the addresses without line breaks before storing them to our database.

Removing Line Breaks in Excel

Using the CLEAN function directly on column C will give us the desired result. The formula used will be as follows.

=CLEAN(C2)

Removing Line Breaks in Excel

Note:

There is another approach to accomplish the above. We know that a line break corresponds to ASCII value 10. So we can use the SUBSTITUTE function to replace CHAR(10) with an empty string, thereby removing the line breaks from the text.

The formula would be:

=SUBSTITUTE(C2,CHAR(10),"")

Example 3 – Finding Non-Printable Characters in Data

Until now we have determined how to clean text, string, and numeric data using the CLEAN function. However, CLEAN function also allows us check if there are any non-printable characters in the text and if there are any such characters, we can ever get their count.

Let's see how.

Suppose you are in the marketing department and exported some data from an online tool for processing. We are here to check whether the exported data contains any non-printable characters or if we will have to clean it.

In this dataset, we have a few Facebook posts and their engagement numbers (number of likes).

Finding Non-Printable Characters in Data

To determine if the downloaded data contains any non-printable characters, we will use the IF function combined with LEN and CLEAN functions. The logic we will use is as follows.

We will compare the length of the original string and the length of the cleaned string. If there is any difference between the lengths that indicates there are non-printing characters in the string.

We can even go one step further and print the difference between the lengths of the original string and the cleaned string, this difference will be the number of non-printing characters in the original string. If the string lengths are same, we will print a message "Clean Text".

The difference in the length of the clean text and the original text is the number of non-printable characters. It can be easily calculated as follows.

=LEN(B3)-LEN(CLEAN(B3))

Combining the above IF logic in the Excel function, the final formula will look like this.

=IF((LEN(B3)-LEN(CLEAN(B3))) > 0, (LEN(B3)-LEN(CLEAN(B3)))&" Non-printable Characters Found.", "Clean Text")

Finding Non-Printable Characters in Data

Now, we can analyze how many non-printable characters are there in the data and use them accordingly.

But what if the data contain extra spaces along with non-printable characters?

Example 4 – How to Clean Extra Spaces

In the dataset, we downloaded a few quotes from a webpage. The data contains extra spaces and non-printable characters and, therefore, cannot be used as such.

How to Clean Extra Spaces

As we can see, there is a line break, unnecessary trailing spaces, and non-printable characters. When tried using the CLEAN function, it only cleared up the line break and non-printable characters. The formula used is as follows.

=CLEAN(C2)

How to Clean Extra Spaces

The CLEAN function clears up all the characters ranging from 0-31 ASCII code; and space corresponds to ASCII code 32. Here we can make use of the TRIM function. TRIM Function is used to clean up extra spaces in the text. So, let's try using the TRIM function to eliminate spaces.

=TRIM(C2)

How to Clean Extra Spaces

As we can see, the TRIM function cleared the trailing spaces and extra spaces, leaving no spaces at the beginning or end of the text and only one space between words. The line break and non-printable characters still exist.

We can clearly use both functions together to clean up our data. Using both TRIM and CLEAN functions together as follows.

=TRIM(CLEAN(C2))

How to Clean Extra Spaces

While the CLEAN function is great for getting rid of non-printable ASCII characters. But you might want to get rid of a few non-printable characters outside the ASCII range, such as non-breaking space characters (160), which are commonly used in HTML.

Let's see how we can get rid of Unicode Non-printing characters.

Example 5 – Removing Unicode Non-printing Characters

In this example, we have two similar-looking sentences in cells B2 and B3. Both contain extra space at the beginning and a non-printable character.

Removing Unicode Non-printing Characters

Let's use the solution mentioned above of combining the TRIM and CLEAN functions to clear up the data. The formula will go as follows.

=TRIM(CLEAN(B2))

Removing Unicode Non-printing Characters

As we can see, example 2 is now clean, as seen in cell C3, and the length of the clean string is 34 after removing the extra space and a non-printable character. However, in example 1, the non-printable character is removed, but there is still an extra space in the beginning.

This hidden space character is called a non-breaking space (nbsp) usually appears in Excel when we download data from another program or copy-paste from any webpage on the internet. The non-breaking space character is represented by CHAR(160).

To remove it, we need to combine the SUBSTITUTE function with the TRIM and CLEAN functions.

The SUBSTITUTE function will replace CHAR(160) with a space. The CLEAN and TRIM function can then remove the extra space and non-printable characters. The final function will be as follows.

=TRIM(CLEAN(SUBSTITUTE(B2,CHAR(160)," ")))

Removing Unicode Non-printing Characters

We can now clean up all the data even if it contains a hidden space. As Unicode is a vast character set, there are several other non-printable characters, such as 129, 141, 143, 144, and 157, which might create interference in the data. We can easily remove them using the same method.

CLEAN vs. TRIM Function

As we have seen in the above examples, the TRIM function cleans up extra spaces while the CLEAN function removes non-printable characters belonging to ASCII code. The space corresponds to CHAR(32), while the CLEAN function strips the first 32 characters (0-31).

That wraps up our discussion on the CLEAN function. While you solidify your expertise with some practice, we'll go put together another tutorial with some nifty formulas to help you become an undefeated Excel champion.

Compiled by - Content Studio

Thanks for reading. If you have found this article helpful show your love by sharing it with your friends & colleagues. All the tutorials on the Excel Trick are produced, reviewed, and fact-checked by a team of experts. You can check out our team here.