Joshua is a graduate student at the USF. He has interests in business technology, analytics, finance, and lean six sigma.
Why Remove Duplicates in Excel?
Removing duplicate records one by one from an Excel worksheet is a repetitive task for anyone. This is especially true for a user working with thousands of records. We are in luck that Excel has an automated tool to find duplicates for attributes of a record and allows you to remove those records to clean up the data.
Below I cover how to use the remove duplicates tool and cover how to remove duplicates by filtering data.
How Remove Duplicates Works
With the remove duplicates tool, Excel essentially allows you to choose the columns where you would like to remove duplicates. During this process, all records will be removed that contained duplicate data.
Tech Companies Remove Duplicates Example
Here is a simple example to see how the remove duplicates function works. The illustration below is a list of 248 tech companies and can be downloaded here. I would like to remove all the duplicates from the state attribute. This will allow me to show which states the tech companies listed are being drawn from.
To remove duplicates in the state column, we would first have to click on the data tab, then click on the remove duplicates icon in the data tools section shown below.
Data → Remove Duplicates
The remove duplicates window appears. At this point, you would click on the unselect all button. This will clear all of the checkboxes to allow you to customize your selection. Since I only want to remove duplicates in the state column I would select state and click OK.
Unselect All → Column(s)
The tool will want you to acknowledge the number of records that have been removed as a result of the process.
Verifying Duplicate Values Were Removed
Below is the result of removing duplicates from the state column where unique characters remain.
If you would like to keep your original data, I would suggest copying the new data set over to another workbook followed by exiting the workbook where duplicates were removed with saving the changes.
An Alternative to Removing Duplicates
An alternative to the removing duplicates tool can be used when the data is not overwhelming. Follow these steps to filter out duplicates:
- Add a column and name it Unique.
- Select the top row & click Sort & Filter → Filter.
- Go through the records and mark a 1 in the Unique column for the records with duplicate values. So, create add a 1 for each of the sets of different data.
- Select the Unique header, deselect the blanks check box, and click on OK.
This option allows you to keep each record and sort out duplicates by adding this binomial attribute to help you sort through the data.
- How to Remove Errors in Excel With the IFERROR Function
Shows a Microsoft Excel user how to apply the IFERROR Function to remove errors in cells.
- How to Use the ABS Function in Excel
Covers how to use the ABS function in Microsoft Excel with examples on the correct syntax to use.
- How to Use the AVERAGE Function in Excel
Shows how to use the AVERAGE function in Excel with application examples.
- How to Use the COUNT Function in Excel
Shows a Microsoft Excel user how to use the count function in a spreadsheet.
City of Ferndale, Michigan. (2016, October 24). Tech248 Companies. https://data.ferndalemi.gov/datasets/6860741d40ee4ec49d95ee709ebc984e_0.csv?outSR=%7B%22latestWkid%22%3A3857%2C%22wkid%22%3A102100%7D.
Microsoft. (n.d.). Filter for unique values or remove duplicate values. https://support.microsoft.com/en-us/office/filter-for-unique-values-or-remove-duplicate-values-ccf664b0-81d6-449b-bbe1-8daaec1e83c2.
Learning More About Excel
To learn more about completing tasks like the remove duplicates process in Excel I recommend using The Excel Bible. I have been using the Excel Bible for years to improve my understanding of all aspects of Excel.
This content is accurate and true to the best of the author’s knowledge and is not meant to substitute for formal and individualized advice from a qualified professional.
© 2021 Joshua Crowder