Table.Distinct is a Power Query M function that removes duplicate rows from the table, with an optional parameter to specify which columns are tested for duplication. The function returns a new table with duplicate rows removed.
Compatible with: Power BI Service Power BI Desktop Excel Microsoft 365
Syntax
Table.Distinct(
table as table,
optional equationCriteria as any,
) as table
Argument | Attribute | Description |
---|---|---|
Table | The table to remove duplicates from. | |
EquationCriteria | Optional | You can use the EquationCriteria to specify which columns to test for duplicates. The argument also allows you to specify a comparer. Options include Comparer.Ordinal for exact case-sensitive matching, Comparer.OrdinalIgnoreCase for case-insensitive matching, and Comparer.FromCulture for culture-specific comparisons. When omitted, the function removes duplicates on all columns. Alternatively, you can supply a custom function for removing duplicates. |
Description
Removes duplicate rows from the table.
An optional parameter, equationCriteria
, specifies which columns of the table are tested for duplication. If equationCriteria
is not specified, all columns are tested.
Because Power Query sometimes offloads certain operations to backend data sources (known as “folding”), and also sometimes optimizes queries by
skipping operations that aren’t strictly necessary, in general there’s no guarantee which specific duplicate will be preserved.
For example, you can’t assume that the first row with a unique set of column values will remain, and rows further down in the table will be removed.
If you want the duplicate removal to behave predictably, first buffer the table using Table.Buffer
.
Examples
Let’s see how the Table.Distinct functions works in practice. Suppose you are working with the following dataset in Power Query:
Removing Duplicates from the Entire Table
When you don’t specify a column in the second argument of Table.Distinct, the function checks for duplicates across the entire table. Here’s how you apply it:
Table.Distinct( Source)
In this example, row 9 and 10 are identical. As a result, Table.Distinct will remove the second occurrence of the duplicate row. The output would be:
In this case, the duplicate “Sofa” row is removed.
Removing Duplicates from a Single Column
If you want to remove duplicates based on a single column, you can specify the column in the second argument. For example, to remove duplicates from the Color
column:
Table.Distinct(
Source,
"Color"
)
This expression will retain the first unique values for the Color
column and remove any subsequent duplicates. The result would look like this:
In this case, only the first Red
and Blue
values are kept, and all subsequent rows with the same Color
are removed.
Important: This operation does not guarantee the same results every time, as the order in which rows are kept may vary based on the data processing. To make the results more predictable, you can wrap the table in the Table.Buffer function to ensure the data is read into memory before performing the distinct operation.
Removing Duplicates from Multiple Columns
To remove duplicates based on multiple columns, you can provide a list of column names as the second argument:
Table.Distinct(
Source,
{ "Category", "Color" }
)
In this case, duplicates will be removed based on combinations of values in both the Category
and Color
columns. The resulting table will be:
Here, only unique combinations of Category
and Color
are retained. The other rows are removed if they have the same combination of values in both columns.
Removing Duplicates with Case Insensitivity
You can also perform case-insensitive duplicate removal by providing a comparer function. For instance, to remove duplicates based on the Product
column, ignoring the case of the text:
Table.Distinct(
Source,
{ "Product", Comparer.OrdinalIgnoreCase }
)
This removes duplicates where the Product
names match, regardless of their letter casing. For more control, you can combine case-sensitive and case-insensitive comparisons across different columns. The following example removes duplicates from the Product
column case-insensitively, while keeping the Category
column case-sensitive:
Table.Distinct(
Source,
{
{ "Product", Comparer.OrdinalIgnoreCase },
{ "Category", Comparer.Ordinal }
}
)
This gives you fine-grained control over how duplicates are removed when working with mixed case or case-sensitive data. Note that the last two examples are shown for theory. The case-sensitivity does not make any difference in the example dataset shown on this page.
Full M Code
To try these examples yourself, you can use the following code:
let
Source =
#table(
type table [ Index = Int64.Type, Product = text, Category = text, Color = text ],
{
{ 1, "Laptop", "Electronics", "Red" },
{ 2, "Smartphone", "Electronics", "Blue" },
{ 3, "Tablet", "Electronics", "Red" },
{ 4, "Desk Chair", "Furniture", "Blue" },
{ 5, "Monitor", "Electronics", "Red" },
{ 6, "Dining Table", "Furniture", "Blue" },
{ 7, "Coffee Maker", "Home Goods", "Red" },
{ 8, "Blender", "Home Goods", "Blue" },
{ 9, "Sofa", "Furniture", "Red" },
{ 9, "Sofa", "Furniture", "Red" }
}
),
DistinctTable = Table.Distinct( Source ),
DistinctColumn = Table.Distinct( Source, "Color" ),
DistinctColumns = Table.Distinct( Source, { "Category", "Color" } )
in
DistinctColumns
Related articles
Learn more about Table.Distinct in the following articles:
- Removing Duplicates in Power Query (Complete Guide)
Removing duplicates in Power Query – This guide explores basic and advanced techniques and why you may be getting inconsistent results. » Read more
Related functions
Other functions related to Table.Distinct are:
- Table.AlternateRows
- Table.InsertRows
- Table.Range
- Table.RemoveFirstN
- Table.RemoveLastN
- Table.RemoveMatchingRows
- Table.RemoveRows
- Table.Repeat
- Table.Skip
2023-2024 © BI Gorilla. All rights are reserved. Information from Microsoft docs is property of Microsoft Corp. | Privacy Policy