Table.Distinct

Updated on

Table.Distinct is a Power Query M function that removes duplicate rows from the table, with an optional parameter to specify which columns are tested for duplication. The function returns a new table with duplicate rows removed.

Compatible with: Power BI Service Power BI Desktop Excel Microsoft 365

Syntax

Table.Distinct(
   table as table,
   optional equationCriteria as any,
) as table
ArgumentAttributeDescription
TableThe table to remove duplicates from.
EquationCriteriaOptional You can use the EquationCriteria to specify which columns to test for duplicates. The argument also allows you to specify a comparer. Options include Comparer.Ordinal for exact case-sensitive matching, Comparer.OrdinalIgnoreCase for case-insensitive matching, and Comparer.FromCulture for culture-specific comparisons. When omitted, the function removes duplicates on all columns. Alternatively, you can supply a custom function for removing duplicates.

Description

Removes duplicate rows from the table. An optional parameter, equationCriteria, specifies which columns of the table are tested for duplication. If equationCriteria is not specified, all columns are tested.

Because Power Query sometimes offloads certain operations to backend data sources (known as “folding”), and also sometimes optimizes queries by skipping operations that aren’t strictly necessary, in general there’s no guarantee which specific duplicate will be preserved. For example, you can’t assume that the first row with a unique set of column values will remain, and rows further down in the table will be removed. If you want the duplicate removal to behave predictably, first buffer the table using Table.Buffer.

Examples

When you don’t specify a column in the second argument, the Table.Distinct function removes duplicates from the entire table.

// Removes duplicates based on all column combinations
Table.Distinct( Source)

// Removes the 3rd row which is identical to the first one.
Table.Distinct( 
    Table.FromRecords( {
        [a = "A", b = "a"],
        [a = "B", b = "b"],
        [a = "A", b = "a"]
    } )
 )

You can remove duplicate rows from a single column by providing the column name in the second argument.

// Removes duplicates from the [Fruit] Column in the Source table.
= Table.Distinct( Source, { "Fruit" } )

// Removes the 3rd row, which has a similar fruit as the 1st one.
Table.Distinct( 
    Table.FromRecords( {
        [Id = "2201", Fruit = "Banana"],
        [Id = "2202", Fruit = "Apple"],
        [Id = "2203", Fruit = "Banana"]
    } ),
    "Fruit"
 )

To remove duplicates from multiple columns in your table you can use the second argument. Provide the desired columns as text within a list.

// Removes duplicates combinations from the Fruit and Description Column.
= Table.Distinct(
     Source, 
     { "Fruit", "Description"} )

You can remove duplicates case-insensitive by providing a comparer function as equation criteria.

// Removes duplicates case-insensitive based on the "Fruit" Column
= Table.Distinct(
   Source, 
   { "Fruit", Comparer.OrdinalIgnoreCase } )

// Removes duplicates:
// -- ignoring case on the "Fruit" Column
// -- respecting case on the "Id" Column

= Table.Distinct(
     Source, 
     { { "Fruit", Comparer.OrdinalIgnoreCase },
       { "Id",    Comparer.Ordinal } } )

Learn more about Table.Distinct in the following articles:

Other functions related to Table.Distinct are:

Contribute » | Contributors: Rick de Groot
Microsoft documentation: https://learn.microsoft.com/en-us/powerquery-m/table-distinct