Splitter.SplitTextByCharacterTransition

Updated on

Splitter.SplitTextByCharacterTransition is a Power Query M function that splits text into a list based on transitions between different character types. The function returns a list of text segments after applying the transition-based splitting.

Compatible with: Power BI Service Power BI Desktop Excel Microsoft 365

Syntax

Splitter.SplitTextByCharacterTransition(
   before as any,
   after as any,
) as function

Description

The Splitter.SplitTextByCharacterTransition function is designed to split text into a list based on transitions between various character types. In simple terms, this function allows you to split a text wherever there’s a shift from one type of character to another, like moving from a letter to a number.

The before and after parameters can either be a list of characters, or a function that takes a character and returns true/false.

Examples

Let’s look at a few examples to get a better understanding of the Splitter.SplitTextByCharacterTransition function.

Transition from Letter to Digit

Suppose you have a text “Abc123” and you want to split it wherever a letter transitions to a number.

// Output: {"Abc", "123"}
Splitter.SplitTextByCharacterTransition( {"A".."Z", "a".."z"}, {"0".."9"} )( "Abc123" )

Here, the function splits the input right where the letter “c” meets the number “1”.

From Lowercase to Uppercase

Imagine a scenario where you have the word “ServiceQuantity”. To make sense of it, you’d like to split it wherever a lowercase letter meets an uppercase one.

// Output: { "Service", "Quantity" }
Splitter.SplitTextByCharacterTransition({"a".."z"}, {"A".."Z"} )("ServiceQuantity")

As you’d expect, the word splits right between “Service” and “Quantity”. The function returns a list with the split values as output.

Real-World Application

Consider a dataset named ‘Source’. The ‘Attribute’ column in this dataset combines a category and a descriptive attribute. Now, if you’re aiming for clarity, you’d want these two elements in separate columns.

Splitter.SplitTextByCharacterTransition dataset

Here’s where the magic happens:

You can identify a recurring pattern in the string: a transition from a lowercase letter to an uppercase one. And, as you might have guessed, the Splitter.SplitTextByCharacterTransition function is a good tool for this.

Want to go a step further? To get a table with separate columns, all you need to do is feed the splitter function to the Table.SplitColumn function:

Table.SplitColumn(
  Source, 
  "Attribute", 
  Splitter.SplitTextByCharacterTransition({"a".."z"}, {"A".."Z"} ), 
  {"Category", "Attribute"}
)
Splitter.SplitTextByCharacterTransition function to split text by each transition to a capital

Once this is executed, you’ll end up with two distinct columns named “Category” and “Attribute”, making your data much easier to interpret.

In summary, the Splitter.SplitTextByCharacterTransition function makes splitting data very easy. So, the next time you’re stuck with mixed texts, you know which function can come to the rescue!

Used by

While the Splitter.SplitTextByCharacterTransition function can be used on its own, it is also used by the following functions:

Other functions related to Splitter.SplitTextByCharacterTransition are:

BI Gorilla Blog

Contribute » | Contributors: Rick de Groot
Microsoft documentation: https://learn.microsoft.com/en-us/powerquery-m/splitter-splittextbycharactertransition