PDA

View Full Version : Find a text substring that matches a given "pattern"



Rick Rothstein
03-09-2012, 01:56 AM
If you are not familiar with VBA's Like operator, then you should become familiar with it as it is a very handy tool. Normally, one uses the Like operator to test if a particular piece of text matches a relatively simple "pattern". For example, if you wanted to know if what the user typed into a TextBox is an upper case letter followed by 6 digits, you could perform this test...


If TextBox1.Value Like "[A-Z]######" Then
' It is so do something here
End If

Okay, that is a simple example to give you an idea of how the Like operator is used (the examples can get way more complex than this). However, what if you were presented with a text string and you wanted to pull out the text that matches a certain pattern. If the pattern is exceeding complex, you might have to turn to an Regular Expression Parser add-in; however, for simple to mildly complex patterns, you might find this function (also usable as a UDF - user defined function), which uses the Like operator at its core, is all that you need...


Function GetPattern(Source As String, ByVal Pattern As String) As String
Dim X As Long, FindPattern As Long
Do Until Left(Pattern, 1) <> "*"
Pattern = Mid(Pattern, 2)
Loop
For X = 1 To Len(Source)
If Mid(Source, X) Like Pattern & "*" Then
FindPattern = X
Exit For
End If
Next
If FindPattern = 0 Then Exit Function
For X = 1 To Len(Source) - FindPattern + 1
If Mid(Source, FindPattern, X) Like Pattern Then
GetPattern = Mid(Source, FindPattern, X)
Exit For
End If
Next
End Function

This function uses the same pattern syntax for its argument that the Like operator uses for its own patterns. If you are not already familiar with the Like operator, you should look it up in the help files so you can see all the pattern possibilities available to you. To whet your appetite for it just a little bit, give this example a try. Put this text in cell A1...

uvw123xyzAB456CDefg

Then put this formula in cell B1...

=GetPattern(A1,"[A-Z][A-Z]###[A-Z]")

What the UDF GetPattern function call has done is retrieve AB456C from the text in cell A1 which, as it turns out, is the text substring matching the function's pattern argument of "2 upper case letters followed by 3 digits followed by another upper case letter" from with the full text in cell A1.

HOW TO INSTALL UDFs
------------------------------------
If you are new to UDFs, they are easy to install and use. To install it, simply press ALT+F11 to go into the VB editor and, once there, click Insert/Module on its menu bar, then copy/paste the above code into the code window that just opened up. That's it.... you are done. You can now use GetPatternGetPattern just like it was a built-in Excel function (see above for examples of usage).

EDIT NOTE (June 19, 2012): I just added the following code line to the originally posted function to stop the error condition that is returned to a UDF if the pattern does not exist within the source text (usage when called by other VB code is not affected by this change)...


If FindPattern = 0 Then Exit Function

With this change, the function, when used as a UDF, will return the empty string ("") rather than a #VALUE! error.

Rick Rothstein
06-19-2012, 10:53 PM
I am replying to this article to alert anyone subscribed to it that I edited the code in the original article... see the Edit Note in the original article for details.

aimhigh85
02-10-2013, 06:19 AM
Rick,

This UDF has already helped me tremendously for the data I am trying to comb through. Being new to all this, can you help guide me through having this look for every instance of the pattern in a cell? This works flawlessly for the first instance of my pattern but some of my data has 3-5 instances of the pattern in a cell and I would like to return all matches separated by a comma if at all possible.