Results 1 to 2 of 2

Thread: FindWord - Find A (Possibly Listed) Word, As A Word, Not Embedded Within Another Word

  1. #1
    Forum Guru Rick Rothstein's Avatar
    Join Date
    Feb 2012
    Rep Power

    FindWord - Find A (Possibly Listed) Word, As A Word, Not Embedded Within Another Word

    NOTE: Minor addition (highlighted in red) to the code made on October 16, 2012

    The inspiration for this article came from this blog article by Debra Dalgleish ("Find Text With INDEX and MATCH") that I found on her Contextures Blog. In that article, Debra shows a formula that identifies the location of any one word in a "list" of words from within a larger piece of text. This seemed to be a natural extension to my "InStrExact - Find Location Of A Word, As A Word, Not Embedded Within Another Word" article posted elsewhere here in Rick Rothstein's Corner. The resulting function below (which can be used as a UDF) actually incorporates the functionality of my InStrExact function completely except that I changed the way delimiters for words are handled. In the function below, the word break characters are specified directly with the user having the ability to extend the list as needed (unlike the InStrExact function which employed a more complex method to determine which characters could serve as a word break). First, here is the function, which I call FindWord...

    Function FindWord(Start As Long, SourceText As String, ByVal WordsToFind As Variant, _
                      Optional CaseSensitive As Boolean = False, _
                      Optional AdditionalWordBreaks As String) As Long
      Dim x As Long, Str1 As String, Pattern As String, WordBreaks As String, Word As Variant
      WordBreaks = "[&"" '(),./:;<>?[\_`{|}~©®«»­´¶·¿!" & Chr$(160) & "-]"
      For x = 1 To Len(AdditionalWordBreaks)
        If InStr(WordBreaks, Mid(AdditionalWordBreaks, x, 1)) Then Mid(AdditionalWordBreaks, x, 1) = " "
      WordBreaks = Replace(WordBreaks, "-]", Replace(AdditionalWordBreaks, " ", "") & "-]")
      If CaseSensitive Then
        Str1 = SourceText
        Str1 = UCase(SourceText)
      End If
      If TypeName(WordsToFind) = "String" Then WordsToFind = Split(WordsToFind, Chr$(1))
      For Each Word In WordsToFind
        If Not CaseSensitive Then Word = UCase(Word)
        For x = Start To Len(Str1) - Len(Word) + 1
          If Mid(" " & Str1 & " ", x, Len(Word) + 2) Like WordBreaks & Word & WordBreaks Or _
             Mid(" " & Str1 & " ", x, Len(Word) + 2) Like WordBreaks & Word & "[]]" Then
            FindWord = x
            Exit Function
          End If
    End Function
    This function takes three required arguments and provides for two additional optional arguments. The first argument, named Start, is required and lets you set the starting character number for the search to begin at (use 1 to start the search at the first character). The second argument, named SourceText, is required and is the text to look for the word in. The third argument, named WordsToFind, is required and is word or list of words whose position you want to locate (more about this argument in a moment). The fourth argument, named CaseSensitive, is optional and allows you to make the search case sensitive or not (the default is FALSE representing a case insensitive search). The fifth argument, named AdditionalWordBreaks, allows you to include additional characters (one or more characters) as word break characters over and above the characters I hard-coded in the function. And those characters are...

    [ Left square bracket
    ] Right square bracket
    & Ampersand
    " Quote mark
    ' Apostrophe
    ( Left Parenthesis
    ) Right Parenthesis
    , Comma
    . Dot
    / Slash
    : Colon
    ; Semi-Colon
    < Left Angle Bracket
    > Right Angle Bracket
    ? Question Mark
    \ Back Slash
    _ Underline
    ` Grave Accent
    { Left Curly Brace
    } Right Curly Brace
    | Vertical Line
    ~ Tilde
    © Copyright Symbol
    ® Registered Symbol
    « Left Double Angle Bracket
    » Right Double Angle Bracket
    ¶ Pilcrow Sign
    · Middle Dot
    ¿ Spanish Question Mark
    ! Exclamation mark
    - Dash
    Non-Breaking Space

    If you have one or more characters that is not listed above that you would like to be used as a word break character, simply pass them in as a single text string for this fifth argument. For example, let's say you wanted to find the location for the number 12.34 within this text string...

    "The value was 12.34% or thereabouts."

    Because the percent sign is not defined as a word break character, you cannot simply look for the number 12.34 like this...

    Position = FindWord(1, "The value was 12.34% or thereabouts.", "12.34")
    rather, you have to tell the function the percent sign is to be considered a word break character like this...

    Position = FindWord(1, "The value was 12.34% or thereabouts.", "12.34",,"%")
    Okay, now about that third argument (WordsToFind)... this argument can be a single word, an array of words (either using a declared array or an array produced from the Array or Split function when called from other VB code or using an array of quoted words encased in curly braces {} from a worksheet formula) or a specified range when called from a worksheet formula. So, all of the following are valid ways to specify this...

    From VB Code
    VariableContainingWord (either simple String variable or Array variable)
    Array("WordOne", "WordTwo", "WordThree", "Etc.")
    Split("WordOne,WordTwo,WordThree,Etc.", ",")

    From Excel Formula

    So, using the array constant argument in the above "From Excel Formula" section, you could specify this formula in a cell...

    =FindWord(1,"This contains WordTwo in it",{"WordOne","WordTwo","WordThree","Etc."})

    and it would return 15 as a result because one of the words in the list (WordTwo) can be found at character position 15 in the larger text string passed into the second argument.

    One final note... while the emphasis has been on single words, word or words specified for the WordsToFind argument do not have to be single words... they can be multi-word phrases possibly containing characters that would otherwise be condered word break characters. So, you could have a text string like this...

    "Her mother-in-law came to visit."

    and you could use the FindWord function to locate "mother-in-law" even though the dash is considered to be a word break character.

    If you are new to UDFs, they are easy to install and use. To install it, simply press ALT+F11 to go into the VB editor and, once there, click Insert/Module on its menu bar, then copy/paste the above code into the code window that just opened up. That's it.... you are done. You can now use NameOfTheUDF just like it was a built-in Excel function. For example, assuming A1 contains the text string and B1 contains a word to find...

    Last edited by Rick Rothstein; 10-16-2012 at 02:21 PM.

  2. #2
    Forum Guru Rick Rothstein's Avatar
    Join Date
    Feb 2012
    Rep Power
    Posting this message so that it will alert anyone subscribed to this article... see the note at the beginning of Message #1.

Similar Threads

  1. Replies: 7
    Last Post: 08-24-2015, 11:28 PM
  2. Loop through a folder and find word
    By k0st4din in forum Excel Help
    Replies: 7
    Last Post: 12-08-2012, 02:52 PM
  3. Search word in different workbooks and sheets
    By k0st4din in forum Excel Help
    Replies: 5
    Last Post: 11-29-2012, 10:40 PM
  4. How to extract all text from a word file?
    By vsrawat in forum Word Help
    Replies: 3
    Last Post: 09-25-2012, 10:54 PM
  5. Replies: 2
    Last Post: 09-25-2012, 02:00 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts