PDA

View Full Version : Find the First or Last So Many Words in a Text String



Rick Rothstein
06-19-2012, 01:54 PM
Using from the formula I posted in my "Get Field from Delimited Text String (http://www.excelfox.com/forum/f22/get-field-delimited-text-string-333/)" article and the formula Haseeb Avarakkan posted in response to my "Get "Reversed" Field from Delimited Text String (http://www.excelfox.com/forum/f22/get-reversed-field-delimited-text-string-334/)" article as a base, here are formulas for finding the first N words and the last N words in a text string.

First N Words: =TRIM(LEFT(SUBSTITUTE(TRIM(A1)," ",REPT(" ",99)),N*99))

Last N Words: =TRIM(RIGHT(SUBSTITUTE(TRIM(A1)," ",REPT(" ",99)),N*99))

where N is the number of words you want to retrieve. The only caution is that the text in A1 must be less than 100 characters (that is what the 99 establishes). If your text is longer, then you would need to change the two 99's in each formula to a number equal to, or larger than, the maximum number of characters the text in A1 can have.

snb
06-19-2012, 03:22 PM
Did you consider; to return the first 4 words:


=LEFT(A1;FIND("@";SUBSTITUTE(A1;" ";"@";4)))

The N in your formula is 4 in mine (as an example)


=LEFT(A1;FIND("@";SUBSTITUTE(A1;" ";"@";N)))

The last N words:


=MID(TRIM(A1);FIND("@";SUBSTITUTE(TRIM(A1);" ";"@";1+LEN(A1)-LEN(SUBSTITUTE(TRIM(A1);" ";""))-3));LEN(A1))

Rick Rothstein
06-20-2012, 02:31 AM
Did you consider; to return the first 4 words:


=LEFT(A1;FIND("@";SUBSTITUTE(A1;" ";"@";4)))

The N in your formula is 4 in mine (as an example)


=LEFT(A1;FIND("@";SUBSTITUTE(A1;" ";"@";N)))

Nope, at 5:00AM in the morning, that method simply didn't occur to me. Thanks for reminding me of it, although you should see my note in the next section.


The N in your formula is 4 in mine (as an example)


=LEFT(A1;FIND("@";SUBSTITUTE(A1;" ";"@";N)))

This is a better formula than mine, for sure, but it is including a trailing space character. You need to subtract 1 from the FIND function's result...


=LEFT(A1,FIND("@",SUBSTITUTE(A1," ","@",3))-1)


The last N words:


=MID(TRIM(A1);FIND("@";SUBSTITUTE(TRIM(A1);" ";"@";1+LEN(A1)-LEN(SUBSTITUTE(TRIM(A1);" ";""))-3));LEN(A1))
In a similar fashion to your other formula, this one is including a leading space in its return value. However, with 6 more function calls than the formula I posted for this situation, I think I'll stay with my formula for this case.

snb
06-21-2012, 07:10 PM
You are absolutely right about the 'trailing' space. I'd prefer


=trim(LEFT(A1,FIND("@",SUBSTITUTE(A1," ","@",3))))
Correct me if I'm wrong but shouldn't you use


=TRIM(RIGHT(SUBSTITUTE(TRIM(A1)," ",REPT(" ",99)),N*99))

just to exclude a textstring that contains two adjacent spaces ?

Rick Rothstein
06-21-2012, 07:48 PM
You are absolutely right about the 'trailing' space. I'd prefer


=trim(LEFT(A1,FIND("@",SUBSTITUTE(A1," ","@",3))))
Correct me if I'm wrong but shouldn't you use


=TRIM(RIGHT(SUBSTITUTE(TRIM(A1)," ",REPT(" ",99)),N*99))

just to exclude a textstring that contains two adjacent spaces ?
Yes, that is a good catch, but unfortunately for you, it appears to mean that your suggested replacement for my first N words formula is no longer better than mine. Your formula suffers from the same problem with respect to multiple spaces that you pointed out mine does and it seems the only way to correct for them would be to include two TRIM function calls...


=LEFT(TRIM(A1),FIND("@",SUBSTITUTE(TRIM(A1)," ","@",3))-1)

I considered your originally suggested formula better than mine because it involved one less function call, but now, with the two TRIM function calls added in to control the double-space problem, your suggested formula has the same number of function calls than my original proposal with your sugggested TRIM function added in. I like the symmetry of shape for the two functions I posted originally, so I think I will stay with them. Thanks for having caught this double-space problem in the first place though, I appreciate your diligence in sticking with this thread. I will modify the formulas in my original message to include the TRIM function as soon as I send this message.

snb
06-21-2012, 09:29 PM
Well, I never suggested anything 'better'; I prefer talking in 'alternatives'. But still I think my formula hasn't the 99 words limitation your suggestion has. How to assess that (lack) of limitation is a question of personal preference in my opinion.

Rick Rothstein
06-21-2012, 09:42 PM
But still I think my formula hasn't the 99 words limitation your suggestion has.

That is a strong point in favor of your formula, no doubt (and many will probably prefer it for that reason), but, as I mentioned, I still kind of like the symmetry of contruction between the two formulas I posted (just swap out RIGHT for LEFT or vice-versa depending on which end you want to count from).