PDA

View Full Version : Split Function That Ignores Delimiters Located Inside Quote Marks



Rick Rothstein
09-15-2012, 06:25 AM
NOTE
-------
I have updated the code in this message on Sept. 21, 2012 to reflect an excellent observation by snb in the messages he posted to this thread. Thanks snb!

Earlier today, Rasm started a thread (VBA - Split command (http://www.excelfox.com/forum/f2/vba-split-command-558/)) where he asked for the ability to use the Split function, but have it not split text at delimiters inside quote marks. An example would be this comma-delimited text string...

One,Two,"Three,Four",Five

where the returned array (call it Arr for this example) should be this...

Arr(0) ==> One
Arr(1) ==> Two
Arr(2) ==> "Three,Four"
Arr(3) ==> Five

Of course, VBA's built-in Split function does not do this... it will split apart the text in Arr(2) blindly ignoring the quote marks. Now, while this question has only just been asked in this forum, it is any old problem having roots going way back. Many, many years ago back when I was volunteering answering questions on the old Compiled VB Newsgroups, I posted a function that did what Rasm has asked for. Below is that function, but to which I have added some additional functionality.


Function QuotedSplit(ByVal TextToSplit As String, _
Optional Delimiter As String = ",", _
Optional RemoveDuplicates As Boolean = False, _
Optional Compare As VbCompareMethod = vbBinaryCompare) As String()
Dim X As Long, QuoteDelimited() As String, WorkingArray() As String
QuoteDelimited = Split(TextToSplit, """")
For X = 0 To UBound(QuoteDelimited) Step 2
QuoteDelimited(X) = Replace$(QuoteDelimited(X), Delimiter, Chr$(1), , , Compare)
Next
TextToSplit = Join(QuoteDelimited, """")
If RemoveDuplicates Then
Do While InStr(1, TextToSplit, Chr(1) & Chr(1), Compare)
TextToSplit = Replace$(TextToSplit, Chr(1) & Chr(1), Chr(1), , , Compare)
Loop
End If
WorkingArray = Split(TextToSplit, Chr$(1))
QuotedSplit = WorkingArray
End Function

The above function has four arguments... the first is the text that you want to split apart and is, of course, required; the second, the text you want to use as the delimiter is optional and is defaulted to a comma, but which can be any text string composed of one or more characters; the third is a Boolean which allows you to specify how consecutive delimiters are to be treated, where the default is False meaning the each delimited field will be returned even if that delimited field is the empty string ("") and where True will make the function treat consecutive delimiters as if they were a single delimiter; and the fourth is an optional built-in data type (VbCompareMethod - actually, a restricted Long underneath it all) which, if the delimiter has alphabetic characters in it, allows you to make the split case sensitive or not... the default is vbBinaryCompare (0) meaning the text must match the case of the delimiter you pass in exactly, set it to vbTextCompare (1) if you want the split to be performed with regard for the letter casing of the delimiter.

Oh, and I guess I should mention... this function, like the Split function which it is modeled after, returns a zero-based one-dimensional String array. Okay, that is it... try playing around with the various options so you can get a "feel" for what they do and how they interact with each other.

snb
09-17-2012, 12:42 AM
Just to illustrate an approach:


Sub snb()
c00 = "One,Two,""Three,Four"",Five"

sp = Split(c00, Chr(34))
For j = 0 To UBound(sp) Step 2
sp(j) = Replace(sp(j), ",", "_")
Next
sn = Split(Join(sp, Chr(34)), "_")

MsgBox Join(sn, vbLf)
End Sub


https://www.youtube.com/@alanelston2330/featured (https://www.youtube.com/@alanelston2330/featured)


https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA (https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA)
https://www.reddit.com/r/windowsxp/comments/pexq9q/comment/k81ybvj/?utm_source=reddit&utm_medium=web2x&context=3 (https://www.reddit.com/r/windowsxp/comments/pexq9q/comment/k81ybvj/?utm_source=reddit&utm_medium=web2x&context=3)
https://www.youtube.com/watch?v=bs-urI_o8jo&lc=UgxYgiEZuS9I3xkjJv54AaABAg (https://www.youtube.com/watch?v=bs-urI_o8jo&lc=UgxYgiEZuS9I3xkjJv54AaABAg)
https://www.youtube.com/watch?v=bs-urI_o8jo&lc=UgyBACXgNY4j_cHgH5J4AaABAg.9oTkVdzfqfm9wlhQrYJP 3M (https://www.youtube.com/watch?v=bs-urI_o8jo&lc=UgyBACXgNY4j_cHgH5J4AaABAg.9oTkVdzfqfm9wlhQrYJP 3M)
ttps://www.youtube.com/watch?v=LP9fz2DCMBE (ttps://www.youtube.com/watch?v=LP9fz2DCMBE)
https://www.youtube.com/watch?v=LP9fz2DCMBE&lc=UgzbPgJUMCztIOQDym14AaABAg (https://www.youtube.com/watch?v=LP9fz2DCMBE&lc=UgzbPgJUMCztIOQDym14AaABAg)
https://www.youtube.com/watch?v=LP9fz2DCMBE&lc=UgzbPgJUMCztIOQDym14AaABAg.9wdo_rWgxSH9wdpcYqrv p8 (https://www.youtube.com/watch?v=LP9fz2DCMBE&lc=UgzbPgJUMCztIOQDym14AaABAg.9wdo_rWgxSH9wdpcYqrv p8)
ttps://www.youtube.com/watch?v=bFxnXH4-L1A (ttps://www.youtube.com/watch?v=bFxnXH4-L1A)
https://www.youtube.com/watch?v=bFxnXH4-L1A&lc=UgxuODisjo6cvom7O-B4AaABAg.9w_AeS3JiK09wdi2XviwLG (https://www.youtube.com/watch?v=bFxnXH4-L1A&lc=UgxuODisjo6cvom7O-B4AaABAg.9w_AeS3JiK09wdi2XviwLG)
https://www.youtube.com/watch?v=bFxnXH4-L1A&lc=UgxBU39bTptFznDC1PJ4AaABAg (https://www.youtube.com/watch?v=bFxnXH4-L1A&lc=UgxBU39bTptFznDC1PJ4AaABAg)
ttps://www.youtube.com/watch?v=GqzeFYWjTxI (ttps://www.youtube.com/watch?v=GqzeFYWjTxI)
https://www.youtube.com/watch?v=GqzeFYWjTxI&lc=UgwJnJDJ5JT8hFvibt14AaABAg (https://www.youtube.com/watch?v=GqzeFYWjTxI&lc=UgwJnJDJ5JT8hFvibt14AaABAg)
https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA (https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA)

Rick Rothstein
09-17-2012, 12:59 AM
Just to illustrate an approach:


Sub snb()
c00 = "One,Two,""Three,Four"",Five"

sp = Split(c00, Chr(34))
For j = 0 To UBound(sp) Step 2
sp(j) = Replace(sp(j), ",", "_")
Next
sn = Split(Join(sp, Chr(34)), "_")

MsgBox Join(sn, vbLf)
End Sub

I am not entirely sure why you posted this as its layout is pretty much exactly that of the code I posted, except it omits the additional functionality that I provided for in my code. The word "an" which I highlighted above seems to suggest you think your code is different from mine in some way... have I taken your meaning and intentions wrongly here?


https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA (https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA)
https://www.youtube.com/watch?v=SIDLFRkUEIo&lc=UgzTF5vvB67Zbfs9qvx4AaABAg (https://www.youtube.com/watch?v=SIDLFRkUEIo&lc=UgzTF5vvB67Zbfs9qvx4AaABAg)
https://www.youtube.com/watch?v=v_1iqtOnUMg&lc=UgxLtKj969oiIu7zNb94AaABAg (https://www.youtube.com/watch?v=v_1iqtOnUMg&lc=UgxLtKj969oiIu7zNb94AaABAg)
https://www.youtube.com/watch?v=f7xZivqLZxc&lc=Ugxq4JHRza_zx3sz0fx4AaABAg (https://www.youtube.com/watch?v=f7xZivqLZxc&lc=Ugxq4JHRza_zx3sz0fx4AaABAg)
https://www.youtube.com/watch?v=f7xZivqLZxc&lc=Ugxq4JHRza_zx3sz0fx4AaABAg (https://www.youtube.com/watch?v=f7xZivqLZxc&lc=Ugxq4JHRza_zx3sz0fx4AaABAg)
https://www.youtube.com/watch?v=f7xZivqLZxc&lc=UgzMCQUIQgrbec400jl4AaABAg (https://www.youtube.com/watch?v=f7xZivqLZxc&lc=UgzMCQUIQgrbec400jl4AaABAg)
https://www.youtube.com/watch?v=LuAipOW8BNQ&lc=Ugz0Uy2bCSCTb1W-0_14AaABAg (https://www.youtube.com/watch?v=LuAipOW8BNQ&lc=Ugz0Uy2bCSCTb1W-0_14AaABAg)
https://www.youtube.com/watch?v=ITI1HaFeq_g&lc=Ugx12mI-a39T41NaZ8F4AaABAg.9iDQqIP56NV9iFD0AkeeJG (https://www.youtube.com/watch?v=ITI1HaFeq_g&lc=Ugx12mI-a39T41NaZ8F4AaABAg.9iDQqIP56NV9iFD0AkeeJG)
https://www.youtube.com/watch?v=vXyMScSbhk4&lc=Ugxa2VYHMWJWXA6QI294AaABAg.9irLgSdeU3r9itU7zdnW Hw (https://www.youtube.com/watch?v=vXyMScSbhk4&lc=Ugxa2VYHMWJWXA6QI294AaABAg.9irLgSdeU3r9itU7zdnW Hw)
https://www.youtube.com/watch?v=tPRv-ATUBe4&lc=UgzFkoI0n_BxwnwVMcZ4AaABAg (https://www.youtube.com/watch?v=tPRv-ATUBe4&lc=UgzFkoI0n_BxwnwVMcZ4AaABAg)
https://www.youtube.com/watch?v=LuAipOW8BNQ&lc=Ugz0Uy2bCSCTb1W-0_14AaABAg.9htChVuaX9W9htG01cKBzX (https://www.youtube.com/watch?v=LuAipOW8BNQ&lc=Ugz0Uy2bCSCTb1W-0_14AaABAg.9htChVuaX9W9htG01cKBzX)
https://www.youtube.com/watch?v=LuAipOW8BNQ&lc=Ugw6UrV69zpeKvLOeOV4AaABAg.9ht16tzryC49htJ6TpIO XR (https://www.youtube.com/watch?v=LuAipOW8BNQ&lc=Ugw6UrV69zpeKvLOeOV4AaABAg.9ht16tzryC49htJ6TpIO XR)
https://www.youtube.com/watch?v=LuAipOW8BNQ&lc=UgwMKwGZpDjv7vi7pCx4AaABAg (https://www.youtube.com/watch?v=LuAipOW8BNQ&lc=UgwMKwGZpDjv7vi7pCx4AaABAg)
https://www.youtube.com/watch?v=LuAipOW8BNQ&lc=Ugw6UrV69zpeKvLOeOV4AaABAg.9ht16tzryC49htOKs4jh 3M (https://www.youtube.com/watch?v=LuAipOW8BNQ&lc=Ugw6UrV69zpeKvLOeOV4AaABAg.9ht16tzryC49htOKs4jh 3M)
https://www.youtube.com/watch?v=LuAipOW8BNQ&lc=UgxVW-am20rQ5GFuJ9F4AaABAg (https://www.youtube.com/watch?v=LuAipOW8BNQ&lc=UgxVW-am20rQ5GFuJ9F4AaABAg)
https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA (https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA)

snb
09-17-2012, 01:26 AM
I had no intention to present an alternative approach to all the functionality you provided.
As far as I can see the main difference is in which part to replace the delimiter: in your code you replace 'inside' the quotation marks; I do that 'outside' the quotation marks.
You use 2 different replacers: chr(0) & chr(1); I only use 1
That saves some additional replacements.

If you think my post confuses more that it enlightens, please feel completely free to remove it. :) :)




https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA (https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA)


https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA (https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA)
https://www.youtube.com/watch?v=zHJPliWS9FQ&lc=Ugz39PGfytiMUCmTPTl4AaABAg.91d_Pbzklsp9zfGbIr8h gW (https://www.youtube.com/watch?v=zHJPliWS9FQ&lc=Ugz39PGfytiMUCmTPTl4AaABAg.91d_Pbzklsp9zfGbIr8h gW)
https://www.youtube.com/watch?v=zHJPliWS9FQ&lc=UgwbcybM8fXnaIK-Y3B4AaABAg.97WIeYeaIeh9zfsJvc21iq (https://www.youtube.com/watch?v=zHJPliWS9FQ&lc=UgwbcybM8fXnaIK-Y3B4AaABAg.97WIeYeaIeh9zfsJvc21iq)
https://www.youtube.com/watch?v=vSjTzhoJFdk&lc=UgzTC8V4jCzDHbmfCHF4AaABAg.9zaUSUoUUYs9zciSZa95 9d (https://www.youtube.com/watch?v=vSjTzhoJFdk&lc=UgzTC8V4jCzDHbmfCHF4AaABAg.9zaUSUoUUYs9zciSZa95 9d)
https://www.youtube.com/watch?v=vSjTzhoJFdk&lc=UgzTC8V4jCzDHbmfCHF4AaABAg.9zaUSUoUUYs9zckCo1tv PO (https://www.youtube.com/watch?v=vSjTzhoJFdk&lc=UgzTC8V4jCzDHbmfCHF4AaABAg.9zaUSUoUUYs9zckCo1tv PO)
https://www.youtube.com/watch?v=vSjTzhoJFdk&lc=UgwMsgdKKlhr2YPpxXl4AaABAg (https://www.youtube.com/watch?v=vSjTzhoJFdk&lc=UgwMsgdKKlhr2YPpxXl4AaABAg)
https://www.youtube.com/watch?v=XQAIYCT4f8Q&lc=UgwTUdEgR4bdt6crKXF4AaABAg.9xmkXGSciKJ9xonTti2s Ix (https://www.youtube.com/watch?v=XQAIYCT4f8Q&lc=UgwTUdEgR4bdt6crKXF4AaABAg.9xmkXGSciKJ9xonTti2s Ix)
https://www.youtube.com/watch?v=XQAIYCT4f8Q&lc=UgwWw16qBFX39JCRRm54AaABAg.9xnskBhPnmb9xoq3mGxu _b (https://www.youtube.com/watch?v=XQAIYCT4f8Q&lc=UgwWw16qBFX39JCRRm54AaABAg.9xnskBhPnmb9xoq3mGxu _b)
https://www.youtube.com/watch?v=XQAIYCT4f8Q&lc=UgzgWvzV-kvC4TJ8O414AaABAg.9xnFzCj8HRM9xon1p2ImxO (https://www.youtube.com/watch?v=XQAIYCT4f8Q&lc=UgzgWvzV-kvC4TJ8O414AaABAg.9xnFzCj8HRM9xon1p2ImxO)
https://www.youtube.com/watch?v=XQAIYCT4f8Q&lc=UgybZfNJd3l4FokX3cV4AaABAg.9xm_ufqOILb9xooIlv5P LY (https://www.youtube.com/watch?v=XQAIYCT4f8Q&lc=UgybZfNJd3l4FokX3cV4AaABAg.9xm_ufqOILb9xooIlv5P LY)
https://www.youtube.com/watch?v=XQAIYCT4f8Q&lc=UgzgWvzV-kvC4TJ8O414AaABAg.9xnFzCj8HRM9y38bzbSqaG (https://www.youtube.com/watch?v=XQAIYCT4f8Q&lc=UgzgWvzV-kvC4TJ8O414AaABAg.9xnFzCj8HRM9y38bzbSqaG)
https://www.youtube.com/watch?v=XQAIYCT4f8Q&lc=UgyWm8nL7syjhiHtpBF4AaABAg.9xmt8i0IsEr9y3FT9Y9F eM (https://www.youtube.com/watch?v=XQAIYCT4f8Q&lc=UgyWm8nL7syjhiHtpBF4AaABAg.9xmt8i0IsEr9y3FT9Y9F eM)
https://www.youtube.com/watch?v=jTmVtPHtiTg&lc=Ugy_RiNN_kAqUvZ8W994AaABAg.9xhyRrsUUOM9xpn-GDkL3o (https://www.youtube.com/watch?v=jTmVtPHtiTg&lc=Ugy_RiNN_kAqUvZ8W994AaABAg.9xhyRrsUUOM9xpn-GDkL3o)
https://www.youtube.com/watch?v=jTmVtPHtiTg&lc=Ugy_RiNN_kAqUvZ8W994AaABAg (https://www.youtube.com/watch?v=jTmVtPHtiTg&lc=Ugy_RiNN_kAqUvZ8W994AaABAg)
https://www.youtube.com/watch?v=jTmVtPHtiTg&lc=UgzlC5LBazG6SMDP4nl4AaABAg (https://www.youtube.com/watch?v=jTmVtPHtiTg&lc=UgzlC5LBazG6SMDP4nl4AaABAg)
https://www.youtube.com/watch?v=jTmVtPHtiTg&lc=UgzlC5LBazG6SMDP4nl4AaABAg.9zYoeePv8sZ9zYqog9KZ 5B (https://www.youtube.com/watch?v=jTmVtPHtiTg&lc=UgzlC5LBazG6SMDP4nl4AaABAg.9zYoeePv8sZ9zYqog9KZ 5B)
https://www.youtube.com/watch?v=jTmVtPHtiTg&lc=Ugy_RiNN_kAqUvZ8W994AaABAg.9xhyRrsUUOM9zYlZPKdO pm (https://www.youtube.com/watch?v=jTmVtPHtiTg&lc=Ugy_RiNN_kAqUvZ8W994AaABAg.9xhyRrsUUOM9zYlZPKdO pm)
https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA (https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA)

Rick Rothstein
09-21-2012, 10:27 PM
I had no intention to present an alternative approach to all the functionality you provided.
As far as I can see the main difference is in which part to replace the delimiter: in your code you replace 'inside' the quotation marks; I do that 'outside' the quotation marks.
You use 2 different replacers: chr(0) & chr(1); I only use 1
That saves some additional replacements.

Sorry about the delay in getting back to you on this. The part I highlighted in red is an excellent observation... when I first developed the code (many years ago now), my thought was I had to actively protect the text inside the quotes, but as your postings indicate... I do not have to do it that way... rather, simply changing the non-quoted delimiters and then directly manipulating those replacements makes the code tighter and more efficient. Thank you for your incites on this... much appreciated. By the way, I have changed the code in my original posting to reflect the changes you suggested.