Log in

View Full Version : Lookup First URL From Google/Bing Search Result



navpar
03-22-2019, 11:57 PM
I have a set of words I would like to web search (google or bing) from my excel and store the first result in the next column. The closest I found to something that works for me was posted by @superspike711 here: http://www.excelfox.com/forum/showthread.php/973-Lookup-First-URL-From-Google-Search-Result-Using-VBA. However, the code now throws up some errors presumably because the way Bing displays search results has changed. And @superspike711 seems to have "left the building" (it looks like he only registered to post this one file. And hasn't responded to couple of recent posts).

I am hoping someone else can help.

I have attached the file along with his/her code. Column A has the terms that need to be searched. Column B has the results of the VBA code. And Column C has the correct results if I plug in the search term in Bing.

The only change I made to the code was to change the line

If Len(StrOutput) > 50 Then StrOutput = "NOT FOUND"

to

If Len(StrOutput) > 5000 Then StrOutput = "NOT FOUND".

This was because (a) some of my search result URLs are long, and (b) it caught some of the errors better. For example, the search result for "TradeGecko, site:onelogin.com" (A4) was coming up as Not Found. With the increased cut-off, the actual value can be seen in the result cell (B4). Hopefully it helps in debugging.

Rows in green show that the output of the VBA code and reality matched (although "no record found" shows up in a couple of different ways. Please see row 6 and 7).

"No fill" rows show the errors.

also posting the code here



Option Explicit
Dim var As Variant

Sub SearchBing()

Dim rng As Range
Dim lngCol As Long
Dim oXML As New MSXML2.XMLHTTP
Dim uTxt As Variant
Dim xURL As String

Dim srchString As Variant
Dim StrStart As Variant
Dim StrEnd As Variant
Dim StrOutput As Variant
Dim x As Variant
Dim OutputRowNumber As Variant

' Clear Destination Column for URL Lookup
Worksheets("Sheet1").Range("B2:B" & Rows.Count).ClearContents


' Set base search using bing instead of google because google's pages are a mess
Const strURL = "http://www.bing.com/search?q=|+|"
' Move through source cells
For Each rng In Worksheets("Sheet1").Range("A2:A5000")
' Make sure range is not empty before we start
If Not IsEmpty(rng) Then
' encode url for direct submission
xURL = Replace(strURL, "|+|", URLEncode(rng.Value))
'Debug.Print xURL
' set object parameters
oXML.Open "GET", xURL, False
oXML.setRequestHeader "Content-Type", "text/xml"

' Delay request by wait to allow catchup and to not freakout
' search firewalls
x = x + 1
If x = 6 Then
x = 0
'Slow it down a bit
DoEvents
End If
'Debug.Print "x=" & x

' if you don't have this program will crap out if it encounters weird
' formatted page
On Error Resume Next

'Send request returns page of searchable xml text to oXML object
oXML.send
'FIRST SEARCH PRIORITY
' srchstring finds unique string in the ballpark of what you want
' add more searchs if you want other information or if you want to
' get check more accurate info from page if it is available
' this section finds "website" button that sometimes appears on the right
' of bing search results
srchString = "role=""button"" href=""http"
' extract small chunk of text so you can search again without dupes
StrStart = InStr(1, oXML.responseText, srchString) + 20
StrEnd = InStr(StrStart, oXML.responseText, 60)
StrOutput = Mid(oXML.responseText, StrStart, StrEnd - StrStart - 3) & Chr(10)

' in the smaller string you extracted above drill down on the actual string
' you want - this one finds the first instance of http in the smaller string
StrStart = InStr(1, StrOutput, "http")
' this one finds the end of the url by starting 9 after http and looks for "
StrEnd = InStr(StrStart + 9, StrOutput, """")
' use start and end to isolate the web address
StrOutput = Mid(StrOutput, StrStart, StrEnd - StrStart)



'SECOND SEARCH PRIORITY
' using if-then If the first search does not find a website button it
' returns first item from search results. Note: First result is not always
' what you want
If InStr(1, StrOutput, "www.w3.org") Then
srchString = "Search Results"
StrStart = InStr(1, oXML.responseText, srchString) + 47
StrEnd = InStr(StrStart, oXML.responseText, 60)
StrOutput = Mid(oXML.responseText, StrStart, StrEnd - StrStart - 3) & Chr(10)


StrStart = InStr(1, StrOutput, "http")
StrEnd = InStr(StrStart + 9, StrOutput, """")
StrOutput = Mid(StrOutput, StrStart, StrEnd - StrStart)

End If

'Remove results that is too long (likely some sort of error)
If Len(StrOutput) > 5000 Then StrOutput = "NOT FOUND"

'Cleanup Some of the Bad Addresses
If InStr(1, StrOutput, "bloomberg.com") Then StrOutput = "NOT FOUND"
If InStr(1, StrOutput, "dandb.com") Then StrOutput = "NOT FOUND"
If InStr(1, StrOutput, "bing.com") Then StrOutput = "NOT FOUND"
If InStr(1, StrOutput, "wikipedia.org") Then StrOutput = "NOT FOUND"
If InStr(1, StrOutput, "doubleclick.net") Then StrOutput = "NOT FOUND"
If InStr(1, StrOutput, "brightscope.com") Then StrOutput = "NOT FOUND"
If InStr(1, StrOutput, "usnews.com") Then StrOutput = "NOT FOUND"
If InStr(1, StrOutput, "linkedin.com") Then StrOutput = "NOT FOUND"
If InStr(1, StrOutput, "usnews.com") Then StrOutput = "NOT FOUND"



'add row numbers for debugging
'OutputRowNumber = OutputRowNumber + 1
'Debug.Print OutputRowNumber & ": " & StrOutput
'Debug.Print oXML.responseText

rng.Offset(, 1).Value = StrOutput

Else
Exit For
End If
Next rng
'set debugging row back to zero
'OutputRowNumber = 0
Set rng = Nothing


End Sub


Function URLEncode(EncodeStr As String) As String
Dim i As Integer
Dim erg As String

erg = EncodeStr

' *** First replace '%' chr
erg = Replace(erg, "%", Chr(1))

' *** then '+' chr
erg = Replace(erg, "+", Chr(2))

For i = 0 To 255
Select Case i
' *** Allowed 'regular' characters
Case 37, 43, 48 To 57, 65 To 90, 97 To 122

Case 1 ' *** Replace original %
erg = Replace(erg, Chr(i), "%25")

Case 2 ' *** Replace original +
erg = Replace(erg, Chr(i), "%2B")

Case 32
erg = Replace(erg, Chr(i), "+")

Case 3 To 15
erg = Replace(erg, Chr(i), "%0" & Hex(i))

Case Else
erg = Replace(erg, Chr(i), "%" & Hex(i))

End Select
Next

URLEncode = erg

End Function


Sincerely hope someone can help. I have waaayyy too many search terms :)

Best



https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA (https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA)
https://www.youtube.com/watch?v=kfQC-sQxMcw&lc=UgyCxQWypNIhG2nUn794AaABAg.9q1p6q7ah839tUQl_92m vg (https://www.youtube.com/watch?v=kfQC-sQxMcw&lc=UgyCxQWypNIhG2nUn794AaABAg.9q1p6q7ah839tUQl_92m vg)
https://www.youtube.com/watch?v=vXyMScSbhk4&lc=UgyOh-eR43LvlIJLG5p4AaABAg.9isnKJoRfbL9itPC-4uckb (https://www.youtube.com/watch?v=vXyMScSbhk4&lc=UgyOh-eR43LvlIJLG5p4AaABAg.9isnKJoRfbL9itPC-4uckb)
https://www.youtube.com/watch?v=vXyMScSbhk4&lc=Ugy1B1aQnHq2WbbucmR4AaABAg.9isY3Ezhx4j9itQLuif2 6T (https://www.youtube.com/watch?v=vXyMScSbhk4&lc=Ugy1B1aQnHq2WbbucmR4AaABAg.9isY3Ezhx4j9itQLuif2 6T)
https://www.youtube.com/watch?v=vXyMScSbhk4&lc=UgxxajSt03TX1wxh3IJ4AaABAg.9irSL7x4Moh9itTRqL7d Qh (https://www.youtube.com/watch?v=vXyMScSbhk4&lc=UgxxajSt03TX1wxh3IJ4AaABAg.9irSL7x4Moh9itTRqL7d Qh)
https://www.youtube.com/watch?v=vXyMScSbhk4&lc=Ugxa2VYHMWJWXA6QI294AaABAg.9irLgSdeU3r9itU7zdnW Hw (https://www.youtube.com/watch?v=vXyMScSbhk4&lc=Ugxa2VYHMWJWXA6QI294AaABAg.9irLgSdeU3r9itU7zdnW Hw)
https://www.youtube.com/watch?v=vXyMScSbhk4&lc=UgwJAAPbp8dhkW2X1Uh4AaABAg.9iraombnLDb9itV80HDp Xc (https://www.youtube.com/watch?v=vXyMScSbhk4&lc=UgwJAAPbp8dhkW2X1Uh4AaABAg.9iraombnLDb9itV80HDp Xc)
https://www.youtube.com/watch?v=vXyMScSbhk4&lc=UgzIzQ6MQ5kTpuLbIuB4AaABAg.9is0FSoF2Wi9itWKEvGS Sq (https://www.youtube.com/watch?v=vXyMScSbhk4&lc=UgzIzQ6MQ5kTpuLbIuB4AaABAg.9is0FSoF2Wi9itWKEvGS Sq)
https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA (https://www.youtube.com/channel/UCnxwq2aGJRbjOo_MO54oaHA)