Results 1 to 10 of 222

Thread: Notes tests, Scrapping, YouTube

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    10,457
    Rep Power
    10
    This post , #102 19741
    https://excelfox.com/forum/showthrea...ge11#post19741
    https://excelfox.com/forum/showthread.php/2840-Notes-tests-Scrapping-YouTube/page11#post19741

    https://excelfox.com/forum/showthrea...ll=1#post19741
    https://excelfox.com/forum/showthread.php/2840-Notes-tests-Scrapping-YouTube?p=19741&viewfull=1#post19741





    Second attempt

    Previously, and in my first attempt ( https://eileenslounge.com/viewtopic....303644#p303644 https://excelfox.com/forum/showthrea...YouTube/page10 ) I looked at all Videos from channel So geht YouTube ( https://www.youtube.com/@SogehtYouTube )
    In thus second attempt I will look at another play list from the same channel, Popular videos
    I do note the number of videos is exactly the same as in Videos, so they may be the same … we will see.
    ( Some of this will be repeated text from the first attempt, with bits added. All a bit mixed up, but it’s just my own rough notes for later reference )…….
    The previous story… ……….( https://excelfox.com/forum/showthrea...ll=1#post19701 ) ……………….
    _ In the long play list I looked at it seems you only get a text file of all the stuff I want for a bit more than 75 videos at a time. This makes sense and ties up with the experience when you view manually in real time: The scroll box only goes up to on average a bit over the first 75.


    Scrapping that, or rather to say, playing around with the text file from the page source text from this
    Code:
     https://www.youtube.com/watch?v=rM-CtC6cklI&list=UULFwInqvNXb-GN0JHdtoul_9A    '  --  main play list link
    ,give links of this form
    https://http://www.youtube.com/watch?v=rM-Ct...oul_9A&index=1
    https://http://www.youtube.com/watch?v=YsnmN...oul_9A&index=2
    https://http://www.youtube.com/watch?v=KIx_8...oul_9A&index=3

    …….. up to about &index=79
    If you want the next chunk of videos, and a new text file of it all, you have to click on a video towards the bottom. ( https://i.postimg.cc/65L3ydNF/Click-...t-next-lot.jpg ) I thought I would keep stuff in some organized order, so tried getting all the text in a text file from these 9 links, the ones ending with &index=1, &index=76, &index=151, &index=226 …. 301, 376,451,526,601
    That sort of worked…. Eventually…
    _ I end up with 9 big text files to play with So that is sort of Part 1. I got now all the info I need, somewhere I expect, in those files… https://i.postimg.cc/R06JWCxf/9-Big-...text-files.jpg


    _ a small snag: Previously using the main link, https://http://www.youtube.com/watch?v=rM-Ct...-GN0JHdtoul_9A , gets the first 79 links and with the index number, which is not essential but useful to have. But use a link with the extra &index=123 and I can’t find or get the index number from those 9 text files. Could be hidden there somewhere. I can’t see it initially. Maybe later.
    No matter, not so important
    _ ( I am actually using initially a hybrid Yasser/ SpeakEasy suggestion code to get those. So
    Object "MSXML2.ServerXMLHTTP"
    and the
    .setRequestHeader "User-Agent", "Chrome".
    Maybe that’s a sort of “belt and braces” approach? I don’t know. I have not had the time to look in great detail at the differences yet in the three files. The hybrid comes out the smallest of the three.
    ( https://i.postimg.cc/MK5Q4rYc/Hybrid...-text-file.jpg ) )
    …………………………

    The new second story:
    I will loop this time all 9 links to get the text files, instead of re hard coding 9 times as I did the last time, ( but I will run the macro from the VB Editor in step F8 mode as usual initially. )

    Here the files:

    Coding to get those 9 text files
    Code:
    Sub WieGehtsYouTubeURLServerChromeHybridStep75_2()   '     https://eileenslounge.com/viewtopic.php?p=303644#p303644   https://excelfox.com/forum/showthread.php/2840-Notes-tests-Scrapping-YouTube/page11#post19741  https://excelfox.com/forum/showthread.php/2656-Automated-Search-Results-Returning-Nothing            https://excelfox.com/forum/showthread.php/973-Lookup-First-URL-From-Google-Search-Result-Using-VBA
     On Error GoTo Bed
        '_1 First section get the long text string of the HTML coding of the internet Page
        '_1(i) get the long single text string
    Dim strURLs As String: Let strURLs = "https://www.youtube.com/watch?v=l8TYMHlqlLM&list=UULPwInqvNXb-GN0JHdtoul_9A&index=1" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=h15o6YLzfqc&list=UULPwInqvNXb-GN0JHdtoul_9A&index=76" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=cYJxctyMO2s&list=UULPwInqvNXb-GN0JHdtoul_9A&index=151" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=dcQNQP9i_WE&list=UULPwInqvNXb-GN0JHdtoul_9A&index=226" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=FDg34qCE8-Y&list=UULPwInqvNXb-GN0JHdtoul_9A&index=301" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=t_Xuqu6Rw2Q&list=UULPwInqvNXb-GN0JHdtoul_9A&index=376" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=5DkjHTTqIPc&list=UULPwInqvNXb-GN0JHdtoul_9A&index=451" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=1tT7m5qAR4o&list=UULPwInqvNXb-GN0JHdtoul_9A&index=526" & vbCr & vbLf & _
    "https://www.youtube.com/watch?v=g9kOyaXsJlk&list=UULPwInqvNXb-GN0JHdtoul_9A&index=601"
    Dim URLs() As String: Let URLs() = Split(strURLs, vbCr & vbLf, 9, vbBinaryCompare)
    Dim Cnt As Long
        For Cnt = LBound(URLs()) To UBound(URLs())
        Dim strURL As String, Indx As String
         Let strURL = URLs(Cnt)
         Let Indx = Right(strURL, Len(strURL) - InStrRev(strURL, "&", -1, vbBinaryCompare))
         Let Indx = Replace(Indx, "=", "_", 1, 1, vbBinaryCompare)
                With CreateObject("MSXML2.ServerXMLHTTP")
                 .Open "GET", strURL, False ' '
                 '.Open "GET", "", False ' '
                 'No extra info here for type GET
                 '.setRequestHeader bstrheader:="Ploppy", bstrvalue:="PooH" ' YOU MAY NEED TO TAKE OUT THIS LINE
                                                                                            '.setRequestHeader bstrheader:="If-Modified-Since", bstrvalue:="Sat, 1 Jan 2000 00:00:00 GMT" '  https://www.autohotkey.com/boards/viewtopic.php?t=9554  ---   It will caching the contents of the URL page. Which means if you request the same URL more than once, you always get the same responseText even the website changes text every time. This line is a workaround : Set cache related headers.
                 .setRequestHeader "User-Agent", "Chrome"  '  https://eileenslounge.com/viewtopic.php?p=303639#p303639
                 .send ' varBody:= ' No extra info for type GET. .send actually makes the request
                    While .readyState <> 4: DoEvents: Wend ' Allow other processes to run while the web page loads. Think this is part of the True option
                Dim PageSrc As String: Let PageSrc = .responseText ' Save the HTML code in the (Global) variable. ': Range("P1").Value = PageSrc 'For me for a print out copy to text file etc.    The responseText property returns the information requested by the Open method as a text string
                End With
            '_1(ii)  Optional secion  to put the text string into a text file , for ease of code developments
            Dim FileNum2 As Long: Let FileNum2 = FreeFile(0)                                  ' https://msdn.microsoft.com/en-us/vba/language-reference-vba/articles/freefile-function
            Dim PathAndFileName2 As String
             Let PathAndFileName2 = ThisWorkbook.Path & "\" & "SecondAttemptPopular\" & "WieGehtsYouTubePopularServerChrome" & Indx & ".txt" ' "WieGehtsYouTubeServerChrome526" & ".txt" ' "WieGehtsYouTubeServerChrome451" & ".txt" '  "WieGehtsYouTubeServerChrome376" & ".txt" '  "WieGehtsYouTubeServerChrome301" & ".txt" '  "WieGehtsYouTubeServerChrome226" & ".txt" '  "WieGehtsYouTubeServerChrome151" & ".txt" '  "WieGehtsYouTubeServerChrome76" & ".txt"   '   "WieGehtsYouTubeServerChrome1" & ".txt"   '
            Open PathAndFileName2 For Output As #FileNum2 ' ' The text file will be made if not there, and if it is there and already contains data, then the data will be overwritten
             Print #FileNum2, PageSrc '
             Close #FileNum2
        Next Cnt
    Exit Sub  '  Normal code error in the case of no errors
    Bed:
     MsgBox prompt:=Err.Number & ":  " & Err.Description: Debug.Print Err.Number & ":  " & Err.Description
    End Sub   ' Code end in the case of any error
    '    Dim sTitle As String
    '     Let sTitle = Split(Split(PageSrc, """title"":{""runs"":[{""text"":""")(1), """}]}")(0)
    '
    '    Dim sViews As String
    '     Let sViews = Split(Split(PageSrc, """shortViewCount"":{""simpleText"":""")(1), """}}}")(0)
    
    Last edited by DocAElstein; 02-19-2023 at 02:26 AM.

Similar Threads

  1. Some Date Notes and Tests
    By DocAElstein in forum Test Area
    Replies: 5
    Last Post: 03-26-2025, 02:56 AM
  2. Tests and Notes on Range Referrencing
    By DocAElstein in forum Test Area
    Replies: 70
    Last Post: 02-20-2024, 01:54 AM
  3. Tests and Notes for EMail Threads
    By DocAElstein in forum Test Area
    Replies: 29
    Last Post: 11-15-2022, 04:39 PM
  4. Notes tests. Excel VBA Folder File Search
    By DocAElstein in forum Test Area
    Replies: 39
    Last Post: 03-20-2018, 04:09 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •