Results 1 to 10 of 193

Thread: Appendix Thread 2. ( Codes for other Threads, HTML Tables, etc.)

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    10,457
    Rep Power
    10

    Coding for Update Lists

    Main Routine in support of these Threads Part 1
    http://www.excelfox.com/forum/showth...0893#post10893
    http://www.eileenslounge.com/viewtopic.php?f=21&t=31572

    The coding is split into 2 parts to fil it into a Forum Post. But this and the coding in the next post form a single routine. That forms the main routine. In addition, a routine Called by the Main routine is required, Public Sub GetElemsText( ) , which is posted in the over next post.

    Code:
    Option Explicit
    Sub EP() '     http://www.excelforum.com/showthread.php?t=1148621&page=7&p=4452110&highlight=#post4452110
    Rem 1)File Info
    'Dim wsLkUp As Worksheet: Set wsLkUp = ThisWorkbook.Worksheets("Tabelle1"): wsLkUp.Activate
    Dim strURL As String ' File with Page ' file:///G:/Excel0202015Jan2016/OffenFragensForums/eileenslounge/XP/Updates/report.html
    Let strURL = ThisWorkbook.Path & "\Updates\" & "report.html" ' '"http://www.ernaehrung.de/lebensmittel/de/W233000/Fleischkaese.php" ' "http://www.ernaehrung.de/lebensmittel/de/W233000/PloppyPooFukYou"
    ' Application.Wait Now + TimeValue("00:00:02") '
    Rem 2) '
    '2a  xmlHTTP stuff  MSXML2.XMLHTTP.6.0     IXMLHTTPRequest  Alan: "simple xml request here, so you could give URL a simple File of the HTML code"       'Dim Request As Object: Set Request = CreateObject("MSXML2.XMLHTTP") 'Late Inding                     https://msdn.microsoft.com/en-us/library/ms759148(v=vs.85).aspx
    Dim request As MSXML2.XMLHTTP:   Set request = New MSXML2.XMLHTTP 'Early Binding Requires --- TOOLS --- REFERENCES -- tick Microsoft XML, v6.0    http://www.mrexcel.com/forum/excel-questions/759592-help-createobject-msxml2-xmlhttp-macro.html
    'Application.Cursor = xlWait'cursor disable..just to be on the safe side???
       With request '(or With CreateObject("msxml2.xmlhttp"))'By virtue of GET this is a simplified "xml" request
        .Open bstrmethod:="GET", bstrURL:=strURL, varasync:=True '  ("GET", strURL, True) 'just preparing the request type, how and what type. The second argument determines type. This may then require further info in next lines     Only diferrence to pike's and Kyle's opening and sending stuff is argument:- Leith: "The True/False argument of the HTTP Request is the Asynchronous mode flag. If set False then control is immediately returns to VBA after Send is executed. If set True then control is returned to VBA after the server has sent back a response. I prefer to use asynchronous mode and test if my timeout period has expired to prevent the code from hanging due to an unresponsive server. In the example I provided I used synchronous mode to reduce the amount the code and keep it easier to understand."
        'No extra info here for type GET    '                      '   '.setRequestHeader "DNT", "1"
        '.setRequestHeader "Content-Type", "application/x-www-form-urlencoded" ' Content-Type is the property name, x-www-form-urlencoded is the value (content type in the html is "text/html" not "x-www-form-urlencoded" - that is something diifferent) You can have different request header properties and pass different values. This isn't unusual, just not required in this case   When you POST data to a server, you need to tell it what format you are sending it in. So the Type of Content sent in the body of the request (the send bit) is application/x-www-form-urlencoded
        .setRequestHeader bstrheader:="Ploppy", bstrvalue:="Poo"
        .send ' varBody:= ' No extra info for type GET. .send actually makes the request
            While .readyState <> 4: DoEvents: Wend ' Allow other processes to run while the web page loads. Think this is part of the True option
       Dim PageSrc As String: Let PageSrc = .responseText ' Save the HTML code in the (Global) variable. ': Range("P1").Value = PageSrc 'For me for a print out copy to text file etc.    The responseText property returns the information requested by the Open method as a text string
       End With
    Set request = Nothing ' This section is finished. We no longer need the Library. Optionally can therefore Set request = Nothing, a step most appropriate if required for some reason. Previous arguments of good practice to prevent memory leaks and data corruption appear outdated in favour of only using when a good reason is apparent to avoid masking when it is a good idea.
    '_..EP2ab Explicit Pedantry.  We intend using PagrSrc through a method to produce a model  Object Orientated stylio for later use through  use of its Methods and Properties. This model is frequently referred to as a Document Orientated Model, DOM. Some steps in this creation of the "DOM" can frequently be confused with the processes in '2a which are in fact now finished. Part of the .Send , "finishes all processes. We move on to '2b. Only PagrSrc is required to be "taken over" as it were
    '2b DOM stuff' Make OOP type model of HTML code, using Microsoft HTML Office Library
    'Dim HTMLdoc As HTMLDocument: Set HTMLdoc = New HTMLDocument 'Early binding - will not work with .Write:- Leith "This is a case where late binding has to be used. The htmlfile is an ActiveX object that is a wrapper function for the IHTMLDocument2 interface in MSXML2. This gets into a lot of low level system operation......."   https://www.mrexcel.com/forum/excel-questions/367030-copy-table-website-into-excel-vba-2.html#post4031122    https://www.excelforum.com/excel-programming-vba-macros/1214789-late-binding-2.html#post4820307    'Early binding TOOLS >>> REFERENCES >>Microsoft HTML Object Library
    Dim HTMLdoc As Object: Set HTMLdoc = CreateObject("htmlfile") 'Late Binding, ' Create an empty HTML Document.
     HTMLdoc.Open 'EP2b(i)  This clears the values in the HTMLdoc. Complete Explicit Pedantry. in usage outside VBA, Methods for an instance will often be required which require a clearing of an instance before "using". Approximately in VBA this can be considered putting the DOM back to as if it were at the point just before it is given "loaded" with the PageSrc String. Effectively in VBA doing a pair of Set = Nothing , with either a Dim  and Create Dom or Set = New type code line  It serves no purpose usually in VBA. Effectively we reset a situation back to as it is. It can however be used through .Open
     HTMLdoc.Write PageSrc 'EP2b(ii).  Convert the HTML code into an HTML Document Object Model, DOM 'give it somehow the info it needs to work further? ---- Fills the DOM HTML  .. Wiki Dom  http://www.excelforum.com/showthread.php?t=1148621&page=3#post4441761
     'HTMLdoc.body.innerHTML = PageSrc ' Most people do that, but The Write method of an HTML file is designed to convert the page source text into an HTML DOM document. Both methods achieve the same results. The more common way Body of the Page Source code when converting it to an HTML DOM document oustside of VBA. Withiin VBA it just works harder to achieve the same. This excludes the Meta data, Java scripts, and Class information from being converted. Generally speaking, this information is not used when retrieving only text data from a web page.
     HTMLdoc.Close 'EP2b(iii)  _ 2 b or not in 2b , that was the ??  http://www.excelforum.com/showthread.php?t=1148621&page=6.. Briefly When used outside VBA, some processes started by .Open() can  or should be finished after the corresponding outside VBA .write(). This is done using .Close(). Once again this can be used in VBA through .Close.  It has no conceivable merit or known as yet reason to use it in VBA. Pike thinks it It closes the document you have just written. As such he describes it as optional. He would also not have the HTMLdoc.Open. Kyle thinks nothing is open. Leith uses it but has made no comment to Date. This may be just his style, like my EP's just not including the HTMLdoc.Open 'EP2b(i)
    Rem 3
    Rem 3a) Directly
    Last edited by DocAElstein; 01-09-2019 at 09:32 PM.
    ….If you are my competitor, I will try all I can to beat you. But if I do, I will not belittle you. I will Salute you, because without you, I am nothing.
    If you are my enemy, we will try to kick the fucking shit out of you…..
    Winston Churchill, 1939
    Save your Forum..._
    KILL A MODERATOR!!

  2. #2
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    10,457
    Rep Power
    10
    This is required for the single Main routine which is posted in two parts in the last two posts


    [Code]'2 Alan http://www.excelforum.com/showthread.php?t=1148621&page=3#post4441761
    '5 Leith Ross http://www.mrexcel.com/forum/excel-questions/367030-copy-table-website-into-excel-visual-basic-applications-2.html#post4031122
    '10 '....' "This is a recursive procedure to extract text from between an element's start tag and end tag and everything in between. Usually the Calling program will have passed a HTML code ( either from, for example, a .HTML File, a .Tex File, a .txt file, or from a returned such file after a request to a web page) into a Document Object Model. ( DOM ). This somehow organises things in a tree type structure , approximately as like you might see if you carefully indented the HTML code yourself, such that tag pairs were clear to see within tag pairs, each level down as it were. ( a "next level down" is often referred to as a "Child" ). The exact structure is less obvious, but in any case the DOM will have some ordered structure and every constitute part of the code is referred to as an Element. In a simple case most Elements have a start and stop pointed bracket. They are all nodes. Text is usually squeezed in between somewhere within a paired tag set, but is also referred to as a node.
    '12 'I think a node is a point, usually a junction point in the tree type structure. Usually before the procedure is run a first time, an Element will have been obtained from the DOM and this is to be passed in the signature line of the procedure, as an Object. VBA then makes a Copy of the procedure and runs that with the given Element.
    '15 'The macro will examine this Element Object brought in for a Text Node: If the element .NodeType is not 3 (a text node) then there are possibly child nodes ( Nodes "next down" in a Tree type listing ) that need to examined. The procedure then "Calls itself". In other words the first Copy stops at the Call Point. At the Call point another Copy of the procedure is made and runs in a loop for each child node.
    '20 'The next Copy of the macro will again examine the element for a Text Node. If found (If element node type is 3), the text is concatenated with the ElemText String. If this is the ElemText string is empty then ElemText is set to this value. If not then this value is concatenated with any previous text and separated by a tilde character. This character can be used later to parse the text string into the individual strings from each element. The macro will exit the Sub at this point. When this happens, this copy of the macro is "removed from the call stack", in other words it Ends, and the last Copy continues from the Call point at which it was stopped.
    Public Sub GetElemsText(ByRef Elem As Object, ByRef ElemText As String) 'It takes an Object, (variable Elem), a HTML Element, or a ( child ) node thereof. (Wiki says "An HTML element is an individual component of an HTML document or web page, once this has been parsed into the Document Object Model. (DOM). HTML is composed of a tree of HTML Elements and other nodes, such as text nodes." May be close to but not excactly what you se by carefully indenting down "Child" levels
    '25 Dim strobjElem As String: Let strobjElem = TypeName(Elem)' http://www.excelforum.com/excel-programming-vba-macros/1149427-vba-determine-object-type-from-html-dom-object-put-type-in-string-variable-as-shown-in.html
    65 Rem 1) Do we have an Element
    70 If Elem Is Nothing Then GoTo LEndSub [color=darkgreen]'If the Object Elem is empty, or rather we are not given one, Then we End
    ….If you are my competitor, I will try all I can to beat you. But if I do, I will not belittle you. I will Salute you, because without you, I am nothing.
    If you are my enemy, we will try to kick the fucking shit out of you…..
    Winston Churchill, 1939
    Save your Forum..._
    KILL A MODERATOR!!

Similar Threads

  1. VBA to Reply All To Latest Email Thread
    By pkearney10 in forum Outlook Help
    Replies: 11
    Last Post: 12-22-2020, 11:15 PM
  2. Appendix Thread. Diet Protokol Coding Adaptions
    By DocAElstein in forum Test Area
    Replies: 6
    Last Post: 09-05-2019, 10:45 AM
  3. Replies: 19
    Last Post: 04-20-2019, 02:38 PM
  4. Search List of my codes
    By PcMax in forum Excel Help
    Replies: 6
    Last Post: 08-03-2014, 08:38 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •