Skip to content

MKHTMLBlockScanner

Description

An internal class used during parsing to scan HTML blocks.

Methods

Name Parameters Returns
Constructor
FindClosingTag line As XUITextLine, pos As integer, tagName As String Integer
FindOpenTag line As XUITextLine, pos As Integer, tagName As String, type7Only As Boolean Integer
GetHtmlTagName chars() As String, pos As Integer String
IsHtmlBlockType1End line As XUITextLine, pos As Integer Boolean
IsHtmlBlockType2End line As XUITextLine, pos As Integer Boolean
IsHtmlBlockType3End line As XUITextLine, pos As Integer Boolean
IsHtmlBlockType4End line As XUITextLine, pos As Integer Boolean
IsHtmlBlockType5End line As XUITextLine, pos As Integer Boolean
MatchAnythingExcept line As XUITextLine, pos As Integer, currentChar As String, invalidChar As String Boolean
MatchAnythingExceptInvalidAndWhitespace line As XUITextLine, pos As Integer, currentChar As String, ParamArray invalidChars() As String Boolean
MatchASCIILetterOrDigit line As XUITextLine, pos As Integer, currentChar As String, ParamArray validChars() As String Boolean
MatchASCIILetterOrValidCharacter line As XUITextLine, pos As Integer, currentChar As String, ParamArray validChars() As String Boolean
SkipWhitespace chars() As String, pos As Integer, currentChar As String Boolean

Method Descriptions

Constructor()

Private to prevent instantiation.


FindClosingTag(line As XUITextLine, pos As integer, tagName As String) As Integer This method is shared.

Finds the 0-based index in line of a valid HTML closingTag beginning at pos. Returns 0 if no valid closingTag is found.

Assumes that pos points to the character immediately following </

closingTag: </, tagName, optional whitespace, >
tagName: ASCII letter, >= 0 ASCII letter|digit|-

Also sets the ByRef tagName parameter to the detected tagName (if present) or "" if no valid tagName is found.


FindOpenTag(line As XUITextLine, pos As Integer, tagName As String, type7Only As Boolean) As Integer This method is shared.

Returns the 0-based index in line of the end of a valid HTML opening tag, beginning at pos or 0 if not found. tagName is set to the tag found or "".

Assumes that pos points to the character immediately following < Sets the ByRef parameter tagName to the detected tag name (if present) or "" if none is found.

openTag: `<`, a tagname, >= 0 attributes, optional whitespace, optional `/`, and a `>`.
tagName: ASCII letter, >= 0 ASCII letter|digit|-
attribute: whitespace, attributeName, optional attributeValueSpec
attributeName: ASCII letter|-|:, >=0 ASCII letter|digit|_|.|:|-
attributeValueSpec: optional whitespace, =, optional whitespace, attributeValue
attributeValue: unQuotedAttValue | singleQuotedAttValue | doubleQuotedAttValue
unQuotedAttValue: > 0 characters NOT including whitespace, ", ', =, <, >, or `.
singleQuotedAttValue: ', >= 0 characters NOT including ', then a final '
doubleQuotedAttValue: ", >= 0 characters NOT including ", then a final "

GetHtmlTagName(chars() As String, pos As Integer) As String This method is shared.

Starting at pos, reads a HTML tag name from chars and returns it. Adjusts pos to point to the character immediately after the tag name. May return "".

Note: pos is passed ByRef. tagName: ASCII letter, >= 0 ASCII letter|digit|- Returns "" If no valid tagName is found.


IsHtmlBlockType1End(line As XUITextLine, pos As Integer) As Boolean This method is shared.

Returns True if, starting at pos, we find a valid HTML type 1 block end on line.

End condition: line contains an end tag , , or (case-insensitive; it need not match the start tag).


IsHtmlBlockType2End(line As XUITextLine, pos As Integer) As Boolean This method is shared.

Returns True if, starting at pos, line contains a valid HTML type 2 block end.

End condition: line contains the string "-->"


IsHtmlBlockType3End(line As XUITextLine, pos As Integer) As Boolean This method is shared.

True if, starting at pos, line contains a valid HTML type 3 block end.

End condition: line contains the string "?>"


IsHtmlBlockType4End(line As XUITextLine, pos As Integer) As Boolean This method is shared.

True if, starting at pos, line contains a valid HTML type 4 block end.

End condition: line contains the character ">".


IsHtmlBlockType5End(line As XUITextLine, pos As Integer) As Boolean This method is shared.

True if, starting at pos, line contains a valid HTML type 5 block end.

End condition: line contains the string "]]>".


MatchAnythingExcept(line As XUITextLine, pos As Integer, currentChar As String, invalidChar As String) As Boolean This method is shared.

Advances past the characters in line starting at pos until invalidChar. Returns True if we advanced. pos and currentChar are mutated.


MatchAnythingExceptInvalidAndWhitespace(line As XUITextLine, pos As Integer, currentChar As String, ParamArray invalidChars() As String) As Boolean This method is shared.

Advances past the characters in line starting at pos until whitespace or an invalid character is found. Returns True if we advanced. pos and currentChar are mutated.


MatchASCIILetterOrDigit(line As XUITextLine, pos As Integer, currentChar As String, ParamArray validChars() As String) As Boolean This method is shared.

Advances along line starting at pos as long as the character is an ASCII letter, digit or validChars. Mutates pos and currentChar. True if pos changed.


MatchASCIILetterOrValidCharacter(line As XUITextLine, pos As Integer, currentChar As String, ParamArray validChars() As String) As Boolean This method is shared.

Advances along line starting at pos as long as the character is an ASCII letter or validChars. Mutates pos and currentChar. True if pos changed.


SkipWhitespace(chars() As String, pos As Integer, currentChar As String) As Boolean This method is shared.

Skips over whitespace in chars beginning at pos updating pos and currentChar. Returns True if any whitespace was skipped.