MKHTMLBlockScanner
Description
An internal class used during parsing to scan HTML blocks.
Methods
Name | Parameters | Returns |
---|---|---|
Constructor | ||
FindClosingTag | line As XUITextLine , pos As integer , tagName As String |
Integer |
FindOpenTag | line As XUITextLine , pos As Integer , tagName As String , type7Only As Boolean |
Integer |
GetHtmlTagName | chars() As String , pos As Integer |
String |
IsHtmlBlockType1End | line As XUITextLine , pos As Integer |
Boolean |
IsHtmlBlockType2End | line As XUITextLine , pos As Integer |
Boolean |
IsHtmlBlockType3End | line As XUITextLine , pos As Integer |
Boolean |
IsHtmlBlockType4End | line As XUITextLine , pos As Integer |
Boolean |
IsHtmlBlockType5End | line As XUITextLine , pos As Integer |
Boolean |
MatchAnythingExcept | line As XUITextLine , pos As Integer , currentChar As String , invalidChar As String |
Boolean |
MatchAnythingExceptInvalidAndWhitespace | line As XUITextLine , pos As Integer , currentChar As String , ParamArray invalidChars() As String |
Boolean |
MatchASCIILetterOrDigit | line As XUITextLine , pos As Integer , currentChar As String , ParamArray validChars() As String |
Boolean |
MatchASCIILetterOrValidCharacter | line As XUITextLine , pos As Integer , currentChar As String , ParamArray validChars() As String |
Boolean |
SkipWhitespace | chars() As String , pos As Integer , currentChar As String |
Boolean |
Method Descriptions
Constructor()
Private to prevent instantiation.
FindClosingTag(line As XUITextLine, pos As integer, tagName As String) As Integer This method is shared.
Finds the 0-based index in line
of a valid HTML closingTag beginning at pos
.
Returns 0
if no valid closingTag is found.
Assumes that pos
points to the character immediately following </
closingTag: </, tagName, optional whitespace, >
tagName: ASCII letter, >= 0 ASCII letter|digit|-
Also sets the ByRef tagName
parameter to the detected tagName (if present) or "" if no
valid tagName is found.
FindOpenTag(line As XUITextLine, pos As Integer, tagName As String, type7Only As Boolean) As Integer This method is shared.
Returns the 0-based index in line
of the end of a valid HTML opening tag, beginning at pos
or 0
if not found. tagName
is set to the tag found or "".
Assumes that pos
points to the character immediately following <
Sets the ByRef parameter tagName
to the detected tag name (if present) or "" if none is found.
openTag: `<`, a tagname, >= 0 attributes, optional whitespace, optional `/`, and a `>`.
tagName: ASCII letter, >= 0 ASCII letter|digit|-
attribute: whitespace, attributeName, optional attributeValueSpec
attributeName: ASCII letter|-|:, >=0 ASCII letter|digit|_|.|:|-
attributeValueSpec: optional whitespace, =, optional whitespace, attributeValue
attributeValue: unQuotedAttValue | singleQuotedAttValue | doubleQuotedAttValue
unQuotedAttValue: > 0 characters NOT including whitespace, ", ', =, <, >, or `.
singleQuotedAttValue: ', >= 0 characters NOT including ', then a final '
doubleQuotedAttValue: ", >= 0 characters NOT including ", then a final "
GetHtmlTagName(chars() As String, pos As Integer) As String This method is shared.
Starting at pos
, reads a HTML tag name from chars
and returns it. Adjusts pos
to point to the
character immediately after the tag name. May return "".
Note: pos
is passed ByRef.
tagName: ASCII letter, >= 0 ASCII letter|digit|-
Returns "" If no valid tagName is found.
IsHtmlBlockType1End(line As XUITextLine, pos As Integer) As Boolean This method is shared.
Returns True if, starting at pos
, we find a valid HTML type 1 block end on line
.
End condition: line contains an end tag , , or (case-insensitive; it need not match the start tag).
IsHtmlBlockType2End(line As XUITextLine, pos As Integer) As Boolean This method is shared.
Returns True if, starting at pos
, line
contains a valid HTML type 2 block end.
End condition: line contains the string "-->"
IsHtmlBlockType3End(line As XUITextLine, pos As Integer) As Boolean This method is shared.
True if, starting at pos
, line
contains a valid HTML type 3 block end.
End condition: line contains the string "?>"
IsHtmlBlockType4End(line As XUITextLine, pos As Integer) As Boolean This method is shared.
True if, starting at pos
, line
contains a valid HTML type 4 block end.
End condition: line contains the character ">".
IsHtmlBlockType5End(line As XUITextLine, pos As Integer) As Boolean This method is shared.
True if, starting at pos
, line
contains a valid HTML type 5 block end.
End condition: line contains the string "]]>".
MatchAnythingExcept(line As XUITextLine, pos As Integer, currentChar As String, invalidChar As String) As Boolean This method is shared.
Advances past the characters in line
starting at pos
until invalidChar
.
Returns True if we advanced. pos
and currentChar
are mutated.
MatchAnythingExceptInvalidAndWhitespace(line As XUITextLine, pos As Integer, currentChar As String, ParamArray invalidChars() As String) As Boolean This method is shared.
Advances past the characters in line
starting at pos
until whitespace or an invalid character is found.
Returns True if we advanced. pos
and currentChar
are mutated.
MatchASCIILetterOrDigit(line As XUITextLine, pos As Integer, currentChar As String, ParamArray validChars() As String) As Boolean This method is shared.
Advances along line
starting at pos
as long as the character is an ASCII letter, digit or
validChars
. Mutates pos
and currentChar
. True if pos
changed.
MatchASCIILetterOrValidCharacter(line As XUITextLine, pos As Integer, currentChar As String, ParamArray validChars() As String) As Boolean This method is shared.
Advances along line
starting at pos
as long as the character is an ASCII letter or
validChars
. Mutates pos
and currentChar
. True if pos
changed.
SkipWhitespace(chars() As String, pos As Integer, currentChar As String) As Boolean This method is shared.
Skips over whitespace in chars
beginning at pos
updating pos
and currentChar
. Returns True if
any whitespace was skipped.