XUIStrings
Description
A module containing many extensions methods and regular methods for manipulating strings.
Includes comprehensive support for unicode characters such as emoji.
Properties
Name | Type | Read-Only |
---|---|---|
CodePointCategoryDictionary | Dictionary |
✔ |
Methods
Name | Parameters | Returns |
---|---|---|
CategoryForLatin1 | codePoint As Integer |
XUIStrings.UnicodeCategories |
CharacterArray | s As String |
String() |
CharacterCount | s As String |
Integer |
CheckLetter | uc As XUIStrings.UnicodeCategories |
Boolean |
CheckLetterOrDigit | uc As XUIStrings.UnicodeCategories |
Boolean |
Chop | s As String , numChars As Integer |
String |
Clone | s() As String |
String() |
CompareCase | s As String , other As String |
Boolean |
Contains | s As String , what As String , caseSensitive As Boolean |
Boolean |
FromArray | chars() As String , start As Integer , length As Integer |
String |
GetLatin1UnicodeCharacter | character As String |
XUIStrings.UnicodeCategories |
GetUnicodeCategory | s As String |
XUIStrings.UnicodeCategories |
InitialiseCodepointCategoryDictionary | Dictionary |
|
IsASCII | character As String |
Boolean |
IsASCIILetter | letter As String |
Boolean |
IsASCIILetterOrDigit | letter As String |
Boolean |
IsASCIILetterOrDigitOrHyphen | letter As String |
Boolean |
IsASCIILetterOrDigitOrUnderscore | letter As String |
Boolean |
IsASCIILetterOrUnderscore | letter As String |
Boolean |
IsBinaryDigit | s As String |
Boolean |
IsDigit | s As String |
Boolean |
IsDigitOrUnderscore | s As String |
Boolean |
IsExactly | char As String , ParamArray characters() As String |
Boolean |
IsHexDigit | s As String |
Boolean |
IsLatin1 | character As String |
Boolean |
IsLetter | s As String |
Boolean |
IsLetterDigitOrUnderscore | s As String |
Boolean |
IsLetterOrDigit | s As String |
Boolean |
IsLowercaseCharacter | character As String |
Boolean |
IsOctalDigit | s As String |
Boolean |
IsRGBA | s As String |
Boolean |
IsSpaceOrTab | character As String |
Boolean |
IsSpaceOrTabOrNewline | character As String |
Boolean |
IsUppercaseASCIICharacter | s As String |
Boolean |
IsUppercaseASCIILetter | s As String |
Boolean |
IsWhiteSpace | s As String |
Boolean |
JustifyLeft | s As String , width As Integer , char As String |
String |
JustifyLeft | s As String , width As Integer , char As String |
String |
JustifyRight | s As String , width As Integer , char As String |
String |
JustifyRight | s As String , width As Integer , char As String |
String |
LeftCharacters | s As String , count As Integer |
String |
Longest | s() As String |
String |
MiddleCharacters | s As String , start As Integer |
String |
MiddleCharacters | s As String , start As Integer , count As Integer |
String |
ReplaceInvisibleCharacters | s As String |
String |
RightCharacters | s As String , count As Integer |
String |
Constants
Name | Type |
---|---|
TAB | String |
UNICODE_CODEPOINT_CATEGORY_PAIRS | String |
TAB As String The horiztonal tab character.
UNICODE_CODEPOINT_CATEGORY_PAIRS As String Contains parsed data from the Unicode standard where each line represents a codepoint/category pairing.
Each line is in the format: codepoint:category
where codepoint
is a hex value codepoint and category
is a two character category.
Enumerations
UnicodeCategories
The different Unicode categories.
Name |
---|
ClosePunctuation |
ConnectorPunctuation |
Control |
CurrencySymbol |
DashPunctuation |
DecimalDigitNumber |
EnclosingMark |
FinalQuotePunctuation |
Format |
InitialQuotePunctuation |
LetterNumber |
LineSeparator |
LowercaseLetter |
MathSymbol |
ModifierLetter |
ModifierSymbol |
NonSpacingMark |
OpenPunctuation |
OtherLetter |
OtherNotAssigned |
OtherNumber |
OtherPunctuation |
OtherSymbol |
ParagraphSeparator |
PrivateUse |
SpaceSeparator |
SpaceCombiningMark |
Surrogate |
TitlecaseLetter |
UppercaseLetter |
None |
Property Descriptions
CodePointCategoryDictionary As Dictionary
Maps a Unicode codepoint to its category. Key = Unicode codepoint, Value = Unicode category.
Method Descriptions
CategoryForLatin1(codePoint As Integer) As XUIStrings.UnicodeCategories
Returns the Unicode category for a latin1 character.
Assumes that codePoint
is within the range &u0000 and &u00FF.
CharacterArray(s As String) As String()
Returns the individual characters in s
as an array.
It's at least 4x faster to use Text
to split into characters and then iterate over that array
than to use the native String.Characters()
method that returns an Iterable
.
CharacterCount(s As String) As Integer
Returns the number of characters in the passed string (including multibyte characters).
CheckLetter(uc As XUIStrings.UnicodeCategories) As Boolean
Checks if uc
belongs to the letter category.
CheckLetterOrDigit(uc As XUIStrings.UnicodeCategories) As Boolean
Checks if uc
belongs to the letter or digit categories.
Chop(s As String, numChars As Integer) As String
Removes numChars
characters from s
.
If numChars
is greater than the length of s
, "" is returned.
Clone(s() As String) As String()
Returns a copy of the passed string array.
CompareCase(s As String, other As String) As Boolean
Performs a case sensitive string comparison. Returns True if s = other.
Contains(s As String, what As String, caseSensitive As Boolean) As Boolean
True if s
contains what
.
FromArray(chars() As String, start As Integer, length As Integer) As String
Returns a string from chars
beginning at index start
for length
characters.
Assumes chars
is an array of individual characters.
If start + length
> the number of remaining characters then all characters from start
to the
end of chars
are returned.
If length
= -1
then all characters from start
to the end of chars
are returned.
GetLatin1UnicodeCharacter(character As String) As XUIStrings.UnicodeCategories
Returns the Unicode category for Unicode characters <= &h00ff.
Assumes that character
is one character long.
GetUnicodeCategory(s As String) As XUIStrings.UnicodeCategories
Returns the Unicode category that s
belongs to.
If s
is empty or is more than one character in length then
we return a special None
category.
InitialiseCodepointCategoryDictionary() As Dictionary
Returns a dictionary mapping unicode codepoints to unicode categories.
UNICODE_CODEPOINT_CATEGORY_PAIRS
contains parsed data from the Unicode
standard where each line represents a codepoint/category pairing.
Each line is in the format: codepoint:category
where codepoint
is a hex value codepoint and category
is a two character category.
IsASCII(character As String) As Boolean
True if character
is in the ASCII range.
Assumes that character
is one character in length.
IsASCIILetter(letter As String) As Boolean
Returns True if the letter
is A-Z or a-z.
IsASCIILetterOrDigit(letter As String) As Boolean
Returns True if the letter
is A-Z, a-z or 0-9
IsASCIILetterOrDigitOrHyphen(letter As String) As Boolean
Returns True if the letter
is A-Z, a-z, 0-9 or "-"
IsASCIILetterOrDigitOrUnderscore(letter As String) As Boolean
Returns True if the letter
is A-Z, a-z, 0-9 or the underscore.
IsASCIILetterOrUnderscore(letter As String) As Boolean
Returns True if the letter
is A-Z, a-z or the underscore.
IsBinaryDigit(s As String) As Boolean
True if s
is 0
or 1
.
IsDigit(s As String) As Boolean
True if s
is a single digit in the range 0-9.
We could use GetUnicodeCategory
but a Select...Case
is faster.
IsDigitOrUnderscore(s As String) As Boolean
True if s
is a single digit in the range 0-9 or the underscore character (_
).
We could use GetUnicodeCategory
but a Select...Case
is faster.
IsExactly(char As String, ParamArray characters() As String) As Boolean
True if char
exactly matches (case-sensitive) any of the passed characters
.
Assumes that char
is a single character in length.
IsHexDigit(s As String) As Boolean
True if s
is a valid hexadecimal digit (0-9, a-f, A-F).
IsLatin1(character As String) As Boolean
Returns True for if character
is in the ASCII or Latin-1 supplement range.
Assumes that character
is one character in length.
IsLetterDigitOrUnderscore(s As String) As Boolean
Determines whether s
is a letter, a digit or an underscore.
Based on code from .NET core:
IsLetterOrDigit(s As String) As Boolean
Determines whether s
is a letter or a digit.
Based on code from .NET core:
IsLowercaseCharacter(character As String) As Boolean
True if character
is lowercase.
Assumes that character
is one character long.
IsOctalDigit(s As String) As Boolean
True if s
is a valid octal digit (0-7).
IsRGBA(s As String) As Boolean
Returns True if s
is a valid RGBA hex string.
Valid formats are:
IsSpaceOrTab(character As String) As Boolean
True if character
is a space or horizontal tab.
IsSpaceOrTabOrNewline(character As String) As Boolean
True if character
is a space, horizontal tab or UNIX newline (&u0A).
IsUppercaseASCIICharacter(s As String) As Boolean
True if s
is an uppercase ASCII character.
IsUppercaseASCIILetter(s As String) As Boolean
True if s
is an uppercase ASCII letter.
IsWhiteSpace(s As String) As Boolean
True if s
is Unicode whitespace.
Assumes s
is a single character. String.Asc
returns the
codepoint for the first character in s
so if this method is passed
a string comprising more than one character, it'll break.
&u0009 = <control> HORIZONTAL TAB
&u000a = <control> LINE FEED
&u000b = <control> VERTICAL TAB
&u000c = <contorl> FORM FEED
&u000d = <control> CARRIAGE RETURN
&u0085 = <control> NEXT LINE
&u00a0 = NO-BREAK SPACE
JustifyLeft(s As String, width As Integer, char As String) As String
Left justifies s
to width
characters using char
to pad the right edge if required.
"Hello".JustifyLeft(10) // Becomes "Hello "
JustifyLeft(s As String, width As Integer, char As String) As String
Left justifies s
to width
characters using char
to pad the right edge if required.
"Hello".JustifyLeft(10) // Becomes "Hello "
JustifyRight(s As String, width As Integer, char As String) As String
Right justifies s
to width
characters using char
to pad the left edge if required.
"Hello".JustifyRight(10) // Becomes " Hello"
JustifyRight(s As String, width As Integer, char As String) As String
Right justifies s
to width
characters using char
to pad the left edge if required.
"Hello".JustifyRight(10) // Becomes " Hello"
LeftCharacters(s As String, count As Integer) As String
Returns count
left-most characters from s
.
Longest(s() As String) As String
Returns the longest string in s
.
MiddleCharacters(s As String, start As Integer) As String
Returns all of the characters from start
to the end of s
. The start position is a zero-based.
MiddleCharacters(s As String, start As Integer, count As Integer) As String
Returns count
characters from s
. Handles multibyte characters like emoji.
ReplaceInvisibleCharacters(s As String) As String
Replaces certain invisible characters with a visible representation.
RightCharacters(s As String, count As Integer) As String
Returns count
right-most characters from s
.