Documentation
¶
Overview ¶
Package str contains various tools for manipulation of strings beyond those available in golang's `strings` library. it also conatains a wide variety of string constants.
Index ¶
- Constants
- Variables
- func ContainsAll(s string, subStrs ...string) bool
- func ContainsAny(s string, subStrs ...string) bool
- func ExtremeNormalization(s string) string
- func Map(f func(string) string, a []string) []string
- func NFC(s string) string
- func NFD(s string) string
- func NFKD(s string) string
- func NKFC(s string) string
- func NormEqual(s, q string) bool
- func NormFoldEqual(s, q string) bool
- func RemoveASCIIPunctuation(s string) string
- func RemoveASCIIWhiteSpace(s string) string
- func RemoveDiacriticsNFC(s string) string
- func RemoveNonASCII(s string) string
- func RemoveRunes(s string, toRemove ...rune) string
- func Trim(s string, n int, filler string) string
- type Diffs
- type RuneDiff
Constants ¶
const ( //ASCIIPunct is contains all ASCII punctuation, identical to string.punctuation in python 3.6 ASCIIPunct = `$+<=>^|~!"#$%&\'()*+,-./:;<=>?@[\\]^_{|}~` + "`" //ASCIIWhitespace is a list of all ASCII whitespace, identical to string.Whitespace in python 3.6 ASCIIWhitespace = " \t\n\r\x0b\x0c" //ASCIIPrintable is a list of all ASCII printable characters, identical to string.printable in python 3.6 ASCIIPrintable = `0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&\'()*+,-./:;<=>?@[\\]^_{|}~ \t\n\r\x0b\x0c` + "`" //ASCIILowercase is all lowercase letters in the latin alphabet. (code points in [97, 122]) ASCIILowercase = `abcdefghijklmnopqrstuvwxyz` //ASCIIUpperCase is all uppercase letters in the latin alphabet (code points in [65, 90]) ASCIIUpperCase = `ABCDEFGHIJKLMNOPQRSTUVWXYZ` ASCIILetters = ASCIILowercase + ASCIIUpperCase //ASCIINumerics are the numerals 0-9 (code points in [30, 39]) ASCIINumerics = "0123456789" ASCIIAlphaNumeric = ASCIILowercase + ASCIIUpperCase + ASCIINumerics //ASCII is all ASCII characters, comprising the unicode code points 0-127. ASCII = "`" + `\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_abcdefghijklmnopqrstuvwxyz{|}~\x7f` )
Variables ¶
var ( //ASCIIPunctSet contains all ASCII punctuation. Equivalent ot set(string.punctuation) in python 3.6 ASCIIPunctSet = runeset.FromString(ASCIIPunct) //ASCIIWhitespaceSet contains all ASCII whitespace, identical to set(string.whitespace) in python 3.6 ASCIIWhitespaceSet = runeset.FromString(ASCIIWhitespace) ASCIISet = runeset.FromString(ASCII) )
var ( UnicodeNonSpacingMarksSet = runes.In(unicode.Mn) UnicodePuncuationSet = runes.In(unicode.Punct) UnicodeControlSet = runes.In(unicode.C) UnicodePrintable = runes.In(printable) UnicodeNonPrintable = runes.NotIn(printable) )
var RemoveWhiteSpace = RemoveASCIIWhiteSpace
RemoveWhiteSpace is an alias for RemoveASCIIWhiteSpace.
Functions ¶
func ContainsAll ¶
ContainsAll returns true if strings.Contains(s, sub) is true for all sub in subStrs. ContainsAll(s) is true.
func ContainsAny ¶
ContainsAny returns true if strings.Contains(s, sub) is true for any sub in subStrs. ContainsAny(s) is false.
func ExtremeNormalization ¶
ExtremeNormalization heavily normalizes a string for purposes of comparison and safety. It lowercases the string, removes ALL nonspacing marks, nonprinting marks, whitespace, control characters, and punctuation, and transforms the string to NFKC encoding. This can and will lose a lot of information!
func NormFoldEqual ¶
NormFoldEqual returns true if the casefolded, NKFC normalized forms of both strings are equal.
func RemoveASCIIPunctuation ¶
RemovePunctuation removes punctuation (as defined by unicode) from a string. Note that this converts to runes and back to UTF-8, so RemoveWhiteSpace(s) == s for a string that contains non-punctuation characters does not necessarially hold, since the code points may differ.
func RemoveASCIIWhiteSpace ¶
RemoveASCIIWhiteSpace returns a copy of the string with the ASCII whitespace (" \t\n\r\x0b\x0c") removed.
func RemoveDiacriticsNFC ¶
RemoveDiacriticsNFC creates a copy of s with the diacritics removed. It also transforms it to NFC. It is NOT thread Safe
func RemoveNonASCII ¶
RemoveNonASCII returns a copy of the string with all non-ASCII runes removed.
func RemoveRunes ¶
RemoveRunes removes any runes listed from the string. Note that this converts to runes and back to UTF-8, so RemoveRunes(s) == s does not necessarially hold, since the code points may differ.