Documentation ¶
Overview ¶
Package gostrutils contains string utilities that are missing from the main strings package, as well as other encoding packages arrives by go.
The implementation of the package is set by files based on a subject that belong to.
Basics ¶
The basic logic of Go, is to use bytes as a non numeric holder. The basics of the following package is to gain the ability to hold more support for string based on functions that are missing, while remembering that Go's strings are UTF-8.
The project itself built with files that holds the subject of what they are doing.
The helpers.go file, holds functions that are not string related functions, but help creating that string support.
Test coverage ¶
The aim of the following library is to have close to 100% of unit test coverage, and also examples for all existed functions.
GSM 03.38 ¶
GSM 03.38 is a text encoding that is used with sending latin based languages with SMS.
There are several types of encoding that SMS can support, where one of them is GSM 03.38.
UTF16 ¶
UTF16 is 2 byte encoding mechanism. When a character is a single byte, it is padded with NULL.
UTF16 contains a representation of characters either as Big or Little endian. It means that bytes are either appears as `'A', 0x00"`, or `0x00, 'A'`.
In order to understand what type of encoding is in use, and it's endian, there is a BOM -> Byte Order Mark header that provides such information.
Partial content of UTF16 does not a BOM, but a starting of such, should have one. It must be the first (two) bytes of the 'string'.
At the past there was a UCS2 encoding that provided the same encoding as UTF16 Big Endian, and because of that, it was deprecated in favor of of UTF16 Big Endian.
The current support for UTF16, provides three functions:
- Encoding a go string that is UTF8 into UTF16.
- Decoding a UTF16 to go's string (UTF8).
- UTF16 BOM detection.
Index ¶
- Constants
- func Abbreviate(str string, startAt, maxLen int, abbrvChar string) string
- func ByteToStr(b []byte) string
- func CamelCaseToJavascriptCase(str string) string
- func CamelCaseToUnderscore(str string) string
- func CleanPhoneNumber(number string) string
- func ClearByteChars(buf []byte, chars []byte) []byte
- func ClearHebrewPunctuationPoints(s string) string
- func CopyRange(src string, from, to int) (string, error)
- func CountBidiChars(s string) int
- func CountStartBidiLanguage(s string) int
- func DecToHex(dec byte) byte
- func DecodeUTF16(b []byte) (string, error)
- func DetectMimeTypeFromContent(content []byte) (contentType string)
- func EncodeUTF16(s string, bigEndian, addBom bool) []byte
- func GSM0338ToUTF8(text string) string
- func GetBytesRuneIndexInSlice(list []byte, needle rune) int
- func GetStringIndexInSlice(list []string, needle string) int
- func HaveNULL(str string) bool
- func HexToDec(ch byte) byte
- func HexToUTF16Runes(s string, bigEndian bool) []rune
- func In64Split(data, sep string) []int64
- func Int64Join(list []int64, sep string) string
- func IsANSI(str string) bool
- func IsASCII(str string) bool
- func IsE164(number string) bool
- func IsEmpty(s string) bool
- func IsEmptyChars(s string, chars []rune) bool
- func IsFloat(txt string) bool
- func IsHex(ch byte) bool
- func IsInRange(min, max int64, src string) bool
- func IsInteger(txt string) bool
- func IsIsraeliPhoneNumber(number string) bool
- func IsNumber(txt string) bool
- func IsRuneInByteSlice(list []byte, needle rune) bool
- func IsStringInSlice(list []string, needle string) bool
- func IsUFloat(txt string) bool
- func IsUInteger(txt string) bool
- func IsUNumber(txt string) bool
- func IsValidUUID(uuid string) bool
- func KeepByteChars(buf []byte, toKeep []byte) []byte
- func PointerToStr(s *string) string
- func SliceIndex(limit int, predicate func(i int) bool) int
- func SmsCalculateSmsFragments(message string) uint16
- func StrToInt64(str string, def int64) int64
- func StrToPointer(str string) *string
- func StrToUInt64(str string, def uint64) uint64
- func TagIsBidi(tag language.Tag) bool
- func ToFloat32(field string) float32
- func ToFloat32Default(field string, defaultValue float32) float32
- func ToFloat64(field string) float64
- func ToFloat6Default(field string, defaultValue float64) float64
- func ToSQLString(str string) sql.NullString
- func ToSQLStringNotNullable(str string) sql.NullString
- func Truncate(s string, length int) string
- func UInt64Join(list []uint64, sep string) string
- func UTF16Bom(b []byte) int8
- func UTF8ToGsm0338(text string) string
- func Uin64Split(data, sep string) []uint64
Examples ¶
Constants ¶
const ( BOMNone = iota BOMBE BOMLE )
BOM Types
const DefaultEllipse = "…"
DefaultEllipse is the default char for marking abbreviation
Variables ¶
This section is empty.
Functions ¶
func Abbreviate ¶
Abbreviate takes 'str' and based on length returns an abbreviate string with marks that represents middle of words.
str - The original string startAt - Where to start to abbreviate inside a string maxLen - The maximum length for a string. It must be at least 4 chatrs
func ByteToStr ¶
ByteToStr converts slice of bytes to string
Example ¶
package main import ( "fmt" "github.com/ik5/gostrutils" ) func main() { b := []byte{'H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd'} s := gostrutils.ByteToStr(b) fmt.Println(s) }
Output:
func CamelCaseToJavascriptCase ¶
CamelCaseToJavascriptCase convert CamelCase to camelCase
func CamelCaseToUnderscore ¶
CamelCaseToUnderscore converts CamelCase to camel_case
func CleanPhoneNumber ¶
CleanPhoneNumber Remove non numeric values from a phone number
func ClearByteChars ¶ added in v0.0.17
ClearByteChars removes chars from buffer
func ClearHebrewPunctuationPoints ¶
ClearHebrewPunctuationPoints removes Hebrew based punctuation points (Nikud) from a string. If something went wrong, the original string is returned.
func CopyRange ¶
CopyRange returns a copy of a string based on start and end of a rune for multi-byte chars instead of byte based chars (ASCII). 'to' is the amount of chars interested plus one. for example, for a string 'foo' extracting 'oo' by from: 1 and to 3.
func CountBidiChars ¶
CountBidiChars counts how many chars belong to a bi-directional language: Arabic (and sub language), Hebrew, Urdu and Yiddish
func CountStartBidiLanguage ¶
CountStartBidiLanguage returns number of chars from the beginning. If chars are not A bidi range, and not up to 0x40 char, then 0 will be returned. If only chars up to (including) char 0x40 (@) -1 will be returned
Important: result is both bidi chars and up to 0x40 count from the beginning
func DecToHex ¶ added in v0.0.19
DecToHex convert decimal value to hexadecimal as long as the range is between 0 to 15. When value is bigger than 15, 0 will be returned.
func DecodeUTF16 ¶
DecodeUTF16 get a slice of bytes and decode it to UTF-8
Example ¶
package main import ( "fmt" "github.com/ik5/gostrutils" ) func main() { // Big endian text of TEST text := []byte{ 0xfe, // BOM 0xff, // BOM 0x00, 'T', 0x00, 'E', 0x00, 'S', 0x00, 'T', } test, err := gostrutils.DecodeUTF16(text) if err != nil { panic(err) } fmt.Println("we have 'TEST'? ", test) }
Output:
func DetectMimeTypeFromContent ¶
DetectMimeTypeFromContent takes a slice of bytes, and try to detect the content type that is in use.
Example ¶
package main import ( "fmt" "github.com/ik5/gostrutils" ) func main() { hello := []byte{0x68, 0x65, 0x6c, 0x6c, 0x6f} fmt.Printf("Mime type of %s is: %s", hello, gostrutils.DetectMimeTypeFromContent(hello)) }
Output:
func EncodeUTF16 ¶
EncodeUTF16 get a utf8 string and translate it into a slice of bytes
Example ¶
package main import ( "fmt" "github.com/ik5/gostrutils" ) func main() { str := gostrutils.EncodeUTF16("Windows Unicode String", true, true) fmt.Println(gostrutils.ByteToStr(str)) }
Output:
func GSM0338ToUTF8 ¶
GSM0338ToUTF8 convert a GSM0338 string to a UTF8 equivalent chars
The encoding is driven from "Extended ASCII" charsets, and as such compatible with UTF8, and does not require a slice of bytes
Example ¶
package main import ( "fmt" "github.com/ik5/gostrutils" ) func main() { msg := gostrutils.GSM0338ToUTF8("Please email me to user\x00example.com") fmt.Println("Got sms message: ", msg) }
Output:
func GetBytesRuneIndexInSlice ¶ added in v0.0.17
GetBytesRuneIndexInSlice returns the index of the slice if rune was found or -1 if not. Important, the index returned is of the rune, not of the byte!
func GetStringIndexInSlice ¶
GetStringIndexInSlice returns the index of the slice if string was found or -1 if not
func HaveNULL ¶ added in v0.0.18
HaveNULL look for null char, and return true if found at first place. Empty str returns false.
func HexToDec ¶ added in v0.0.19
HexToDec convert hex decimal ASCII value. If ch is not in the range, the function will return 0
func HexToUTF16Runes ¶
HexToUTF16Runes takes a hex based string and converts it into UTF16 runes
Such string looks like:
"\x00H\x00e\x00l\x00l\x00o\x00 \x00W\x00o\x00r\x00l\x00d" "\x05\xe9\x05\xdc\x05\xd5\x05\xdd\x00 \x05\xe2\x05\xd5\x05\xdc\x05\xdd"
Result of the first string is (big endian):
[U+0048 'H' U+0065 'e' U+006C 'l' U+006C 'l' U+006F 'o' U+0020 ' ' U+0057 'W' U+006F 'o' U+0072 'r' U+006C 'l' U+0064 'd'] Hello World
Result of the second string is (big endian):
[U+05E9 'ש' U+05DC 'ל' U+05D5 'ו' U+05DD 'ם' U+0020 ' ' U+05E2 'ע' U+05D5 'ו' U+05DC 'ל' U+05DD 'ם'] שלום עולם
func Int64Join ¶
Int64Join takes a list of int64 and join them with a separator based on https://play.golang.org/p/KpdkVS1B4s
func IsANSI ¶ added in v0.0.18
IsANSI returns true if the entire string contains only ANSI characters. On empty string, the function returns false.
func IsASCII ¶ added in v0.0.18
IsASCII returns true if the entire string contains only ASCII characters. On empty string, the function returns false.
func IsEmpty ¶
IsEmpty returns true if a string with whitespace only was provided or an empty string
func IsEmptyChars ¶
IsEmptyChars return true if after cleaning given chars the string is empty
func IsInRange ¶ added in v0.0.15
IsInRange takes an integer range and look at the string that contains only numbers tom make sure it is inside the range
func IsIsraeliPhoneNumber ¶
IsIsraeliPhoneNumber check to see if a pettern that looks like Israeli phone number is actually an Israeli phone number
func IsRuneInByteSlice ¶ added in v0.0.17
IsRuneInByteSlice return true if needle exists in list
func IsStringInSlice ¶
IsStringInSlice looks for a string inside a slice and return true if it exists
func IsUInteger ¶
IsUInteger returns true if a string is unsigned integer
func IsValidUUID ¶
IsValidUUID validate if a given string is a uuid. On error it will return false.
The code is not mine, and was taken from: https://stackoverflow.com/questions/25051675/how-to-validate-uuid-v4-in-go#25051754
func KeepByteChars ¶ added in v0.0.17
KeepByteChars removes all chars that are not part of toKeep and return a new slice of bytes
func PointerToStr ¶
PointerToStr converts a pointer string to a normal string
If there is a need for a string to be nil, then usually there is a pointer for a string involved. The current function converts nil pointer to empty string, or return the string itself
Example ¶
package main import ( "fmt" "github.com/ik5/gostrutils" ) func main() { empty := gostrutils.PointerToStr(nil) hello := "hello" pHello := &hello newHello := gostrutils.PointerToStr(pHello) fmt.Printf("empty: %s, pHello: %s\n", empty, newHello) }
Output:
func SliceIndex ¶
SliceIndex returns a slice index based on a callback value
func SmsCalculateSmsFragments ¶
SmsCalculateSmsFragments calculate based on a string how many SMS messages will it take to deliver a massage
func StrToInt64 ¶
StrToInt64 convert a string to int64
Example ¶
package main import ( "fmt" "github.com/ik5/gostrutils" ) func main() { meaning := gostrutils.StrToInt64("42", 0) fmt.Println("Meaning of life universe and everything: ", meaning) }
Output:
func StrToPointer ¶
StrToPointer returns the address of string.
The function can help when there is a need to convert literal string yo be have a pointer for that content.
Example ¶
package main import ( "fmt" "github.com/ik5/gostrutils" ) func main() { hello := "hello" addr := gostrutils.StrToPointer(hello) fmt.Printf("String: %s, Address: %p\n", hello, addr) }
Output:
func StrToUInt64 ¶
StrToUInt64 convert string to uint64
func ToFloat32Default ¶
ToFloat32Default convert string to float32 without errors. If error returns, defaultValue is set instead.
Example ¶
package main import ( "fmt" "github.com/ik5/gostrutils" ) func main() { pi := gostrutils.ToFloat32Default("3.141592653", 3.14) fmt.Printf("Pi: %f\n", pi) }
Output:
func ToFloat6Default ¶
ToFloat6Default convert string to float64 without errors. If error returns, default is set instead.
func ToSQLString ¶
func ToSQLString(str string) sql.NullString
ToSQLString convert string to sql.NullString
func ToSQLStringNotNullable ¶
func ToSQLStringNotNullable(str string) sql.NullString
ToSQLStringNotNullable convert string to sql.NullString not nullable
func Truncate ¶
Truncate a string to smaller parts, UTF8 multi-byte wise.
When using s[:3] to truncate to 3 chars, it will not take runes but rather bytes. A multi-byte char (rune) can contain between one to 4 bytes, but not all of them will be returned.
The following function will return the number of runes, and not the number of bytes inside a string.
func UInt64Join ¶
UInt64Join takes a list of uint64 and join them with a separator based on https://play.golang.org/p/KpdkVS1B4s
Example ¶
package main import ( "fmt" "github.com/ik5/gostrutils" ) func main() { list := []uint64{0, 1, 1, 2, 3, 5, 8, 13, 21, 34} fibonacci := gostrutils.UInt64Join(list, ", ") fmt.Printf("First 10 fibonacci sequence: %s\n", fibonacci) }
Output:
func UTF16Bom ¶
UTF16Bom returns 0 for no BOM, 1 for Big Endian and 2 for little endian it will return -1 if b is too small for having BOM
func UTF8ToGsm0338 ¶
UTF8ToGsm0338 convert a UTF8 string to a gsm0338 equivalent chars
The encoding is driven from "Extended ASCII" charsets, and as such compatible with UTF8, and does not require a slice of bytes
Example ¶
package main import ( "github.com/ik5/gostrutils" ) func sendSmsMessage(msg string) bool { return true } func main() { msg := gostrutils.UTF8ToGsm0338("Please email me to user@example.com") // send the SMS message here if !sendSmsMessage(msg) { panic("Cannot send a message") } }
Output:
func Uin64Split ¶
Uin64Split get a string with separator and convert it to a slice of uint64
Example ¶
package main import ( "fmt" "github.com/ik5/gostrutils" ) func main() { fibonacci := "0, 1, 1, 2, 3, 5, 8, 13, 21, 34" list := gostrutils.Uin64Split(fibonacci, ", ") fmt.Printf("List: %+v\n", list) }
Output:
Types ¶
This section is empty.