Documentation ¶
Overview ¶
Package purell offers URL normalization as described on the wikipedia page: http://en.wikipedia.org/wiki/URL_normalization
This file implements query escaping as per RFC 3986. It contains some parts of the net/url package, modified so as to allow some reserved characters incorrectly escaped by net/url. See https://github.com/golang/go/issues/5684
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func MustNormalizeURLString ¶
func MustNormalizeURLString(u string, f NormalizationFlags) string
MustNormalizeURLString returns the normalized string, and panics if an error occurs. It takes an URL string as input, as well as the normalization flags.
Example ¶
normalized := MustNormalizeURLString("hTTpS://someWEBsite.com:443/Amazing%fa/url/", FlagsUnsafeGreedy) fmt.Print(normalized)
Output: http://somewebsite.com/Amazing%FA/url
func NormalizeURL ¶
func NormalizeURL(u *url.URL, f NormalizationFlags) string
NormalizeURL returns the normalized string. It takes a parsed URL object as input, as well as the normalization flags.
Example ¶
if u, err := url.Parse("Http://SomeUrl.com:8080/a/b/.././c///g?c=3&a=1&b=9&c=0#target"); err != nil { panic(err) } else { normalized := NormalizeURL(u, FlagsUsuallySafeGreedy|FlagRemoveDuplicateSlashes|FlagRemoveFragment) fmt.Print(normalized) }
Output: http://someurl.com:8080/a/c/g?c=3&a=1&b=9&c=0
func NormalizeURLString ¶
func NormalizeURLString(u string, f NormalizationFlags) (string, error)
NormalizeURLString returns the normalized string, or an error if it can't be parsed into an URL object. It takes an URL string as input, as well as the normalization flags.
Example ¶
if normalized, err := NormalizeURLString("hTTp://someWEBsite.com:80/Amazing%3f/url/", FlagLowercaseScheme|FlagLowercaseHost|FlagUppercaseEscapes); err != nil { panic(err) } else { fmt.Print(normalized) }
Output: http://somewebsite.com:80/Amazing%3F/url/
Types ¶
type NormalizationFlags ¶
type NormalizationFlags uint
A set of normalization flags determines how a URL will be normalized.
const ( // Safe normalizations FlagLowercaseScheme NormalizationFlags = 1 << iota // HTTP://host -> http://host, applied by default in Go1.1 FlagLowercaseHost // http://HOST -> http://host FlagUppercaseEscapes // http://host/t%ef -> http://host/t%EF FlagDecodeUnnecessaryEscapes // http://host/t%41 -> http://host/tA FlagEncodeNecessaryEscapes // http://host/!"#$ -> http://host/%21%22#$ FlagRemoveDefaultPort // http://host:80 -> http://host FlagRemoveEmptyQuerySeparator // http://host/path? -> http://host/path // Usually safe normalizations FlagRemoveTrailingSlash // http://host/path/ -> http://host/path FlagAddTrailingSlash // http://host/path -> http://host/path/ (should choose only one of these add/remove trailing slash flags) FlagRemoveDotSegments // http://host/path/./a/b/../c -> http://host/path/a/c // Unsafe normalizations FlagRemoveDirectoryIndex // http://host/path/index.html -> http://host/path/ FlagRemoveFragment // http://host/path#fragment -> http://host/path FlagForceHTTP // https://host -> http://host FlagRemoveDuplicateSlashes // http://host/path//a///b -> http://host/path/a/b FlagRemoveWWW // http://www.host/ -> http://host/ FlagAddWWW // http://host/ -> http://www.host/ (should choose only one of these add/remove WWW flags) FlagSortQuery // http://host/path?c=3&b=2&a=1&b=1 -> http://host/path?a=1&b=1&b=2&c=3 // Normalizations not in the wikipedia article, required to cover tests cases // submitted by jehiah FlagDecodeDWORDHost // http://1113982867 -> http://66.102.7.147 FlagDecodeOctalHost // http://0102.0146.07.0223 -> http://66.102.7.147 FlagDecodeHexHost // http://0x42660793 -> http://66.102.7.147 FlagRemoveUnnecessaryHostDots // http://.host../path -> http://host/path FlagRemoveEmptyPortSeparator // http://host:/path -> http://host/path // Convenience set of safe normalizations FlagsSafe NormalizationFlags = FlagLowercaseHost | FlagLowercaseScheme | FlagUppercaseEscapes | FlagDecodeUnnecessaryEscapes | FlagEncodeNecessaryEscapes | FlagRemoveDefaultPort | FlagRemoveEmptyQuerySeparator // Convenience set of usually safe normalizations (includes FlagsSafe) FlagsUsuallySafeGreedy NormalizationFlags = FlagsSafe | FlagRemoveTrailingSlash | FlagRemoveDotSegments FlagsUsuallySafeNonGreedy NormalizationFlags = FlagsSafe | FlagAddTrailingSlash | FlagRemoveDotSegments // Convenience set of unsafe normalizations (includes FlagsUsuallySafe) FlagsUnsafeGreedy NormalizationFlags = FlagsUsuallySafeGreedy | FlagRemoveDirectoryIndex | FlagRemoveFragment | FlagForceHTTP | FlagRemoveDuplicateSlashes | FlagRemoveWWW | FlagSortQuery FlagsUnsafeNonGreedy NormalizationFlags = FlagsUsuallySafeNonGreedy | FlagRemoveDirectoryIndex | FlagRemoveFragment | FlagForceHTTP | FlagRemoveDuplicateSlashes | FlagAddWWW | FlagSortQuery // Convenience set of all available flags FlagsAllGreedy = FlagsUnsafeGreedy | FlagDecodeDWORDHost | FlagDecodeOctalHost | FlagDecodeHexHost | FlagRemoveUnnecessaryHostDots | FlagRemoveEmptyPortSeparator FlagsAllNonGreedy = FlagsUnsafeNonGreedy | FlagDecodeDWORDHost | FlagDecodeOctalHost | FlagDecodeHexHost | FlagRemoveUnnecessaryHostDots | FlagRemoveEmptyPortSeparator )