Documentation ¶
Overview ¶
Package sortfile provides functions to sort a file. Both in-memory and external merge sort.
## Usage
```go import "github.com/KEINOS/go-sortfile/sortfile" ```
Index ¶
- Constants
- func ExternalFile(sizeFileIn, sizeChunk datasize.InBytes, ptrFileIn io.Reader, ...) error
- func FileExists(pathFile string) bool
- func FromPath(pathFileIn, pathFileOut string, forceExternalSort bool) error
- func FromPathFunc(pathFileIn, pathFileOut string, forceExternalSort bool, ...) error
- func InMemory(numLines int, input io.Reader, output io.Writer, ...) error
Examples ¶
Constants ¶
const ( LF = "\n" // LF is the line feed character CR = "\r" // CR is the carriage return character CRLF = CR + LF // CRLF is the carriage return and line feed character )
Variables ¶
This section is empty.
Functions ¶
func ExternalFile ¶
func ExternalFile(sizeFileIn, sizeChunk datasize.InBytes, ptrFileIn io.Reader, ptrFileOut io.Writer, isLess func(string, string) bool) error
ExternalFile sorts the file using external merge sort (K-way merge sort).
The isLess agument is a function to compare two lines. If isLess is nil, the default is used.
// Default isLess function func isLess(a, b string) bool { return a < b // to reverse the sort, use a > b }
If the sizeFileIn is smaller than the sizeChunk, we recommend to use InMemory sort instead.
Example ¶
package main import ( "log" "os" "path/filepath" "github.com/KEINOS/go-sortfile/sortfile" "github.com/KEINOS/go-sortfile/sortfile/datasize" ) func main() { exitOnError := func(err error) { if err != nil { log.Fatal(err) } } // Input and output file paths pathFileIn := filepath.Join("testdata", "sorted_chunks", "input_shuffled.txt") // Get file and memory information sizeFileIn, _, err := datasize.File(pathFileIn) exitOnError(err) sizeMemoryFree, err := datasize.AvailableMemory() exitOnError(err) // Open the file to read fileIn, err := os.Open(pathFileIn) exitOnError(err) defer fileIn.Close() fileOut := os.Stdout // External merge sort with sizeMemoryFree as the chunk size. Use the default // sort function (by nil). err = sortfile.ExternalFile(sizeFileIn, sizeMemoryFree, fileIn, fileOut, nil) exitOnError(err) }
Output: Alice Bob Carol Charlie Dave Ellen Eve Frank Isaac Ivan Justin Mallet Mallory Marvin Matilda Oscar Pat Peggy Steve Trent Trudy Victor Walter Zoe
func FileExists ¶
FileExists returns true if the path exists and is a file.
Example ¶
package main import ( "fmt" "os" "path/filepath" "github.com/KEINOS/go-sortfile/sortfile" ) func main() { for _, pathTarget := range []string{ filepath.Join("testdata", "sorted_chunks", "input_shuffled.txt"), // Existing file os.TempDir(), // Exists but not a file "unknown-non-existing-file", // Not exists } { exists := sortfile.FileExists(pathTarget) fmt.Println("Is file:", exists, ":", pathTarget) } }
Output: Is file: true : testdata/sorted_chunks/input_shuffled.txt Is file: false : /var/folders/8c/lmckjks95fj4h_jqzw4v3k_w0000gn/T/ Is file: false : unknown-non-existing-file
func FromPath ¶
FromPath sorts the file by lines and stores the result in the given path.
It will sort in-memory if the file size is smaller than the current free memory. Otherwise it will use the external merge sort.
It is similar to FromPathFunc() but it uses the default isLess() function.
Example ¶
package main import ( "fmt" "log" "os" "path/filepath" "github.com/KEINOS/go-sortfile/sortfile" ) func main() { exitOnError := func(err error) { if err != nil { log.Fatal(err) } } // Input and output file paths pathFileIn := filepath.Join("testdata", "sorted_chunks", "input_shuffled.txt") pathFileOut := filepath.Join(os.TempDir(), "pkg-sortfile_example_from_path.txt") // Clean up the output file after the test defer func() { exitOnError(os.Remove(pathFileOut)) }() // Sort file in-memory since the file size is small forceExternalSort := false // auto detect err := sortfile.FromPath(pathFileIn, pathFileOut, forceExternalSort) exitOnError(err) // Print the result data, err := os.ReadFile(pathFileOut) exitOnError(err) fmt.Println(string(data)) }
Output: Alice Bob Carol Charlie Dave Ellen Eve Frank Isaac Ivan Justin Mallet Mallory Marvin Matilda Oscar Pat Peggy Steve Trent Trudy Victor Walter Zoe
func FromPathFunc ¶
func FromPathFunc(pathFileIn, pathFileOut string, forceExternalSort bool, isLess func(string, string) bool) error
FromPath sorts the file by lines and stores the result in the given path.
It will sort in-memory if the file size is smaller than the current free memory. Otherwise it will use the external merge sort.
It is similar to FromPath() but it allows you to specify your own isLess() function. If isLess is nil, it will use the default isLess() function.
// Default isLess function func isLess(a, b string) bool { return a < b // to reverse the sort, use a > b }
Example ¶
package main import ( "fmt" "log" "os" "path/filepath" "github.com/KEINOS/go-sortfile/sortfile" ) func main() { exitOnError := func(err error) { if err != nil { log.Fatal(err) } } // Input and output file paths pathFileIn := filepath.Join("testdata", "sorted_chunks", "input_shuffled.txt") pathFileOut := filepath.Join(os.TempDir(), "pkg-sortfile_example_from_path.txt") // Clean up the output file after the test defer func() { exitOnError(os.Remove(pathFileOut)) }() // Sort file in-memory since the file size is small forceExternalSort := false // auto detect // User defined sort function (reverse sort) isLess := func(a, b string) bool { return a > b } err := sortfile.FromPathFunc(pathFileIn, pathFileOut, forceExternalSort, isLess) exitOnError(err) // Print the result data, err := os.ReadFile(pathFileOut) exitOnError(err) fmt.Println(string(data)) }
Output: Zoe Walter Victor Trudy Trent Steve Peggy Pat Oscar Matilda Marvin Mallory Mallet Justin Ivan Isaac Frank Eve Ellen Dave Charlie Carol Bob Alice
func InMemory ¶
func InMemory(numLines int, input io.Reader, output io.Writer, isLess func(string, string) bool) error
InMemory sorts the lines in-memory from the given io.Reader and writes the result to the given io.Writer. Note that the number of lines is required to be known in advance.
Usually it is recommended to use the FromPath() function which detects whether to use the in-memory sort or the external merge sort.
Example ¶
Example of using the in-memory sort.
Note that the number of lines is required to be known in advance. Usually it is recommended to use the FromPath() function which detects whether to use the in-memory sort or the external merge sort.
package main import ( "fmt" "log" "os" "path/filepath" "github.com/KEINOS/go-sortfile/sortfile" "github.com/KEINOS/go-sortfile/sortfile/datasize" ) func main() { pathFileIn := filepath.Join("testdata", "sorted_chunks", "input_shuffled.txt") // Get the number of lines in the file sizeFile, numLines, err := datasize.File(pathFileIn) if err != nil { log.Fatal(err) } fmt.Println("File size:", sizeFile) // Open the input file ptrFileIn, err := os.Open(pathFileIn) if err != nil { log.Fatal(err) } defer ptrFileIn.Close() // Output to stdout ptrFileOut := os.Stdout // Custom sort function as reverse alphabetical order isLess := func(a, b string) bool { return a > b } // Sort the file in-memory. Use default isLess function for sorting (by nil). if err := sortfile.InMemory(numLines, ptrFileIn, ptrFileOut, isLess); err != nil { log.Fatal(err) } }
Output: File size: 145 Bytes Zoe Walter Victor Trudy Trent Steve Peggy Pat Oscar Matilda Marvin Mallory Mallet Justin Ivan Isaac Frank Eve Ellen Dave Charlie Carol Bob Alice
Types ¶
This section is empty.
Source Files ¶
Directories ¶
Path | Synopsis |
---|---|
Package chunk is a chunk file manager.
|
Package chunk is a chunk file manager. |
Package datasize defines the type InBytes which represents a size in bytes.
|
Package datasize defines the type InBytes which represents a size in bytes. |
Package inmemory provides sorting algorithms for in-memory data.
|
Package inmemory provides sorting algorithms for in-memory data. |