runes

package
v0.0.0-...-50d4735 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 9, 2024 License: Apache-2.0, BSD-3-Clause Imports: 2 Imported by: 0

Documentation

Overview

Package runes provides interfaces and utilities for working with runes.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Buffer

type Buffer interface {
	Get(i int) rune
	Slice(i, j int) string
	Len() int
}

Buffer is an interface for accessing a contiguous array of code points.

func NewBuffer

func NewBuffer(data string) Buffer

NewBuffer returns an efficient implementation of Buffer for the given text based on the ranges of the encoded code points contained within.

Code points are represented as an array of byte, uint16, or rune. This approach ensures that each index represents a code point by itself without needing to use an array of rune. At first we assume all code points are less than or equal to '\u007f'. If this holds true, the underlying storage is a byte array containing only ASCII characters. If we encountered a code point above this range but less than or equal to '\uffff' we allocate a uint16 array, copy the elements of previous byte array to the uint16 array, and continue. If this holds true, the underlying storage is a uint16 array containing only Unicode characters in the Basic Multilingual Plane. If we encounter a code point above '\uffff' we allocate an rune array, copy the previous elements of the byte or uint16 array, and continue. The underlying storage is an rune array containing any Unicode character.

func NewBufferAndLineOffsets

func NewBufferAndLineOffsets(data string) (Buffer, []int32)

NewBufferAndLineOffsets returns an efficient implementation of Buffer for the given text based on the ranges of the encoded code points contained within, as well as returning the line offsets.

Code points are represented as an array of byte, uint16, or rune. This approach ensures that each index represents a code point by itself without needing to use an array of rune. At first we assume all code points are less than or equal to '\u007f'. If this holds true, the underlying storage is a byte array containing only ASCII characters. If we encountered a code point above this range but less than or equal to '\uffff' we allocate a uint16 array, copy the elements of previous byte array to the uint16 array, and continue. If this holds true, the underlying storage is a uint16 array containing only Unicode characters in the Basic Multilingual Plane. If we encounter a code point above '\uffff' we allocate an rune array, copy the previous elements of the byte or uint16 array, and continue. The underlying storage is an rune array containing any Unicode character.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL