Documentation ¶
Overview ¶
Package exporter provides tools for extracting chat session data from JSON files and converting it into various formats such as CSV and JSON datasets. This package is designed to facilitate the analysis and processing of chat data, making it easier to perform tasks such as data visualization, reporting, or feeding the data into machine learning models.
The exporter package defines several types to represent chat sessions, messages, and associated metadata. It also includes functions to read chat session data from JSON files, convert sessions to CSV with different formatting options, create separate CSV files for sessions and messages, and extract sessions to a JSON format suitable for use with Hugging Face datasets.
Usage:
To read chat sessions from a JSON file and convert them to a CSV format:
store, err := exporter.ReadJSONFromFile("path/to/chat-sessions.json") if err != nil { log.Fatal(err) } csvData, err := exporter.ConvertSessionsToCSV(store.ChatNextWebStore.Sessions, exporter.FormatOptionInline, "output.csv") if err != nil { log.Fatal(err) } fmt.Println(csvData)
To create separate CSV files for sessions and messages:
err := exporter.CreateSeparateCSVFiles(store.ChatNextWebStore.Sessions, "sessions.csv", "messages.csv") if err != nil { log.Fatal(err) }
To extract chat sessions to a JSON dataset:
datasetJSON, err := exporter.ExtractToDataset(store.ChatNextWebStore.Sessions) if err != nil { log.Fatal(err) } fmt.Println(datasetJSON)
The package supports handling of IDs and other fields that may be represented as either strings or integers in the source JSON by using the custom StringOrInt type.
Index ¶
- func ConvertSessionsToCSV(sessions []Session, formatOption int, outputFilePath string) error
- func CreateSeparateCSVFiles(sessions []Session, sessionsFileName string, messagesFileName string) error
- func ExtractToDataset(sessions []Session) (string, error)
- type ChatNextWebStore
- type Mask
- type Message
- type Session
- type Stat
- type Store
- type StringOrInt
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ConvertSessionsToCSV ¶
ConvertSessionsToCSV writes a slice of Session objects into a CSV file. It formats the CSV data in different ways based on the formatOption parameter. It returns an error if the format option is invalid or if writing the CSV data fails.
func CreateSeparateCSVFiles ¶
func CreateSeparateCSVFiles(sessions []Session, sessionsFileName string, messagesFileName string) error
CreateSeparateCSVFiles creates two separate CSV files for sessions and messages from a slice of Session objects. It takes the file names as parameters and returns an error if the files cannot be created or if writing the data fails.
func ExtractToDataset ¶
ExtractToDataset converts a slice of Session objects into a JSON formatted string suitable for use as a dataset in machine learning applications. It returns an error if marshaling the sessions into JSON format fails.
Types ¶
type ChatNextWebStore ¶
type ChatNextWebStore struct {
ChatNextWebStore Store `json:"chat-next-web-store"`
}
ChatNextWebStore is a wrapper for Store that aligns with the expected JSON structure for a chat-next-web-store object.
func ReadJSONFromFile ¶
func ReadJSONFromFile(filePath string) (ChatNextWebStore, error)
ReadJSONFromFile reads a JSON file from the given file path and unmarshals it into a ChatNextWebStore struct. It returns an error if the file cannot be opened, the JSON is invalid, or the JSON format does not match the expected ChatNextWebStore format.
type Mask ¶
type Mask struct { ID StringOrInt `json:"id"` // Use the custom type for ID Avatar string `json:"avatar"` Name string `json:"name"` Lang string `json:"lang"` CreatedAt int64 `json:"createdAt"` // Assuming it's a Unix timestamp }
Mask represents an anonymization mask for a participant in a chat session, including the participant's ID, avatar link, name, language, and creation timestamp.
type Message ¶
type Message struct { ID string `json:"id"` Date string `json:"date"` Role string `json:"role"` Content string `json:"content"` }
Message represents a single message within a chat session, including metadata like the ID, date, role of the sender, and the content of the message itself.
type Session ¶
type Session struct { ID string `json:"id"` Topic string `json:"topic"` MemoryPrompt string `json:"memoryPrompt"` Stat Stat `json:"stat"` LastUpdate int64 `json:"lastUpdate"` // Changed to int64 assuming it's a Unix timestamp LastSummarizeIndex int `json:"lastSummarizeIndex"` Mask Mask `json:"mask"` Messages []Message `json:"messages"` }
Session represents a single chat session, including session metadata, statistics, messages, and the mask for the participant.
type Stat ¶
type Stat struct { TokenCount int `json:"tokenCount"` WordCount int `json:"wordCount"` CharCount int `json:"charCount"` }
Stat represents statistics for a chat session, such as the count of tokens, words, and characters.
type Store ¶
type Store struct {
Sessions []Session `json:"sessions"`
}
Store encapsulates a collection of chat sessions.
type StringOrInt ¶
type StringOrInt string
StringOrInt is a custom type to handle JSON values that can be either strings or integers (Magic Golang 🎩 🪄). It implements the Unmarshaler interface to handle this mixed type when unmarshaling JSON data.
func (*StringOrInt) UnmarshalJSON ¶
func (soi *StringOrInt) UnmarshalJSON(data []byte) error
UnmarshalJSON is a custom unmarshaler for StringOrInt that tries to unmarshal JSON data as a string, and if that fails, as an integer, which is then converted to a string.