Documentation ¶
Index ¶
- Constants
- Variables
- func AddFileToS3(sess *session.Session, inFile string, s3file *S3File) (err error)
- func BuildDate() time.Time
- func BuildNumber() int64
- func CheckIfDbExists(db *sql.DB, dbName string) (exists bool, err error)
- func CreateDatabase(conf PGConfig) error
- func CreateDumpFile(conf PGConfig, dumpfilePath, schemaPrefix string, ...) error
- func DropDatabase(conf PGConfig) error
- func DropPublicTables(conf PGConfig) error
- func ExecPostgresCmd(name string, args ...string) error
- func ExecPostgresCommandOutErr(stdOut, stdErr io.Writer, name string, arg ...string) error
- func GetAllProceduresInSchema(conf PGConfig, schema string) ([]string, error)
- func GetAllSchemaColumns(db *sql.DB) (*sql.Rows, error)
- func GetAllTablesInSchema(conf PGConfig, schema string) ([]string, error)
- func GetFileFromS3(sess *session.Session, s3file *S3File, loadFile string) (err error)
- func GetSchemaColumnEquals(db *sql.DB, schema string) (*sql.Rows, error)
- func GetSchemaColumnsLike(db *sql.DB, schemaPrefix string) (*sql.Rows, error)
- func GetSchemasInDatabase(conf PGConfig, excludeSchemas []string) ([]string, error)
- func GetTableRowCountsInDB(conf PGConfig, schemaPrefix string, excludeTable []string) (*[]RowCounts, error)
- func KillDatabaseConnections(db *sql.DB, dbName string) (err error)
- func LoadFile(conf PGConfig, filePath string) (err error)
- func OpenDB(conf PGConfig) (*sql.DB, error)
- func ProcessDumpFile(mapper *DBMapper, src, dst, preProcessFile, postProcessFile string, ...) error
- func ProcessorAddress(cmap *ColumnMapper, input string) (string, error)
- func ProcessorAlphaNumericScrambler(cmap *ColumnMapper, input string) (string, error)
- func ProcessorCity(cmap *ColumnMapper, input string) (string, error)
- func ProcessorCompanyName(cmap *ColumnMapper, input string) (string, error)
- func ProcessorEmailAddress(cmap *ColumnMapper, input string) (string, error)
- func ProcessorEmptyJson(cmap *ColumnMapper, input string) (string, error)
- func ProcessorFirstName(cmap *ColumnMapper, input string) (string, error)
- func ProcessorFullName(cmap *ColumnMapper, input string) (string, error)
- func ProcessorIBANScrambler(_ *ColumnMapper, input string) (string, error)
- func ProcessorIPv4(cmap *ColumnMapper, input string) (string, error)
- func ProcessorIdentity(cmap *ColumnMapper, input string) (string, error)
- func ProcessorLastName(cmap *ColumnMapper, input string) (string, error)
- func ProcessorPhoneNumber(cmap *ColumnMapper, input string) (string, error)
- func ProcessorRandomBoolean(cmap *ColumnMapper, input string) (string, error)
- func ProcessorRandomCountryCode(_ *ColumnMapper, _ string) (string, error)
- func ProcessorRandomDate(cmap *ColumnMapper, input string) (string, error)
- func ProcessorRandomDigits(cmap *ColumnMapper, input string) (string, error)
- func ProcessorRandomUUID(cmap *ColumnMapper, input string) (string, error)
- func ProcessorScrubString(cmap *ColumnMapper, input string) (string, error)
- func ProcessorState(cmap *ColumnMapper, input string) (string, error)
- func ProcessorStateAbbrev(cmap *ColumnMapper, input string) (string, error)
- func ProcessorUserName(cmap *ColumnMapper, input string) (string, error)
- func ProcessorZip(cmap *ColumnMapper, input string) (string, error)
- func RenameDatabase(db *sql.DB, fromName, toName string) (err error)
- func SQLCommandFile(conf PGConfig, filepath string, ignoreErrors bool) error
- func VerifyRowCount(conf PGConfig, filePath string) (err error)
- func Version() string
- func WriteConfigSkeleton(dbmap *DBMapper, filepath string) error
- type ColumnMapper
- type CountryCode
- type DBMapper
- type LineState
- type PGConfig
- func (conf *PGConfig) BaseDSN() string
- func (conf *PGConfig) BaseURI() string
- func (conf *PGConfig) DSN() string
- func (conf *PGConfig) LoadFromCLI(host, username, password, database string, port int32, disableSSL bool)
- func (conf *PGConfig) LoadFromEnv(debugNum int64, prefix, suffix string)
- func (conf *PGConfig) URI() string
- type ProcessorDefinition
- type ProcessorFunc
- type RowCounts
- type S3File
Constants ¶
const ( StateChangeTokenBeginCopy = "COPY" StateChangeTokenEndCopy = "\\." )
StateChangeTokenBeginCopy is the token used to notify the processor that we have hit SQL-COPY in the dump file StateChangeTokenEndCopy is the token used to notify the processor that we are done with SQL-COPY
Variables ¶
var AlphaNumericMap = map[string]map[string]string{}
AlphaNumericMap is used to keep consistency with scrambled alpha numeric strings. For example, if we need to scramble things such as Social Security Numbers, but it is nice to keep track of these changes so if we run across the same SSN again we can scramble it to what we already have.
var CountryCodes []CountryCode
var IBANMap = map[string]string{}
IBANMap is the Global IBANs map for all IBANs we anonymize.
var ProcessorCatalog map[string]ProcessorFunc
ProcessorCatalog is the function map that points to each Processor to it's entry function. All Processors are listed in this map.
var UUIDMap = map[uuid.UUID]uuid.UUID{}
UUIDMap is the Global UUID map for all UUIDs that we anonymize. Similar to AlphaNumericMap this map contains all UUIDs and what they are changed to. Some tables use UUIDs as the primary key and this allows us to keep consistency in the data set when anonymizing it.
Functions ¶
func AddFileToS3 ¶
AddFileToS3 will upload the supplied inFile to the supplied S3File.FilePath
func BuildDate ¶
BuildDate will return the current unix time as the build date time for the application.
func BuildNumber ¶
func BuildNumber() int64
BuildNumber will return the build number for the application.
func CheckIfDbExists ¶
CheckIfDbExists checks to see if the database exists using the provided db connection.
func CreateDatabase ¶
CreateDatabase will create the database that is supplied in the PGConfig.
func CreateDumpFile ¶
func CreateDumpFile( conf PGConfig, dumpfilePath, schemaPrefix string, excludeTables, excludeDataTables, excludeCreateSchemas, schemas []string, ) error
CreateDumpFile will create a PostgreSQL dump file from the specified PGConfig to the location, and with restrictions, that are provided by the inputs to the function.
func DropDatabase ¶
DropDatabase will drop the database that is supplied in the PGConfig.
func DropPublicTables ¶
DropPublicTables drops all tables in the public schema.
func ExecPostgresCmd ¶
ExecPostgresCmd executes the psql command, but first opens the db_test_*.log log files for debugging runtime issues using the psql command.
func ExecPostgresCommandOutErr ¶
ExecPostgresCommandOutErr is the executing function for the psql -f command. It also closed the loaded files/buffers from the calling functions.
func GetAllProceduresInSchema ¶
GetAllProceduresInSchema will return all procedures for the given schemas in SQL form.
func GetAllSchemaColumns ¶
GetAllSchemaColumns will return a row pointer to a list of table and column names for the given database connection.
func GetAllTablesInSchema ¶
GetAllTablesInSchema will return a list of database tables for a given database configuration.
func GetFileFromS3 ¶
GetFileFromS3 will save the S3File to the loadFile destination.
func GetSchemaColumnEquals ¶
GetSchemaColumnEquals returns a pointer to a list of database rows containing the names of tables and columns for the provided schema (using the SQL equals operator).
func GetSchemaColumnsLike ¶
GetSchemaColumnsLike will return a pointer to a list of database rows containing the names of tables and columns for the provided schema (using the SQL LIKE operator).
func GetSchemasInDatabase ¶
GetSchemasInDatabase returns a list of schemas for a given database configuration. If an excludeSchemas list is provided GetSchemasInDatabase will leave them out of the returned list of schemas.
func GetTableRowCountsInDB ¶
func GetTableRowCountsInDB(conf PGConfig, schemaPrefix string, excludeTable []string) (*[]RowCounts, error)
GetTableRowCountsInDB collects the number of rows for each table in the given supplied schema prefix and will not include any of the tables listed in the excludeTable list. Returns a list of tables the number of rows for each.
func KillDatabaseConnections ¶
KillDatabaseConnections will kill all connections to the provided database name.
func OpenDB ¶
OpenDB will open the database set in the PGConfig and return a pointer to the database connection.
func ProcessDumpFile ¶
func ProcessDumpFile(mapper *DBMapper, src, dst, preProcessFile, postProcessFile string, generateSeed bool, ) error
ProcessDumpFile will process the supplied dump file according to the supplied database map file. GenerateSeed can also be set to true which will inform the function to use Go's built-in random number generator.
func ProcessorAddress ¶
func ProcessorAddress(cmap *ColumnMapper, input string) (string, error)
ProcessorAddress will return a fake address string that is compiled from the fake library
func ProcessorAlphaNumericScrambler ¶
func ProcessorAlphaNumericScrambler(cmap *ColumnMapper, input string) (string, error)
ProcessorAlphaNumericScrambler will receive the column metadata via ColumnMap and the column's actual data via the input string. The processor will scramble all alphanumeric digits and characters, but it will leave all non-alphanumerics the same without modification. These values are globally mapped and use the AlphaNumericMap to remap values once they are seen more than once.
Example: "PUI-7x9vY" = ProcessorAlphaNumericScrambler("ABC-1a2bC")
func ProcessorCity ¶
func ProcessorCity(cmap *ColumnMapper, input string) (string, error)
ProcessorCity will return a real city name that is >= 0.4 Jaro-Winkler similar than the input.
func ProcessorCompanyName ¶
func ProcessorCompanyName(cmap *ColumnMapper, input string) (string, error)
ProcessorCompanyName will return a company name that is >= 0.4 Jaro-Winkler similar than the input.
func ProcessorEmailAddress ¶
func ProcessorEmailAddress(cmap *ColumnMapper, input string) (string, error)
ProcessorEmailAddress will return an e-mail address that is >= 0.4 Jaro-Winkler similar than the input.
func ProcessorEmptyJson ¶
func ProcessorEmptyJson(cmap *ColumnMapper, input string) (string, error)
ProcessorEmptyJson will return an empty JSON no matter what is the input.
func ProcessorFirstName ¶
func ProcessorFirstName(cmap *ColumnMapper, input string) (string, error)
ProcessorFirstName will return a first name that is >= 0.4 Jaro-Winkler similar than the input.
func ProcessorFullName ¶
func ProcessorFullName(cmap *ColumnMapper, input string) (string, error)
ProcessorFullName will return a full name that is >= 0.4 Jaro-Winkler similar than the input.
func ProcessorIBANScrambler ¶
func ProcessorIBANScrambler(_ *ColumnMapper, input string) (string, error)
func ProcessorIPv4 ¶
func ProcessorIPv4(cmap *ColumnMapper, input string) (string, error)
func ProcessorIdentity ¶
func ProcessorIdentity(cmap *ColumnMapper, input string) (string, error)
ProcessorIdentity will skip anonymization and leave output === input.
func ProcessorLastName ¶
func ProcessorLastName(cmap *ColumnMapper, input string) (string, error)
ProcessorLastName will return a last name that is >= 0.4 Jaro-Winkler similar than the input.
func ProcessorPhoneNumber ¶
func ProcessorPhoneNumber(cmap *ColumnMapper, input string) (string, error)
ProcessorPhoneNumber will return a phone number that is >= 0.4 Jaro-Winkler similar than the input.
func ProcessorRandomBoolean ¶
func ProcessorRandomBoolean(cmap *ColumnMapper, input string) (string, error)
ProcessorRandomBoolean will return a random boolean value.
func ProcessorRandomCountryCode ¶
func ProcessorRandomCountryCode(_ *ColumnMapper, _ string) (string, error)
func ProcessorRandomDate ¶
func ProcessorRandomDate(cmap *ColumnMapper, input string) (string, error)
ProcessorRandomDate will return a random day and month, but keep year the same (See: HIPAA rules)
func ProcessorRandomDigits ¶
func ProcessorRandomDigits(cmap *ColumnMapper, input string) (string, error)
ProcessorRandomDigits will return a random string of digit(s) keeping the same length of the input.
func ProcessorRandomUUID ¶
func ProcessorRandomUUID(cmap *ColumnMapper, input string) (string, error)
ProcessorRandomUUID will generate a random UUID and replace the input with the new UUID. The input however will be mapped to the output so every occurrence of the input UUID will replace it with the same output UUID that was originally created during the first occurrence of the input UUID.
func ProcessorScrubString ¶
func ProcessorScrubString(cmap *ColumnMapper, input string) (string, error)
ProcessorScrubString will replace the input string with asterisks (*). Useful for blanking out password fields.
func ProcessorState ¶
func ProcessorState(cmap *ColumnMapper, input string) (string, error)
ProcessorState will return a state that is >= 0.4 Jaro-Winkler similar than the input.
func ProcessorStateAbbrev ¶
func ProcessorStateAbbrev(cmap *ColumnMapper, input string) (string, error)
ProcessorStateAbbrev will return a state abbreviation.
func ProcessorUserName ¶
func ProcessorUserName(cmap *ColumnMapper, input string) (string, error)
ProcessorUserName will return a username that is >= 0.4 Jaro-Winkler similar than the input.
func ProcessorZip ¶
func ProcessorZip(cmap *ColumnMapper, input string) (string, error)
ProcessorZip will return a zip code that is >= 0.4 Jaro-Winkler similar than the input.
func RenameDatabase ¶
RenameDatabase will rename a database using the fromName to the toName.
func SQLCommandFile ¶
SQLCommandFile will run psql -f on a file and execute any queries contained in the sql file. If ignoreErrors is supplied then psql will ignore errors in the file.
func VerifyRowCount ¶
VerifyRowCount will verify that the rowcounts in the PGConfig matches the supplied CSV file (see command/dump)
func WriteConfigSkeleton ¶
WriteConfigSkeleton will save the supplied DBMap to filepath.
Types ¶
type ColumnMapper ¶
type ColumnMapper struct { Comment string TableSchema string TableName string ColumnName string DataType string ParentSchema string ParentTable string ParentColumn string OrdinalPosition int IsNullable bool Processors []ProcessorDefinition }
ColumnMapper is the data structure that contains all gonymizer required information for the specified column.
type CountryCode ¶
type CountryCode struct {
Code, Name string
}
type DBMapper ¶
type DBMapper struct { DBName string SchemaPrefix string Seed int64 ColumnMaps []ColumnMapper }
DBMapper is the main structure for the map file JSON object and is used to map all database columns that will be anonymized.
func GenerateConfigSkeleton ¶
func GenerateConfigSkeleton(conf PGConfig, schemaPrefix string, schemas, excludeTables []string) (*DBMapper, error)
GenerateConfigSkeleton will generate a column-map based on the supplied PGConfig and previously configured map file.
func LoadConfigSkeleton ¶
LoadConfigSkeleton will load the column-map into memory for use in dumping, processing, and loading of SQL files.
func (DBMapper) ColumnMapper ¶
func (dbMap DBMapper) ColumnMapper(schemaName, tableName, columnName string) *ColumnMapper
ColumnMapper returns the address of the ColumnMapper object if it matches the given parameters otherwise it returns nil. Special cases exist for sharded schemas using the schema-prefix. See documentation for details.
type LineState ¶
type LineState struct { LineNum int64 IsRow bool SchemaName string TableName string ColumnNames []string }
LineState contains all the required information for parsing a line in the SQL dump file.
type PGConfig ¶
type PGConfig struct { Username string Pass string Host string DefaultDBName string SSLMode string }
PGConfig is the main configuration structure for different PostgreSQL server configurations.
func (*PGConfig) DSN ¶
DSN will construct the data source name from the supplied data in the PGConfig. See: https://en.wikipedia.org/wiki/Data_source_name
func (*PGConfig) LoadFromCLI ¶
func (conf *PGConfig) LoadFromCLI(host, username, password, database string, port int32, disableSSL bool)
LoadFromCLI will load the PostgreSQL configuration using the function input variables.
func (*PGConfig) LoadFromEnv ¶
LoadFromEnv uses environment variables to load the PGConfig.
type ProcessorDefinition ¶
type ProcessorDefinition struct { Name string // optional helpers Max float64 Min float64 Variance float64 Comment string }
ProcessorDefinition is the processor data structure used to map database columns to their specified column processor.
type ProcessorFunc ¶
type ProcessorFunc func(*ColumnMapper, string) (string, error)
ProcessorFunc is a simple function prototype for the ProcessorMap function pointers.
type RowCounts ¶
RowCounts is used to keep track of the number of rows for a given schema and table.
type S3File ¶
S3File is the main structure for gonymizer files in S3 metadata.
func (*S3File) ParseS3Url ¶
ParseS3Url will parse the supplied S3 uri and load it into a S3File structure