README ¶
Open Source Data Anonymization and Synthetic Data Orchestration
Introduction
Neosync is an open-source, developer-first way to anonymize PII, generate synthetic data and sync environments for better testing, debugging and developer experience.
Companies use Neosync to:
- Safely test code against production data - Anonymize sensitive production data in order to safely use it locally for a better testing and developer experience
- Easily reproduce production bugs locally - Anonymize and subset production data to get a safe, representative data set that you can use to locally reproduce production bugs quickly and efficiently
- High quality data for lower-level environments - Catch bugs before they hit production when you hydrate your staging and QA environments with production-like data
- Solve GDPR, DPDP, FERPA, HIPAA and more - Use anonymized and synthetic data to reduce your compliance scope and easily comply with laws like HIPAA, GDPR, and DPDP
- Seed development databases - Easily seed development databases with synthetic data for unit testing, demos and more
Features
- Generate synthetic data based on your schema
- Anonymize existing production-data for a better developer experience
- Subset your production database for local and CI testing using any SQL query
- Complete async pipeline that automatically handles job retries, failures and playback using an event-sourcing model
- Referential integrity for your data automatically
- Declarative, GitOps based configs as a step in your CI pipeline to hydrate your CI DB
- Pre-built data transformers for all major data types
- Custom data transformers using javascript or LLMs
- Pre-built integrations with Postgres, Mysql, S3
Getting started
Neosync is a fully dockerized setup which makes it easy to get up and running.
A compose.yml file at the root contains production image refs that allow you to get up and running with just a few commands without having to build anything on your system.
Neosync uses the newer docker compose
command, so be sure to have that installed on your machine.
To start Neosync, clone the repo into a local directory, be sure to have docker installed and running, and then run:
make compose/up
To stop, run:
make compose/down
Neosync will now be available on http://localhost:3000.
The production compose pre-seeds with connections and jobs to get you started! Simply run the generate and sync job to watch Neosync in action!
Kubernetes, Auth Mode and more
For more in-depth details on environment variables, Kubernetes deployments, and a production-ready guide, check out the Deploy Neosync section of our Docs.
Resources
Some resources to help you along the way:
- Docs for comprehensive documentation and guides
- Discord for discussion with the community and Neosync team
- X for the latest updates
Contributing
We love contributions big and small. Here are just a few ways that you can contribute to Neosync.
- Join our Discord channel and ask us any questions there
- Open a PR (see our instructions on developing with Neosync locally)
- Submit a feature request or bug report
Licensing
We strongly believe in free and open source software and make this repo is available under the MIT expat license.
Directories ¶
Path | Synopsis |
---|---|
backend
module
|
|
cli
module
|
|
k8s-operator
module
|
|
worker
module
|
|
pkg/workflows/datasync/activities/shared
This package is intended to be a centralized module for any shared utilities across activities for the datasync workflow
|
This package is intended to be a centralized module for any shared utilities across activities for the datasync workflow |