provly

module

v0.0.0-...-c50d277 Latest Latest Go to latest Published: Nov 25, 2019 License: Apache-2.0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/schafer14/provly

README ¶

Provly

provly logo

Provly & Provenance

Provly is a rough implementation of the W3C's provenance standandards. This maps the relationship between objects across time. It is worth noting that this could be the same object changing over time. This is covered in the data models section.

In order to do this Provly provides the following data models:

Entity
Activities
Agents

These models are connected through a set of relationships defined in the Prov spec.

The goal of this particular provenance implementation is to map activities in the research process to make scientific reproducability easier. To accomplish this goal the implementation occasionally strays from the W3C reference when necessary.

Example

graph

This represents an example graph that might be stored in Provly to describe a hypothentical experiment. We can see this experiment resolved around a packet of seeds. While most papers written about this would describe the packet of seed linking to this prov graph gives a description of the origin of this seed packet as well as the anlysis that was done to create the paper. We can see who was involved for each process of the experiment and where entities or agents represent software we are given hashes of the document to ensure exact replication.

Getting Started

This section describes setting up Provly on your machine for development/personal usage. If you are interested in using an existing Provly instance through an API contact the repo owners.

Required software:

Go v1.13
Docker

# Start the Databse
make start-db

# Start tracing server
make start-zipkins

# Run migrations
make migrate

# Start server
go run ./cmd/provly-api --zipkin-reporter-uri=0.0.0.0:9411

Command default options

--web-api-host=0.0.0.0:3000
--web-debug-host=0.0.0.0:4000
--web-read-timeout=10s
--web-write-timeout=10s
--web-shutdown-timeout=5s
--db-user=root
--db-host=[http://localhost:8529]
--db-name=provly
--zipkin-local-endpoint=0.0.0.0:3000
--zipkin-reporter-uri=http://zipkins:9411/api/v2/spans
--zipkin-service-name=provly-api
--zipkin-probability=0.05

Testing

go test ./...

Loading demo data

The data used to create diagram above can be loaded into the database by running

make demo

Points of interest

There are now four services that you can interact with to help with development.

API - running on :3000.
Monitoring & Debug - running on :4000
Arango Graph DB - running on :8529
Zipkins Tracing - running on :9411

Data models

A goal of provenance is to track relationships between entities across time using activities as the main catalyst for change. This model is often conceptually different from data models used in applications. Understanding these differences is key to using Provly effectivly.

While most applications build up relationships between different objects at a single point in time (normally as current as possible), Provly builds up relationships between a single object across a time range. This results in each item in Provly having two identifiers. A canonical ID is the identifier that defines the resource to the outside world, and a provenance ID which defines a particular version of that resource.

If this is hard to conceptualize consider the proverb "If the blade of an axe is replaced, and then its handle, it is still the same axe?" This could be modelled in Provly as follows:

data model diagram

As you can see as the axe goes through transformations its canonical ID does not change, but it gets a new provenance ID after each transformation.

Contributing

All contributions are welcome. Please contact the authors to get involved!

Directories ¶

Path	Synopsis
cmd
provly provly runs all services related to provly	provly runs all services related to provly
provly-admin
provly-api
provly-api/internal/handlers
provly-ui
provly-ui/internal/handlers
sidecar/metrics
sidecar/metrics/internal/collector
sidecar/metrics/internal/publisher
sidecar/metrics/internal/publisher/datadog
sidecar/metrics/internal/publisher/expvar
internal
mid
platform/database
platform/database/databasetest
platform/router
prov
prov/activity
prov/agent
prov/arango
prov/entity
schema
tests

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL