tromos-ce

module
v0.0.0-...-5b6cbe3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 20, 2019 License: Apache-2.0

README

= (Tromos) Transparent Online Management Of Storage


Storage is an unbalanced world full of trade-offs and compromises. If you want scalability, you have to sacrifice consistency. If you want performance, you have to sacrifice real-time monitoring.
If you want high ingestion rate, you have to sacrifice either read performance or raw capacity. There is no "one-solution-fits-it-all" since every application has its own requirements and characteristics.
Unfortunately, the faster the application birth rate increases, the faster these applications -and their requirements- diverge. 

In an ideal world, every application would run atop a storage system tailored to the storage requirements of the application. We have already seen the merits of this strategy with Docker Containers that provide a customized
execution environment tailored to the runtime requirements of the application. However, a storage system is a quite complex piece of software that usually takes years of development and hardening. As a result, the rate at which new systems emerge cannot cope with the birth rate of application.

link:http://www.tromos.io/download[**Tromos Community Edition**] (or *Tromos-CE*) for solving precisely this problem! Its goal is to be the *easiest and fastest way to design and deploy customized storage containers.*

It does so by breaking the primitives of distributed storage systems into narrow-scoped elements, which the application architects can later compose in arbitrary ways. Using the provided domains-specific language, the architects can configure a data-management environment tailored to the requirements of the application hand, simply by choosing the appropriate combination of components (e.g., distribution logic, in-transit processing, consistency-level, data-layout)


If the application does not require strong semantics, do not apply such a policy. If the application requires strong semantics, add them as a plugin, and so do for any other storage aspect.
*The general principle behind storage containers is: if specific functionality is needed, add it as a plugin. If it is not needed, skip it to avoid unnecessary overheads.*


*Storage Containers* own the managed files but do not own the underlying raw storage (e.g., filesystem, key-value databases, or cloud storage solutions), which they merely use for data persistence.
In other words, it is middleware that separate changes made to application codes by science users from changes made to I/O actions by developers or administrators.


 
[caption="",link=https://gitlab.com/tromos/tromos-ce/raw/master/docs/images/storagecontainers.png]
image::docs/images/storagecontainers.png[800,800]


== Microservices

Tromos framework consists of Microservices with well-defined APIs whose backend is swappable by plugins. Next, we present the basic architecture of each Microservice.
For the plugins that each Microservice can consume, please consult the link:https://gitlab.com/tromos/tromos-ce/tree/master/hub[Tromos HUB]


==== Devices 
Devices provide a convenient and flexible way to abstract the various backends. Simple connectors do not suffice as they suffer from the link:https://thenewstack.io/avoiding-least-common-denominator-approach-hybrid-clouds/[least-common denominator] problem.
Instead, Tromos provides a framework for mapping several processing layers into virtual Devices. (If you are familiar with link:https://en.wikipedia.org/wiki/Device_mapper[Device Mapper],
this part of Tromos can be regarded as the user-space equivalent of Device Mapper).


[caption="",link=https://gitlab.com/tromos/tromos-ce/blob/master/docs/images/DeviceService.png]
image::./docs/images/DeviceService.png[300,300]

<<<
==== Coordinators
The Coordinators are quite similar to Devices, but they are responsible for the control of the metadata. As you we discuss in the next section, we have greater flexibility when separating the data from the metadata, since we can scale and provide Quality of Service independently from one to another.

[caption="",link=https://gitlab.com/tromos/tromos-ce/blob/master/docs/images/CoordinatorService.png]
image::./docs/images/CoordinatorService.png[300,300]

<<<
==== Processors
In its current form, a storage system is much more than a repository of data. It is also a piece of software that process data on behalf of the application. That processing may involve a stream that ends to a single Device (e.g., encryption, compression, deduplication, filtering),
or a stream that ends up to several Devices (e.g., mirroring, stripping, erasure coding). The unique feature that Processors brings into the table is their ability to abstract any desired datapath into a  directed acyclic graph (DAG). 
Through that, advertised features (e.g., Replication, Stripping, Erasure-coding) are nothing more than mere components of the graph of a filestore.

Yet another advantage of Processors is their ability to link to each and form complex distributed processing networks. That comes especially handy for HPC scenarios where the cycles of compute nodes are very precious to waste for
simple tasks. For example, data filtering, indexing, compression, or any other similar task can be offloaded to a chained Processor running on a filtering node.



[caption="",link=https://gitlab.com/tromos/tromos-ce/blob/master/docs/images/ProcessorService.png]
image::./docs/images/ProcessorService.png[300,300]


==== Client Middleware

The middleware is a lightweight library that enables the client to access the various Microservices and creates Meshes of them. For example, the client can partition the keyspace across
several Coordinators and create a composite Namespace, or distribute the data across Devices. Similarly, it can decide which Processor to use in to perform in-transit processing of the data
before they reach the Devices. Although it provides a set of user-friendly API, it is also equipped with gateways so that clients can benefit from Tromos without having to modify the source
of their applications. For example, when using the Fuse gateway, the application architects can mount the Storage container like a normal filesystem, while still controlling the properties of the virtual storage through the container's Manifest.

[caption="",link=https://gitlab.com/tromos/tromos-ce/blob/master/docs/images/Middleware.png]
image::docs/images/middleware.png[400,400]



link:http://www.tromos.io/docs/overview/introduction/[Learn more] 

<<<


== Using Tromos

Before starting though, it is advisable to visit the tromos link:https://gitlab.com/tromos/tromos-ce/blob/master/docs/tutorial/README.adoc[tutorial] and learn how to design your own manifest.
After following these steps, you will be able to create and mount your virtual storage system by using the `tromos-cli` command.

    $ go get -v gitlab.com/tromos/tromos-ce/cmd/tromos-cli
    $ tromos-cli gateway fuse --mountpoint $MOUNTPOINT --manifest $MANIFEST 

$MOUNTPOINT is the location where the storage container will be mounted (e.g., /tmp/test), and $MANIFEST is the specification of the virtual system.


.tutorial.yml
[source, yaml]
----

Name: Tutorial
Description: This file describes a storage container


# Middleware sections defines the plugins that will be used 
# on the client-slide middleware
Middleware:
    DeviceManager:
        plugin: gitlab.com/tromos/hub/selector/random
    Namespace: 
        plugin: gitlab.com/tromos/hub/selector/consistenthash


Devices:
    "dev0":
        Persistent:
            plugin: gitlab.com/tromos/hub/device/filesystem
            family: os
            path: gitlab.com/tromos/scratch/hdd0

        Translators:
            "0":
                plugin: gitlab.com/tromos/hub/device/blob
                blocksize: 2M
    "dev1":
        Persistent:
            plugin: gitlab.com/tromos/hub/device/googledrive
            credentials: /home/myuser/credentials.json
        Translators:
            "0":
                plugin: gitlab.com/tromos/hub/device/throttler
                rate: 500MB
                capacity: 1B
                regulate: channel
            "1":
                plugin: gitlab.com/tromos/hub/device/blob
                blocksize: 2M

Coordinators:
    "coord0":
        Persistent:
            plugin: gitlab.com/tromos/hub/coordinator/boltdb
            path: /tmp/databases/coord0
        Translators:
            "0":
                plugin: gitlab.com/tromos/hub/coordinator/sequencer
                blockw2r: true
                blockw2w: true
----


== Contributing

If you are as excited as we are about the evolution of storage systems and distributed processing, feel free to join our community! 
Contributions are greatly appreciated! Whether that be feedback, code contributions, or even discussion!


==== Feedback

We are always happy to receive feedback!

* Do any of the commands have surprising effects, output, or results?
* Do you have workflows that the tool supports well, or doesn't support at all?
* Do you have suggestions centered on the user experience (UX) of the tool?

Let us know by filing an issue, describing what you did or wanted to do, what you expected to happen, and what actually happened.


====  Code

The maintainers actively manage the issues list and try to highlight issues suitable for newcomers.

If you want to contribute,
    fork the project
    do your hack
    create a pull request!

Before starting any work, please either comment on an existing issue or file a new one.



==== Contact Information

You can contact the author of Tromos by nikolaidis.fotis@gmail.com 



== Licensing

Tromos is licensed under the Apache License, Version 2.0. See
link:https://gitlab.com/tromos/tromos-ce/blob/master/LICENSE[LICENSE] for the full license text.



== FAQ

Tromos is a new project, so things are fragile. Here we will be listing all the known issues that may cause inconveniences

==== What is the error: no matching versions for query "latest" ?

* If you experience an error like link:https://github.com/golang/go/issues/27215[go get ... no matching versions for query "latest"] try to upgrade your Golang version

==== I 'm experiencing data corruption when I have concurrent access to more than 30 files
* Tromos is trying to minimize the number of necessary resources and therefore does extensive use of pools. Given that, the number 30 is associated with the number of instances
that are waiting in the pool. In case that you want to serve more than 30 files, please change the "MaxConcurrentChannels" variable defined in configuration/default. You must
also take into account that a file may consist of a writer and a reader - so you must provision 2 channels per file so to be on the safe side

==== Runtime error: Elements in the pool have been exhausted
* Pools are periodically exhausted. That is a normal case which Tromos can handle transparently to the user. The specific error occurs when the minimum amount of resources are not sufficient to guarantee the minima for an operation. For example, when mirroring your data into 4 locations, you must have at least 4 devices. Otherwise, you get the above error





Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL