OSCAR - Open Source Serverless Computing for Data-Processing Applications
Introduction
OSCAR is an open-source platform to support the event-driven serverless
computing model for data-processing applications. It can be automatically
deployed on multi-Clouds, and even on low-powered devices, to create highly-parallel event-driven
data-processing serverless applications along the computing continuum. These applications execute on customized runtime
environments provided by Docker containers that run on elastic Kubernetes clusters.
Information on how to deploy an OSCAR cluster using the Infrastucture Manager can be found at: https://grycap.github.io/oscar/deploy-im-dashboard/
For more documentation visit https://grycap.github.io/oscar/
NOTE: If you detect inaccurate or unclear information on the documentation please report back to us either opening an issue or contacting us at products@grycap.upv.es
Overview
Why OSCAR
FaaS platforms are typically oriented to the execution of short-lived functions,
coded in a certain programming language, in response to events. Scientific
application can greatly benefit from this event-driven computing paradigm in
order to trigger on demand the execution of a resource-intensive application
that requires processing a certain file that was just uploaded to a storage
service. This requires additional support for the execution of generic
applications in existing open-source FaaS frameworks.
To this aim, OSCAR supports the
High Throughput Computing Programming Model
initially introduced by the SCAR framework,
to create highly-parallel event-driven data-processing serverless applications
that execute on customized runtime environments provided by Docker containers
run on AWS Lambda.
With OSCAR, users upload files to a data storage back-end and this automatically
triggers the execution of parallel invocations to a service responsible for
processing each file. Output files are delivered into a data storage back-end
for the convenience of the user. The user only specifies the Docker image and
the script to be executed, inside a container created out of that image,
to process a file that will be automatically made available to the
container. The deployment of the computing infrastructure and its scalability
is abstracted away from the user. Synchronous invocations are also supported to create
scalable HTTP-based endpoints for triggering containerised applications.
Components
OSCAR runs on an elastic Kubernetes cluster that is deployed using:
- IM, an open-source virtual infrastructure
provisioning tool for multi-Clouds.
The following components are deployed inside the Kubernetes cluster to support the enactment of the OSCAR platform:
- CLUES, an elasticity manager that
horizontally scales in and out the number of nodes of the Kubernetes cluster
according to the workload.
- MinIO, a high-performance distributed object storage
server that provides an API compatible with S3.
- Knative, a serverless framework to serve
container-based applications for synchronous invocations (default Serverless
Backend).
- OSCAR Manager, the main API, responsible for the management of the services and the integration of the different components.
- OSCAR UI, an easy-to-use web-based graphical user interface aimed at end users.
As external storage providers, the following services can be used:
- External MinIO servers, which may be in clusters other than
the platform.
- Amazon S3, an object storage service
that offers industry-leading scalability, data availability, security, and
performance in the AWS public Cloud.
- Onedata, the global data access solution for science,
used in the EGI Federated Cloud.
- dCache, a system for storing and retrieving huge amounts of data, distributed among a large number of heterogeneous server nodes, under a single virtual filesystem tree with a variety of standard access methods.
An OSCAR cluster can be easily deployed via the IM Dashboard
on any major public and on-premises Cloud provider, including the EGI Federated Cloud.
Licensing
OSCAR is licensed under the Apache License, Version 2.0. See
LICENSE for the full
license text.
Acknowledgements
This development is partially funded by the EGI Strategic and Innovation Fund.
Partially funded by the project AI-SPRINT "AI in Secure Privacy-Preserving Computing Continuum" that has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under Grant 101016577.
Also, Grant PDC2021-120844-I00 funded by Ministerio de Ciencia e Innovación/Agencia Estatal de Investigación/ 10.13039/501100011033 and by “European Union NextGenerationEU/PRTR” and Grant PID2020-113126RB-I00 funded by Ministerio de Ciencia e Innovación/Agencia Estatal de Investigación/ 10.13039/501100011033.
This software has received a silver badge according to the Software Quality Baseline criteria defined by the EOSC-Synergy project. Please acknowledge the use of OSCAR by citing the following scientific
publications (preprints available):
Sebastián Risco, Germán Moltó, Diana M. Naranjo and Ignacio Blanquer. (2021). Serverless Workflows for Containerised Applications in the Cloud Continuum. Journal of Grid Computing, 19(3), 30. https://doi.org/10.1007/s10723-021-09570-2
Alfonso Pérez, Sebastián Risco, Diana M. Naranjo, Miguel Caballer, and Germán Moltó,
“Serverless Computing for Event-Driven Data Processing Applications,”
in 2019 IEEE International Conference on Cloud Computing (CLOUD 2019), 2019. https://ieeexplore.ieee.org/document/8814513/