Capabilities

HPC and HTC Application Execution and Data Management

Unified API for scheduling jobs to run on a variety of remote resources, including supercomputers, Kubernetes clusters, physical servers and Virtual Machines.
Automates the data management lifecycle associated with a job including staging data inputs to the execution target and archive job outputs to storage resources.
Leverage containerized application assets to enable portability, and reduce the overall time-to-solution by utilizing data locality and other “smart scheduling” techniques.

Functions-as-a-service, Sensor Data/Streaming Data API & Events-Driven Workloads

Provides a distributed computing platform where computational primitives are based on Docker container images; containers are executed on Tapis cloud infrastructure in response to messages sent over HTTP.
Trigger executions based on Tapis events (e.g., file uploads or data streams); automatically scale functions to run in parallel as message load increases.
“Serverless” computing model - No servers to manage for end user.
Storing and retrieving sensor data for batch job processing, with support for temporal and spatial indexes and queries.
Automated, event-driven data stream processing workflows with integration into Tapis functions
Automated data management and scheduled archiving based on programmable policies.

Highly Scalable Document Store and Metadata API

Store and scale research data collections to billions of documents serialized using popular data formats such as JSON.
Configure custom indexes to optimize performance for specific usage patterns.

Identity, Authorization and Federated Security

Federated, decentralized model where each site/institution can manage the credentials and other secrets needed to access their compute resources.
Robust authorization based on scalable permissions, groups and roles.
Pluguble identity provider to leverage local or regional identities or a federated provider such as InCommon.
Secure authentication with short-term tokens based on OAuth2. Containers, Reproducibility & Smart Scheduling

Containers, Reproducibility & Smart Scheduling

The API remembers inputs and parameters used for each job so that computations can be repeated.
The API tracks which users modified which assets (files, apps, actors, permissions, etc)
Supports gathering usage metrics for reporting to funding agencies as well as security incident analysis.