Capabilities
HPC and HTC Application Execution and Data Management
- Unified API for scheduling jobs to run on a variety of remote resources, including supercomputers, Kubernetes clusters, physical servers and Virtual Machines.
- Automates the data management lifecycle associated with a job including staging data inputs to the execution target and archive job outputs to storage resources.
- Leverage containerized application assets to enable portability, and reduce the overall time-to-solution by utilizing data locality and other “smart scheduling” techniques.
Functions-as-a-service, Sensor Data/Streaming Data API & Events-Driven Workloads
- Provides a distributed computing platform where computational primitives are based on Docker container images; containers are executed on Tapis cloud infrastructure in response to messages sent over HTTP.
- Trigger executions based on Tapis events (e.g., file uploads or data streams); automatically scale functions to run in parallel as message load increases.
- “Serverless” computing model - No servers to manage for end user.
- Storing and retrieving sensor data for batch job processing, with support for temporal and spatial indexes and queries.
- Automated, event-driven data stream processing workflows with integration into Tapis functions
- Automated data management and scheduled archiving based on programmable policies.
Highly Scalable Document Store and Metadata API
- Store and scale research data collections to billions of documents serialized using popular data formats such as JSON.
- Configure custom indexes to optimize performance for specific usage patterns.
Identity, Authorization and Federated Security
- Federated, decentralized model where each site/institution can manage the credentials and other secrets needed to access their compute resources.
- Robust authorization based on scalable permissions, groups and roles.
- Pluguble identity provider to leverage local or regional identities or a federated provider such as InCommon.
- Secure authentication with short-term tokens based on OAuth2. Containers, Reproducibility & Smart Scheduling
Containers, Reproducibility & Smart Scheduling
- The API remembers inputs and parameters used for each job so that computations can be repeated.
- The API tracks which users modified which assets (files, apps, actors, permissions, etc)
- Supports gathering usage metrics for reporting to funding agencies as well as security incident analysis.