athena-sdk-lite — Overview¶
Version: 0.1.0 Audience: anyone hearing about this for the first time — stakeholders, new engineers, or partner teams evaluating fit.
What it is¶
A small Python library for building data + AI workflows as DAGs. You import it, declare the nodes you want (a Postgres read, an AI classification, a branch, a transform), wire them with inputs=, and the library produces a workflow object you can validate, visualize, and run locally.
from athena_sdk_lite import Workflow
from athena_sdk_lite.nodes import postgres, ai_tagging, output
with Workflow("ticket-triage") as wf:
rows = postgres("load", operation="select", query="SELECT ...", connection={...})
tagged = ai_tagging("classify", inputs=rows, agent_url="https://...")
output("results", inputs=tagged, format="json")
print(wf.visualize()) # ascii DAG
issues = wf.validate() # [] if good
result = wf.run() # local, in-process
That's the whole API a normal user sees. No CLI. No backend. No API key. No managers / mixins / codegen.
What problem it solves¶
Today, "build a pipeline that touches a database and an AI model" is solved by writing scattered scripts — one for the DB pull, one to call the model, one to format output, glued together with cron and a Slack message. Each script is bespoke; there's no shared shape; testing is ad-hoc; reasoning about what runs when is hard.
This SDK gives a single shape — Workflow of typed nodes — that:
- Reads top-to-bottom like a script (no hidden framework magic)
- Validates structurally before you run it (
wf.validate()) - Renders an ASCII diagram on demand (
wf.visualize()) - Runs locally with no service dependency
- Can be extended without monkey-patching when you outgrow the 11 built-in node helpers
Who uses it¶
- End users building one-off or recurring workflows in Python. They write 5–50 line scripts using the 11 starter helpers.
- Wrapper authors packaging domain logic (e.g.
pharma-workflows,marketing-pipelines) on top. They register custom helpers, compose sub-workflows, and add lifecycle hooks. End users of their package never see the wrapping layer.
Position in the broader stack¶
┌──────────────────────────────┐
│ end-user Python script │ ← you import & write here
│ (or wrapper package) │
└──────────────┬───────────────┘
│
┌──────────────▼───────────────┐
│ athena-sdk-lite │ ← thin, obvious surface
│ (this package) │
└──────────────┬───────────────┘
│
┌──────────────▼───────────────┐
│ vendored _engine/ │ ← workflow execution
│ (from athena-sdk) │
└──────────────┬───────────────┘
│
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Postgres │ │ S3 │ │ Athena │
│ │ │ │ │ agent │
└──────────┘ └──────────┘ └──────────┘
The library is local-only by design. There is no service to deploy, no backend to call. The vendored engine inside _engine/ does the actual node execution; the SDK is a typed builder on top.
What's in the box (the starter set)¶
| Helper | Purpose |
|---|---|
pubmed |
Biomedical literature search |
postgres |
DB select / insert / update / upsert |
s3 |
Object storage read / write |
local_file |
Read CSV / JSON / Excel from the local filesystem |
http |
Generic HTTP request |
ai_tagging |
Athena agent / classification call |
filter |
Row filter (eq/gt/contains/...) |
transform |
User-supplied Python code |
output |
Terminal sink (json/csv/text) |
branch |
Two-way conditional (engine if node) |
merge |
Fan-in (join or concat) |
For anything outside these 11, the escape hatch wf.add_node(name, type, category, config, inputs) reaches any node type the underlying engine supports.
What it does NOT do¶
Stated up front so expectations are clear:
- No remote execution. Workflows run in your Python process. Production deployment is a separate concern (see the full
athena-sdkornexus-backendfor hosted execution). - No scheduler. Cron, Airflow, or a wrapping process supplies the trigger.
- No registry/UI. Workflows live as Python files in your repo.
- No state store. Each run is independent; persistence is the user's responsibility (write to Postgres, S3, etc. via the node helpers).
- No CLI. It's a Python library. Compose with
subprocess,make, or your own entrypoint if you need command-line invocation.
When to use it vs. something heavier¶
| Use this when | Reach for something heavier when |
|---|---|
| Workflow runs in one Python process | You need distributed execution across machines |
| You want to author and test locally | You need a UI / registry / scheduler |
| The 11 helpers + escape hatch cover your nodes | You need first-class support for many bespoke node types |
| You want stakeholders to read the workflow code | You need non-engineers to author workflows |
Pointers¶
- Architecture (how the parts fit): architecture.md
- Reference / how to use each feature: technical.md
- Worked examples:
examples/01_pubmed_to_ai.pythroughexamples/11_triage_pipeline.py - Public-API source files (read these first if you're contributing):
src/athena_sdk_lite/workflow.py,nodes.py,_compat.py