Features¶

For Infrastructure/Ops¶

De-centralized worker deployment¶

Scripts are deployed from the web UI
Versioned, can co-exist in many versions at once
If you can publish to PyPi, it can be deployed

Real-time monitoring¶

Jobs send their status (success / fail) and logs in real time
Notifications can be configured per job to be triggered in case of failure or missed mark (should be running but is not)

Centralized secrets management¶

Scripts pull their configuration from the “vault”
Secrets are stored in a central place
Secrets are never persisted on disk
ACL apply to secret access

Multiple authentication backends (local, ldap, …)¶

Authentication
- LDAP
- Okta
- Local accounts
Authorization
- Groups
- Roles

For Developers¶

Workers as simple python scripts¶

Scripts are simple Python scripts
You can use whatever library you want, as long as it can be pip-installed
No hard requirement on using Kirby at all
Configuration is passed through env variables

File-based local unit-testing¶

Kirby is meant to be testable
All input / output in Kirby is a message
Messages are simulated using folders and files
Comes with integrated testing helpers

Simple integration to web frameworks¶

Designed to allow swap-in replacement of Celery
- ex: produce data in an ETL script, use it straight in your app, or the other way around
Chord / group logic is replaced by stream piping
Higher throughput and parallelism
Simpler status management and failure recovery
Not limited to running jobs, integrations can also serve as a view on all Kirby-managed data
Can be used in templates, views, etc.

For Data Engineers¶

Scheduled tasks with exceptions¶

Schedule can be configured with cron-like syntax
Add exceptions to pause or slow down a job from running (when source is broken for instance)
Automatically back to normal schedule at the end of the exception, no need to undo them

Rewind/replay of tasks¶

Rewind a data source and all downstream processing will flow naturally (but will flag the data as the result of a re-run in case it matters)
Replay production data in preprod or even on a developer laptop → testing with prod data = easier development / debugging

Data traceability with data dependency graph documentation¶

Each process adds itself in the message headers, to allow end-to-end data traceability
Visualize data flows in the web UI to understand
- where your data comes from
- when it was processed
- what transformations were applied

Computed tables¶

Kirby State objects allows you to keep a persistent object that can be updated by scripts
By default, a state only shows its latest version, but each mutation is versioned and can be recovered (only limited by disk space)
Useful for “hot” data with expensive computation costs

Read the Docs v: latest

Versions: latest

Downloads: pdf; html; epub

On Read the Docs: Project Home; Builds

Free document hosting provided by Read the Docs.