Python SDK
Build and register algorithms with Orca in Python
The Orca Python SDK allows you to define and register custom Python algorithms into the Orca framework. With this SDK, you can quickly integrate your analytics or ML logic into Orca’s distributed DAG execution engine.
Getting Started
Before using the SDK, ensure Orca Core is running. If not, follow the Quickstart Guide.
You can test whether Orca is running with the command:
orca status
1 - Install the Python SDK
Install the SDK into your Python project:
pip install orca-python
2 - Define an Algorithm
Create a Python file, e.g. processor.py
, define a processor, then attach an algorithm to it:
from orca_python import Processor
import time
proc = Processor("ml")
@proc.algorithm("MyAlgo", "1.0.0", "Every30Second", "1.0.0")
def my_algorithm() -> dict:
time.sleep(5)
return {"result": 42}
3 - Start the Processor
Invoke the Register()
and Start()
functions on the processor to register it with Orca core:
...
if __name__ == "__main__":
proc.Register()
proc.Start()
Now, grab the environment variables exposed by orca status
:
$ orca status
PostgreSQL: running
Connection string: postgresql://orca:orca@localhost:32768/orca?sslmode=disable
Redis: running
Connection string: redis://localhost:32769
Orca: running
Connection string: grpc://localhost:32770
→ Set these environment variables in your Orca SDKs to connect them to Orca:
→ ORCASERVER=grpc://localhost:32770
→ HOST=172.18.0.1
The important lines are:
→ Set these environment variables in your Orca SDKs to connect them to Orca:
→ ORCASERVER=grpc://localhost:32770
→ HOST=172.18.0.1
A PORT
environment variable is also required by the processor:
PORT=50505
Now run the processor:
ORCASERVER=grpc://localhost:32770 HOST=172.18.0.1 PORT=50505 python processor.py
If everything worked. You should see the following logs:
$ ORCASERVER=grpc://localhost:32773 HOST=172.18.0.1 PORT=50505 poetry run python ./processor.py
2025-05-18 10:46:23,689 - orca_python.main - INFO - Registering algorithm: MyAlgo_1.0.0 (window: Every30Second_1.0.0)
2025-05-18 10:46:23,689 - orca_python.main - INFO - Preparing to register processor 'ml' with Orca Core
2025-05-18 10:46:23,706 - orca_python.main - INFO - Algorithm registration response recieved: received: true message: "Successfully registered processor"
2025-05-18 10:46:23,706 - orca_python.main - INFO - Starting Orca Processor 'ml' with Python 3.13.1 (main, Feb 15 2025, 16:27:20) [GCC 13.3.0]
2025-05-18 10:46:23,706 - orca_python.main - INFO - Initialising gRPC server with 10 workers
2025-05-18 10:46:23,709 - orca_python.main - INFO - Server listening on address 0.0.0.0:50505
2025-05-18 10:46:23,710 - orca_python.main - INFO - Server started successfully
2025-05-18 10:46:23,710 - orca_python.main - INFO - Server is ready for requests
Congrats - your processor is now ready to accept processing requests!
4 - Emit a Processing Window
Now that your processor is up and running, you need to trigger it.
This is achieved by emitting a window to Orca Core that algorithms within the processor are triggered by.
If we look in our algorithm registration:
@proc.algorithm("MyAlgo", "1.0.0", "Every30Second", "1.0.0")
We see that the algorithm is triggered by a window type of Every30Second
with version 1.0.0
.
So, we can build a triggering function that triggers a window of this type, every 30 seconds. We’ll put this
function in a file called window.py
:
import time
from orca_python import EmitWindow, Window
def emitWindow():
now = int(time.time())
window = Window(time_from=now - 30, time_to=now, name="Every30Second", version="1.0.0", origin="Example")
EmitWindow(window)
And we’ll use the schedule
package to run this function every 30 seconds:
pip install schedule
schedule.every(30).seconds.do(emitWindow)
if __name__=="__main__":
while True:
schedule.run_pending()
time.sleep(1)
Now, in a separate terminal, we can run this file to trigger our processor every 30 seconds:
$ ORCASERVER=grpc://localhost:32770 HOST=172.18.0.1 PORT=50505 python window.py
And after waiting for a bit (30 seconds) we should see that windows are being emitted to Orca Core:
$ ORCASERVER=grpc://localhost:32770 HOST=172.18.0.1 PORT=50505 python window.py
2025-05-18 16:31:44,314 - orca_python.main - INFO - Emitting window: Window(time_from=1747582274, time_to=1747582304, name='Every30Second', version='1.0.0', origin='Example')
2025-05-18 16:31:44,339 - orca_python.main - INFO - Window emitted: status: PROCESSING_TRIGGERED
2025-05-18 16:32:14,351 - orca_python.main - INFO - Emitting window: Window(time_from=1747582304, time_to=1747582334, name='Every30Second', version='1.0.0', origin='Example')
2025-05-18 16:32:14,379 - orca_python.main - INFO - Window emitted: status: PROCESSING_TRIGGERED
With the processor logs showing that the processing request has been recieved from Orca core and the algorithm is succesfully running:
2025-05-18 16:31:44,341 - orca_python.main - INFO - Received DAG execution request with 1 algorithms and ExecId: ccb19b964d084da489841b4e6588cbd6
2025-05-18 16:31:44,342 - orca_python.main - INFO - Running algorithm MyAlgo_1.0.0
2025-05-18 16:31:49,344 - orca_python.main - INFO - Completed algorithm: MyAlgo
2025-05-18 16:32:14,383 - orca_python.main - INFO - Received DAG execution request with 1 algorithms and ExecId: c8166cf7310f43e5a09fd70f0542813d
2025-05-18 16:32:14,384 - orca_python.main - INFO - Running algorithm MyAlgo_1.0.0
2025-05-18 16:32:19,385 - orca_python.main - INFO - Completed algorithm: MyAlgo
Congratulations 🎉. You’ve built an end to end analytics scheduler that that will scale seamlessly with your throughput.
Next Steps
Check out more python examples in our GitHub repo.
And take a deeper look into the Orca architecture, including how you can access the results from your algorithms in an efficient manner - yes, Orca tracks everything.