Feature: Task scheduling loop and a task history system #421

JannikSt · 2025-06-11T02:37:51Z

Description

We need to build a proper task scheduling loop and a task history system.

Right now, our orchestrator works like a one-time dispatcher for a group of nodes. Once a group is formed and assigned a task, it's stuck with that task forever, even after completing it. We need a system that recognizes when a task is finished and intelligently assigns a new one.

Additionally, the orchestrator has no memory. Once a task is done and a new one is (theoretically) assigned, all records of the previous task's execution (who ran it, if it succeeded) are lost. We need to create a history to track completed work.

The current implementation has two main shortcomings:

No Task Re-scheduling:
The NodeGroupsPlugin assigns a task to a group of nodes, and this assignment is permanent. When a node reports its task as COMPLETED in a heartbeat, the scheduler doesn't recognize this. In the next heartbeat, it simply re-assigns the exact same task to the node because the group's task assignment is never cleared. This means a group of nodes can only ever execute one task.
No Task History:
The system only tracks tasks that are currently defined and the single task a node is actively working on. There is no persistent record of completed or failed tasks. If a task is deleted from the API, it's gone.

Proposed Solution:

To implement a robust scheduling and history system, we need to introduce two new concepts:

Implement a "Group Task State"
This manager would be responsible for tracking the lifecycle of a task within a node group.

Listen for Completion: In the heartbeat route (src/api/routes/heartbeat.rs), when a node reports a task state like COMPLETED or FAILED, we need to check if all other nodes in its group have also finished the same task.
Free Up the Group: Once all nodes in a group have completed the task, the manager should clear the task assignment for that group. This involves deleting the group_task:<group_id> key in Redis.
Enable Re-scheduling: With the group's task assignment cleared, the next time a node from that group sends a heartbeat, the scheduler (src/plugins/node_groups/scheduler_impl.rs) will see that the group is idle and assign it a new task from the queue of available tasks.

Create a Task History Store
This would provide the missing long-term memory for the orchestrator.

Create a "Completed Tasks" Store: When the "Group Task State" manager determines a task is finished, instead of just deleting the assignment, it should also move a record of that task to a new "history" or "completed" list in Redis.
Store Execution Details: This history record should contain the original task details, the final status (COMPLETED, FAILED), the completion timestamp, and which group/nodes executed it.
Expose History via API: We could then create new API endpoints, like GET /tasks/history, to allow users to query past task executions. This would be invaluable for debugging, monitoring, and analytics.

JannikSt added this to the release:0.2.12 milestone Jun 11, 2025

JannikSt added enhancement New feature or request P2 area:orchestrator size:L labels Jun 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature: Task scheduling loop and a task history system #421

Feature: Task scheduling loop and a task history system #421

Feature: Task scheduling loop and a task history system #421

Feature: Task scheduling loop and a task history system #421

Comments

Description

Proposed Solution: