diff options
| author | soryu <soryu@soryu.co> | 2026-04-30 10:43:31 +0100 |
|---|---|---|
| committer | GitHub <noreply@github.com> | 2026-04-30 10:43:31 +0100 |
| commit | c3e97bbcc32bd18d9344dd44cc54dfcdce32100b (patch) | |
| tree | 7fe772669a614968fff9510abbc054091baf75e2 /makima/src/server/handlers/directives.rs | |
| parent | 2b695485753d55f956746b73c31c2deba0ed0a29 (diff) | |
| download | soryu-c3e97bbcc32bd18d9344dd44cc54dfcdce32100b.tar.gz soryu-c3e97bbcc32bd18d9344dd44cc54dfcdce32100b.zip | |
fix(directive): cancel orphaned planner and kick reconciler on goal update (#104)
Resolves the user-visible bug where editing a directive's goal mid-flight
shows "saved" but does not actually replan: the running planner kept emitting
add-step calls based on the OLD goal while a fresh planner was supposed to
take over, and the user had to wait up to 15s for the next reconciler tick
before any replanning even started.
## What was happening
PUT /api/v1/directives/{id}/goal already had two paths:
- Small change + planner running → SendMessage interrupt + KEEP orchestrator.
- Everything else → clear orchestrator_task_id and let phase_replanning
spawn a new planner on the next 15s tick.
The "everything else" path cleared the directive's pointer to the planner
task but never cancelled the task itself. The task kept executing and could
race the new planner by adding more steps from the stale plan. Worse, those
new steps could push MAX(steps.created_at) past the just-bumped
goal_updated_at, suppressing phase_replanning entirely.
## Fix
1. New helper `try_cancel_running_planner()` (orchestration/directive.rs):
sends `InterruptTask { graceful: true }` to the daemon owning the
orchestrator task and marks the task `interrupted` in the DB. All errors
are logged and swallowed so the goal update still completes.
2. update_goal handler calls the helper whenever it is about to take the
"clear orchestrator_task_id" branch, so the orphaned planner stops
producing stale-plan steps before its DB linkage is cut.
3. New `AppState::directive_kick` (tokio::sync::Notify) lets the handler
signal the reconciler to run a tick immediately. The reconciler loop in
server/mod.rs now selects between its 15s interval and the notify, so the
user no longer waits up to 15s after editing a goal before replanning
actually starts. update_goal calls `kick_directive_reconciler()` after
the goal is persisted (both paths).
## Why not also loosen `get_directives_needing_replanning`
The query already covers the common cases once the orphan-cancel lands —
without a still-running orphan adding fresh steps, goal_updated_at reliably
exceeds MAX(steps.created_at) after a goal edit. Loosening the predicate
risked spurious replans for directives that legitimately have no steps yet
(those are handled by `phase_planning`).
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'makima/src/server/handlers/directives.rs')
| -rw-r--r-- | makima/src/server/handlers/directives.rs | 50 |
1 files changed, 40 insertions, 10 deletions
diff --git a/makima/src/server/handlers/directives.rs b/makima/src/server/handlers/directives.rs index 01c4659..44bf4ac 100644 --- a/makima/src/server/handlers/directives.rs +++ b/makima/src/server/handlers/directives.rs @@ -20,7 +20,8 @@ use crate::db::models::{ use crate::db::repository; use crate::orchestration::directive::{ build_cleanup_prompt, build_order_pickup_prompt, classify_goal_change, - try_interrupt_planner_with_goal_edit, GoalChangeKind, GoalEditInterruptResult, + try_cancel_running_planner, try_interrupt_planner_with_goal_edit, + GoalChangeKind, GoalEditInterruptResult, }; use crate::server::auth::Authenticated; use crate::server::messages::ApiError; @@ -895,6 +896,25 @@ pub async fn update_goal( // SendMessage and adjust in-flight. Otherwise, fall through to the normal // path which clears orchestrator_task_id and lets phase_replanning kick // in on the next tick. + // + // CRITICAL: when going down the "clear" path, we must also CANCEL the + // running planner. Otherwise the orphaned task keeps producing add-step + // calls based on the old goal, racing the freshly-spawned replanner. + if !interrupted { + if let Some(ref current) = current { + if let Some(orch_task_id) = current.orchestrator_task_id { + if let Err(e) = try_cancel_running_planner(pool, &state, id, orch_task_id).await { + tracing::warn!( + directive_id = %id, + task_id = %orch_task_id, + error = %e, + "Failed to cancel orphaned planner — proceeding with clear anyway" + ); + } + } + } + } + let update_result = if interrupted { repository::update_directive_goal_keep_orchestrator(pool, auth.owner_id, id, &req.goal) .await @@ -902,22 +922,32 @@ pub async fn update_goal( repository::update_directive_goal(pool, auth.owner_id, id, &req.goal).await }; - match update_result { + let response = match update_result { Ok(Some(directive)) => Json(directive).into_response(), - Ok(None) => ( - StatusCode::NOT_FOUND, - Json(ApiError::new("NOT_FOUND", "Directive not found")), - ) - .into_response(), + Ok(None) => { + return ( + StatusCode::NOT_FOUND, + Json(ApiError::new("NOT_FOUND", "Directive not found")), + ) + .into_response(); + } Err(e) => { tracing::error!("Failed to update goal: {}", e); - ( + return ( StatusCode::INTERNAL_SERVER_ERROR, Json(ApiError::new("UPDATE_FAILED", &e.to_string())), ) - .into_response() + .into_response(); } - } + }; + + // Nudge the directive reconciler so the user does not wait up to 15s for + // the next interval tick before the new planner is spawned (clear path) or + // the small-edit interrupt is consumed (keep path). Best-effort: if the + // channel is full or closed we just rely on the normal interval. + state.kick_directive_reconciler(); + + response } // ============================================================================= |
