Errand AI Updates: What's New in March 2026

Since our launch on March 14th, the Errand AI team has been hard at work delivering meaningful improvements to your personal task automation platform. Here’s everything we’ve shipped in the past two weeks.

New Features

Task Generators Settings Page

We’ve extracted email task-generation configuration into a dedicated Task Generators settings page. You can now:

Configure email-triggered task generation with custom profiles
Set poll intervals for email monitoring
Add a Task Prompt field to append custom instructions to every email-triggered task
Enable/disable email task generation independently

This gives you granular control over how incoming emails translate to automated tasks.

Reasoning Model Support

We’ve added comprehensive support for reasoning-capable LLMs:

LiteLLM Model Registry Caching: We now cache LiteLLM’s public model registry locally for fast lookups
Reasoning Model Detection: The system automatically detects which models support reasoning capabilities
Smart Warnings: The Default Model selector now warns when you select a non-reasoning model (since the task-runner benefits from reasoning for complex workflows)
Model-Aware Token Limits: Title generation now respects each model’s actual max_output_tokens instead of using hardcoded values

Enhanced Task Log Rendering

Your task execution logs just got a major upgrade:

Turn-Based Grouping: Events are now grouped by LLM turn for easier reading
Thinking Placeholders: See when the model is “thinking” before tool calls arrive
Tool Status Indicators: Visual indicators for tool execution status
Duration Display: See exactly how long each tool call took
MCP Connection Summaries: Collapsed views for MCP server connection details
Suppressed HTTP Noise: httpx INFO logging is now hidden to reduce clutter

Playwright MCP Improvements

For browser automation tasks:

Non-Headless Mode with Xvfb: Playwright MCP now runs in headed mode with a virtual display for better anti-detection capabilities
Works out of the box with the official playwright/mcp image—no custom build needed

MCP Tool Auto-Discovery

Weaker models (like gpt-oss:20b) sometimes call MCP tools without first calling discover_tools. The system now:

Auto-enables undiscovered tools on first use instead of blocking them
Logs warnings for observability
Doesn’t count auto-enables toward retry limits
Strengthened the catalog prompt to guide models toward proper protocol adherence

Smarter Task Descriptions

When you create tasks with scheduling language (e.g., “In two hours, publish one of the approved tweets”), the LLM now:

Returns a cleaned description with timing references removed
Routes tasks to review with a “Needs Info” tag when the LLM cannot extract a meaningful description
Falls back gracefully to raw input on LLM failures

Updated Setup Wizard

The setup wizard’s Step 2 has been completely modernized:

Now uses the proper LLM Provider API instead of legacy settings
Detects environment-sourced providers and shows fields as read-only
Creates providers via API during connection testing
Saves model settings as proper {provider_id, model} objects

Bug Fixes

TCP Keepalive for Advisory Lock Connection

Fixed: After pod kill (SIGKILL/OOM), the sync DB connection holding the advisory lock wasn’t closed cleanly, causing Postgres to keep sessions alive for hours. This blocked new pods from acquiring the leader lock.

Solution: Added libpq TCP keepalive parameters (idle=10s, interval=10s, count=3) so Postgres detects dead connections within ~40 seconds. Also raised lock-wait log messages from DEBUG to INFO for better production visibility.

Empty LLM Response Handling

Fixed: The task-runner was silently reporting success when the LLM returned empty responses (observed when a model’s output was misclassified as reasoning_content by LiteLLM).

Solution: Now validates that final_output is non-empty before reporting success—emits a structured error event and exits with code 1 when responses are empty or whitespace-only.

Repeat Interval Parsing

Fixed: schedule_task with human-readable intervals like “7 days” or “weekly” was failing silently because parse_interval only accepted compact format.

Solution: Added normalize_interval() that converts human-readable formats to compact format before parsing. Also validates and normalizes at write time, returning immediate errors for unparseable intervals.

Task-Runner Output Truncation

Fixed: Tool call arguments exceeding output token limits caused truncated JSON, leading to cascading failures: MCP tool rejection followed by LiteLLM HTTP 500 on subsequent turns.

Solution:

Added pattern-based max output token lookup for Claude, OpenAI, and Gemini model families
Fixed sanitization to scan Responses API function_call items correctly
Injects truncation recovery messages into tool outputs when repair succeeds
Added MAX_OUTPUT_TOKENS environment variable override

Malformed Tool Call Resilience

Fixed: Malformed LLM tool calls (truncated JSON in arguments) caused unrecoverable errors.

Solution:

Sanitizes truncated JSON before it reaches LiteLLM
Classifies API errors as transient vs non-retryable
Retries transient failures with exponential backoff (3 attempts, 2s/4s delays)

Model Name Slash Parsing

Fixed: Model names containing slashes (like “bedrock/gpt-oss:20b”) caused crashes due to slash-based prefix parsing in MultiProvider.

Solution: Use OpenAIProvider directly on RunConfig, bypassing the problematic prefix parsing.

What’s Next

We’re continuing to improve Errand AI based on your feedback. Coming soon:

Enhanced observability for long-running tasks
Additional platform integrations
More robust error recovery for edge cases

Errand AI is your personal task automation platform—running on your hardware, keeping your data under your control.

Try Errand AI today