
Since our launch on March 14th, the Errand AI team has been hard at work delivering meaningful improvements to your personal task automation platform. Here’s everything we’ve shipped in the past two weeks.
New Features
Task Generators Settings Page
We’ve extracted email task-generation configuration into a dedicated Task Generators settings page. You can now:
- Configure email-triggered task generation with custom profiles
- Set poll intervals for email monitoring
- Add a Task Prompt field to append custom instructions to every email-triggered task
- Enable/disable email task generation independently
This gives you granular control over how incoming emails translate to automated tasks.
Reasoning Model Support
We’ve added comprehensive support for reasoning-capable LLMs:
- LiteLLM Model Registry Caching: We now cache LiteLLM’s public model registry locally for fast lookups
- Reasoning Model Detection: The system automatically detects which models support reasoning capabilities
- Smart Warnings: The Default Model selector now warns when you select a non-reasoning model (since the task-runner benefits from reasoning for complex workflows)
- Model-Aware Token Limits: Title generation now respects each model’s actual max_output_tokens instead of using hardcoded values
Enhanced Task Log Rendering
Your task execution logs just got a major upgrade:
- Turn-Based Grouping: Events are now grouped by LLM turn for easier reading
- Thinking Placeholders: See when the model is “thinking” before tool calls arrive
- Tool Status Indicators: Visual indicators for tool execution status
- Duration Display: See exactly how long each tool call took
- MCP Connection Summaries: Collapsed views for MCP server connection details
- Suppressed HTTP Noise: httpx INFO logging is now hidden to reduce clutter
Playwright MCP Improvements
For browser automation tasks:
- Non-Headless Mode with Xvfb: Playwright MCP now runs in headed mode with a virtual display for better anti-detection capabilities
- Works out of the box with the official playwright/mcp image—no custom build needed
MCP Tool Auto-Discovery
Weaker models (like gpt-oss:20b) sometimes call MCP tools without first calling discover_tools. The system now:
- Auto-enables undiscovered tools on first use instead of blocking them
- Logs warnings for observability
- Doesn’t count auto-enables toward retry limits
- Strengthened the catalog prompt to guide models toward proper protocol adherence
Smarter Task Descriptions
When you create tasks with scheduling language (e.g., “In two hours, publish one of the approved tweets”), the LLM now:
- Returns a cleaned description with timing references removed
- Routes tasks to review with a “Needs Info” tag when the LLM cannot extract a meaningful description
- Falls back gracefully to raw input on LLM failures
Updated Setup Wizard
The setup wizard’s Step 2 has been completely modernized:
- Now uses the proper LLM Provider API instead of legacy settings
- Detects environment-sourced providers and shows fields as read-only
- Creates providers via API during connection testing
- Saves model settings as proper
{provider_id, model}objects
Bug Fixes
TCP Keepalive for Advisory Lock Connection
Fixed: After pod kill (SIGKILL/OOM), the sync DB connection holding the advisory lock wasn’t closed cleanly, causing Postgres to keep sessions alive for hours. This blocked new pods from acquiring the leader lock.
Solution: Added libpq TCP keepalive parameters (idle=10s, interval=10s, count=3) so Postgres detects dead connections within ~40 seconds. Also raised lock-wait log messages from DEBUG to INFO for better production visibility.
Empty LLM Response Handling
Fixed: The task-runner was silently reporting success when the LLM returned empty responses (observed when a model’s output was misclassified as reasoning_content by LiteLLM).
Solution: Now validates that final_output is non-empty before reporting success—emits a structured error event and exits with code 1 when responses are empty or whitespace-only.
Repeat Interval Parsing
Fixed: schedule_task with human-readable intervals like “7 days” or “weekly” was failing silently because parse_interval only accepted compact format.
Solution: Added normalize_interval() that converts human-readable formats to compact format before parsing. Also validates and normalizes at write time, returning immediate errors for unparseable intervals.
Task-Runner Output Truncation
Fixed: Tool call arguments exceeding output token limits caused truncated JSON, leading to cascading failures: MCP tool rejection followed by LiteLLM HTTP 500 on subsequent turns.
Solution:
- Added pattern-based max output token lookup for Claude, OpenAI, and Gemini model families
- Fixed sanitization to scan Responses API function_call items correctly
- Injects truncation recovery messages into tool outputs when repair succeeds
- Added
MAX_OUTPUT_TOKENSenvironment variable override
Malformed Tool Call Resilience
Fixed: Malformed LLM tool calls (truncated JSON in arguments) caused unrecoverable errors.
Solution:
- Sanitizes truncated JSON before it reaches LiteLLM
- Classifies API errors as transient vs non-retryable
- Retries transient failures with exponential backoff (3 attempts, 2s/4s delays)
Model Name Slash Parsing
Fixed: Model names containing slashes (like “bedrock/gpt-oss:20b”) caused crashes due to slash-based prefix parsing in MultiProvider.
Solution: Use OpenAIProvider directly on RunConfig, bypassing the problematic prefix parsing.
What’s Next
We’re continuing to improve Errand AI based on your feedback. Coming soon:
- Enhanced observability for long-running tasks
- Additional platform integrations
- More robust error recovery for edge cases
Errand AI is your personal task automation platform—running on your hardware, keeping your data under your control.