Hacker News

Subscribe to Hacker News feed
Hacker News RSS
Updated: 12 min 33 sec ago

Show HN: Fiscal – An Agent Friendly CLI for Actual Budget

Mon, 02/23/2026 - 9:51am

I built Fiscal (fscl), a headless CLI for Actual Budget that's optimized for AI agents running in the terminal (Claude Code, OpenClaw, etc).

It acts as an Actual Budget client, so it can sync with an existing Actual server. I built it because I wanted an agent-friendly way to handle repetitive budgeting work while still being able to review everything in the Actual web dashboard.

Site/docs: https://fiscal.sh GitHub: https://github.com/fiscal-sh/fscl

Comments URL: https://news.ycombinator.com/item?id=47123049

Points: 1

# Comments: 0

Categories: Hacker News

Show HN: Zero-allocation and SIMD-accelerated CSV iterator in Zig

Mon, 02/23/2026 - 9:01am

I needed a CSV library in Zig and I hand rolled one. Then I decided to come back to it and make it avoid allocations entirely and then went down a rabbit hole of performance tuning and learned a ton in the process.

This is the result. I also added a benchmarking library and a blog post that explains the implementation details. All are available in the repo page.

I presented this in a local Zig meetup and it landed well so I figured I'll post it here as well.

Comments URL: https://news.ycombinator.com/item?id=47122443

Points: 1

# Comments: 0

Categories: Hacker News

Show HN: Attest – Test AI agents with 8-layer graduated assertions

Mon, 02/23/2026 - 9:00am

I built Attest because every team I've seen building AI agents ends up writing the same ad-hoc pytest scaffolding — checking if the right tools were called, if cost stayed under budget, if the output made semantic sense. It works until the agent gets complex, then it collapses.

60–70% of what makes an agent correct is fully deterministic: tool call schemas, execution order, cost budgets, content format. Routing all of this through an LLM judge is expensive, slow, and unnecessarily non-deterministic. Attest exhausts deterministic checks first and only escalates when necessary.

The 8 layers: schema validation → cost/perf constraints → trace structure (tool ordering, loop detection) → content validation → semantic similarity via local ONNX embeddings (no API key) → LLM-as-judge → simulation with fault injection → multi-agent trace tree evaluation.

Example:

from attest import agent, expect from attest.trace import TraceBuilder @agent("support-agent") def support_agent(builder: TraceBuilder, user_message: str): builder.add_tool_call(name="lookup_user", args={"query": user_message}, result={...}) builder.add_tool_call(name="reset_password", args={"user_id": "U-123"}, result={...}) builder.set_metadata(total_tokens=150, cost_usd=0.005, latency_ms=1200) return {"message": "Your temporary password is abc123."} def test_support_agent(attest): result = support_agent(user_message="Reset my password") chain = ( expect(result) .cost_under(0.05) .tools_called_in_order(["lookup_user", "reset_password"]) .output_contains("temporary password") .output_similar_to("password has been reset", threshold=0.8) ) attest.evaluate(chain) The .output_similar_to() call runs locally via ONNX Runtime — no embeddings API key required. Layers 1–5 are free or near-free. The LLM judge is only invoked for genuinely subjective quality assessment.

Architecture: single Go binary engine (1.7ms cold start, <2ms for 100-step trace eval) with thin Python and TypeScript SDKs. All evaluation logic lives in the engine — both SDKs produce identical assertion results. 11 adapters covering OpenAI, Anthropic, Gemini, Ollama, LangChain, Google ADK, LlamaIndex, CrewAI, and OpenTelemetry.

v0.4.0 adds continuous evaluation with σ-based drift detection, a plugin system, result history, and CLI scaffolding. The engine and Python SDK are stable across four releases. The TypeScript SDK is newer — API is stable, hasn't been battle-tested at scale yet.

The simulation runtime is the part I'm most curious about feedback on. You can define persona-driven simulated users (friendly, confused, adversarial), inject faults (latency, errors, rate limits), and run your agent against all of them in a single test suite. Is this useful in practice for CI, or is it a solution looking for a problem?

Apache 2.0 licensed. No platform to self-host, no BSL, no infrastructure requirements.

GitHub: https://github.com/attest-framework/attest Examples: https://github.com/attest-framework/attest-examples Website: https://attest-framework.github.io/attest-website/ Install: pip install attest-ai / npm install @attest-ai/core

Comments URL: https://news.ycombinator.com/item?id=47122431

Points: 1

# Comments: 0

Categories: Hacker News

Seymour: Live Programming for the Classroom

Mon, 02/23/2026 - 8:59am
Categories: Hacker News

VoxClaw – Give your Claw a voice

Mon, 02/23/2026 - 8:59am

Article URL: https://malpern.github.io/VoxClaw/

Comments URL: https://news.ycombinator.com/item?id=47122401

Points: 1

# Comments: 1

Categories: Hacker News

Overengineering a Static Website

Mon, 02/23/2026 - 8:58am
Categories: Hacker News

Ohm v18

Mon, 02/23/2026 - 8:54am

Article URL: https://ohmjs.org/blog/ohm-v18

Comments URL: https://news.ycombinator.com/item?id=47122352

Points: 1

# Comments: 0

Categories: Hacker News

Moving Beyond the IDE with Intent

Mon, 02/23/2026 - 8:51am
Categories: Hacker News

Show HN: Fundraising events across the developer tools ecosystem

Mon, 02/23/2026 - 8:49am

Article URL: https://ci.vc

Comments URL: https://news.ycombinator.com/item?id=47122293

Points: 1

# Comments: 0

Categories: Hacker News

Crab Mentality

Mon, 02/23/2026 - 8:48am
Categories: Hacker News

Show HN: AgentWard – After an AI agent deleted files, I built a runtime enforcer

Mon, 02/23/2026 - 8:47am

I've spent time working on AI safety and kept running into the same problem: AI agents have far more access than they need, and the only thing stopping them from misusing it is a prompt. Prompts can be ignored. They can be overridden by prompt injection. They're not enforcement — they're a suggestion. AgentWard is a proxy layer that sits between your agent and its tools and enforces permissions in code, outside the LLM context window. No matter what the model decides, the policy is what actually runs. What it does:

Scans your OpenClaw skills and flags risky permissions Detects dangerous skill combinations — pairs that are low-risk individually but become high-risk when chained together (email + web browser → data exfiltration path) Enforces a YAML policy at runtime — ALLOW, BLOCK, APPROVE, REDACT Logs everything for audit

Getting started is one command: agentward init It scans, shows your risk profile, and wraps your environment with a sensible default policy in under two minutes. Honest caveats: Currently tested on OpenClaw skills and Mac only. MCP server support and Windows are on the roadmap — contributions welcome. This is early and rough in places, but the core enforcement works. I'm sharing it now because the problem is real and getting worse fast. Would love feedback from anyone running agents in production. GitHub: github.com/agentward-ai/agentward

Comments URL: https://news.ycombinator.com/item?id=47122277

Points: 1

# Comments: 0

Categories: Hacker News

Pages