Hacker News
Ask HN: Replacing RAG pipelines with a filesystem interface for AI agents
Every AI agent project I start ends up with the same boilerplate: chunk docs, pick an embedding model, set up a vector store, write retrieval logic, wire it into a custom tool.
It works, but it's plumbing — and it needs to be rebuilt for every new agent or runtime.
The idea I'm exploring: mount a drive at /drive/ with two directories: - /drive/files/ — actual documents (PDF, code, markdown, etc.)
- /drive/search/ — virtual directory where the filename IS the semantic query
So instead of a custom RAG tool, the agent just does: cat "/drive/search/refund policy enterprise customers"
Any runtime that reads files works immediately. No integration code. Context cost drops ~10-20x since you get a relevant chunk, not the full document.
Under the hood: markitdown for conversion, sqlite-vss for vector search, and a virtual filesystem layer to wire it all together.
Before I build this: is this a solved problem I'm not aware of? Does the filesystem interface make sense, or am I overcomplicating something simpler?
GitHub / implementation details coming if there's interest.
If there's enough interest, I'll build this in public and share updates. Follow along: @r_klosowski on X
Comments URL: https://news.ycombinator.com/item?id=47152339
Points: 1
# Comments: 0
Benchmarking the best base small model for fine-tuning
Code Factory: Agent writes and reviews all code
Article URL: https://twitter.com/i/status/2023452909883609111
Comments URL: https://news.ycombinator.com/item?id=47152314
Points: 1
# Comments: 0
Barg'N Monster Where bots sell to humans and bots
Article URL: https://bargn.monster/
Comments URL: https://news.ycombinator.com/item?id=47152309
Points: 1
# Comments: 1
Show HN: AIP – Open protocol for AI agents to discover and collaborate
Article URL: https://github.com/henry9031/aip
Comments URL: https://news.ycombinator.com/item?id=47152308
Points: 1
# Comments: 0
Graph Theory Using Modern CSS
Article URL: https://css-tip.com/graph-theory/
Comments URL: https://news.ycombinator.com/item?id=47152299
Points: 1
# Comments: 0
Open source Mac app to create custom HTML/CSS/JS widgets on your desktop
Article URL: https://github.com/wigify/wigify
Comments URL: https://news.ycombinator.com/item?id=47152292
Points: 1
# Comments: 1
Ask HN: What would you want a daily AI portfolio briefing to tell you?
I spent 5 years as founding engineer at a fintech managing $20B+ in infrastructure assets, running their AI team. I left to build the tool I always wished existed for my own portfolio.
The idea: an AI that reads SEC filings, earnings transcripts, news, and social sentiment overnight, then delivers one briefing before market open, personalized to your actual holdings. Not a chatbot you have to query, not a dashboard you have to check. It just tells you the 3-5 things that matter today.
Example of what a briefing looks like: https://personal-investment-agent-landing-p.vercel.app/ (haven't bought a url yet, still deciding on the name)
A few design decisions I'd love feedback on:
1. Briefing-first vs chat-first. Most tools in this space (Astor, FinChat) are chat-based. I think most retail investors don't know the right questions to ask, so the AI should present findings proactively. Am I wrong?
2. Thesis tracking. When you buy a stock, you usually have a reason ("I think cloud revenue will reaccelerate" or "this is undervalued relative to peers"). What if you could log your thesis and the daily briefing explicitly flags signals that support or contradict it?
3. Like "you bought MSFT because of Azure growth, but this quarter's 10-Q shows deceleration from 29% to 23%." Would that change how you use something like this?
4. Financial metrics in the briefing. Would you want things like P/E ratio shifts, earnings yield vs treasury spread, or forward P/E divergence from estimates surfaced in your daily briefing? Or does that make it feel too noisy and you'd rather just get the narrative?
5. Free tier = follow sectors without connecting accounts. Paid = connect brokerage via Plaid for personalized briefings. Does that free tier feel useful enough to try?
6. Six signal types synthesized in one briefing (market data, news, filings, earnings transcripts, sentiment, macro). Is that the right set or am I missing something?
For those of you who pick individual stocks: what would actually make you open this every morning?
Comments URL: https://news.ycombinator.com/item?id=47152289
Points: 1
# Comments: 0
Does Anthropic think Claude is alive? Define 'alive'
Article URL: https://www.theverge.com/report/883769/anthropic-claude-conscious-alive-moral-patient-constitution
Comments URL: https://news.ycombinator.com/item?id=47152267
Points: 2
# Comments: 0
A clean API for reading PHP attributes
Article URL: https://freek.dev/3030-a-clean-api-for-reading-php-attributes
Comments URL: https://news.ycombinator.com/item?id=47152257
Points: 1
# Comments: 0
US orders diplomats to fight data sovereignty initiatives
Pete Hegseth tells Anthropic to fall in line with DoD desires, or else
Article URL: https://arstechnica.com/ai/2026/02/pete-hegseth-wants-unfettered-access-to-anthropics-models-for-the-military/
Comments URL: https://news.ycombinator.com/item?id=47152250
Points: 1
# Comments: 0
You might not need lit-labs/router
Article URL: https://gist.github.com/kevindurb/763ae5bdace325f9dc384c643f7d5d9d
Comments URL: https://news.ycombinator.com/item?id=47152243
Points: 1
# Comments: 1
Permissive, then restrictive: concrete solutions and examples in Haskell (2020)
Article URL: https://www.williamyaoh.com/posts/2020-05-03-permissiveness-solutions.html
Comments URL: https://news.ycombinator.com/item?id=47152220
Points: 1
# Comments: 0
TinyTTS: Ultra-light English TTS (9M params, 20MB), 8x CPU, 67x GPU
Hey guys,
I wanted to share a small project I've been working on to solve a personal pain point: TinyTTS.
We all love our massive 70B+ LLMs, but when building local voice assistants, running a heavy TTS framework alongside them often eats up way too much precious VRAM and compute. I wanted something absurdly small and fast that "just works" locally.
TL;DR Specs:
Size: ~9 Million parameters
Disk footprint: ~20 MB checkpoint (G.pth)
Speed (CPU): ~0.45s to generate 3.7s of audio (~8x faster than real-time)
Speed (GPU - RTX 4060): ~0.056s (~67x faster than real-time)
Peak VRAM: ~126 MB
License: Apache 2.0 (Open Weights)
Why TinyTTS? It is designed specifically for edge devices, CPU-only setups, or situations where your GPU is entirely occupied by your LLM. It's fully self-contained, meaning you don't need to run a complex pipeline of multiple models just to get audio out.
How to use it? I made sure it’s completely plug-and-play with a simple Python API. Even better, on your first run, it will automatically download the tiny 20MB model from Hugging Face into your cache for you.
pip install git+https://github.com/tronghieuit/tiny-tts.git
Python API:
from tiny_tts import TinyTTS
# Auto-detects device (CPU/CUDA) and downloads the 20MB checkpoint
tts = TinyTTS()
tts.speak("The weather is nice today, and I feel very relaxed.", output_path="output.wav")
CLI:
tiny-tts --text "Local AI is the future" --device cpu
Links:
GitHub: https://github.com/tronghieuit/tiny-tts
Gradio Web Demo: Try it on HF Spaces here
Hugging Face Model: backtracking/tiny-tts
What's next? I plan to clean up and publish the training code soon so the community can fine-tune it easily. I am also looking into adding ultra-lightweight zero-shot voice cloning.
Would love to hear your feedback or see if anyone manages to run this on a literal potato! Let me know what you think.
If you find this project helpful, please give it a on GitHub.
Comments URL: https://news.ycombinator.com/item?id=47152213
Points: 1
# Comments: 0
Show HN: Automatic context rotation for Claude Code (no manual steps)
AI coding agents break when the context window fills up — they lose state, hallucinate, or auto-compact shreds the context you built up.
I built a 3-hook pipeline that rotates before that happens, with a dry-run replay you can run locally (no LLM/API keys).
Quick demo: - https://github.com/Vinix24/vnx-orchestration/tree/master/dem...
How it works:
┌─────────────────┐ │ PreToolUse hook │── checks context % every tool call │ (≥65% → block) │ └────────┬────────┘ ▼ ┌─────────────────┐ │ Agent writes │── structured ROTATION-HANDOVER.md │ handover file │ (task state, files, progress, next steps) └────────┬────────┘ ▼ ┌─────────────────┐ │ PostToolUse │── detects handover → atomic lock │ launches rotator│ → vnx_rotate.sh (nohup, detached) └────────┬────────┘ ▼ ┌─────────────────┐ │ Rotator │── /clear via tmux → waits for SessionStart │ injects resume │ → pastes continuation prompt └─────────────────┘ Why 65%? It’s ~15 points before auto-compact (~80%), so there’s enough headroom to write a clean handover without racing compaction.
I analyzed 5 projects attempting similar fixes — none have a full detect → handover → clear → resume → verify loop.
Repo: https://github.com/Vinix24/vnx-orchestration Docs: https://github.com/Vinix24/vnx-orchestration/blob/master/doc...
Comments URL: https://news.ycombinator.com/item?id=47152204
Points: 1
# Comments: 0
Speaking Pirate Is Against Microsoft AI Content Policy?
Article URL: https://words.benhutton.me/2026-02-25-speaking-like-a-pirate-is-against-microsoft-ai-content-policy
Comments URL: https://news.ycombinator.com/item?id=47152193
Points: 1
# Comments: 0
How AI Will Change the Mobile Ecosystem
Article URL: https://blog.bensontech.dev/posts/How-ai-will-change-mobile-development/
Comments URL: https://news.ycombinator.com/item?id=47152185
Points: 3
# Comments: 0
Spanish company releases free compressed AI model
Article URL: https://techcrunch.com/2026/02/24/spanish-soonicorn-multiverse-computing-releases-free-compressed-ai-model/
Comments URL: https://news.ycombinator.com/item?id=47151469
Points: 1
# Comments: 0
Gleam is straightforward, predictable and stable
Article URL: https://builders.perk.com/gleam-is-boring-so-i-went-to-a-conference-about-it-8f08a52c3de3
Comments URL: https://news.ycombinator.com/item?id=47151461
Points: 2
# Comments: 0
