Feed aggregator
No, it doesn't cost Anthropic $5k per Claude Code user
Article URL: https://martinalderson.com/posts/no-it-doesnt-cost-anthropic-5k-per-claude-code-user/
Comments URL: https://news.ycombinator.com/item?id=47317132
Points: 4
# Comments: 0
Love in the Time of A.I. Companions
Article URL: https://www.newyorker.com/magazine/2026/03/16/love-in-the-time-of-ai-companions
Comments URL: https://news.ycombinator.com/item?id=47317122
Points: 1
# Comments: 0
Helios: Real Real-Time Long Video Generation Model
Article URL: https://www.alphaxiv.org/abs/2603.04379
Comments URL: https://news.ycombinator.com/item?id=47317115
Points: 3
# Comments: 0
PRX Part 3 – Training a Text-to-Image Model in 24h
Article URL: https://huggingface.co/blog/Photoroom/prx-part3
Comments URL: https://news.ycombinator.com/item?id=47317103
Points: 1
# Comments: 0
Open-source software could be excluded from Colorado age verification bill
Article URL: https://twitter.com/carlrichell/status/2031125624711164182
Comments URL: https://news.ycombinator.com/item?id=47317060
Points: 1
# Comments: 0
Show HN: Hacker News Focus Comments Reader
Article URL: https://chromewebstore.google.com/detail/hn-focus-reader/ibhipggecnholemnbahigagpgifkphac
Comments URL: https://news.ycombinator.com/item?id=47317045
Points: 1
# Comments: 0
The emerging role of SRAM-centric chips in AI inference
Article URL: https://gimletlabs.ai/blog/sram-centric-chips
Comments URL: https://news.ycombinator.com/item?id=47317043
Points: 1
# Comments: 0
Simradar21
Article URL: https://simradar21.com/
Comments URL: https://news.ycombinator.com/item?id=47317032
Points: 1
# Comments: 0
Amid wave of kids' online safety laws, age-checking tech comes of age
Article URL: https://www.reuters.com/legal/litigation/amid-wave-kids-online-safety-laws-age-checking-tech-comes-age-2026-03-09/
Comments URL: https://news.ycombinator.com/item?id=47317031
Points: 1
# Comments: 0
Bluesky CEO Jay Graber Is Stepping Down
Electrical upgrades for our Winnebago Solis camper van
Article URL: https://www.jamesxli.com/2026/van-electrical-upgrades/
Comments URL: https://news.ycombinator.com/item?id=47315843
Points: 1
# Comments: 0
Australians Flock to VPNs in the Wake of Online Age-Restriction Laws
Tokenized Stocks Are Coming to a Market Near You
Article URL: https://www.wsj.com/finance/stocks/tokenized-stocks-are-coming-to-a-market-near-you-five-things-to-know-d9131494
Comments URL: https://news.ycombinator.com/item?id=47315815
Points: 2
# Comments: 0
Show HN: LOAB – AI agents get decisions right but skip the process [pdf]
LOAB, an open-source benchmark for evaluating whether AI agents can follow regulated lending processes — not just produce the right final answer. The motivation is simple: in mortgage lending, regulators don't care if you got the right answer. They care whether you followed the right process. Skip a KYC check, pull a credit bureau report before getting privacy consent, or approve a loan without the required policy lookup — that's a compliance failure even if the outcome was correct. Current AI benchmarks don't measure this. They evaluate what the agent decided, not how it got there. LOAB simulates a fictional Australian lender with mock regulatory APIs, multi-agent roles mirroring real bank operations, and a five-dimension scoring rubric derived from actual lending law. A run only passes if the outcome is correct AND the process was correct. The main finding: frontier models achieve 67-75% outcome accuracy but only 25-42% when you also require process compliance. It's surprisingly hard to get AI to follow a prescribed sequence of steps even when it clearly "knows" the right answer.
Comments URL: https://news.ycombinator.com/item?id=47315813
Points: 1
# Comments: 0
But why do we change the clocks at 2am?
Article URL: https://www.rd.com/article/why-does-daylight-saving-time-start-at-2-a-m/
Comments URL: https://news.ycombinator.com/item?id=47315804
Points: 1
# Comments: 0
Hosted MCP server "everything" for testing
Article URL: https://servereverything.dev/
Comments URL: https://news.ycombinator.com/item?id=47315787
Points: 1
# Comments: 0
Train Neural Network Using DirectCompute D11
Article URL: https://pypi.org/project/directcompute-nn/
Comments URL: https://news.ycombinator.com/item?id=47315777
Points: 1
# Comments: 1
Reviving the Maintenance of MkDocs
Article URL: https://github.com/orgs/mkdocs-community/discussions/1
Comments URL: https://news.ycombinator.com/item?id=47315771
Points: 1
# Comments: 0
Energy-based Model (EBM) for enterprise AI security Ship it or keep tuning?
I've been building Energy-Guard OS for the past several months — and I want an honest opinion from people who actually understand the tradeoffs, because I'm stuck at a decision point. What is it? It's not a fine-tuned LLM. It's a production application of Energy-based Models (EBMs) — an architecture that assigns an energy score to inputs rather than predicting tokens. Low energy = normal. High energy = threat or anomaly. The core use case: a real-time data gateway that sits between your organization and any AI service, blocking sensitive data from leaking out (PII, financials, strategic documents) while still allowing legitimate AI use. Think of it as a firewall, but one that understands semantic context, not just regex patterns. More about EBMs No hallucination (it scores, not generates) Calibrated risk score, not binary block/allow Runs on modest hardware — currently 192.8 req/s on a single 4 vCPU / 16GB RAM machine 411MB model size, under 700MB memory usage Built from scratch on 7 production data sources The honest test results (10,000+ cases, independent test suite): Total Tests: 13,000 Valid Responses: 13,000 Success Rate: 100.0% Overall Accuracy: 88.74%
Duration: 18.4s Throughput: 704.5 req/s Avg Latency: 17.6ms P50 Latency: 17.9ms P95 Latency: 32.0ms P99 Latency: 33.8ms Category Accuracy Financial Leak Detection 100% PII / Private Data 100% Strategic Data 100% Malicious Code 95% OWASP LLM Top 10 87% Multi-Turn Attacks 67% General Benign (False Positives) 66% Overall 88.7% F1: 0.927 | Precision: 0.922 | Recall: 0.932 | Specificity: 0.740 The problem I'm facing: After 2 months of tuning, I've gone from 74% → 88.7% overall accuracy. But I've hit a wall where improving one category hurts another. Specifically: The false positive rate is too high for general/technical content (the system over-blocks benign code and text) Multi-turn conversation attacks are at 67% — the model doesn't fully leverage conversation context yet Every time I push one metric up, something else drops My actual question: Do I ship a limited Beta now — restricted to the use cases where it performs at 95-100% (financial data, PII, strategic leaks) — or do I keep tuning before any real-world exposure? Why i want to ship: Real-world data will teach me more than synthetic test cases The high-value use cases already work extremely well I've been optimizing against synthetic benchmarks for 2 months Why i want to wait: 34% false positive rate on general content will frustrate users Multi-turn is a known attack vector that's currently weak First impressions matter Website if you want to see more details: https://ebmsovereign.com/ All forms on the website are currently disabled except for emails, which will be available for testing within 24 hours, Genuinely want to hear from people who've shipped security products or ML systems in production. What would you do?
Comments URL: https://news.ycombinator.com/item?id=47315770
Points: 1
# Comments: 0
