Feed aggregator

Chicken Nuget

Hacker News - Fri, 03/13/2026 - 3:20am
Categories: Hacker News

Show HN: ROI-first AI automation framework for B2B companies

Hacker News - Fri, 03/13/2026 - 3:11am

We've been working on practical AI automation systems focused on measurable business impact.

Many companies implement AI but fail to connect it to real ROI.

This project focuses on CRM automation, AI sales assistants and intelligent workflows.

https://roihacking.ai

Comments URL: https://news.ycombinator.com/item?id=47361508

Points: 1

# Comments: 0

Categories: Hacker News

Ask HN: What benchmarks do you trust most when comparing large LLMs?

Hacker News - Fri, 03/13/2026 - 3:06am

So, I was checking out this research paper that compares Nemotron-3-Super-120B, GPT-OSS-120B, and Qwen3.5-122B. They looked at how these models performed on different benchmarks like IFBench, SWE-Bench, Tau Bench, and RULER.

One thing that stood out was the trade-off between accuracy and inference throughput, especially with formats like NVFP4 vs BF16.

I'm really interested to know which benchmarks folks here actually rely on when they're checking out models for real-life tasks. What seems to work best for you?

Do you rely more on reasoning benchmarks, coding benchmarks, or long-context tests?

Comments URL: https://news.ycombinator.com/item?id=47361482

Points: 1

# Comments: 0

Categories: Hacker News

Ask HN: Resources for a conceptual model of LLMs as applicable to coding?

Hacker News - Fri, 03/13/2026 - 2:51am

I am trying to understand LLMs conceptually well enough to be able to predict their capabilities (and limitations) when it comes to generating code. Is that even a sensible goal? Are there good resources?

So far I've looked at:

1. Vibe Coding, by Steve Yegge and Gene Kim (https://www.amazon.in/Vibe-Coding-Building-Production-grade-Software/dp/1966280025). This has some practical examples and many guidelines. But there is not much theory and this does not explain LLMs conceptually AFAICT.

2. Build an LLM from Scratch, by Sebastian Raschka (https://www.manning.com/books/build-a-large-language-model-from-scratch). Seems in-depth. But I don't really want to build an LLM.

3. AI Engineering, by Chip Huyen (https://www.amazon.in/AI-Engineering-Building-Applications-Foundation/dp/1098166302). This seems promising, although it is not coding focussed.

Perhaps something like How Claude Code Works (https://code.claude.com/docs/en/how-claude-code-works) but fleshed out in more detail.

Thanks.

Comments URL: https://news.ycombinator.com/item?id=47361403

Points: 1

# Comments: 0

Categories: Hacker News

World Vibe Web: a distributed, open-source app store

Hacker News - Fri, 03/13/2026 - 2:47am

Article URL: https://wvw.dev

Comments URL: https://news.ycombinator.com/item?id=47361380

Points: 2

# Comments: 0

Categories: Hacker News

Country Filter for X/Twitter

Hacker News - Fri, 03/13/2026 - 2:46am

Article URL: https://geofilterx.com/

Comments URL: https://news.ycombinator.com/item?id=47361373

Points: 1

# Comments: 0

Categories: Hacker News

Agentic, fully-automated reverse engineering

Hacker News - Fri, 03/13/2026 - 2:37am

Article URL: https://github.com/amruth-sn/kong

Comments URL: https://news.ycombinator.com/item?id=47361337

Points: 1

# Comments: 0

Categories: Hacker News

National Dex

Hacker News - Fri, 03/13/2026 - 2:36am

Article URL: https://nationaldex.io/

Comments URL: https://news.ycombinator.com/item?id=47361331

Points: 2

# Comments: 0

Categories: Hacker News

Pirate Bananagrams

Hacker News - Fri, 03/13/2026 - 2:32am

Article URL: https://piratebanana.com/

Comments URL: https://news.ycombinator.com/item?id=47361318

Points: 1

# Comments: 1

Categories: Hacker News

Ceno, browse the web without internet access

Hacker News - Fri, 03/13/2026 - 2:30am

Article URL: https://ceno.app/en/index.html?

Comments URL: https://news.ycombinator.com/item?id=47361313

Points: 1

# Comments: 0

Categories: Hacker News

Pages