Feed aggregator
In Search of Types [pdf]
Article URL: https://www.humprog.org/%7Estephen/papers/kell14in-author-version.pdf
Comments URL: https://news.ycombinator.com/item?id=42045459
Points: 1
# Comments: 0
Amazon's Echo Pop Just Returned to Its All-Time Lowest Price Before Black Friday
For Windows 10 Holdouts, One More Year of Support Will Cost $30
MA could be the next state to get rid of the subminimum wage for tipped workers
Nvidia ousts Intel from Dow Jones Index after 25-year run
Article URL: https://arstechnica.com/ai/2024/11/nvidia-ousts-intel-from-dow-jones-index-after-25-year-run/
Comments URL: https://news.ycombinator.com/item?id=42045072
Points: 2
# Comments: 0
Down in the Mantle
Article URL: https://www.science.org/content/blog-post/down-mantle
Comments URL: https://news.ycombinator.com/item?id=42045049
Points: 1
# Comments: 0
Black Friday Gaming Deal: Backbone's Nifty Cloud Gaming Controllers Are 40% Off
Public Dataset of Social Media Discourse about the 2024 U.S. Election
Article URL: https://arxiv.org/abs/2411.00376
Comments URL: https://news.ycombinator.com/item?id=42045007
Points: 1
# Comments: 0
OpenPaX, a New Linux Memory Security Patch, Arrives
Article URL: https://thenewstack.io/openpax-a-new-linux-memory-security-patch-arrives/
Comments URL: https://news.ycombinator.com/item?id=42044998
Points: 1
# Comments: 0
Astronomers urge FCC to halt satellite megaconstellation launches
Article URL: https://www.space.com/space-exploration/satellites/astronomers-urge-fcc-to-halt-satellite-megaconstellation-launches
Comments URL: https://news.ycombinator.com/item?id=42044964
Points: 3
# Comments: 1
Show HN: Fuzzy deduplicate any CSV using vector embeddings
I made an app to fuzzy-deduplicate my Google Sheets and CRM records
- No manual configuration required
- Works out-of-the-box on most data types (ex. people, companies, product catalog)
Implementation details:
- Embeds records using an E5-family model
- Performs similarity search using DuckDB w/ vector similarity extension
- Does last-mile comparison and merges duplicates using Claude
Demo video: https://youtu.be/7mZ0kdwXBwM
Github repo (Apache 2.0 licensed): https://github.com/SnowPilotOrg/dedupe_it
Background story: My company has a table for tracking leads, which includes website visitors, demo form submissions, app signups, and manual entries. It’s full of duplicates. And writing formulas to merge those dupes has been a massive PITA.
I figured that an LLM could handle any data shape and give me a way to deal with tricky custom rules like “treat international subsidiaries as distinct from their parent company”.
The challenging thing was avoiding an NxN comparison matrix. The solution I came up with was first narrowing down our search space using vector embeddings + semantic similarity search, and then using a generative LLM only to compare a few nearest neighbors and merge.
Some cool attributes of this approach:
- Can work incrementally (no reprocessing the entire dataset)
- Allows processing all records in parallel
- Composes with deterministic dedupe rules
Lmk any feedback on how to make this better!
Comments URL: https://news.ycombinator.com/item?id=42044962
Points: 2
# Comments: 0
Walmart’s $15 Roku Smart Bulb Deal Will Light Up the Room, Not Your Wallet
Perplexity CEO offers to replace striking NYT staff with AI
Article URL: https://techcrunch.com/2024/11/04/perplexity-ceo-offers-to-replace-striking-nyt-staff-with-ai/
Comments URL: https://news.ycombinator.com/item?id=42044956
Points: 9
# Comments: 3
Digital "AVATAR" therapy for distressing voices in psychosis
Article URL: https://www.nature.com/articles/s41591-024-03252-8
Comments URL: https://news.ycombinator.com/item?id=42044950
Points: 1
# Comments: 0
Netflix Bullish on Gen AI for Games After Laying Off Human Game Developers
Article URL: https://www.404media.co/netflix-games-ai-exec/
Comments URL: https://news.ycombinator.com/item?id=42044949
Points: 4
# Comments: 1
My Favorite Bluetooth Speaker Is $30 Off and Makes for a Great Gift
Sweden scraps plans for 13 offshore windfarms over Russia security fears
Live Types in a TypeScript Monorepo
Article URL: https://colinhacks.com/essays/live-types-typescript-monorepo
Comments URL: https://news.ycombinator.com/item?id=42044935
Points: 1
# Comments: 0