|
π Read In Depth
|
ARC-AGI-3
ARC-AGI-3 just dropped, introducing a new scoring metric (RHAE β Relative Human Action Efficiency) that measures not just whether AI solves tasks but how efficiently it does so relative to humans. Current frontier models like Gemini and Opus score around 0.2% on these visual reasoning puzzles, making this a stark measure of how far we are from fluid general reasoning. The technical report is worth reading if you think carefully about what benchmarks actually measure.
hn/Best Stories
|
ARC AGI 3 scores are not calculated the same way as ARC AGI 1 or 2
Critical methodological context for ARC-AGI-3: the scoring function (RHAE) is fundamentally different from ARC-AGI-1/2, making direct comparisons misleading. The post breaks down how per-level action efficiency is scored relative to human baselines β worth understanding before drawing conclusions from the leaderboard numbers.
reddit/r/singularity
|
Running Tesla Model 3's computer on my desk using parts from crashed cars
A hardware hacker salvaged a Tesla Model 3's MCU from crashed cars and got it running on a desk β the kind of build-it-from-scratch-to-understand-it project that reveals how the system actually works. Likely involves low-level boot, proprietary protocols, and creative reverse engineering. 710 HN points suggests it delivers.
hn/Best Stories
|
Miscellanea: The War in Iran
A Collection of Unmitigated Pedantry (a military historian's blog) offers ground-level analytical commentary on the Iran war β the kind of structured, evidence-based reasoning that cuts through media noise. 551 HN points suggests it's substantive rather than punditry. Essential context for understanding a conflict that's reshaping energy markets and global supply chains you probably care about.
hn/Best Stories
|
A post-transformer architecture just crushed LLMs on Sudoku Extreme. Is the transformer hitting a reasoning wall nobody wants to talk about?
A post-transformer architecture from Pathway reportedly dominates LLMs on Sudoku Extreme, suggesting that pure chain-of-thought scaling may not be sufficient for certain constraint-satisfaction reasoning tasks. If the claims hold up, this is a meaningful data point in the debate about transformer architectural limits β relevant both technically and for understanding where the field is headed.
reddit/r/singularity
|
Antimatter has been transported for the first time
Antimatter has been physically transported for the first time β a genuine physics milestone. Antimatter has historically only been studied at the accelerator facilities where it's created; moving it opens up new experimental possibilities and is a meaningful step toward practical antimatter research. Worth a read if you follow fundamental physics.
hn/Best Stories
|
Is Big Tech Facing a Big Tobacco Moment?
Back-to-back jury verdicts finding Meta and YouTube negligent for addictive design are being compared to the Big Tobacco moment β a potential inflection point for platform liability law. The DealBook piece analyzes what this means structurally for tech business models, not just as a legal story. Relevant for anyone thinking about platform defensibility and regulatory moats.
nyt/Business
|
Thoughts on slowing the fuck down
The author of Pi (a popular open-source coding agent framework) writes about deliberately slowing down amid the AI velocity arms race β questioning whether speed of shipping is actually the right metric. With 956 HN points and context from Simon Willison that this person built real production infrastructure, it's a practitioner's reflection worth taking seriously rather than dismissing as burnout blogging.
hn/Best Stories
|
βLethalityβ Used to Be a Pentagon Buzzword. Now Itβs a Worldview.
NYT Magazine examines how 'lethality' has evolved from Pentagon jargon into an actual governing philosophy under Hegseth β and how it obfuscates as much as it clarifies, just differently than old defense-speak did. A substantive look at how language shapes institutional culture and decision-making in the military, which is now running a real war.
nyt/Top Stories
|
Has modern cinema replaced tragedy with psychology?
A sharp critical argument that modern cinema has substituted psychological explanation for genuine tragic structure β using Inglourious Basterds as a case study where evil is legible as behavior but never reaches the depth of what evil *is*. The kind of ideas-forward film criticism that connects to broader questions about how storytelling shapes moral understanding.
reddit/r/TrueFilm
|
|
β‘ FYI
|
OpenAI Is Shutting Down Sora, Its A.I. Video Generator
OpenAI is shutting down Sora just three months after signing a multiyear deal to bring Disney characters to the platform. A fast product death after high-profile launch deals β raises questions about OpenAI's product strategy and whether video generation as a standalone product has a viable moat.
nyt/Technology
|
First-ever American AI Jobs Risk Index released by Tufts University
Tufts University released the first American AI Jobs Risk Index, estimating 2.7Mβ19.5M U.S. job displacements within 2β5 years depending on adoption speed, with a central estimate around 9.3M. Useful data point for thinking about labor market effects β though wide confidence intervals reflect genuine uncertainty about adoption pace.
reddit/r/singularity
|
Bernie Sanders and AOC introduce bill to pause building of new datacenters
Sanders and AOC introduced a bill to pause construction of new data centers β framed around energy and labor concerns. Long-shot legislation, but signals growing political traction for AI infrastructure regulation. Worth tracking as a potential constraint on compute expansion, which directly affects the industry you work in.
reddit/r/singularity
|
The EU still wants to scan your private messages and photos
The EU's push to scan private messages and photos for CSAM is back on the table after a surprise Parliament vote shifted from blanket surveillance to targeted monitoring. Still a live policy threat to end-to-end encryption in Europe β the creator of Fight Chat Control confirmed the campaign is needed again. Relevant for anyone thinking about privacy infrastructure and EU tech regulation.
hn/Best Stories
|
I went through congressional trade filings from the week before the Iran war started. An Intelligence Committee member bought Exxon twice in early February and.....
A Reddit user dug through congressional trade filings and found a House Intelligence Committee member bought Exxon twice in early February and sold a cybersecurity stock the day before bombs dropped on Iran. Not conclusive, but the pattern is striking and connects to the broader story of possible insider trading around Trump's Iran-related posts that's already drawing regulatory scrutiny.
reddit/r/investing
|