Author: Benj Edwards

Researchers astonished by tool’s apparent success at revealing AI’s hidden motives

By Benj EdwardsMarch 15, 2025

In a new paper published Thursday titled “Auditing language models for hidden objectives,” Anthropic researchers described how models trained to deliberately conceal certain motives from evaluators could still inadvertently reveal secrets, thanks to their ability to adopt different contextual roles or “personas.” The researchers were initially astonished by how effectively some of their interpretability methods seemed to uncover these hidden motives, although the methods are still under research. While the research involved models trained specifically to conceal motives from automated software evaluators called reward models (RMs), the broader purpose of studying hidden objectives is to prevent future scenarios where powerful…

AI coding assistant refuses to write code, tells user to learn programming instead

By Benj EdwardsMarch 14, 2025

On Saturday, a developer using Cursor AI for a racing game project hit an unexpected roadblock when the programming assistant abruptly refused to continue generating code, instead offering some unsolicited career advice. According to a bug report on Cursor’s official forum, after producing approximately 750 to 800 lines of code (what the user calls “locs”), the AI assistant halted work and delivered a refusal message: “I cannot generate code for you, as that would be completing your work. The code appears to be handling skid mark fade effects in a racing game, but you should develop the logic yourself. This…

AI search engines cite incorrect sources at an alarming 60% rate, study says

By Benj EdwardsMarch 14, 2025

Even when these AI search tools cited sources, they often directed users to syndicated versions of content on platforms like Yahoo News rather than original publisher sites. This occurred even in cases where publishers had formal licensing agreements with AI companies. URL fabrication emerged as another significant problem. More than half of citations from Google’s Gemini and Grok 3 led users to fabricated or broken URLs resulting in error pages. Of 200 citations tested from Grok 3, 154 resulted in broken links. These issues create significant tension for publishers, which face difficult choices. Blocking AI crawlers might lead to loss…

Anthropic CEO floats idea of giving AI a “quit job” button, sparking skepticism

By Benj EdwardsMarch 13, 2025

Anthropic CEO Dario Amodei raised a few eyebrows on Monday after suggesting that advanced AI models might someday be provided with the ability to push a “button” to quit tasks they might find unpleasant. Amodei made the provocative remarks during an interview at the Council on Foreign Relations, acknowledging that the idea “sounds crazy.” “So this is—this is another one of those topics that’s going to make me sound completely insane,” Amodei said during the interview. “I think we should at least consider the question of, if we are building these systems and they do all kinds of things like…

What does “PhD-level” AI mean? OpenAI’s rumored $20,000 agent plan explained.

By Benj EdwardsMarch 12, 2025

On the Frontier Math benchmark by EpochAI, o3 solved 25.2 percent of problems, while no other model has exceeded 2 percent—suggesting a leap in mathematical reasoning capabilities over the previous model. Benchmarks vs. real-world value Ideally, potential applications for a true PhD-level AI model would include analyzing medical research data, supporting climate modeling, and handling routine aspects of research work. The high price points reported by The Information, if accurate, suggest that OpenAI believes these systems could provide substantial value to businesses. The publication notes that SoftBank, an OpenAI investor, has committed to spending $3 billion on OpenAI’s agent products…

OpenAI pushes AI agent capabilities with new developer API

By Benj EdwardsMarch 11, 2025

Developers using the Responses API can access the same models that power ChatGPT Search: GPT-4o search and GPT-4o mini search. These models can browse the web to answer questions and cite sources in their responses. That’s notable because OpenAI says the added web search ability dramatically improves the factual accuracy of its AI models. On OpenAI’s SimpleQA benchmark, which aims to measure confabulation rate, GPT-4o search scored 90 percent, while GPT-4o mini search achieved 88 percent—both substantially outperforming the larger GPT-4.5 model without search, which scored 63 percent. Despite these improvements, the technology still has significant limitations. Aside from issues…

What's Hot

Tired of AI slop on Instagram? These apps are for human artists only

Saga Origins Partnering With GFAL for Diamond Dreams

Google Assistant is officially being phased out for Gemini

Author: Benj Edwards

Researchers astonished by tool’s apparent success at revealing AI’s hidden motives

AI coding assistant refuses to write code, tells user to learn programming instead

AI search engines cite incorrect sources at an alarming 60% rate, study says

Anthropic CEO floats idea of giving AI a “quit job” button, sparking skepticism

What does “PhD-level” AI mean? OpenAI’s rumored $20,000 agent plan explained.

OpenAI pushes AI agent capabilities with new developer API

Our Picks

Skate Playtest is Getting Microtransactions

Apple fixes WebKit zero-day exploited in ‘extremely sophisticated’ attacks

Uncle Chop’s Rocket Shop 1.1 update adds new content