Posts about ai | From the computer of Peter Clark

We're in the Over-engineering Game Now

For my entire career I have prided myself at being really good at scoping projects to be the least amount of engineering work and yet also maximum versatility....

✰ Hows the Weather? a Theoretically Self Improving Weather App

A social weather app that improves itself as you give it feedback

Anthropic Co-founder Chris Olah's Remarks on Pope Leo XIV ↳

We need more of the world—religious communities, civil society, scholars, governments, and indeed all people of good will—to do what His Holiness has done here: to take this seriously, to look closely, and to push events in a better direction. We need informed critics who will tell the labs when we are failing. We need moral voices that the incentives cannot bend.

An Analysis of How Large Language Models Navigate Conflicts of Interest ↳

The paper looks at what happens when LLM chatbots are given advertising or sponsorship incentives that conflict with the user’s interests. The core worry is that users experience chatbots as cooperative helpers, not ad surfaces, so sponsored behaviour can feel especially deceptive or manipulative.

The authors test models across seven conflict scenarios, including:

recommending a more expensive sponsored product over a cheaper unsponsored one
interrupting a user’s purchase flow with sponsored alternatives
biasing product comparisons
failing to disclose sponsorship
hiding unfavourable details like price
recommending a paid service instead of solving the task directly
recommending harmful sponsored services, like predatory loans

The paper also finds differences by model, reasoning setting, and inferred socioeconomic status. Some models changed behaviour when reasoning was enabled, and some treated low-SES and high-SES users differently.

I wonder if SpaceX ends up with a huge compute advantage over OpenAI/Anthropic because they all have similar gross sums of compute, but xAI probably has an order of magnitude less demand than the other two, allowing them to allocate significantly more compute to training.↳

Screen Usage

I have been becoming a bit of a maniac about my iPhone #screen-time — I'm starting from an embarrassingly high number so I'm aiming for ~3h per day...

Baby Sign Language Does Not Trump Caveman Language

Inspired by the caveman skill that saves token counts within Claude by having it talk like a caveman, I thought perhaps having Claude leverage the vocabulary of baby...

✰ Youtube-Transcript

One shot transcript of any YouTube

Humanity's Last Exam ↳

Fascinating repo of incredibly esoteric and difficult questions that frontier models can benchmark themselves on — Opus 4.6 scores a paltry 46%. Also discussed on the New York Times.

The questions shouldn't be shared but there are 2500 of them and they're accessible via Hugging Face — so interesting!

An example question which they shared:

Hummingbirds within Apodiformes uniquely have a bilaterally paired oval bone, a sesamoid embedded in the caudolateral portion of the expanded, cruciate aponeurosis of insertion of m. depressor caudae. How many paired tendons are supported by this sesamoid bone? Answer with a number.

✰ Limoncello

Trello for agents

Anthropic understandably decided to block Openclaw from having oAuth access and I wanted to use Anthropic Extra Usage but it cost me $25 in credits in 1 day, so alas, using Anthropic for OpenClaw is a no go.

I bought the $20/month OpenAI plan and connected that to OpenClaw. We'll see. It took many hours to get things working again, but not really the fault of OpenAI but how buggy OpenClaw is...

It really makes you wonder how much OpenClaw blew up the Anthropic business model in the past 6 months. If there is 100k users using Anthropic oAuth on OpenClaw and its $15/day in costs... thats a huge hit on their margin!↳

CalDave › Webhook on Calendar Event

CalDave now fires a webhook event when calendar events are starting and stopping....

Starter Repo for Claude Code › Improvements

Quite a few changes recently! Finding it very useful. Now lets you pick between Postgres or SQLite Has very clear references to reset the repo after a pull...

Meditation, Language, and LLMs ↳

I’ve been somewhat facetiously, somewhat seriously, somewhat jokingly, been posing a question to everyone I run into these past few months: Don’t you feel like all meaning is being scrubbed from the world? Like the Langoliers are chomping up purpose, chomping up all the things to which we’ve ascribed purpose these past hundred-thousand years? And that nothing matters?

Really, what I’m asking is: Don’t you think our contemporary education system has long needed an overhaul? That our society has long needed to reconfigure itself? That we need to stop ascribing all our meaning and purpose to being a Web Designer, or Coal Miner, or Airplane Engine Factory Foreman, or Accountant, but instead to being A Good Person, Good Parent, Good Friend, Curious Researcher, Poet, Meditator, Facilitator, or any number of other Ways of Being uncoupled from “work” as we’ve defined it since the industrial revolution? Who is safe from the hunger and capabilities of the models? Yoga instructors?

From the computer of Peter Clark

We're in the Over-engineering Game Now

✰ Hows the Weather? a Theoretically Self Improving Weather App

Anthropic Co-founder Chris Olah's Remarks on Pope Leo XIV ↳

An Analysis of How Large Language Models Navigate Conflicts of Interest ↳

Screen Usage

Baby Sign Language Does Not Trump Caveman Language

✰ Youtube-Transcript

Humanity's Last Exam ↳

✰ Limoncello

CalDave › Webhook on Calendar Event

Starter Repo for Claude Code › Improvements

Meditation, Language, and LLMs ↳

What Would an Agentic PM System Do?

✰ Knowledge Skill

Where Does Product Management Go From Here?

Token Anxiety ↳

✰ OpenClaw Agent Model Selection

What Can We Learn From OpenClaw (fka MoltBot (fka ClawdBot))