✰ Hows the Weather? a Theoretically Self Improving Weather App
A social weather app that improves itself as you give it feedback
Anthropic Co-founder Chris Olah's Remarks on Pope Leo XIV ↳
We need more of the world—religious communities, civil society, scholars, governments, and indeed all people of good will—to do what His Holiness has done here: to take this seriously, to look closely, and to push events in a better direction. We need informed critics who will tell the labs when we are failing. We need moral voices that the incentives cannot bend.
An Analysis of How Large Language Models Navigate Conflicts of Interest ↳
The paper looks at what happens when LLM chatbots are given advertising or sponsorship incentives that conflict with the user’s interests. The core worry is that users experience chatbots as cooperative helpers, not ad surfaces, so sponsored behaviour can feel especially deceptive or manipulative.
The authors test models across seven conflict scenarios, including:
recommending a more expensive sponsored product over a cheaper unsponsored one
interrupting a user’s purchase flow with sponsored alternatives
biasing product comparisons
failing to disclose sponsorship
hiding unfavourable details like price
recommending a paid service instead of solving the task directly
recommending harmful sponsored services, like predatory loans
The paper also finds differences by model, reasoning setting, and inferred socioeconomic status. Some models changed behaviour when reasoning was enabled, and some treated low-SES and high-SES users differently.
I wonder if SpaceX ends up with a huge compute advantage over OpenAI/Anthropic because they all have similar gross sums of compute, but xAI probably has an order of magnitude less demand than the other two, allowing them to allocate significantly more compute to training.↳
Screen Usage
I have been becoming a bit of a maniac about my iPhone #screen-time — I'm starting from an embarrassingly high number so I'm aiming for ~3h per day...Baby Sign Language Does Not Trump Caveman Language
Inspired by the caveman skill that saves token counts within Claude by having it talk like a caveman, I thought perhaps having Claude leverage the vocabulary of baby...✰ Youtube-Transcript
One shot transcript of any YouTube
Humanity's Last Exam ↳
Fascinating repo of incredibly esoteric and difficult questions that frontier models can benchmark themselves on — Opus 4.6 scores a paltry 46%. Also discussed on the New York Times.
The questions shouldn't be shared but there are 2500 of them and they're accessible via Hugging Face — so interesting!
An example question which they shared:
Hummingbirds within Apodiformes uniquely have a bilaterally paired oval bone, a sesamoid embedded in the caudolateral portion of the expanded, cruciate aponeurosis of insertion of m. depressor caudae. How many paired tendons are supported by this sesamoid bone? Answer with a number.
✰ Limoncello
Trello for agents
Anthropic understandably decided to block Openclaw from having oAuth access and I wanted to use Anthropic Extra Usage but it cost me $25 in credits in 1 day, so alas, using Anthropic for OpenClaw is a no go.
I bought the $20/month OpenAI plan and connected that to OpenClaw. We'll see. It took many hours to get things working again, but not really the fault of OpenAI but how buggy OpenClaw is...
It really makes you wonder how much OpenClaw blew up the Anthropic business model in the past 6 months. If there is 100k users using Anthropic oAuth on OpenClaw and its $15/day in costs... thats a huge hit on their margin!↳
Webhook on Calendar Event
CalDave now fires a webhook event when calendar events are starting and stopping....Improvements
Quite a few changes recently! Finding it very useful. Now lets you pick between Postgres or SQLite Has very clear references to reset the repo after a pull...Meditation, Language, and LLMs ↳
I’ve been somewhat facetiously, somewhat seriously, somewhat jokingly, been posing a question to everyone I run into these past few months: Don’t you feel like all meaning is being scrubbed from the world? Like the Langoliers are chomping up purpose, chomping up all the things to which we’ve ascribed purpose these past hundred-thousand years? And that nothing matters?
Really, what I’m asking is: Don’t you think our contemporary education system has long needed an overhaul? That our society has long needed to reconfigure itself? That we need to stop ascribing all our meaning and purpose to being a Web Designer, or Coal Miner, or Airplane Engine Factory Foreman, or Accountant, but instead to being A Good Person, Good Parent, Good Friend, Curious Researcher, Poet, Meditator, Facilitator, or any number of other Ways of Being uncoupled from “work” as we’ve defined it since the industrial revolution? Who is safe from the hunger and capabilities of the models? Yoga instructors?
What Would an Agentic PM System Do?
Let's assume that the Product Management Industrial Complex is real and that we — Product Managers — are all wheels within wheels of wheels. Which, I truly think...✰ Knowledge Skill
LLMs are amazing at summarizing content. Agents are amazing at doing workflows based on a simple input. I have found a huge amount of value by using OpenClaw as a "second brain."
Where Does Product Management Go From Here?
Square just laid off 4000 employees. It does feel like many technology companies over-hired during the pandemic and are now blaming AI for lay offs — but at...Token Anxiety ↳
✰ OpenClaw Agent Model Selection
Let OpenClaw plugins dynamically override which AI model handles a request, enabling cost-optimized routing based on prompt complexity or session context, amongst other things!