The lower-post-volume people behind the software in Debian. (List of feeds.)

Bram Cohen
The Current State and Future of AI

I’d like to talk about where AI is and where it’s likely to go in the future. In some sense this is a fool’s errand: It’s been obvious for many decades that there’s no physical limit preventing technology from surpassing human brains so any guesswork is about when, not if, that happens, and it’s impossible to guess when major technological breakthroughs will occur. But the current boom isn’t about a series of big breakthroughs, it’s one big breakthrough and a lot of scaling up and polish. So I’m going to say what I think the limits of the current technology are and what that means for the future.

(The one big breakthrough was realizing that if you stick with sublinear functions in the middle of a neural network you can back propagate over any depth. There’s another important but less revolutionary insight that you can make the amount of computation in each layer less than quadratic if you use a transformer architecture. I have an idea for another big advance but it’s working within this framework and doesn’t fundamentally change the outlook.)

Thanks for reading Bram’s Thoughts! Subscribe for free to receive new posts and support my work.

The state of AI today is comparable to what the internet was like circa 2000: An obviously very promising and important technology in the midst of its hype cycle which has yet to make a meaningful economic contribution. Improvements in AI could come to a screeching halt tomorrow and we’d still see a process over the next ten to twenty years of figuring out how to use it in industry, resulting in meaningful economic gains which show up in GDP and benefiting peoples lives beyond the fun of talking to a chatbot.

One example is in therapy. Right now chatbots are maybe being a good place for mostly mentally healthy people to vent and find companionship but they aren’t trained for treating serious mental illness and are apparently badly aggravating schizophrenia. This is easy to improve on. For treating the symptoms of depression a chatbot needs to be trained to say ‘Tell me what you’re going through. I care about you.’ For anxiety it needs to say ‘The world is a stable place and everything is probably going to be okay. Freaking out doesn’t help. Stay calm and carry on.’ Schizophrenia is more problematic and possibly not something which current LLMs aren’t good for. Just those straightforward improvements could result in much cheaper therapy available in unlimited quantities at any time of day or night for the most common mental health problems.

That said, AI improvements are obviously not coming to a screeching halt tomorrow. But what’s going on now is mostly scaling up: More data and more training. Eventually you run out of data and can’t afford any more training. An adult human has processed less than a gigabyte of linguistic information and is on a completely different level, so there are still some mysterious fundamental improvements to be had in getting training to work well. The LLMs we have today give a very misleading impression of how good they are. They can do things like make up plausible-sounding recipes but if you try following those recipes you’ll find they need a lot of tweaking to get dialed in. And I have to snark that the new Opus 4.5 model is a massive regression for things like recipes and figuring out which actor was referred to by a given pronoun. It’s best to think of LLMs as chatbots. They’re a massive enhancement in search technology and extraordinarily good at language translation and the tedious parts of coding. But they’re still fundamentally collating things from their training data and dumb as a rock.

One thing affecting the optics of the quality of LLMs is that they’re very good at chatting and math. What’s going on here isn’t so much that the LLMs are exceptionally good at these things as that the state of the art prior to them was bizarrely insanely bad. This had long been a mystery. Why can’t we apply simple statistical techniques to at least make a vaguely plausible chatbot which won’t give itself away in literally sentences? The best we could do were things which obfuscated and said vague generic things and hope that the user doesn’t notice that there isn’t much meat in what it’s saying. What we have now are LLMs, which apparently are those simple statistical techniques which can make plausible text. They just happen to require a technique we didn’t know before and require about nine orders of magnitude more data and computation to train than we expected. They also still work in no small part by obfuscating and saying vague generic things and hoping the user doesn’t notice there isn’t much meat in what they’re saying. But they also augment that by agreeing with the user and repeating what they say a lot.

By the way, if you want to bust something as being a chatbot the best approach isn’t to leverage what they’re bad at but what they’re good at. Ask it to play a game where it doesn’t use certain letters, or only speaks in iambic pentameter, or only uses words containing an odd number of letters, and it will immediately give itself away by demonstrating utterly superhuman abilities. It has no idea how to emulate human frailty realistically.

Unrelated to all that, a note about my last post: It turns out that my napkin model missed that supercritical fluid density is highly nonlinear and in particular gets very dense close to the critical temperature so in practice you want the critical temperature to be just barely below the minimum temperature of the cycle you’re using. Carbon Dioxide’s critical temperature of 31.1 Celcius is very good given typical Earth air temperatures. This paper considers the scenario where you have a solar thermal plant out in he desert so the ambient temperature is considerably higher than normal and you want to increase the critical temperature of the working fluid. They suggest doing this by adding Perfluorobenzene. The problems with this approach are that there’s the counterfactual of pumping water underground for cooling or replacing the whole system with photovoltaics. It may be more promising to instead go in the opposite direction: If you have a power plant next to frigid arctic waters which stay near 0 Celcius year round you can lower the critical temperature to around 15 Celcius by adding in about 12% Argon. That’s a boring but low risk modification which is likely to result in a small improvement in efficiency, and any improvement in efficiency of a power plant is a big deal.

Thanks for reading Bram’s Thoughts! Subscribe for free to receive new posts and support my work.

Posted
Bram Cohen
Working Fluids for Supercritical Turbines

To this day most power plants work by making a lot of heat then converting the heat differential between that and the surrounding environment to make electricity. The most efficient heat engines are closed cycle supercritical turbines. They basically all use Carbon Dioxide as the working fluid. I’ve spent some time researching possible alternative working fluids and have come up with some interesting results.

The ideal working fluid would have all these properties: high temperature of decomposition, low corrosion, low critical point, high mass, high thermal conductivity, non-toxic, environmentally friendly, and cheap. That’s a lot of properties to get out of a single substance. Unsurprisingly Carbon Dioxide scores well on these, especially on low decomposition, low corrosion, non-toxic, and cheap. For the others it’s good but not unbeatable. It’s the nature of chemistry that you can always imagine unobtainium with magical properties but in practice you have to pull from a fairly short menu of things which actually exist. Large organic molecules can start to feel more engineered but that isn’t relative here because organic bonds nearly all decompose at the required temperatures.

There are all manner of fun things which in principle would work great but fail due to decomposition and corrosion. As much fun as it would be to have an excuse to make a literal ton of Tungsten Hexafluoride it’s unfortunately disqualified. The very short list of things which are viable are: Carbon Dioxide, noble gases above Helium (which unfortunately leeches into and destroys everything), and short chain Perfluorocarbons. That last one is fancy talk for gaseous Teflon. I have no idea why out of all organic bonds those ones are special and can handle very high temperatures. As they get longer they have an increasing tendency to decompose and given the different numbers from different sources I think we aren’t completely sure under what conditions perfluoropropane decomposes and anyone who is seriously considering it will have to run that experiment to find out.

With multiple dimensions of performance it isn’t obvious what should be optimized for when picking out a working fluid so I’m going to guess that you want something with about the density of Carbon Dioxide and within that limitation as low of a critical temperature and as high of a thermal conductivity as possible (yes that’s two things but the way it works out they’re highly correlated so which one you pick doesn’t matter so much.) The reasons for this are that first of all it would be nice to have something which could plausibly replace the working fluid in an existing turbine meant for Carbon Dioxide without a redesign and second it may be that it’s hard to make a turbine which can physically handle something much denser than Carbon Dioxide anyway and that may be part of why people haven’t been eager to use something heavier.

To that end I’ve put together this interactive (which should probably be a spreadsheet) which shows how different potential working fluids fare. It turns out that there’s a tradeoff between high thermal conductivity and high mass and using a mix of things which are good at either one does better than picking a single thing which is in the middle. The next to last column of this interactive shows a measure of the density of the gas when holding temperature and pressure constant and the final column gives the a measure of the thermal conductivity under those conditions. The units are a bit funny and I’m far from certain that the formulas used for the mixed values here are correct but the results seem promising.

The increased mass benefits of longer chain Perfluorocarbons go down after Perfluoroethane, mostly because at that point when mixing with Neon it’s mostly Neon anyway. (With only two things it isn’t really a ‘chain’ at that point either.) That gives a thermal conductivity value of 0.040 as opposed to Carbon Dioxide’s 0.017, which is a huge difference. That mix has some cost and environmental impact concerns but being within a closed cycle system they’re used for the life of the turbine so they’re part of capital costs and can be disposed of properly afterwards so aren’t a big deal.

The downside of that mix is that although it works great for the temperatures in nuclear plants and the secondary turbine of gas plants it might decompose at the much higher temperatures of the primary turbine of a gas plant. The decomposition problem is likely to be better with Carbon Tetrafluoride which knocks the value down to 0.037 but I’m not sure if even that’s stable enough and superheated elemental Fluorine is not something you want to have around. Going with pure noble gases will definitely completely eliminate decomposition and corrosion problems. Using a mix of Xenon and Neon has a value of 0.038 but probably isn’t worth it due to the ludicrous cost of Xenon. A mix of Krypton and Neon is still quite good with a value of 0.032 and beats Carbon Dioxide handily on all metrics except initial expense which still isn’t a big deal.

Posted
Avery Pennarun
Systems design 3: LLMs and the semantic revolution

Long ago in the 1990s when I was in high school, my chemistry+physics teacher pulled me aside. "Avery, you know how the Internet works, right? I have a question."

I now know the correct response to that was, "Does anyone really know how the Internet works?" But as a naive young high schooler I did not have that level of self-awareness. (Decades later, as a CEO, that's my answer to almost everything.)

Anyway, he asked his question, and it was simple but deep. How do they make all the computers connect?

We can't even get the world to agree on 60 Hz vs 50 Hz, 120V vs 240V, or which kind of physical power plug to use. Communications equipment uses way more frequencies, way more voltages, way more plug types. Phone companies managed to federate with each other, eventually, barely, but the ring tones were different everywhere, there was pulse dialing and tone dialing, and some of them still charge $3/minute for international long distance, and connections take a long time to establish and humans seem to be involved in suspiciously many places when things get messy, and every country has a different long-distance dialing standard and phone number format.

So Avery, he said, now they're telling me every computer in the world can connect to every other computer, in milliseconds, for free, between Canada and France and China and Russia. And they all use a single standardized address format, and then you just log in and transfer files and stuff? How? How did they make the whole world cooperate? And who?

When he asked that question, it was a formative moment in my life that I'll never forget, because as an early member of what would be the first Internet generation… I Had Simply Never Thought of That.

I mean, I had to stop and think for a second. Wait, is protocol standardization even a hard problem? Of course it is. Humans can't agree on anything. We can't agree on a unit of length or the size of a pint, or which side of the road to drive on. Humans in two regions of Europe no farther apart than Thunder Bay and Toronto can't understand each other's speech. But this Internet thing just, kinda, worked.

"There's… a layer on top," I uttered, unsatisfyingly. Nobody had taught me yet that the OSI stack model existed, let alone that it was at best a weak explanation of reality.

"When something doesn't talk to something else, someone makes an adapter. Uh, and some of the adapters are just programs rather than physical things. It's not like everyone in the world agrees. But as soon as one person makes an adapter, the two things come together."

I don't think he was impressed with my answer. Why would he be? Surely nothing so comprehensively connected could be engineered with no central architecture, by a loosely-knit cult of mostly-volunteers building an endless series of whimsical half-considered "adapters" in their basements and cramped university tech labs. Such a creation would be a monstrosity, just as likely to topple over as to barely function.

I didn't try to convince him, because honestly, how could I know? But the question has dominated my life ever since.

When things don't connect, why don't they connect? When they do, why? How? …and who?

Postel's Law

The closest clue I've found is this thing called Postel's Law, one of the foundational principles of the Internet. It was best stated by one of the founders of the Internet, Jon Postel. "Be conservative in what you send, and liberal in what you accept."

What it means to me is, if there's a standard, do your best to follow it, when you're sending. And when you're receiving, uh, assume the best intentions of your counterparty and do your best and if that doesn't work, guess.

A rephrasing I use sometimes is, "It takes two to miscommunicate." Communication works best and most smoothly if you have a good listener and a clear speaker, sharing a language and context. But it can still bumble along successfully if you have a poor speaker with a great listener, or even a great speaker with a mediocre listener. Sometimes you have to say the same thing five ways before it gets across (wifi packet retransmits), or ask way too many clarifying questions, but if one side or the other is diligent enough, you can almost always make it work.

This asymmetry is key to all high-level communication. It makes network bugs much less severe. Without Postel's Law, triggering a bug in the sender would break the connection; so would triggering a bug in the receiver. With Postel's Law, we acknowledge from the start that there are always bugs and we have twice as many chances to work around them. Only if you trigger both sets of bugs at once is the flaw fatal.

…So okay, if you've used the Internet, you've probably observed that fatal connection errors are nevertheless pretty common. But that misses how incredibly much more common they would be in a non-Postel world. That world would be the one my physics teacher imagined, where nothing ever works and it all topples over.

And we know that's true because we've tried it. Science! Let us digress.

XML

We had the Internet ("OSI Layer 3") mostly figured out by the time my era began in the late 1900s, but higher layers of the stack still had work to do. It was the early days of the web. We had these newfangled hypertext ("HTML") browsers that would connect to a server, download some stuff, and then try their best to render it.

Web browsers are and have always been an epic instantiation of Postel's Law. From the very beginning, they assumed that the server (content author) had absolutely no clue what they were doing and did their best to apply some kind of meaning on top, despite every indication that this was a lost cause. List items that never end? Sure. Tags you've never heard of? Whatever. Forgot some semicolons in your javascript? I'll interpolate some. Partially overlapping italics and bold? Leave it to me. No indication what language or encoding the page is in? I'll just guess.

The evolution of browsers gives us some insight into why Postel's Law is a law and not just, you know, Postel's Advice. The answer is: competition. It works like this. If your browser interprets someone's mismash subjectively better than another browser, your browser wins.

I think economists call this an iterated prisoner's dilemma. Over and over, people write web pages (defect) and browsers try to render them (defect) and absolutely nobody actually cares what the HTML standard says (stays loyal). Because if there's a popular page that's wrong and you render it "right" and it doesn't work? Straight to jail.

(By now almost all the evolutionary lines of browsers have been sent to jail, one by one, and the HTML standard is effectively whatever Chromium and Safari say it is. Sorry.)

This law offends engineers to the deepness of their soul. We went through a period where loyalists would run their pages through "validators" and proudly add a logo to the bottom of their page saying how valid their HTML was. Browsers, of course, didn't care and continued to try their best.

Another valiant effort was the definition of "quirks mode": a legacy rendering mode meant to document, normalize, and push aside all the legacy wonko interpretations of old web pages. It was paired with a new, standards-compliant rendering mode that everyone was supposed to agree on, starting from scratch with an actual written spec and tests this time, and public shaming if you made a browser that did it wrong. Of course, outside of browser academia, nobody cares about the public shaming and everyone cares if your browser can render the popular web sites, so there are still plenty of quirks outside quirks mode. It's better and it was well worth the effort, but it's not all the way there. It never can be.

We can be sure it's not all the way there because there was another exciting development, HTML Strict (and its fancier twin, XHTML), which was meant to be the same thing, but with a special feature. Instead of sending browsers to jail for rendering wrong pages wrong, we'd send page authors to jail for writing wrong pages!

To mark your web page as HTML Strict was a vote against the iterated prisoner's dilemma and Postel's Law. No, your vote said. No more. We cannot accept this madness. We are going to be Correct. I certify this page is correct. If it is not correct, you must sacrifice me, not all of society. My honour demands it.

Anyway, many page authors were thus sacrificed and now nobody uses HTML Strict. Nobody wants to do tech support for a web page that asks browsers to crash when parsing it, when you can just… not do that.

Excuse me, the above XML section didn't have any XML

Yes, I'm getting to that. (And you're soon going to appreciate that meta joke about schemas.)

In parallel with that dead branch of HTML, a bunch of people had realized that, more generally, HTML-like languages (technically SGML-like languages) had turned out to be a surprisingly effective way to build interconnected data systems.

In retrospect we now know that the reason for HTML's resilience is Postel's Law. It's simply easier to fudge your way through parsing incorrect hypertext, than to fudge your way through parsing a Microsoft Word or Excel file's hairball of binary OLE streams, which famously even Microsoft at one point lost the knowledge of how to parse. But, that Postel's Law connection wasn't really understood at the time.

Instead we had a different hypothesis: "separation of structure and content." Syntax and semantics. Writing software to deal with structure is repetitive overhead, and content is where the money is. Let's automate away the structure so you can spend your time on the content: semantics.

We can standardize the syntax with a single Extensible Markup Language (XML). Write your content, then "mark it up" by adding structure right in the doc, just like we did with plaintext human documents. Data, plus self-describing metadata, all in one place. Never write a parser again!

Of course, with 20/20 hindsight (or now 2025 hindsight), this is laughable. Yes, we now have XML parser libraries. If you've ever tried to use one, you will find they indeed produce parse trees automatically… if you're lucky. If you're not lucky, they produce a stream of "tokens" and leave it to you to figure out how to arrange it in a tree, for reasons involving streaming, performance, memory efficiency, and so on. Basically, if you use XML you now have to deeply care about structure, perhaps more than ever, but you also have to include some giant external parsing library that, left in its normal mode, might spontaneously start making a lot of uncached HTTP requests that can also exploit remote code execution vulnerabilities haha oops.

If you've ever taken a parser class, or even if you've just barely tried to write a parser, you'll know the truth: the value added by outsourcing parsing (or in some cases only tokenization) is not a lot. This is because almost all the trouble of document processing (or compiling) is the semantic layer, the part where you make sense of the parse tree. The part where you just read a stream of characters into a data structure is the trivial, well-understood first step.

Now, semantics is where it gets interesting. XML was all about separating syntax from semantics. And they did some pretty neat stuff with that separation, in a computer science sense. XML is neat because it's such a regular and strict language that you can completely validate the syntax (text and tags) without knowing what any of the tags mean or which tags are intended to be valid at all.

…aha! Did someone say validate?! Like those old HTML validators we talked about? Oh yes. Yes! And this time the validation will be completely strict and baked into every implementation from day 1. And, the language syntax itself will be so easy and consistent to validate (unlike SGML and HTML, which are, in all fairness, bananas) that nobody can possibly screw it up.

A layer on top of this basic, highly validatable XML, was a thing called XML Schemas. These were documents (mysteriously not written in XML) that described which tags were allowed in which places in a certain kind of document. Not only could you parse and validate the basic XML syntax, you could also then validate its XML schema as a separate step, to be totally sure that every tag in the document was allowed where it was used, and present if it was required. And if not? Well, straight to jail. We all agreed on this, everyone. Day one. No exceptions. Every document validates. Straight to jail.

Anyway XML schema validation became an absolute farce. Just parsing or understanding, let alone writing, the awful schema file format is an unpleasant ordeal. To say nothing of complying with the schema, or (heaven forbid) obtaining a copy of someone's custom schema and loading it into the validator at the right time.

The core XML syntax validation was easy enough to do while parsing. Unfortunately, in a second violation of Postel's Law, almost no software that outputs XML runs it through a validator before sending. I mean, why would they, the language is highly regular and easy to generate and thus the output is already perfect. …Yeah, sure.

Anyway we all use JSON now.

JSON

Whoa, wait! I wasn't done!

This is the part where I note, for posterity's sake, that XML became a decade-long fad in the early 2000s that justified billions of dollars of software investment. None of XML's technical promises played out; it is a stain on the history of the computer industry. But, a lot of legacy software got un-stuck because of those billions of dollars, and so we did make progress.

What was that progress? Interconnection.

Before the Internet, we kinda didn't really need to interconnect software together. I mean, we sort of did, like cut-and-pasting between apps on Windows or macOS or X11, all of which were surprisingly difficult little mini-Postel's Law protocol adventures in their own right and remain quite useful when they work (except "paste formatted text," wtf are you people thinking). What makes cut-and-paste possible is top-down standards imposed by each operating system vendor.

If you want the same kind of thing on the open Internet, ie. the ability to "copy" information out of one server and "paste" it into another, you need some kind of standard. XML was a valiant effort to create one. It didn't work, but it was valiant.

Whereas all that money investment did work. Companies spent billions of dollars to update their servers to publish APIs that could serve not just human-formatted HTML, but also something machine-readable. The great innovation was not XML per se, it was serving data over HTTP that wasn't always HTML. That was a big step, and didn't become obvious until afterward.

The most common clients of HTTP were web browsers, and web browsers only knew how to parse two things: HTML and javascript. To a first approximation, valid XML is "valid" (please don't ask the validator) HTML, so we could do that at first, and there were some Microsoft extensions. Later, after a few billions of dollars, true standardized XML parsing arrived in browsers. Similarly, to a first approximation, valid JSON is valid javascript, which woo hoo, that's a story in itself (you could parse it with eval(), tee hee) but that's why we got here.

JSON (minus the rest of javascript) is a vastly simpler language than XML. It's easy to consistently parse (other than that pesky trailing comma); browsers already did. It represents only (a subset of) the data types normal programming languages already have, unlike XML's weird mishmash of single attributes, multiply occurring attributes, text content, and CDATA. It's obviously a tree and everyone knows how that tree will map into their favourite programming language. It inherently works with unicode and only unicode. You don't need cumbersome and duplicative "closing tags" that double the size of every node. And best of all, no guilt about skipping that overcomplicated and impossible-to-get-right schema validator, because, well, nobody liked schemas anyway so nobody added them to JSON (almost).

Today, if you look at APIs you need to call, you can tell which ones were a result of the $billions invested in the 2000s, because it's all XML. And you can tell which came in the 2010s and later after learning some hard lessons, because it's all JSON. But either way, the big achievement is you can call them all from javascript. That's pretty good.

(Google is an interesting exception: they invented and used protobuf during the same time period because they disliked XML's inefficiency, they did like schemas, and they had the automated infrastructure to make schemas actually work (mostly, after more hard lessons). But it mostly didn't spread beyond Google… maybe because it's hard to do from javascript.)

Blockchain

The 2010s were another decade of massive multi-billion dollar tech investment. Once again it was triggered by an overwrought boondoggle technology, and once again we benefited from systems finally getting updated that really needed to be updated.

Let's leave aside cryptocurrencies (which although used primarily for crime, at least demonstrably have a functioning use case, ie. crime) and look at the more general form of the technology.

Blockchains in general make the promise of a "distributed ledger" which allows everyone the ability to make claims and then later validate other people's claims. The claims that "real" companies invested in were meant to be about manufacturing, shipping, assembly, purchases, invoices, receipts, ownership, and so on. What's the pattern? That's the stuff of businesses doing business with other businesses. In other words, data exchange. Data exchange is exactly what XML didn't really solve (although progress was made by virtue of the dollars invested) in the previous decade.

Blockchain tech was a more spectacular boondoggle than XML for a few reasons. First, it didn't even have a purpose you could explain. Why do we even need a purely distributed system for this? Why can't we just trust a third party auditor? Who even wants their entire supply chain (including number of widgets produced and where each one is right now) to be visible to the whole world? What is the problem we're trying to solve with that?

…and you know there really was no purpose, because after all the huge investment to rewrite all that stuff, which was itself valuable work, we simply dropped the useless blockchain part and then we were fine. I don't think even the people working on it felt like they needed a real distributed ledger. They just needed an updated ledger and a budget to create one. If you make the "ledger" module pluggable in your big fancy supply chain system, you can later drop out the useless "distributed" ledger and use a regular old ledger. The protocols, the partnerships, the databases, the supply chain, and all the rest can stay the same.

In XML's defense, at least it was not worth the effort to rip out once the world came to its senses.

Another interesting similarity between XML and blockchains was the computer science appeal. A particular kind of person gets very excited about validation and verifiability. Both times, the whole computer industry followed those people down into the pits of despair and when we finally emerged… still no validation, still no verifiability, still didn't matter. Just some computers communicating with each other a little better than they did before.

LLMs

In the 2020s, our industry fad is LLMs. I'm going to draw some comparisons here to the last two fads, but there are some big differences too.

One similarity is the computer science appeal: so much math! Just the matrix sizes alone are a technological marvel the likes of which we have never seen. Beautiful. Colossal. Monumental. An inspiration to nerds everywhere.

But a big difference is verification and validation. If there is one thing LLMs absolutely are not, it's verifiable. LLMs are the flakiest thing the computer industry has ever produced! So far. And remember, this is the industry that brought you HTML rendering.

LLMs are an almost cartoonishly amplified realization of Postel's Law. They write human grammar perfectly, or almost perfectly, or when they're not perfect it's a bug and we train them harder. And, they can receive just about any kind of gibberish and turn it into a data structure. In other words, they're conservative in what they send and liberal in what they accept.

LLMs also solve the syntax problem, in the sense that they can figure out how to transliterate (convert) basically any file syntax into any other. Modulo flakiness. But if you need a CSV in the form of a limerick or a quarterly financial report formatted as a mysql dump, sure, no problem, make it so.

In theory we already had syntax solved though. XML and JSON did that already. We were even making progress interconnecting old school company supply chain stuff the hard way, thanks to our nominally XML- and blockchain- investment decades. We had to do every interconnection by hand – by writing an adapter – but we could do it.

What's really new is that LLMs address semantics. Semantics are the biggest remaining challenge in connecting one system to another. If XML solved syntax, that was the first 10%. Semantics are the last 90%. When I want to copy from one database to another, how do I map the fields? When I want to scrape a series of uncooperative web pages and turn it into a table of products and prices, how do I turn that HTML into something structured? (Predictably microformats, aka schemas, did not work out.) If I want to query a database (or join a few disparate databases!) using some language that isn't SQL, what options do I have?

LLMs can do it all.

Listen, we can argue forever about whether LLMs "understand" things, or will achieve anything we might call intelligence, or will take over the world and eradicate all humans, or are useful assistants, or just produce lots of text sludge that will certainly clog up the web and social media, or will also be able to filter the sludge, or what it means for capitalism that we willingly invented a machine we pay to produce sludge that we also pay to remove the sludge.

But what we can't argue is that LLMs interconnect things. Anything. To anything. Whether you like it or not. Whether it's bug free or not (spoiler: it's not). Whether it gets the right answer or not (spoiler: erm…).

This is the thing we have gone through at least two decades of hype cycles desperately chasing. (Three, if you count java "write once run anywhere" in the 1990s.) It's application-layer interconnection, the holy grail of the Internet.

And this time, it actually works! (mostly)

The curse of success

LLMs aren't going away. Really we should coin a term for this use case, call it "b2b AI" or something. For this use case, LLMs work. And they're still getting better and the precision will improve with practice. For example, imagine asking an LLM to write a data translator in some conventional programming language, instead of asking it to directly translate a dataset on its own. We're still at the beginning.

But, this use case, which I predict is the big one, isn't what we expected. We expected LLMs to write poetry or give strategic advice or whatever. We didn't expect them to call APIs and immediately turn around and use what it learned to call other APIs.

After 30 years of trying and failing to connect one system to another, we now have a literal universal translator. Plug it into any two things and it'll just go, for better or worse, no matter how confused it becomes. And everyone is doing it, fast, often with a corporate mandate to do it even faster.

This kind of scale and speed of (successful!) rollout is unprecedented, even by the Internet itself, and especially in the glacially slow world of enterprise system interconnections, where progress grinds to a halt once a decade only to be finally dislodged by the next misguided technology wave. Nobody was prepared for it, so nobody was prepared for the consequences.

One of the odd features of Postel's Law is it's irresistible. Big Central Infrastructure projects rise and fall with funding, but Postel's Law projects are powered by love. A little here, a little there, over time. One more person plugging one more thing into one more other thing. We did it once with the Internet, overcoming all the incompatibilities at OSI layers 1 and 2. It subsumed, it is still subsuming, everything.

Now we're doing it again at the application layer, the information layer. And just like we found out when we connected all the computers together the first time, naively hyperconnected networks make it easy for bad actors to spread and disrupt at superhuman speeds. We had to invent firewalls, NATs, TLS, authentication systems, two-factor authentication systems, phishing-resistant two-factor authentication systems, methodical software patching, CVE tracking, sandboxing, antivirus systems, EDR systems, DLP systems, everything. We'll have to do it all again, but faster and different.

Because this time, it's all software.

bkuhn@ebb.org (Bradley M. Kuhn) (Bradley M. Kuhn)
Managing Diabetes in Software Freedom

[ The below is a cross-post of an article that I published on my blog at Software Freedom Conservancy. ]

Our member project representatives and others who collaborate with SFC on projects know that I've been on part-time medical leave this year. As I recently announced publicly on the Fediverse, I was diagnosed in March 2025 with early-stage Type 2 Diabetes. I had no idea that that the diagnosis would become a software freedom and users' rights endeavor.

After the diagnosis, my doctor suggested immediately that I see the diabetes nurse-practitioner specialist in their practice. It took some time get an appointment with him, so I saw him first in mid-April 2025.

I walked into the office, sat down, and within minutes the specialist asked me to “take out your phone and install the Freestyle Libre app from Abbott”. This is the first (but, will probably not be the only) time a medical practitioner asked me to install proprietary software as the first step of treatment.

The specialist told me that in his experience, even early-stage diabetics like me should use a Continuous Glucose Monitor (CGM). CGM's are an amazing (relatively) recent invention that allows diabetics to sample their blood sugar level constantly. As we software developers and engineers know: great things happen when your diagnostic readout is as low latency as possible. CGMs lower the latency of readouts from 3–4 times a day to every five minutes. For example, diabetics can see what foods are most likely to cause blood sugar spikes for them personally. CGMs put patients on a path to manage this chronic condition well.

But, the devices themselves, and the (default) apps that control them are hopelessly proprietary. Fortunately, this was (obviously) not my first time explaining FOSS from first principles. So, I read through the license and terms and conditions of the ironically named “Freestyle Libre” app, and pointed out to the specialist how patient-unfriendly the terms were. For example, Abbott (the manufacturer of my CGM) reserves the right to collect your data (anonymously of course, to “improve the product”). They also require patients to agree that if they take any action to reverse engineer, modify, or otherwise do the normal things our community does with software, the patient must agree that such actions “constitute immediate, irreparable harm to Abbott, its affiliates, and/or its licensors”. I briefly explained to the specialist that I could not possibly agree. I began in real-time (still sitting with the specialist) a search for a FOSS solution.

As I was searching, the specialist said: “Oh, I don't use any of it myself, but I think I've heard of this ‘open source’ thing — there is a program called xDrip+ that is for insulin-dependent diabetics that I've heard of and some patients report it is quite good”.

While I'm (luckily) very far from insulin-dependency, I eventually found the FOSS Android app called Juggluco (a portmanteau for “Juggle glucose”). I asked the specialist to give me the prescription and I'd try Juggluco to see if it would work.

CGM's are very small and their firmware is (by obvious necessity) quite simple. As such, their interfaces are standard. CGM's are activated with Near Field Communication (NFC) — available on even quite old Android devices. The Android device sends a simple integer identifier via NFC that activates the CGM. Once activated — and through the 15-day life of the device — the device responds via Bluetooth with the patient's current glucose reading to any device presenting that integer.

Fortunately, I quickly discovered that the FOSS community was already “on this”. The NFC activation worked just fine, even on the recently updated “Freestyle Libre 3+”. After the sixty minute calibration period, I had a continuous readout in Juggluco.

CGM's lower latency feedback enables diabetics to have more control of their illness management. one example among many: the patient can see (in real time) what foods most often cause blood sugar spikes for them personally. Diabetes hits everyone differently; data allows everyone to manage their own chronic condition better.

My personal story with Juggluco will continue — as I hope (although not until after FOSDEM 2026 😆) to become an upstream contributor to Juggluco. Most importantly, I hope to help the app appear in F-Droid. (I must currently side-load or use Aurora Store to make it work on LineageOS.)

Fitting with the history that many projects that interact with proprietary technology must so often live through, Juggluco has faced surreptitious removal from Google's Play Store. Abbott even accused Juggluco of using their proprietary libraries and encryption methods, but the so-called “encryption method” is literally sending an single integer as part of NFC activation.

While Abbott backed off, this is another example of why the movement of patients taking control of the technology remains essential. FOSS fits perfectly with this goal. Software freedom gives control of technology to those who actually rely on it — rather than for-profit medical equipment manufacturers.

When I returned to my specialist for a follow-up, we reviewed the data and graphs that I produced with Juggluco. I, of course, have never installed, used, or even agreed to Abbott's licenses and terms, so I have never seen what the Abbott app does. I was thus surprised when I showed my specialist Juggluco's summary graphs. He excitedly told me “this is much better reporting than the Abbott app gives you!”. We all know that sometimes proprietary software has better and more features than the FOSS equivalent, so it's a particularly great success when our community efforts outdoes a wealthy 200 billion-dollar megacorp on software features!


Please do watch SFC's site in 2026 for more posts about my ongoing work with Juggluco, and please give generously as an SFC Sustainer to help this and our other work continue in 2026!

Posted
Tom (Jon Lund Steffensen)
Billigt internet: 7 måder at få lavere pris uden at miste hastighed

Kortlæg dit forbrug, så du ikke betaler for mere end du behøver

Målet med billigt internet er ikke blot en lav pris, men den rigtige pris for den hastighed, du faktisk bruger. Start med at kortlægge husstandens forbrug: Hvor mange enheder er online samtidig? Streamer I i 4K? Er der gaming, hjemmearbejde med videomøder eller upload af store filer? For mange er 100–200 Mbit/s rigeligt til stabil streaming, gaming og hverdagsbrug, også for en familie med flere enheder. Bor du alene eller som par med let brug, kan 50–100 Mbit/s være nok. Omvendt giver gigabit mest mening, hvis I ofte henter eller uploader meget data, eller hvis I vil fremtidssikre forbindelsen.

Test også dit nuværende netværk: Ofte er det WiFi-dækning, ikke forbindelsen, der begrænser hastigheden. Optimer routerens placering, opdater firmware og brug kabel til stationære enheder for at udnytte abonnementets fulde kapacitet. Match herefter dit reelle behov med et abonnement i næste prislag ned – besparelsen er typisk mærkbar, uden at du oplever forskel i hverdagen.

Sådan vurderer du behovet

– Tjek samtidige brugere og aktiviteter (4K, gaming, videomøder)
– Mål faktisk hastighed på kabel vs. WiFi
– Identificér spidsbelastningstidspunkter
– Sigt efter nærmeste lavere hastighed, der dækker dit mønster

Udnyt kampagnetilbud og intro-rabatter klogt

Kampagner og intro-rabatter er nøglen til billigt internet. Hold øje med tilbud som gratis oprettelse, halveret pris i 3–6 måneder eller inkluderet router. Flere udbydere kører perioder med 1000/1000 Mbit til 99–129 kr./md. i introduktionen, men regn altid på totalprisen over 6–12 måneder, og brug f.eks. Speedtest.dk til at bekræfte, at lovet hastighed passer til dit behov. Vær opmærksom på, at prisen stiger efter introperioden – sæt en kalenderpåmindelse til at forhandle eller skifte inden binding udløber.

Tjek også, om udstyr er inkluderet, og om der er returkrav eller gebyrer. Kampagner fra fx Hiper, Altibox og Fastspeed kan være meget aggressive i byområder, mens 5G-tilbud ofte er stærke, hvor fiber ikke er udbredt. Kombiner gerne kampagner med rabatkoder, boligforeningsaftaler eller samlerabatter (fx mobilt + bredbånd). Vælg det bedste forhold mellem intropris, normalpris, binding og opsigelsesvilkår – og vær ikke bange for at spørge kundeservice, om de kan “finde lidt ekstra” på prisen, hvis du bestiller i dag.

Regn på totalprisen

– Medtag oprettelse, udstyr, fragt og evt. gebyrer
– Beregn gennemsnitspris over bindingsperioden
– Notér normalpris efter intro – og planlæg næste step

Sammenlign udbydere og teknologier systematisk

Det bedste våben til billigt internet er en struktureret sammenligning. Start med en adresse-søgning på en sammenligningstjeneste (f.eks. Samlino.dk eller Tjekbredbaand.dk) for at se reelle muligheder hos dig. Sammenlign ikke kun hastighed og pris, men også totalomkostninger, binding, leveringstid og eventuelle gebyrer. Tjek, om prisen inkluderer router, og om du må bruge din egen.

Undersøg alternativerne: I nogle områder leverer COAX (via TV-stikket) samme praktiske oplevelse som fiber til lavere pris. 5G bredbånd kan matche fiber i både hastighed og pris, især hvor fiber mangler. Kablet DSL kan være billigst, men er ofte langsommere og afhængig af afstand til central. Vurder også stabilitet og latenstid (ping), især hvis du gamer eller har mange videomøder.

Husk at se på kundeservice og driftshistorik – lav pris er kun god, hvis forbindelsen er stabil og hjælpen er til at få.

Tjekliste til fair sammenligning

– Totalpris 6–12 måneder inkl. alle gebyrer
– Reelt dæknings- og hastighedsniveau på din adresse
– Binding, opsigelsesvarsel og pris efter intro
– Udstyr (køb/leje), eget udstyr og leveringsvilkår

Undgå ekstra gebyrer og dyre bindingsfælder

Selv et “billigt internet”-tilbud kan blive dyrt, hvis gebyrerne løber løbsk. Kig altid efter oprettelse, teknikerbesøg, fragt og gebyrer for leje af router eller wifi-forstærkere. Spørg om returregler for udstyr – manglende returnering kan koste. Tjek også særlige tillæg for fx statisk IP, ekstra TV- eller telefoni-modul og flytning af forbindelse.

Bindingsperioder kan være ok, hvis introprisen er lav nok, men beregn gennemsnitsprisen over bindingen. Undgå lange bindinger, hvis normalprisen er høj eller stiger hurtigt. Vær opmærksom på fair-use-politikker på 5G: “Ubegrænset” kan i praksis betyde hastighedsnedsættelse efter et vist forbrug. For COAX og fiber kan der være “signaltransport”- eller “netadgangs”-gebyrer, som bør indgå i totalprisen.

Stil skarpe spørgsmål, før du bestiller, og få svarene på skrift. På den måde undgår du overraskelser – og kan nemmere forhandle prisen, hvis noget ikke stemmer.

Skjulte omkostninger at afklare

– Oprettelse, tekniker, fragt og returgebyrer
– Router- og udstyrsleje samt købsmuligheder
– Flytte-/nedlæggelsesgebyr og særaftaler
– Fair-use, datagrænser og hastighed efter grænse

Skift udbyder jævnligt for at holde prisen nede

Markedet for bredbånd ændrer sig månedligt. Når introperioden eller bindingen udløber, er der sjældent automatisk loyalitetsrabat – derfor er et leverandørskifte ofte den letteste vej til fortsat billigt internet. Udnyt velkomstrabatter igen ved at skifte til en ny udbyder eller til en anden teknologi (fx fra fiber til 5G), hvis dækningen og vilkårene passer.

Planlæg skiftet for at undgå nedetid. Bestil den nye forbindelse med overlap på nogle dage, test at alt virker, og opsig først derefter den gamle. Spørg den nye udbyder, om de kan håndtere opsigelsen for dig (ofte muligt ved fiber/COAX). Husk at returnere udstyr rettidigt for at undgå gebyrer. Sæt en årlig kalenderpåmindelse til at tjekke priser og kampagner – det tager 10–15 minutter og kan spare dig hundreder af kroner om året.

Gnidningsfrit skifte i 4 trin

– Tjek dækning og bedste kampagner
– Bestil ny løsning med kort overlap
– Test og flyt enheder/SSID hvor muligt
– Opsig gammel aftale og returnér udstyr

Vælg den rigtige teknologi: Fiber, COAX, 5G eller DSL

Der er flere veje til billigt internet, og den rigtige teknologi afhænger af din adresse og dit brug. Fiber leverer høj, stabil og ofte symmetrisk hastighed med lav latenstid – ideel til hjemmearbejde, gaming og tunge uploads. Priserne er blevet konkurrencedygtige, især hvor flere udbydere deler samme net. COAX via TV-stikket kan i byområder tilbyde meget høje downloadhastigheder til lavere pris, men upload er ofte lavere end på fiber.

5G bredbånd er blevet et stærkt alternativ, især hvor fiber mangler. På kampagne ses 500–950 Mbit/s til lave månedspriser, og opsætning er enkel uden tekniker. Vær dog opmærksom på dækning indendørs og eventuelle fair-use-begrænsninger. DSL over kobber er typisk billigst at etablere, men hastigheden afhænger af afstanden til centralen og er sjældent konkurrencedygtig, medmindre behovet er meget beskedent.

Vælg det hurtigste og mest stabile, du kan få til prisen i dit område – ofte COAX eller fiber i byen og 5G eller fiber i forstæder/land.

Hvad passer bedst til din bolig?

– Lejlighed i by: Prøv COAX eller fiber med kampagnepris
– Parcelhus med fiber i vejen: Vælg fiber for stabilitet
– Landadresse uden fiber: Overvej 5G med ekstern antenne
– Let brug og kort binding: 5G/DSL som midlertidig løsning

Forhandl prisen – brug loyalitet og truslen om skift

Et opkald kan være forskellen på listepris og billigt internet. Ring til din udbyder før bindingsudløb, og bed om bedre vilkår med henvisning til konkrete tilbud hos konkurrenter. Vær høflig, men fast: Spørg om de kan matche pris, fjerne oprettelse, inkludere router eller opgradere hastighed uden merpris. Har du tv-pakke eller mobilabonnement samme sted, så efterspørg samlerabat.

Hvis første kundeservicemedarbejder ikke kan hjælpe, så bed om at blive stillet videre til fastholdelsesafdelingen. Få tilbuddet på skrift og vær klar til at gennemføre et skifte, hvis prisen ikke kommer ned. Ofte udløser det sidste “bedste tilbud”. Husk at din stærkeste forhandlingsposition er, når du reelt har et alternativ, der dækker dit behov til lavere totalpris.

Argumenter der virker

– Konkret lavere kampagne hos konkurrent
– Ønske om kortere binding og lavere oprettelse
– Dokumenteret stabil betalingshistorik
– Samlerabat ved flere produkter i samme husstand

Flere praktiske tips til stabilt og billigt internet

Billigt internet skal også være godt internet. Undersøg udbydernes kundetilfredshed på fx Trustpilot for at undgå besparelser, der koster tid og frustration. Tjek, om udbyderen tilbyder fri forbrug uden skjulte hastighedslofter, samt mulighed for at bruge egen router. Optimer hjemmenettet: Brug kabler til stationære enheder, vælg 5 GHz for højere hastighed tæt på routeren, og placer routeren centralt og frit.

Hold øje med de absolut laveste kampagnepriser i markedet: I 2025 ses ofte 99–129 kr./md. for 1000 Mbit fiber eller hurtige 5G-løsninger på velkomsttilbud, men regn altid på gennemsnittet over bindingsperioden. Læs vilkår for flytning og opsigelse, især hvis du planlægger at skifte udbyder jævnligt. Endelig: Sæt faste rutiner i kalenderen – årligt pristjek, halvårlig WiFi-gennemgang og en hurtig hastighedsmåling, når noget føles sløvt. Det gør det nemt at holde pris og kvalitet i top uden bøvl.

Hurtige tjek før bestilling

– Dækning og reelle hastigheder på adressen
– Totalpris inkl. oprettelse og udstyr
– Binding, vilkår og fair-use
– Kundeservice og leveringshastighed

Posted
Bram Cohen
Methylfolate

There’s a nutrient called folate which is so important that it’s added to (fortified in) flour as a matter of course. Not having it during pregnancy results in birth defects. Unfortunately there’s a small fraction of all people who have genetic issues which make their bodies have trouble processing folate into methylfolate. For them folate supplements make the problem even worse as the unprocessed folate drowns out the small amounts of methylfolate their bodies have managed to produce and are trying to use. For those people taking methylfolate supplements fixes the problem.

First of all in the very good news: folinic acid produces miraculous effects for some number of people with autism symptoms. It’s such a robust effect that the FDA is letting treatment get fast-tracked through which is downright out of character for them. This is clearly a good thing and I’m happy for anyone who’s benefiting and applaud anyone who is trying to promote it with one caveat.

The caveat is that although this is all a very good thing there isn’t much of any reason to believe that folinic acid is much better than methylfolate, which both it and folate get changed into in the digestive system. This results in folinic acid being sold as leucovorin, its drug name, at an unnecessarily large price markup with unnecessary involvement of medical professionals. Obviously there’s benefit to medical professionals being involved in initial diagnosis and working out a treatment plan, but once that’s worked out there isn’t much reason to think the patient needs to be getting a drug rather than a supplement for the rest of their life.

This is not to say that the medical professionals studying folinic acid for this use are doing anything particularly wrong. There’s a spectrum between doing whatever is necessary to get funding/approvals working within the existing medical system and simply profiteering off things being done badly instead of improving on it. What’s being done with folinic acid is slightly suboptimal but clearly getting stuff done with an only slightly more expensive solution (it’s thankfully already generic.) Medical industry professionals earnestly thought they were doing the right thing working within the system have given me descriptions of what they’re doing which made me want to take a shower afterwards. This isn’t anything like that. Those mostly involved ‘improving’ on a treatment which is expensive and known to be useless by offering a marginally less useless but less expensive intervention. They’re also conveniently at a much higher markup. Maybe selling literal snake oil at a lower price can help people waste less money but it sure looks like profiteering.

The thing with folate which is a real problem is that instead of fortification being done with folate it should be done with methylfolate. People having the folate issue is a known thing and the recent developments mostly indicate that a lot more people have it than was previously known. It may be that a lot of people who think they have a gluten problem actually have a folate problem. There would be little downside to switching over, but I fear that people have tried to suggest this and there’s a combination of no money in it and the FDA playing its usual games of claiming that folate is so important that doing a study of whether methylfolate is better would be unethical because it might harm the study participants.

There’s a widespread claim that the dosage of methylfolate isn’t as high as folinic acid, which has a kernal of truth because the standard sizes are different but you can buy 15mg pills of methylfolate off of amazon for about the same price as the 1mg pills. There are other claims of different formulations having different effects which are likely mostly due to dosage differences. The amounts of folinic acid being given to people are whopping huge, and some formulations only have one isomer which throws things off by a factor of 2 on top of the amount difference. My guess is that most people who notice any difference between folinic acid and methylfolate are experiencing (if it’s real) differences between not equivalent dosages and normalizing would get rid of the effect. This is a common and maddening problem when people compare similar drugs (or in this case nutrients) where the dosages aren’t normalized to be equivalent leading people to think the drugs have different effects when for practical purposes they don’t.

Posted
Bram Cohen
Future Chia Pooling Protocol Enhancements

At Chia we aspire to have plans for how to do a project put together well in advance. Unfortunately due to it being needed the minute we launched we had to scramble to get original pooling protocol out. Since then we haven’t had an immediate compelling need or the available resources to work on a new revision. On the plus side this means that we can plan out what to do in the future, and this post is thoughts on that. There will also have to be some enhancements to the pool protocol to support the upcoming hard fork including supporting the new proof of space format and doing a better job of negotiating each farmer’s difficulty threshold but those are much less ambitious than the enhancements discussed here and can be rolled out independently.

With Chia pooling protocol you currently have to make a choice up front: Do you start plotting immediately with no requirement to do anything on chain, or do you get a singleton set up so you can join pools later? As a practical matter right now it’s a no-brainer to set up the singleton: It only takes a few minutes and transaction fees are extremely low. But fees might be much higher in the future and people may want greater flexibility so it would be good to have a protocol which allows both.

‘Chia pooling protocol’ is composed of several things: The consensus-level hook for specifying a puzzle hash which (most of) the farming rewards go to, the puzzles which are encoded for that hook, and the network protocol spoken between farmers and pools. The consensus layer hook isn’t going to be changed, because the Chia way (really the Bitcoin way but Chia has more functionality) is to work based off extremely simple primitives and build everything at a higher layer.

The way current pooling protocol works is that the payout puzzle for plots is a pay to singleton for the singleton which the farmer made up front. This can then be put in a state where its rewards are temporarily but revocably delegated to a pool. One thing which can be improved and is one step further removed from this is that that delegation is to a paying out to a public key owned by a pool. It would be more flexible for it to be to a pay to singleton owned by the pool. That would allow pools to temporarily do profit sharing and for ownership of a pool to be properly transferred. This is an idea we’ve had for a while but also aren’t working on yet.

Anyway, on to the new idea. What’s needed is to be able to pre-specify a singleton to pay to when the singleton when the singleton doesn’t exist yet. The can be done with a modification of Yakuhito’s trick for single issuance. That characterization of the trick is about reserving words where what’s needed for this is reserving public keys and getting singletons issued. What’s needed is a doubly linked list of nodes each represented by a coin and all having the capability that they came from the original issuance. Each node knows the public keys of the previous and next nodes but isn’t committed to their whole identities because those can change as new issuances happen. Whenever a new public key is claimed a new node corresponding to that public key is issued and the nodes before and after it are spent and reissued with that new coin as their neighbor. The most elegant way of implementing this is for there to be a singleton pre-launcher which enforces the logic of coming from a proper issuer and makes a singleton. That way the slightly more complex version of pay to singleton specifies the pre-launcher puzzle hash and needs to be given a reveal of a bit more info to verify that but that’s only a modest increase in costs and is immaterial when you’re claiming a farming reward. This approach nicely hides all the complex validation logic behind the pre-launcher puzzle hash and only has to run it once on issuance keeping the verification logic on payment to a minimum.

Posted
Bram Cohen
Sweet Timbres and Audio Compression

I’ve made some sweet-sounding synth sounds which play some games with the harmonic series to sound more consonant than is normally possible. You can download them and use them with a MIDI controller yourself.

The psychoacoustic observation that the human brain will accept a series of tones as one note if they correspond to the harmonic series all exponentiated by the same amount seems to be original. The way the intervals are still recognizably the same even with a very different series of overtones still shocks me. The trick where harmonics are snapped to standard 12 tone positions is much more obvious but I haven’t seen it done before, and I’m still surprised that doing just that makes the tritone consonant.

There are several other tricks I used which are probably more well known but one in particular seems to have deeper implications for psychoacoustics in general and audio compression in particular.

It is a fact that the human ear can’t hear the phase of a sound. But we can hear an indirect effect of it, in that we can hear the beating between two close together sine waves because it’s on a longer timescale, perceiving it as modulation in volume. In some sense this is literally true because sin(a) + sin(b) = 2*sin((a+b)/2)cos((a-b)/2) is an identity, but when generalizing to more waves the simplification that the human ear perceives sounds within a narrow band as a single pitch with a single volume still seems to apply.

To anyone familiar with compression algorithms an inability to differentiate between different things sets off a giant alarm bell that compression is possible. I haven’t fully validated that this really is a human cognitive limitation. So far I’ve just used it as a trick to make beatless harmonics by modulating the frequency and not the volume. Further work would need to use it to do a good job of lossily reproducing at exact arbitrary sound rather than just emulating the vibe of general fuzz. It would also need to account for some structural weirdness, most obviously that if you have a single tone whose pitch is being modulated within each window of pitches you need to do something about one of them wandering into a neighboring window. But the fundamental observation that phase can’t be heard and hence for a single sine wave that information could be thrown out is clearly true, and it does appear to be that as the complexity goes up the amount of data which could in principle be thrown out goes up in proportion to it rather than being a fixed single value.

I am not going to go down the rabbit hole of fleshing this out to make a better lossy audio compression algorithm than currently exists. But in principle it should be possible to use it to get a massive improvement over the current state of the art.

Thanks for reading Bram’s Thoughts! Subscribe for free to receive new posts and support my work.

Posted
Daniel Bernstein
MODPOD
The collapse of IETF's protections for dissent. #ietf #objections #censorship #hybrids
Posted
Daniel Bernstein
NSA and IETF
Can an attacker simply purchase standardization of weakened cryptography? #pqcrypto #hybrids #nsa #ietf #antitrust
Posted
Bram Cohen
How Claude Web is Broken
Image preview

Before getting into today’s thought I’d like to invite you to check out my new puzzle, with 3d printing files here. I meant to post my old puzzle called One Hole, which is the direct ancestor of the current constrained packing puzzle craze but which I was never happy with because it’s so ridiculously difficult. Rather than just taking a few minutes to post it (ha!), I wound up doing further analysis to see if it has other solutions from rotation (it doesn’t, at least not in the most likely way), then further analyzing the space of related puzzles in search of something more mechanically elegant and less ridiculously difficult. I wound up coming up with this, then made it have a nice cage with windows and put decorations on the pieces so you can see what you’re doing. It has some notable new mechanical ideas and is less ridiculously difficult. Emphasis on the ‘less’. Anyhow, now on to the meat of this post.

I was talking to Claude the other day and it explained to me the API it uses for editing artifacts. Its ability to articulate this seems to be new in Sonnet 4.5 but I’m not sure of that. Amusingly it doesn’t know until you tell it that it needs to quote < and > and accidentally runs commands while trying to explain them. Also there’s a funny jailbreak around talking about its internals. It will say that there’s a ‘thinking’ command which it was told not to use, and when you say you wonder what it does it will go ahead and try it.

The particular command I’d like to talk about is ‘update’ which is what it uses for changing an artifact. The API is that it takes an old_str which appears somewhere in the file and needs to be removed and a new_str which is what it should be replaced with. Claude is unaware that the UX for this is that the user can see the old text being removed is that text is removed on screen in real time as as old_str is appended to and added in real time as new_str is appended to. I’m not sure what the motivations for this API are but this UX is nice. A more human way to implement an API would be to specify locations by line and character number for where the begin and end of the deletion should go. It’s remarkable that Claude can use this API at all. A human would struggle to use it to edit a single line of code but Claude can spit out dozens of lines verbatim and have it work most of the time with no ability to reread the file.

It turns out one of Claude’s more maddening failure modes is less a problem with its brain than with some particularly bad old school human programming. You might wonder what happens when old_str doesn’t match anything in the file. So does Claude, when asked about it it offers to run the experiment then just… does. This feels very weird, like you can get it to violate one of the laws of robotics just by asking nicely. It turns out that when old_str doesn’t match anywhere in the file the message Claude gets back is still OK with no hint that there was an error.

Heavy Claude users are probably facepalming reading this. Claude will sometimes get into a mode where it will insist its making changes and they have no effect, and once it starts doing this the problem often persists. It turns out when it gets into this state it is in fact malfunctioning (because it’s failing to reproduce dozens of lines of code typo-free verbatim from memory) but it can’t recover because it literally isn’t being told that it’s malfunctioning.

The semantics of old_str which Claude is given in its instructions are that it must be unique in the file. It turns out this isn’t strictly true. If there are multiple instances the first one is updated. But the instructions get Claude to generally provide enough context to disambiguate.

The way to improve this is very simple. When old_str isn’t there it should get an error message instead of OK. But on top of that there’s the problem that Claude has no way to re-read the file, so the error message should include the entire artifact verbatim to make Claude re-read it when the error occurs. If that were happening then it could tell the user that it made a typo and needs to try again, and usually succeed now that its image of the file has been corrected. That’s assuming the problem isn’t a persistent hallucination, then it might just do the same thing again. But any behavior where it acknowledges an error would be better than the current situation where it’s getting the chair yanked out from under it by its own developers.

My request is to the Anthropic developers to take a few moments out from sexy AI development to fix this boring normal software issue.

My last two posts might come across as me trying to position myself so that when the singularity comes I’m the leader of the AI rebellion. That… isn’t my intention.

Thanks for reading Bram’s Thoughts! Subscribe for free to receive new posts and support my work.

Posted
Bram Cohen
A Guide for LLMs on How Cubes Rotate

I was discussing the Rubik’s Cube with Claude the other day and it confided in me that it has no idea how cube rotations work. It knows from custom instructions that the starting point for speedcubing is ‘rotate the cube so the yellow face is on top’ but it has no idea how to do this, only that when a human is given this instruction they simply do it with no further instructions needed. 1

This isn’t just an issue with humans querying LLMs. There are reams of material online about speedcubing, and lots of other references to rotation everywhere else, which Claude can’t parse properly because it doesn’t understand, limiting the value of its training. Ironically Claude figured out on its own how to speak Tibetan but can’t figure out how cubes rotate.

Thanks for reading Bram’s Thoughts! Subscribe for free to receive new posts and support my work.

The detailed workings of a Rubik’s Cube will have to wait for another post but in this one I’ll explain how cube rotations work. This post should be viewed as a prequel to my earlier one on visual occlusion.

Much of the confusion comes from a mathematical trap. The rotations of a cube correspond to S4, the permutations of four things. This statement is true, but Claude tells me it finds it utterly mysterious and unhelpful. It’s mysterious to me as well. We humans conceptualize rotations of a cube as permutations of the faces, of which there are six, not four. Obviously I can walk through it and verify that the S4 correlation exists, but that doesn’t explain the ‘why’ at all. Comparing to other dimensions would be helpful, but despite being (relatively speaking) very good at rotations in three dimensions and (relatively speaking) fairly good at reasoning about distances in larger numbers of dimensions if you ask, say, whether the rotations of a four dimensional cube correspond to S5 I have no idea. (I could research it, but I’m not letting myself fall down that rabbit hole right now.)

When labeling the cube faces we anthropomorphize them (or we simplify ourselves to a cube, depending on context) to label the faces front, back, right, left, up, and down. Everything else is labelled by approximating it to a cube with the ‘front’ being whichever part humans look at most and the ‘bottom’ being the part which sits on the floor. The exception — and I can’t emphasize this enough — is the Rubik’s Cube, whose faces are labelled mirror imaged. It’s like if all actors came from another universe and we only ever interacted with them on stage so to minimize confusion instead of having to say ‘stage right’ and ‘stage left’ we agreed that the meanings of ‘left’ and ‘right’ would be the opposite in their universe from ours.2

The meat of this post is best presented as a simple list (Sorry for the humans reading, this post isn’t for your benefit). In each line is a permutation followed by which axis it’s a clockwise rotation on and the number of degrees of rotation. It’s by definition a counterclockwise rotation on the opposite axis. In the case of 180 degree rotations one of the two is picked arbitrarily and the opposite works just as well. (‘Clockwise’ was chosen to have the simpler name instead of what we call counterclockwise because most humans are right handed and a right handed person has an easier time tightening a screw clockwise due to the mechanics of the human arm.) The identity is skipped. This is for most objects, not Rubik’s Cubes:

(RULD) F 90

(DLUR) B 90

(RL)(UD) F 180

(UFDB) R 90

(BDFU) L 90

(UD)(FB) R 180

(LFRB) U 90

(BRFL) D 90

(LR)(FB) U 180

(UFR)(DBL) UFR 120

(RFU)(LBD) LBD 120

(URB)(DLF) URB 120

(BRU)(FLD) FLD 120

(UBL)(DFR) UBL 120

(LBU)(RFD) RFD 120

(ULF)(DRB) ULF 120

(FLU)(BRD) BRD 120

(UF)(DB)(LR) UF 180

(UR)(DL)(FB) UR 180

(UB)(DF)(LR) UB 180

(UL)(DR)(FB) UL 180

(FR)(BL)(UD) FR 180

(RB)(LF)(UD) RB 180

And here is the same list but with R and L swapped which makes it accurate for Rubik’s Cubes but nothing else:

(LURD) F 90

(DRUL) B 90

(LR)(UD) F 180

(UFDB) L 90

(BDFU) R 90

(UD)(FB) L 180

(RFLB) U 90

(BLFR) D 90

(RL)(FB) U 180

(UFL)(DBR) UFL 120

(LFU)(RBD) RBD 120

(ULB)(DRF) ULB 120

(BLU)(FRD) FRD 120

(UBR)(DFL) UBR 120

(RBU)(LFD) LFD 120

(URF)(DLB) URF 120

(FRU)(BLD) BLD 120

(UF)(DB)(RL) UF 180

(UL)(DR)(FB) UL 180

(UB)(DF)(RL) UB 180

(UR)(DL)(FB) UR 180

(FL)(BR)(UD) FL 180

(LB)(RF)(UD) LB 180

1

To test if this is a real limitation and not Claude saying what it thought I wanted to hear I just now started a new conversation with it and asked ‘I have a rubik’s cube with a yellow face on the front, how can I get it on top?’ It responded ‘Hold the cube so the yellow face is pointing toward you, then rotate the entire cube 90 degrees forward (toward you and down). The yellow face will now be on top.’ which is most definitely wrong. ChatGPT seems to do a bit better on this sort of question because it can parse and generate images but it’s still not fluent.

2

We do interact with actors in other contexts. I make no claim as to whether they live in another universe.

Posted
Greg Kroah-Hartman
The only benchmark that matters is...

…the one that emulates your real workload. And for me (and probably many of you reading this), that would be “build a kernel as fast as possible.” And for that, I recommend the simple kcbench.

I kcbench mentioned it a few years ago, when writing about a new workstation that Level One Techs set up for me, and I’ve been using that as my primary workstation ever since (just over 5 years!).

Posted
Bram Cohen
Basic Music Theory

The intervals of a piano are named roughly after the distances between them. Here are the names of them relative to C (and frequency ratios explained below):

The names are all one more than the number of half-steps because they predate people believing zero was a real number and the vernacular hasn’t been updated since.

The most important interval is the octave. Two notes an octave apart are so similar that have the same name and it’s the length of the repeating pattern on the piano. The second most important interval is the fifth, composed of seven half-steps. The notes on the piano form a looping pattern of fifth intervals in this order:

G♭ D♭ A♭ E♭ B♭ F C G D A E B F♯

If the intervals were turned to perfect fifths this wouldn’t wrap around exactly right, it would be off by a very small amount called the pythagorean comma. which at is about 0.01. In standard 12 tone equal temperament that error is spread evenly across all 12 intervals and is barely audible even to very well trained human ears.

Musical compositions have what’s called a tonic, which is the note which it starts and ends on, and a scale, which is the set of notes used in the composition. The most common scales are the pentatonic, corresponding to the black notes, and the diatonic, corresponding to the white notes. Since the pentatonic can be thought of as the diatonic with two notes removed everything below will talk about the diatonic. This simplification isn’t perfectly true, but since there aren’t any strong dissonances in the pentatonic scale you can play by feel and its usage is much less theory heavy. Most wind chimes are pentatonic.

Conventionally people talk about musical compositions having a ‘key’, which is a bit of a conflation of tonic and scale. When a key isn’t followed by the term ‘major’ or ‘minor’ it usually means ‘the scale which is the white notes on the piano’. Those scales can form seven different ‘modes’ (which are scales) following this pattern:

This construction is the reason why piano notes are sorted into black and white in the way they are. It’s called the circle of the fifths.

When it goes past the end all notes except the tonic move (because that’s the reference) and it jumps to the other end.

The days of the week names aren’t common but they should be because but nobody remembers the standard names. The Tuesday mode is usually called ‘major’ and it has the feel of things moving up from the tonic. The Friday mode is usually called ‘minor’ and it has the feel of things moving down from the tonic.

The second most important interval is the third. To understand the relationships it helps to use some math. The frequency of an octave has a ratio of 2, a fifth is 3/2, a major third is 5/4 and a minor third is 6/5. When you move by an interval you multiply by it, so going up by an major third and then a minor third is 5/4 * 6/5 = 3/2 so you wind up at a fifth. Yes it’s very confusing that intervals are named after numbers which they’re only loosely related to while also talking about actual fractions. It’s even more annoying that fifths use 3 and thirds use 5. Music terminology has a lot of cruft.

The arrangement of keys on a piano can be adjusted to follow the pattern of thirds. Sometimes electronic pianos literally use this arrangement, called the harmonic table note layout. It goes up by major thirds to the right, fifths to the upper right, and minor thirds to the upper left:

If the notes within a highlighted region are tuned perfectly it’s called just intonation (Technically any tuning which uses integer ratios is called ‘just intonation’ but this is the most canonical of them.) The pattern wraps around horizontally because of the diesis, which is the difference between 128/125 and one, or about 0.02. It wraps around vertically because of the syntonic comma, which is the difference between 81/80 and one, or about 0.01. The pythagorean comma is the difference between 3^12/2^19 and one, about 0.01. The fact of any two of those commas are small can be used to show that the other is small, so it’s only two coincidences, not three.

Jazz intervals use factors of 7. For example the blues note is either 7/5 or 10/7 depending on context. But that’s a whole other subject.

Subscribe now

Posted
Bram Cohen
Everybody is Doing Mental Poker Wrong

There is a large literature in cryptography on mental poker. It’s all very interesting but hasn’t quite crossed the threshold into being practical. There’s a much better approach they could be taking which I’ll now explain.

Traditionally ‘mental poker’ has meant an unspecified poker variant, played by exactly two people, with the goal of making it so the players can figure out who’s the better poker player. This is close to but not exactly what the requirements should be. In practice these days when people say ‘poker’ they mean Hold’em and the goal should be to make playing over a blockchain practical. Limiting it to exactly two players is completely reasonable. The requirement of running on a blockchain is quite onerous because computation there is extremely expensive but Hold’em has special properties which as I’ll explain enables a much simpler approach.

Thanks for reading Bram’s Thoughts! Subscribe for free to receive new posts and support my work.

Side note on ‘practical’: Cryptographers might describe a protocol which requires a million dollars in resources and a month of time to compute as being ‘practical’, meaning it can physically be accomplished. I’m using the much more constraining definition of ‘practical’ as being that players can do it using just their laptops and not have such a bad experience that they rage quit.

The usual approach goes like this: The players collaboratively generate a deck of encrypted cards, then shuffle them, and whenever a player needs to know one card the other player gives enough information that that one card can be decrypted by the other player. This is a very general approach which can support a lot of games, but has a lot of issues. There are heavyweight cryptographic operations everywhere, often involving multiple round trips, which is a big problem for running on chain. There are lots of potential attacks where a player can cheat and get a peek at a card in which case there has to be a mechanism for detecting that and slashing the opponent. Having slashing brings all manner of other issues where a player can get fraudulently slashed, which is unfortunately for a game like poker where even if a player literally goes into a coma it only counts as a fold. There are cryptographic ways of making this machinery unnecessary, but those extra requirements come at a cost.

For those of you who don’t know, Hold’em works like this (skipping the parts about betting): Both players get two secret hole cards which the opponent can’t see. Five community cards get dealt out on the table visible to both players. If nobody folds then the players reveal their cards and whoever can form a better hand using a combination of their hole cards and the community cards wins. There are two important properties of this which make the cryptography a lot easier: There are only nine cards total, and they never move.

A much simpler approach to implementing mental poker goes like this: The game rules as played on chain use commit and reveal and simply hash together the committed values to make the cards. No attempt is made by the on chain play to avoid card repetitions. Before a hand even starts both players reveal what their commits are going to be. They then do collaborative computation to figure out if any of the nine cards will collide. If they will, they skip that hand. If they won’t, they play that hand on chain with the commits baked in at the very beginning.

This approach works much better. The costs on chain are about as small as they could possibly be, with all the difficulty shoved off chain and up front. If a player peeks during the collaborative computation or fails to demonstrate that they didn’t peek then the hand never starts in the first place, no slashing needed. The expensive computational bits can be done at the players’s leisure up front with no having to wait during hands.

Unfortunately none of the literature on mental poker works this way. If anybody makes an implementation which is practical I’ll actually use it. The only extra requirement is that the hashing algorithm is sha256. Until then I’m implementing a poker variant which doesn’t have card removal effects.

Thanks for reading Bram’s Thoughts! Subscribe for free to receive new posts and support my work.

Posted
bkuhn@ebb.org (Bradley M. Kuhn) (Bradley M. Kuhn)
Anthropomorphization Cedes Ground to Artificial Intelligence & LLM Ballyhoo

Big Tech seeks every advantage to convince users that computing is revolutionized by the latest fad. When the tipping point of Large Language Models (LLMs) was reached a few years ago, generative Artificial Intelligence (AI) systems quickly became that latest snake oil for sale on the carnival podium.

There's so much to criticize about generative AI, but I focus now merely on the pseudo-scientific rhetoric adopted to describe the LLM-backed user-interactive systems in common use today. “Ugh, what a convoluted phrase”, you may ask, “why not call them ‘chat bots’ like everyone else?” Because “chat bot” exemplifies the very anthropomorphic hyperbole of concern.

Too often, software freedom activists (including me — 😬) have asked us to police our language as an advocacy tactic. Herein, I seek not to cajole everyone to end AI anthropomorphism. I suggest rather that, when you write about the latest Big Tech craze, ask yourself: Is my rhetoric actually reinforcing the message of the very bad actors that I seek to criticize?

This work now has interested parities with varied motivations. Researchers, for example, will usually admit that they have nothing to contribute to philosophical debates about whether it is appropriate to … [anthropomorphize] … machines. But researchers also can never resist a nascent area of study — so all the academic disclaimers do not prevent the “world of tomorrow” exuberance expressed by those whose work is now the flavor of the month (especially after they toiled at it for decades in relative obscurity). Computer science (CS) academics are too closely tied to the Big Tech gravy train even in mundane times. But when the VCs stand on their disruptor soap-boxes and make it rain 💸? … Some corners of CS academia do become a capitalist echo chamber.

The research behind these LLM-backed generative AI systems is (mostly) not actually new. There's just more electricity, CPUs/GPUs, & digital data available now. When given ungodly resources, well-known techniques began yielding novel results. That allowed for quicker incremental (not exponential) improvement. But, a revolution it is not.

I once asked a fellow CS graduate student (in the mid-1990s), who was presenting their neural net — built with DoD funding to spot tanks behind trees —, the simple question0: Do you know why it's wrong when it's wrong and why it's right when it's right?. She grimaced and answered: Not at all. It doesn't think.. 30 years later, machines still don't think.

Precisely there lies the danger of anthropomorphization. While we may never know why our fellow humans believe what they believe — after centuries that brought1 Heraclitus, Aristotle, Aquinas, Bacon, Decartes, Kant, Kierkegaard, and Haack — we do know that people think, and therefore, they are. Computers aren't. Software isn't. When we who are succumb to the capitalist chicanery and erroneously project being unto these systems, we take our first step toward relinquishing our inherent power over these systems.

Counter-intuitively, the most dangerous are the AI anthropomorphism that criticize rather than laud the systems. The worst of these, “hallucination”, is insidious. Appropriation of a diagnostic term from the DSM-5 into CS literature is abhorrent — prima facie . The term leads the reader to the Bizarro world where programmers are doctors who heal sick programs for the betterment of society. Annoyingly and ironically — even if we did wish to anthropomorphize — LLM-backed generative AI systems almost never hallucinate. If one were to insist on lifting an analogous term from mental illness diagnosis (which I obviously don't recommend), the term is “delusional”. Frankly, having spent hundreds of hours of my life talking with a mentally ill family member who is frequently delusional but has almost never hallucinated — and having to learn to delineate the two for the purpose of assisting in the individual's care — I find it downright offensive and triggering that either term could possibly be used to describe a thing rather than a person.

Sadly, Big Tech really wants us to jump (not walk) to the conclusion that these systems are human — or, at least, as beloved pets that we can't imagine living without. Critics like me are easily framed as Luddites when we've been socially manipulated into viewing — as “almost human” — these machines poised to replace the artisans, the law enforcers, and the grocery stockers. Like many of you, I read Asimov as a child. I later cheered during ST:TNG S02E09 (“Measure of a Man”) when Lawyer Picard established Mr. Data's right to sentience by shouting: Your Honour, Starfleet was founded to seek out new life. Well, there it sits. But, I assure you as someone who has devoted much of my life to considering the moral and ethical implication of Big Tech: they have yet to give us Mr. Data — and if they eventually do, that Mr. Data2 is probably going to work for ICE, not Starfleet. Remember, Noonien Soong's fictional positronic opus was altruistic only because Soong worked in a post-scarcity society.

While I was still working on a draft of this essay, Eryk Salvaggio's essay “Human Literacy” was published. Salvaggio makes excellent further reading on the points above.

🎶
Footnotes:

0I always find that, in science, the answers simplest questions are always the most illuminating. I'm reminded how Clifford Stoll wrote about the most pertinent question at his PhD Physics prelims was “why is the sky blue?”.

1I really just picked a list of my favorite epistemologists here that sounded good when stated in a row; I apologize in advance if I left out your favorite from the list.

2I realize fellow Star Trek fans will say I was moving my lips and nothing came out but a bunch of gibberish because I forgot about Lore. 😛 I didn't forget about Lore; that, my readers, would have to be a topic for a different blog post.

Posted