The lower-post-volume people behind the software in Debian. (List of feeds.)
A while ago I was looking at Rust-based parsing of HID reports but, surprisingly, outside of C wrappers and the usual cratesquatting I couldn't find anything ready to use. So I figured, why not write my own, NIH style. Yay! Gave me a good excuse to learn API design for Rust and whatnot. Anyway, the result of this effort is the hidutils collection of repositories which includes commandline tools like hid-recorder and hid-replay but, more importantly, the hidreport (documentation) and hut (documentation) crates. Let's have a look at the latter two.
Both crates were intentionally written with minimal dependencies, they currently only depend on thiserror and arguably even that dependency can be removed.
HID Usage Tables (HUT)
As you know, HID Fields have a so-called "Usage" which is divided into a Usage Page (like a chapter) and a Usage ID. The HID Usage tells us what a sequence of bits in a HID Report represents, e.g. "this is the X axis" or "this is button number 5". These usages are specified in the HID Usage Tables (HUT) (currently at version 1.5 (PDF)). The hut crate is generated from the official HUT json file and contains all current HID Usages together with the various conversions you will need to get from a numeric value in a report descriptor to the named usage and vice versa. Which means you can do things like this:
let gd_x = GenericDesktop::X; let usage_page = gd_x.usage_page(); assert!(matches!(usage_page, UsagePage::GenericDesktop));Or the more likely need: convert from a numeric page/id tuple to a named usage.
let usage = Usage::new_from_page_and_id(0x1, 0x30); // GenericDesktop / X println!("Usage is {}", usage.name());90% of this crate are the various conversions from a named usage to the numeric value and vice versa. It's a huge crate in that there are lots of enum values but the actual functionality is relatively simple.
hidreport - Report Descriptor parsing
The hidreport crate is the one that can take a set of HID Report Descriptor bytes obtained from a device and parse the contents. Or extract the value of a HID Field from a HID Report, given the HID Report Descriptor. So let's assume we have a bunch of bytes that are HID report descriptor read from the device (or sysfs) we can do this:
let rdesc: ReportDescriptor = ReportDescriptor::try_from(bytes).unwrap();I'm not going to copy/paste the code to run through this report descriptor but suffice to day it will give us access to the input, output and feature reports on the device together with every field inside those reports. Now let's read from the device and parse the data for whatever the first field is in the report (this is obviously device-specific, could be a button, a coordinate, anything):
let input_report_bytes = read_from_device(); let report = rdesc.find_input_report(&input_report_bytes).unwrap(); let field = report.fields().first().unwrap(); match field { Field::Variable(var) => { let val: u32 = var.extract(&input_report_bytes).unwrap().into(); println!("Field {:?} is of value {}", field, val); }, _ => {} }The full documentation is of course on docs.rs and I'd be happy to take suggestions on how to improve the API and/or add features not currently present.
hid-recorder
The hidreport and hut crates are still quite new but we have an existing test bed that we use regularly. The venerable hid-recorder tool has been rewritten twice already. Benjamin Tissoires' first version was in C, then a Python version of it became part of hid-tools and now we have the third version written in Rust. Which has a few nice features over the Python version and we're using it heavily for e.g. udev-hid-bpf debugging and development. An examle output of that is below and it shows that you can get all the information out of the device via the hidreport and hut crates.
$ sudo hid-recorder /dev/hidraw1 # Microsoft Microsoft® 2.4GHz Transceiver v9.0 # Report descriptor length: 223 bytes # 0x05, 0x01, // Usage Page (Generic Desktop) 0 # 0x09, 0x02, // Usage (Mouse) 2 # 0xa1, 0x01, // Collection (Application) 4 # 0x05, 0x01, // Usage Page (Generic Desktop) 6 # 0x09, 0x02, // Usage (Mouse) 8 # 0xa1, 0x02, // Collection (Logical) 10 # 0x85, 0x1a, // Report ID (26) 12 # 0x09, 0x01, // Usage (Pointer) 14 # 0xa1, 0x00, // Collection (Physical) 16 # 0x05, 0x09, // Usage Page (Button) 18 # 0x19, 0x01, // UsageMinimum (1) 20 # 0x29, 0x05, // UsageMaximum (5) 22 # 0x95, 0x05, // Report Count (5) 24 # 0x75, 0x01, // Report Size (1) 26 ... omitted for brevity # 0x75, 0x01, // Report Size (1) 213 # 0xb1, 0x02, // Feature (Data,Var,Abs) 215 # 0x75, 0x03, // Report Size (3) 217 # 0xb1, 0x01, // Feature (Cnst,Arr,Abs) 219 # 0xc0, // End Collection 221 # 0xc0, // End Collection 222 R: 223 05 01 09 02 a1 01 05 01 09 02 a1 02 85 1a 09 ... omitted for previty N: Microsoft Microsoft® 2.4GHz Transceiver v9.0 I: 3 45e 7a5 # Report descriptor: # ------- Input Report ------- # Report ID: 26 # Report size: 80 bits # | Bit: 8 | Usage: 0009/0001: Button / Button 1 | Logical Range: 0..=1 | # | Bit: 9 | Usage: 0009/0002: Button / Button 2 | Logical Range: 0..=1 | # | Bit: 10 | Usage: 0009/0003: Button / Button 3 | Logical Range: 0..=1 | # | Bit: 11 | Usage: 0009/0004: Button / Button 4 | Logical Range: 0..=1 | # | Bit: 12 | Usage: 0009/0005: Button / Button 5 | Logical Range: 0..=1 | # | Bits: 13..=15 | ######### Padding | # | Bits: 16..=31 | Usage: 0001/0030: Generic Desktop / X | Logical Range: -32767..=32767 | # | Bits: 32..=47 | Usage: 0001/0031: Generic Desktop / Y | Logical Range: -32767..=32767 | # | Bits: 48..=63 | Usage: 0001/0038: Generic Desktop / Wheel | Logical Range: -32767..=32767 | Physical Range: 0..=0 | # | Bits: 64..=79 | Usage: 000c/0238: Consumer / AC Pan | Logical Range: -32767..=32767 | Physical Range: 0..=0 | # ------- Input Report ------- # Report ID: 31 # Report size: 24 bits # | Bits: 8..=23 | Usage: 000c/0238: Consumer / AC Pan | Logical Range: -32767..=32767 | Physical Range: 0..=0 | # ------- Feature Report ------- # Report ID: 18 # Report size: 16 bits # | Bits: 8..=9 | Usage: 0001/0048: Generic Desktop / Resolution Multiplier | Logical Range: 0..=1 | Physical Range: 1..=12 | # | Bits: 10..=11 | Usage: 0001/0048: Generic Desktop / Resolution Multiplier | Logical Range: 0..=1 | Physical Range: 1..=12 | # | Bits: 12..=15 | ######### Padding | # ------- Feature Report ------- # Report ID: 23 # Report size: 16 bits # | Bits: 8..=9 | Usage: ff00/ff06: Vendor Defined Page 0xFF00 / Vendor Usage 0xff06 | Logical Range: 0..=1 | Physical Range: 1..=12 | # | Bits: 10..=11 | Usage: ff00/ff0f: Vendor Defined Page 0xFF00 / Vendor Usage 0xff0f | Logical Range: 0..=1 | Physical Range: 1..=12 | # | Bit: 12 | Usage: ff00/ff04: Vendor Defined Page 0xFF00 / Vendor Usage 0xff04 | Logical Range: 0..=1 | Physical Range: 0..=0 | # | Bits: 13..=15 | ######### Padding | ############################################################################## # Recorded events below in format: # E: . [bytes ...] # # Current time: 11:31:20 # Report ID: 26 / # Button 1: 0 | Button 2: 0 | Button 3: 0 | Button 4: 0 | Button 5: 0 | X: 5 | Y: 0 | # Wheel: 0 | # AC Pan: 0 | E: 000000.000124 10 1a 00 05 00 00 00 00 00 00 00
In this weekend’s edition of ‘Bram gets nerd sniped by something ridiculous so makes a blog post about it to make it somebody else’s problem’ Mark Rober said something about ‘A Lava Lamp made out of real lava’. Unfortunately he just poured lava on a regular lava lamp to destroy it but this does raise the question of whether you could have a real lava lamp which uses a molten salt instead of water.
First the requirements. The lamp is made of three substances: the lamp itself, the ‘liquid’ inside, and the ‘solid’ inside. The lamp must be transparent and remain solid across the range of temperatures used. The ‘liquid’ must be solid at room temperature, become liquid at a high but not too high temperature, and be transparent in its liquid phase. It should also be opaque in its solid phase to give a cool reveal of what the thing does as it heats up but but that’s hard to avoid. The ‘solid’ should have a melting point higher than the ‘liquid’ but not so high that it softens the lamp and be opaque. The density of the ‘solid’ should be just barely below that of the ‘liquid’ in its melted form and just barely above in its solid form to give it that distinctive lava lamp buoyancy effect. The ‘solid’ and ‘liquid’ should not react with each other or stick to the lamp or decompose over time.
That was a lot of requirements, but it does seem to be possible to meet them. The choice for the lamp is obvious: Borosilicate glass. That’s physically strong, transparent, can withstand big temperature changes (due to low thermal expansion) and is chemically inert. All the same reasons why it’s ideal for cookware. It doesn’t get soft until over 800C, so the melting points of the other materials should be well below that.
For the ‘liquid’ there also turns out to only be one real option: Zinc Chloride. That’s transparent and has a melting point of 290C and a density of 2.9 (it’s also opaque at room temperature). The other transparent salts aren’t dense enough.
For the ‘solid’ there once again only seems to be one option: Boron Trioxide. That has a melting point of 450C and a density of 2.46. Every other oxide has a density which is way too high, but this one overshoots it a bit. It’s much easier to get get the densities closer together by making mixing the Boron Trioxide with something heavy than the Zinc Chloride with something light, so some Lead(II) oxide can be mixed in. That has a density of 9.53 so not much of it is needed and a melting point of 888C so the combined melting point will still be completely reasonable. (Due to eutectic-type effects it might be barely higher at all.) It should also add some color, possibly multiple ones because the colors formed depend on how it cools. Bismuth(III) oxide should also work and may be a bit more colorful.
I’m crossing my fingers a bit on these things not reacting but given that they’re glasses and salts it seems reasonable. The glasses may have a bit of a tendency to stick to each other. Hopefully not so much because one is a solid at these temperatures and the other is a liquid, but it’s probably a good idea to coat the top and bottom of the insides of the lamp with Silicon and to use an overall shape where the pieces inside never come close to the walls, in particular having an inverted cone shape at the bottom and a similar tapering at the top. The whole lamp should also be sealed because oxygen and water might react at the high temperatures reached, and there should be an argon bubble at the top because there is some expansion and contraction going on. Those same concerns apply to regular lava lamps which explains a lot about how they’re shaped.
Anyone who wants to feel free to try this build. You don’t need any more permission from me. I’d like to see it happen and don’t have the time to spend on building it myself.
Let’s say you happen to need a color code something and want RGB values corresponding to all the common color names. Let’s further say that you’re an obsessive maniac and would like to avoid losing your sanity studying color theory just for the sake of this one silly task. How do you do it? Easy, leverage the work I already did for you. Here’s the chart, followed by caveats and methodology:
First the caveats: The level of saturation of these colors varies a lot mostly because sRGB sucks and your monitor can only display faded versions of some of them. (For colorblind accessibility it might be a good idea to use slightly lower saturation.) A luminance high enough to make yellow have decent saturation washes out other things so this was set to a consistent level which is a reasonable compromise. (This isn’t the fault of sRGB, it’s a limitation of human eyes.) Purple hues at angles 300 and 320 are both things which my eyes accept as the single ideal of Purple and don’t realize are two different things until I see them next to each other. The value given is midway between. Reasonable descriptions of them are ‘Royal Purple’ and ‘Common Purple’. They have an analogous relationship to the one between Blue and Cyan.
The methodology behind this first has to answer the question ‘What is a color?’ For the purposes of this exercise we’ll just pretend that hues are colors. The next question is why particular positions in the hue continuum count as colors. Hues are a twisting road. In particular places the road bends, making a gradient crossing over it look not like a straight line. The places where those bends happen we call colors. Which bends get a name is dependent on where you set the threshold and cultural factors. The exact point where the bend happens is also hard to define exactly. I located them by the highly scientific process of picking them out with my own two eyes.
There’s a standard statement of what the common color words are in English to which I’m adding Cyan and Pink. Cyan is a proper name for what’s usually called ‘Light Blue’, a name which makes no sense because both Cyan and Blue can appear at any amount of luminance. It may be that Cyan is denied a proper common name because our displays can barely show it. The biggest limitation of sRGB by far is that it can only display Cyan with poor saturation. Pink I put in both because it’s a primary color (as is Cyan) and because it’s a very common color word in English, mostly denied its proper place out of an absurd bit of misogyny and homophobia. It’s especially funny that in printing Pink is euphemized as ‘Magenta’ even though the shade which is used is a light one and the common usage of ‘Magenta’ is to refer to a darker shade.
One important thing to note is that the primary colors, which have sRGB values set to 00 and FF, are NOT on ideal color hues. Those colors correspond to the most saturated things your display can show, which is important, but the positioning of RGB was selected first and foremost for them to be 120 degrees from each other to maximize how much color display could be done with only three phosphores. They happen to be very close to the true ideals but not exactly.
To pick out exact shades I used an OKHSL color picker with saturation 100% and light and dark luminance at 65% and 43%. There are still a few artifacts of OKHSL, in particular its hues bend a tiny bit in the darker shades. To compensate I picked out color angles which are good at both high and low luminance, mostly resulting in Cyan being shifted over because in darker shades Teal is elbowing it out of the way. One thing OKHSL does NOT do is maintain the property that two colors which are opposite each other in space are true opposites, which is annoying because oppositeness is one of the most objective things you can say about colors. (That could probably be improved by having brightness correction be done by cubing a single value instead of three values separately but I personally am not going to put together such a proposal.)
Annoyingly the human perceptual system doesn’t see fit to put color angles with canonical names opposite each other, instead placing them roughly evenly but with seemingly random noise added. This of course creates problems for color wheels, which want both to show what colors are opposites and what the color names are.
If you want to know how bad your computer display is, go here, select sRGB for the gamut, OKLab for the geometric color coordinate space, and Color for the spectral colors, and rotate it around. You’ll get a much better sense of what’s going on rotating it in 3D yourself, but I’ll do some explaining. Here’s a screenshot showing just how much of the color space your monitor is missing:
The colored shape shows the extremes of what your color can display. It’s warped to reflect the cognitive distance between colors, so the distances in space reflect the apparent distance between the colors in your brain. Ideally that shape would fill the entire basket, which represents all the colors your eyes can perceive. You might notice that it comes nowhere close. It’s a stick going from black at the bottom to white at the top, with just enough around it that you can get colors but the saturation is poor.
The biggest chunks missing from this are that there’s very little bright blue and dark cyan. This may be why people mischaracterize cyan as ‘light blue’. Our display technologies are literally incapable of displaying a highly saturated light blue or a highly saturated dark cyan. It’s likely that most of the paintings from Picasso’s blue period can’t be displayed properly on a monitor, and that he was going with blues not as a gimmick but because it’s literally half the human cognitive space. If you have the ability to make a display or physical object without the standard restrictions go with bright blue or dark cyan. Even better contrast them with each other and break everybody’s brains.
Sadly this situation is basically unfixable. The Rec2020 standard covers much more of the color space, but you can’t simply display sRGB values as Rec2020 values. That will result in more intense colors, but because the original inputs weren’t designed for it the effect will be cartoony and weird. You can simply display the correct values specified by sRGB, but that will waste the potential of the display . If there was content recored for sRGB which specified that in its format it would display very well, but that’s has a chicken and egg problem, and the displaying input recorded for Rec2020 on a legacy sRGB display is even worse than going the other way around. Maybe about the best you can do is have a Rec2020 display which applies a superlinear saturation filter to sRGB input so low saturations are true but ‘fully saturated’ values look more intense.
This is an example of how modern televisions do a huge amount of processing of their input before displaying it and it’s extremely difficult to disentangle how good the physical display is from the quality of the processing software. Another example of that is in the display of gradients. A 10 bit color display will naturally display gradients much better than an 8 bit color display, but an 8 bit color display can dither the errors well enough to be nearly imperceptible. The problem then is that causes flicker due to the dithering changing between frames. There are ways of fixing this by keeping information between frames but I don’t think there’s an open source implementation of this for televisions to use. One has to assume that many if not nearly all of the proprietary ones do it.
Speaking of which, this is a problem how software in general handles color precision. It’s true that 8 bits is plenty for display, but like with audio you should keep all intermediate representations with much greater precision and only smash them down when making the final output. Ideally operating systems would pretend that final display had 16 bit color and fix it on final display, or even in the monitor. Lossy video compression in particular inexplicably gives 8 bit color output resulting in truly awful dark gradients. The standard Python image libraries don’t even have an option for higher color precision resulting in them producing terrible gradients. This should be completely fixable.
Popping back up the stack I’d like to fantasize about a data format for display technology which supports the entire range of human perceptible colors. This would encode color as three values: x, y, and luma. x would go between 0 and 2 with y between 0 and 1. It’s a little hard to describe what exactly these values mean, but (0, 0) would be red, (1, 0) yellow, (2, 0) green, (2, 1) cyan, (1, 1) blue, and (0, 1) pink. The outer edge goes around the color wheel keeping opposite colors opposite and doing an okay job of corresponding with cognitive space even in raw form. You could make an approximate rendering of this in sRGB as a smushed color wheel but by definition the outer edge of that would be horrendously faded compared to how it should look. Luminance should work as it implicitly does in RGB: Luminance 0 is exactly black and x and y have no effect. As it goes up the cognitive width which x and y represent increases up until the midway point, then it shrinks again until it gets to luminance 1 which is exactly white and x and y again have no effect. This shrinking at the top is to reflect how real displays work. If you want to get much better bright colors you can use the luminance of 1/2 as white at the expense of everything being darker. Many movies do this, which makes them look great when covering a whole display but dark when open in a window next to true white on a monitor.
Deep diving on color theory of course has given me multiple ideas for hobby software projects, most of which I’ll probably never get around to because many things don’t pan out but mostly because I have limited hobby coding time so things get triaged heavily. If anybody wants to beat me to the punch on these please go ahead:
A color cycling utility which instead of rotating the color space reflects it. Usually when color cycling there’s one position which is true, then it rotates until all hues are changed to their opposite, then it rotates back around the other way. This would instead at all times have two opposite hues which are true and two opposite colors which are flipped and cycle which those are. Ideally this would be implemented by converting to okHSL, changing H to (d-H) % 1 and converting back again. As you change d it will cycle. You could trivially change it to a very nice traditional color cycler using (d+H) % 1.
A color cycling utility which allows user controlled real time three dimensional color rotation. If you take an input image, convert it into the okLab color space and shrink its color space to fit in a sphere centered at (0.76, 0, 0) with radius 0.125 then this can be done without hitting any values which are unreachable in sRGB. The interface for rotating it should be similar to this one. In the past when I’ve tried doing these sorts of three dimensional color rotations they’ve looked horrible when white-black gets off axis. Hopefully that’s because the cognitive distance in that direction is so much greater than in the color directions and keeping everything uniform will fix it, but it may be that getting white and black inverted at all fundamentally looks weird.
TLDR: if you know what EVIOCREVOKE does, the same now works for hidraw devices via HIDIOCREVOKE.
The HID standard is the most common hardware protocol for input devices. In the Linux kernel HID is typically translated to the evdev protocol which is what libinput and all Xorg input drivers use. evdev is the kernel's input API and used for all devices, not just HID ones.
evdev is mostly compatible with HID but there are quite a few niche cases where they differ a fair bit. And some cases where evdev doesn't work well because of different assumptions, e.g. it's near-impossible to correctly express a device with 40 generic buttons (as opposed to named buttons like "left", "right", ...[0]). In particular for gaming devices it's quite common to access the HID device directly via the /dev/hidraw nodes. And of course for configuration of devices accessing the hidraw node is a must too (see Solaar, openrazer, libratbag, etc.). Alas, /dev/hidraw nodes are only accessible as root - right now applications work around this by either "run as root" or shipping udev rules tagging the device with uaccess.
evdev too can only be accessed as root (or the input group) but many many moons ago when dinosaurs still roamed the earth (version 3.12 to be precise), David Rheinsberg merged the EVIOCREVOKE ioctl. When called the file descriptor immediately becomes invalid, any further reads/writes will fail with ENODEV. This is a cornerstone for systemd-logind: it hands out a file descriptor via DBus to Xorg or the Wayland compositor but keeps a copy. On VT switch it calls the ioctl, thus preventing any events from reaching said X server/compositor. In turn this means that a) X no longer needs to run as root[1] since it can get input devices from logind and b) X loses access to those input devices at logind's leisure so we don't have to worry about leaking passwords.
Real-time forward to 2024 and kernel 6.12 now gained the HIDIOCREVOKE for /dev/hidraw nodes. The corresponding logind support has also been merged. The principle is the same: logind can hand out an fd to a hidraw node and can revoke it at will so we don't have to worry about data leakage to processes that should not longer receive events. This is the first of many steps towards more general HID support in userspace. It's not immediately usable since logind will only hand out those fds to the session leader (read: compositor or Xorg) so if you as application want that fd you need to convince your display server to give it to you. For that we may have something like the inputfd Wayland protocol (or maybe a portal but right now it seems a Wayland protocol is more likely). But that aside, let's hooray nonetheless. One step down, many more to go.
One of the other side-effects of this is that logind now has an fd to any device opened by a user-space process. With HID-BPF this means we can eventually "firewall" these devices from malicious applications: we could e.g. allow libratbag to configure your mouse' buttons but block any attempts to upload a new firmware. This is very much an idea for now, there's a lot of code that needs to be written to get there. But getting there we can now, so full of optimism we go[2].
[0] to illustrate: the button that goes back in your browser is actually evdev's BTN_SIDE and BTN_BACK is ... just another button assigned to nothing particular by default.
[1] and c) I have to care less about X server CVEs.
[2] mind you, optimism is just another word for naïveté
Matt Parker asks for an 'ideal’ jigsaw with two solutions:
The criteria seem to be that the jigsaw should be (a) 25 pieces in a 5x5 grid (b) each edge type occurs exactly twice (c) there are exactly two solutions (d) in the two different solutions the maximum number of pieces are reoriented upside down. He’s done a search using a computer but that turns out to be unnecessary.
There’s a very simple mathematical trick which can be applied here. If the rows and columns are numbered 1-5, first swap columns 2 and 4, then swap rows 2 and 4, like this:
Then you flip it all upside down, and presto, all the criteria satisfied perfectly. Except… not quite. The problem is that this doesn’t result in two solutions, it results in four. You can independently decide whether to swap the rows and the columns. The most elegant fix for this seems to be to change the shapes of the pieces a bit so that instead of each piece just sharing an edge with the ones in the left, right, up, and down directions they also share with the ones in the upper right and lower left. This not only enforces that there are exactly two solutions, it also preserves the property that each edge type occurs exactly twice, even on the diagonal edges
For roughly the last two years I’ve been working on bringing gaming to Chia. This will have:
Playing games of skill for real money (specifically XCH)
Real time play, without waiting for the blockchain to make every move
Enforcement of the game rules so no cheating
No casino so no fee paid to a casino. The only extra money you lose out on is transaction fees which are currently near zero.
Legal in many more jurisdictions than casino gaming. There are meaningful legal distinctions between games of skill and games of chance and games played with and without and intermediary. These are intentional parts of the law and not a loophole.
Whoever you’re playing against won’t be able to skip out on their debts. Money won or lost is transferred immediately.
Those are, to put it mildly, huge features, and it’s been a big engineering lift to make them possible. Current status is that we’re finishing up debugging the core of it (you can follow ongoing development here) and will soon start building it out into a real application which will be normal software development instead of a science fair project and that is expected to ship in a months timeframe.
At a high level the way it works is that when two people want to play a session they use an offer and acceptance to set up a state channel at the beginning, which takes about a minute for the transaction to go through on chain. (State channels are very similar to payment channels like are used in Bitcoin but can do more things like support gaming, which is a problem because Bitcoin can’t support gaming at all, even not in a state channel.) Then they play over that state channel and when they’re done they close out the session with another transaction which pays out the amount they had left at the end. If there’s a dispute in the middle (which there’s no reason for unless one player tries to misbehave or has a serious technical problem) then whatever games are pending get played out on chain. Poker is a fairly good fit for this because session is many short games instead of one long game.
The restrictions of this medium are that it’s restricted to turn based games with very few moves for exactly two players. To get an idea of why more than two players is a problem see envy-free cake cutting. There needs to be as few turns as possible to make the fallback to playing on chain not excessively slow. What exactly is possible in turn based games is a little hard to convey but there will be a suite of (fun and addictive!) games shipped with it initially which does a good job of showing what things are possible and how to implement them. Randomness can be done using commit and reveal, but supporting card replacement value like happens in Poker is problematic. The vast bulk of the academic work on supporting ‘Mental Poker’ is on handling that one not very important feature. Instead we’ll start with a Poker variant which uses an infinite deck because that makes the problem go away completely. (If you search for ‘commit and reveal’ there are a lot of crypto projects claiming they’re doing something cutting edge and amazing and that it has to be done on chain. That isn’t true. It’s trivial and can be done over state channels.)
Since any new on-chain programming language in Bitcoin will have to be supported forever there will always be the objection that whatever is proposed doesn’t do enough and will just be legacy baggage after something better is deployed later. The obvious fix is to make an environment which can do everything but that runs headlong into ‘everything’ being scary. In Chia we’ve gone full steam ahead with the Bitcoin Script research program so I can tell you what ‘everything’ is and what the objections to parts of it are. Some of these may sound inane to you, but trust me these have all been major sources of angst in the past. If people are able to come to agreement about some or all of them now then great. I’m posting this to at least help have a conversation about it. The counterarguments for pieces of functionality given below are taken from Bitcoin lore which may not be well documented or even well remembered any more as there’s been turnover among core developers, but these are all objections given by important core developers in years past.
Turing complete language/Covenants/Capabilities
These three are lumped together because in general if a turing complete language is allowed that naturally produces covenants and capabilities as a side effect. In the past Bitcoin developers were skeptical of covenants in general but seem to have mostly gotten over it now. The difference has to do with whether a covenant is attached as a rider or something which the recipient has to fully opt in to. There’s nothing inherently wrong with funds being restricted. The sender of funds can always do an equivalent thing by simply not sending the funds. The problem is if a recipient thinks that they’ve gotten funds but those funds are actually encumbered by rules they don’t realize are there. Since all serious proposals for covenants in Bitcoin involve them being enforced by scriptpubkeys this is taken care of.
Capabilities haven’t been discussed as much but the Chialisp trick for implementing them once you have a turing complete language seems to be a no brainer. All that’s necessary is for a scriptpubkey to have some way of knowing what UTXO it’s a part of and to have sandboxing possible in the language.
Deduplicating code
As soon as you allow a turing complete language on chain you run headlong into having programs which are so long that they’re cost prohibitive to deploy. The only practical way around this is to allow code to be reused. There are two approaches to this: either allow code reuse within a block, or allow references to code which was explicitly made available from older blocks. The problem with the first one of these is that it makes the costs of including transactions in blocks multidimensional with there being group discounts. The problem with the second one is that it can cause a transaction to go from valid to invalid if it depended on the previous block and that gets reorged out. Neither of these is perfect, but both are workable and at least one of them has to be done to have any hope of widespread complex programs without a zillion little forks adding each of them.
Access to time
In Bitcoin currently timelocks are restricted to being able to make requirements that particular sends only happen after a particular time. The theory is that transaction validity should be strictly monotonic: A transaction can go from invalid to valid because enough time passed, but not from valid to invalid. That way things which are in the mempool only get kicked out when they succeed or the fee rate gets above them.
Obviously there’s a big exception to mempool monotinicity: Transactions can already get kicked out because one of their inputs got spent. That is a very narrow and well defined property but then the current time is also a very narrow and well defined property. Handling the mempool edge cases is some work but not a super compelling argument against the functionality.
The most basic use of time functionality is to keep transactions from ever getting into a limbo ‘failed’ state. If a transaction doesn’t go through for too long you probably want to give up on it, but then it’s in the limbo state of not knowing whether it will get through eventually. You can force it to fail but spending one of its inputs, but if fees were too high for the original transaction to go through then fees may be too high for the cancellation transaction to go through and you’re just stuck. Allowing transactions to simply say that they can’t be spent past a certain time would fix this elegantly.
A strong argument for a supported not valid after is that you can implement a very bad one. For example you can have UTXOs which are anyone can spend but can’t be spent until a set time, and you make a transaction which has to include one of those on the assumption that someone will spend them as soon as it becomes possible. This is extremely janky but it’s the sort of thing which people are likely to do if a supported not valid after is not available.
One argument against allowing not valid after is that it may allow transactions to game their priority by expiring soon. Thankfully due to real decentralization a miner making one block has no idea whether it’s them or some unrelated miner making the next block and it’s usually someone unrelated. The incentives work out that it’s best for miners to let in whatever makes them the most in transaction fees, so there doesn’t seem to be a real issue here.
A related problem with not valid after is that it can allow someone making a transaction to incentivize a miner to set the time of the current block earlier than is accurate by making a high fee transaction which is dependent on an earlier time. In Chia this is avoided by making the relevant time be the one from the previous transaction block, which mostly fixes the problem because of the same real decentralization as the previous issue. There’s a separate mechanism for a coin to assert that it’s spent in the same block as it’s created because that relies on a concept of ‘now’ for a time which technically hasn’t been set yet. This approach works fine but the subtle details need to be hashed out carefully.
Assert UTXO spent
It would be good to have a better answer to how to add fees to a transaction after it’s done being created would be a good idea. One approach would be to allow one transaction to depend on another one without the second one opting in. This violates the principle that transactions shouldn’t be able to influence transactions outside of themselves, but since it can only do so in a positive direction it doesn’t allow censorship and it allows for simple solutions to things which are otherwise huge headaches. Reducing the level of dependence to only allow one transaction to require a certain UTXO be spent instead of specifying the full details of the other transaction would be even less deviation from the principle.
I keep posting about the importance of functions inside of deep neural networks being sublinear but haven’t given an exact definition of that before. It’s sublinearity in the computer science asymptotic sense. The Taylor expansion should not only have a linear bound but either going to zero or at least have the positive and negative directions go to different asymptotics. If the function is defined by different formulas in different sections that criterion should apply to all of them.1
With that out the the way here’s how common activation functions look with a new suggestion at the bottom.
This is as nonlinear as you can get a monotonic function sublinear function to be and is trivial to compute. The one big problem is that it has that kink at 0. Is it too much to ask for a function to be continuously differentiable?
It’s okay to have insecurities about some areas being completely occluded and hence stop responding to training but if it’s that much of a problem for you you don’t have what it takes to work with DNNs and should go back to playing with linear functions.
GELU
We get it, you got rid of that kink, but the requirements specified a function which is monotonic, can you not read? Also maybe don’t use functions which are so obscure that I can’t figure out how to enter them into Wolfram Alpha.
Thank you for you following the requirements and there’s some argument to using Softmax here since you’re probably using it elsewhere anyway. But it does seem to take a large area to smooth that kink out and never quite gets to exactly RELU in either direction.
RELU with Sinusoidal Smoothing (RELUSS)
This is my new idea. Not only is the kink completely smoothed out, it’s done with a simple quick to calculate function which meets the requirements and reverts completely to RELU outside of that area.
Don’t ask about x*sin(x)
Over the last months I've started looking into a few of the papercuts that affects graphics tablet users in GNOME. So now that most of those have gone in, let's see what has happened:
Calibration fixes and improvements (GNOME 47)
The calibration code, a descendent of the old xinput_calibrator tool was in a pretty rough shape and didn't work particularly well. That's now fixed and I've made the calibrator a little bit easier to use too. Previously the timeout was quite short which made calibration quite stressfull, that timeout is now per target rather than to complete the whole calibration process. Likewise, the calibration targets now accept larger variations - something probably not needed for real use-cases (you want the calibration to be exact) but it certainly makes testing easier since clicking near the target is good enough.
The other feature added was to allow calibration even when the tablet is manually mapped to a monitor. Previously this only worked in the "auto" configuration but some tablets don't correctly map to the right screen and lost calibration abilities. That's fixed now too.
A picture says a thousand words, except in this case where the screenshot provides no value whatsoever. But here you have it anyway.
Generic tablet fallback (GNOME 47)
Traditionally, GNOME would rely on libwacom to get some information about tablets so it could present users with the right configuration options. The drawback was that a tablet not recognised by libwacom didn't exist in GNOME Settings - and there was no immediately obvious way of fixing this, the panel either didn't show up or (with multiple tablets) the unrecognised one was missing. The tablet worked (because the kernel and libinput didn't require libwacom) but it just couldn't be configured.
libwacom 2.11 changed the default fallback tablet to be a built-in one since this is now the most common unsupported tablet we see. Together with the new fallback handling in GNOME settings this means that any unsupported tablet is treated as a generic built-in tablet and provides the basic configuration options for those (Map to Monitor, Calibrate, assigning stylus buttons). The tablet should still be added to libwacom but at least it's no longer a requirement for configuration. Plus there's now a link to the GNOME Help to explain things. Below is a screenshot on how this looks like (after modifying my libwacom to no longer recognise the tablet, poor Intuos).
Monitor mapping names (GNOME 47)
For historical reasons, the names of the display in the GNOME Settings Display configuration differed from the one used by the Wacom panel. Not ideal and that bit is now fixed with the Wacom panel listing the name of the monitor and the connector name if multiple monitors share the same name. You get the best value out of this if you have a monitor vendor with short names. (This is not a purchase recommendation).
Highlighted SVGs (GNOME 46)
If you're an avid tablet user, you may have multiple stylus tools - but it's also likely that you have multiple tools of the same type which makes differentiating them in the GUI hard. Which is why they're highlighted now - if you bring the tool into proximity, the matching image is highlighted to make it easier to know which stylus you're about to configure. Oh, and in the process we added a new SVG for AES styli too to make the picture look more like the actual physical tool. The <blink> tag may no longer be cool but at least we can disco our way through the stylus configuration now.
More Pressure Curves (GNOME 46)
GNOME Settings historically presents a slider from "Soft" to "Firm" to adjust the feel of the tablet tip (which influences the pressure values sent to the application). Behind the scenes this was converted into a set of 7 fixed curves but thanks to a old mutter bug those curves only covered a small amount of the possible range. This is now fixed so you can really go from pencil-hard to jelly-soft and the slider now controls an almost-continous range instead of just 7 curves. Behold, a picture of slidery goodness:
Miscellaneous fixes
And of course a bunch of miscellaneous fixes. Things that I quickly found were support for Alt in the tablet pad keymappings, fixing of erroneous backwards movement when wrapping around on the ring, a long-standing stylus button mismatch, better stylus naming and a rather odd fix causing configuration issues if the eraser was the first tool ever to be brought into proximity.
There are a few more things in the pipe but I figured this is enough to write a blog post so I no longer have to remember to write a blog post about all this.
SwiftNavigation
To celebrate that RealityKit's is coming to MacOS, iOS and iPadOS and is no longer limited to VisionOS, I am releasing SwiftNavigation for RealityKit.
Last year, as I was building a game for VisionPro, I wanted the 3D characters I placed in the world to navigate the world, go from one point to another, avoid obstacles and have those 3D characters avoid each other.
Almost every game engine in the world uses the C++ library RecastNavigation library to do this - Unity, Unreal and Godot all use it.
SwiftNavigation was born: Both a Swift wrapper to the underlying C++ library which leverages extensively Swift's C++ interoperability capabilities and it directly integrates into the RealityKit entity system.
This library is magical, you create a navigation mesh from the world that you capture and then you can query it for paths to navigate from one point to another or you can create a crowd controller that will automatically move your objects.
Until I have the time to write full tutorials, your best bet is to look at the example project that uses it.
My grandmother used to make a recipe from an old newspaper clipping. After decades the original clipping started to crumble so she replaced it with a new clipping when the newspaper re-ran the recipe. I struggled but eventually succeeded in making a recipe which matched my childhood memories. Sadly my childhood memories were romanticized and my grandmother’s original recipe didn’t make the pancakes stay floofed after they were done cooling off, but I hope you enjoy this improved version.
These pancakes rise by water under the batter turning into steam, so to keep the pan from getting cooled off by the batter it’s important to cook them in an iron skillet which has been given time to heat all the way through.
3 eggs
70 grams flour
120 grams milk
1 gram nutmeg
1 gram mint oil
Pinch of salt
6 grams powdered sugar
3 grams Lemon juice powder
20 grams Ghee
Preheat the oven with a skillet inside to 400 degrees. Leave it in for 15 more minutes after preheating. Mix together eggs, flour, milk, nutmeg, mint oil, and salt and beat thoroughly. When the oven is heated add ghee to the pan and put it back in to melt (about 2 minutes). After it’s done melting, pour batter on top. Bake for 20 minutes. Thoroughly mix powdered sugar and lemon juice powder and put it in a dusting wand. Sift completely over top.
Back in the day when presumably at least someone was young, the venerable xsetwacom tool was commonly used to configure wacom tablets devices on Xorg [1]. This tool is going dodo in Wayland because, well, a tool that is specific to an X input driver kinda stops working when said X input driver is no longer being used. Such is technology, let's go back to sheep farming.
There's nothing hugely special about xsetwacom, it's effectively identical to the xinput commandline tool except for the CLI that guides you towards the various wacom driver-specific properties and knows the right magic values to set. Like xinput, xsetwacom has one big peculiarity: it is a fire-and-forget tool and nothing is persistent - unplugging the device or logging out would vanish the current value without so much as a "poof" noise [2].
If also somewhat clashes with GNOME (or any DE, really). GNOME configuration works so that GNOME Settings (gnome-control-center) and GNOME Tweaks write the various values to the gsettings. mutter [3] picks up changes to those values and in response toggles the X driver properties (or in Wayland the libinput context). xsetwacom short-cuts that process by writing directly to the driver but properties are "last one wins" so there were plenty of use-cases over the years where changes by xsetwacom were overwritten.
Anyway, there are plenty of use-cases where xsetwacom is actually quite useful, in particular where tablet behaviour needs to be scripted, e.g. switching between pressure curves at the press of a button or key. But xsetwacom cannot work under Wayland because a) the xf86-input-wacom driver is no longer in use, b) only the compositor (i.e. mutter) has access to the libinput context (and some behaviours are now implemented in the compositor anyway) and c) we're constantly trying to think of new ways to make life worse for angry commenters on the internets. So if xsetwacom cannot work, what can we do?
Well, most configurations possible with xsetwacom are actually available in GNOME. So let's make those available to a commandline utility! And voila, I present to you gsetwacom, a commandline utility to toggle the various tablet settings under GNOME:
$ gsetwacom list-devices devices: - name: "HUION Huion Tablet_H641P Pen" usbid: "256C:0066" - name: "Wacom Intuos Pro M Pen" usbid: "056A:0357" $ gsetwacom tablet "056A:0357" set-left-handed true $ gsetwacom tablet "056A:0357" set-button-action A keybinding "<Control><Alt>t" $ gsetwacom tablet "056A:0357" map-to-monitor --connector DP-1
Just like xsetwacom was effectively identical to xinput but with a domain-specific CLI, gsetwacom is effectively identical to the gsettings tool but with a domain-specific CLI. gsetwacom is not intended to be a drop-in replacement for xsetwacom, the CLI is very different. That's mostly on purpose because I don't want to have to chase bug-for-bug compatibility for something that is very different after all.
I almost spent more time writing this blog post than on the implementation so it's still a bit rough. Also, (partially) due to how relocatable schemas work error checking is virtually nonexistent - if you want to configure Button 16 on your 2-button tablet device you can do that. Just don't expect 14 new buttons to magically sprout from your tablet. This could all be worked around with e.g. libwacom integration but right now I'm too lazy for that [4]
Oh, and because gsetwacom writes the gsettings configuration it is persistent, GNOME Settings will pick up those values and they'll be re-applied by mutter after unplug. And because mutter-on-Xorg still works, gsetwacom will work the same under Xorg. It'll also work under the GNOME derivatives as long as they use the same gsettings schemas and keys.
Le utilitaire est mort, vive le utilitaire!
[1] The git log claims libwacom was originally written in 2009. By me. That was a surprise...
[2] Though if you have the same speakers as I do you at least get a loud "pop" sound whenever you log in/out and the speaker gets woken up
[3] It used to be gnome-settings-daemon but with mutter now controlling the libinput context this all moved to mutter
[4] Especially because I don't want to write Python bindings for libwacom right now
As I’ve mentioned previously if you want eventually consistent version control, meaning whatever order you merge things together has no impact on the final result, you not only need to have a very history aware merging algorithm, you also need canonical ordering of the lines. This cleanly dodges around the biggest issue in version control, which is what should you do when one person merges AXB and AYB as AXYB and another person merges them together as AYXB and then you try to merge both of those together. None of the available options are good, so you have to keep it from ever happening in the first place. Both people need to be shown AXYB as the order of lines in the merge conflict (or the other order as long as it’s consistent) and that way if either of them decided to change it to AYXB then that was a proactive change made afterwards and is not only a winner of the later meta-merge conflict, there isn’t even a conflict at all, it merges cleanly.
This flies in the face of how UX normally works on merge conflicts, which orders the conflicting regions by whether they’re ‘local’ or ‘remote’. How to do order better is an involved subject which I’ve covered thoroughly in older posts and won’t rehash here, but conflict UX I want to talk about more. Since the order of lines and whether they should be included if everything is smashed together blindly is assumed to be handled, that creates a question of how to detect and present conflicts. What’s going to be needed is a way of marking particular lines as conflicts and figuring out what should be marked. There should be some format of special lines similar to the conflict markers people are already familiar with as a way of presenting them to users in files. That format should include a way of saying which of the two sides individual lines came from.
The general idea is to determine ‘which side each line came from’ and if two lines whose ancestry are different are ‘too close together’ then they’re both marked as being in conflict. If successive lines have the same ancestry then if one of them is in conflict it taints the others. The simplest approach is that a single line of code which is present on both sides ends regions of conflict. Arguably it should be more than one line to declare peace, or that empty or whitespace only lines shouldn’t count towards it. I’m going to assume the simplest approach for a proof of concept.
An important case is when Alice adds a line to a function and Bob deletes the entire function. Obviously that should somehow be presented as a conflict but deleted lines are crucial to it. For that reason there needs to be some way of showing deleted lines in the conflict, definitely with proper annotations around them and possibly with the individual deleted lines commented out.
To detect conflicts each line is marked as ‘peaceful’, ‘skip’, ‘Alice won’, ‘Bob won’ or ‘both won‘. Once all lines are marked then the ones which are marked skip are, well, skipped. Other lines which border lines with a different marking which is not peaceful are marked as in conflict. Finally tainting is spread to neighboring lines which have the same state. Deleted lines are only presented to the user if they’re in conflict.
What to do in each case is best presented as a laundry list, so here goes. Each case is final-Alice-Bob.
missing missing missing: skip
missing missing present: Alice
missing present missing: Bob
missing present present: both (this is an unusual case but it can happen)
present missing missing: both (similar to the previous case)
present missing present: Bob
present present missing: Alice
present present present: peaceful
That seems to handle all the edge cases properly and covers the last of the theoretical details I needed to work out.
When a user resolves a conflict and does a commit it should first throw an error if conflict markers weren’t removed, then should assume the user edited the clean merge they would have seen if each line were presented verbatim without checking for conflicts. When doing a diff between the complete weave and the user’s final file version it should probably more heavily weight lines which were present than lines which were deleted but I’m not sure what the best way of doing that is and will probably make a prototype which doesn’t have any such heuristic.