The lower-post-volume people behind the software in Debian. (List of feeds.)

I'm turning off commenting for my blogs. While I've enjoyed some feedback, the time wasted to moderate spam posts just isn't worth it. Thank you, spammers! :-(
Posted Sat Jul 23 21:11:00 2016 Tags:

As of tonight, I have a new challenge in my life.

Sifu Dale announced tonight that next year he plans to send a team to next year’s Quoshu, a national-level martial-arts competition held every year in northern Maryland in late summer.

I told Sifu I want to go and compete in weapons, at least. He was like, “Well, of course,” as though he’d been expecting it. Which is maybe a little surprising and flattering considering I’ll be pushing 60 then.

It’ll mean serious training for the next year, and maybe a pretty humiliating experience if it turns out I’m too old and slow. But I want to try, because it’s national-level competition against fighters from dozens of different styles, and…frankly, I’m tired of not having any clear idea how good I am. Winning would be nice, but what I really want is to measure myself against a bigger talent pool.

The thing is, on the limited evidence I have, the possibilities range from “Eric is a clumsy goof who powers through weak opposition just by being a little stronger and more aggressive” to “Eric is a genuinely powerful and clever fighter who even national-level competitors had better take seriously.” It’s really hard for me to tell.

I’ve tended to look pretty good at schools where the style matched my physical capabilities. I was a duffer at aikido and undistinguished at MMA, but me in a style that’s about hitting things or swinging weapons and I shine. You really, really don’t want to be in the way when I strike at full power; I never do it against my training partners because I don’t want to break them. On one occasion at my MMA school when I was practicing short punches against a padded structural beam I vibrated the building. Not kidding!

I also take hits very well when they happen. My sifu often tells new students “Hit him as hard as you can. You can’t hurt him,” which claim is funny because it’s largely true. By the time they can put enough mv**2 on target to make me flinch they’re well beyond being newbies. Generally if he doesn’t say this the student has trained in another striking style before.

On the other hand, I’m only tested against a relatively small population, and it’s not clear that my upper-body strength is the kind of advantage against genuinely skilled opponents that it is when you’re, say, trying to vibrate a building. I’m slow on my feet and my balance is iffy because cerebral palsy. And there are lots of people who can do technique better than me.

If I’m really good, then it’s because (a) I’m strong and tough, (b) I’m aggressive, and (c) I have a kind of low cunning about fighting and do things my opponents don’t expect and aren’t prepared for. I know where my tiger is. An awful lot of people who are better martial technicians than me don’t, and that is a fact.

But I don’t know what percentile this puts me in if you could match me against a hundred people who have also been training for years and are among the best at their schools. In a year and change maybe I will. It’s worth the effort to find out.

Posted Fri Jul 22 11:34:59 2016 Tags:

Don't panic. Of course it isn't. Stop typing that angry letter to the editor and read on. I just picked that title because it's clickbait and these days that's all that matters, right?

With the release of libinput 1.4 and the newest feature to add tablet pad mode switching, we've now finished the TODO list we had when libinput was first conceived. Let's see what we have in libinput right now:

  • keyboard support (actually quite boring)
  • touchscreen support (actually quite boring too)
  • support for mice, including middle button emulation where needed
  • support for trackballs including the ability to use them rotated and to use button-based scrolling
  • touchpad support, most notably:
    • proper multitouch support on touchpads [1]
    • two-finger scrolling and edge scrolling
    • tapping, tap-to-drag and drag-lock (all configurable)
    • pinch and swipe gestures
    • built-in palm and thumb detection
    • smart disable-while-typing without the need for an external process like syndaemon
    • more predictable touchpad behaviours because everything is based on physical units [2]
    • a proper API to allow for kinetic scrolling on a per-widget basis
  • tracksticks work with middle button scrolling and communicate with the touchpad where needed
  • tablet support, most notably:
    • each tool is a separate entity with its own capabilities
    • the pad itself is a separate entity with its own capabilities and events
    • mode switching is exported by the libinput API and should work consistently across callers
  • a way to identify if multiple kernel devices belong to the same physical device (libinput device groups)
  • a reliable test suite
  • Documentation!
The side-effect of libinput is that we are also trying to fix the rest of the stack where appropriate. Mostly this meant pushing stuff into systemd/udev so far, with the odd kernel fix as well. Specifically the udev bits means we
  • know the DPI density of a mouse
  • know whether a touchpad is internal or external
  • fix up incorrect axis ranges on absolute devices (mostly touchpads)
  • try to set the trackstick sensitivity to something sensible
  • know when the wheel click is less/more than the default 15 degrees
And of course, the whole point of libinput is that it can be used from any Wayland compositor and take away most of the effort of implementing an input stack. GNOME, KDE and enlightenment already uses libinput, and so does Canonical's Mir. And some distribution use libinput as the default driver in X through xf86-input-libinput (Fedora 22 was the first to do this). So overall libinput is already quite a success.

The hard work doesn't stop of course, there are still plenty of areas where we need to be better. And of course, new features come as HW manufacturers bring out new hardware. I already have touch arbitration on my todo list. But it's nice to wave at this big milestone as we pass it into the way to the glorious future of perfect, bug-free input. At this point, I'd like to extend my thanks to all our contributors: Andreas Pokorny, Benjamin Tissoires, Caibin Chen, Carlos Garnacho, Carlos Olmedo Escobar, David Herrmann, Derek Foreman, Eric Engestrom, Friedrich Schöller, Gilles Dartiguelongue, Hans de Goede, Jackie Huang, Jan Alexander Steffens (heftig), Jan Engelhardt, Jason Gerecke, Jasper St. Pierre, Jon A. Cruz, Jonas Ådahl, JoonCheol Park, Kristian Høgsberg, Krzysztof A. Sobiecki, Marek Chalupa, Olivier Blin, Olivier Fourdan, Peter Frühberger, Peter Hutterer, Peter Korsgaard, Stephen Chandler Paul, Thomas Hindoe Paaboel Andersen, Tomi Leppänen, U. Artie Eoff, Velimir Lisec.

Finally: libinput was started by Jonas Ådahl in late 2013, so it's already over 2.5 years old. And the git log shows we're approaching 2000 commits and a simple LOCC says over 60000 lines of code. I would also like to point out that the vast majority of commits were done by Red Hat employees, I've been working on it pretty much full-time since 2014 [3]. libinput is another example of Red Hat putting money, time and effort into the less press-worthy plumbing layers that keep our systems running. [4]

[1] Ironically, that's also the biggest cause of bugs because touchpads are terrible. synaptics still only does single-finger with a bit of icing and on bad touchpads that often papers over hardware issues. We now do that in libinput for affected hardware too.
[2] The synaptics driver uses absolute numbers, mostly based on the axis ranges for Synaptics touchpads making them unpredictable or at least different on other touchpads.
[3] Coincidentally, if you see someone suggesting that input is easy and you can "just do $foo", their assumptions may not match reality
[4] No, Red Hat did not require me to add this. I can pretty much write what I want in this blog and these opinions are my own anyway and don't necessary reflect Red Hat yadi yadi ya. The fact that I felt I had to add this footnote to counteract whatever wild conspiracy comes up next is depressing enough.

Posted Wed Jul 20 00:45:00 2016 Tags:

More and more distros are switching to libinput by default. That's a good thing but one side-effect is that the synclient tool does not work anymore [1], it just complains that "Couldn't find synaptics properties. No synaptics driver loaded?"

What is synclient? A bit of history first. Many years ago the only way to configure input devices was through xorg.conf options, there was nothing that allowed for run-time configuration. The Xorg synaptics driver found a solution to that: the driver would initialize a shared memory segment that kept the configuration options and a little tool, synclient (synaptics client), would know about that segment. Calling synclient with options would write to that SHM segment and thus toggle the various options at runtime. Driver and synclient had to be of the same version to know the layout of the segment and it's about as secure as you expect it to be. In 2008 I added input device properties to the server (X Input Extension 1.5 and it's part of 2.0 as well of course). Rather than the SHM segment we now had a generic API to talk to the driver. The API is quite simple, you effectively have two keys (device ID and property number) and you can set any value(s). Properties literally support just about anything but drivers restrict what they allow on their properties and which value maps to what. For example, to enable left-finger tap-to-click in synaptics you need to set the 5th byte of the "Synaptics Tap Action" property to 1.

xinput, a commandline tool and debug helper, has a generic API to change those properties so you can do things like xinput set-prop "device name" "property name" 1 [2]. It does a little bit under the hood but generally it's pretty stupid. You can run xinput set-prop and try to set a value that's out of range, or try to switch from int to float, or just generally do random things.

We were able to keep backwards compatibility in synclient, so where before it would use the SHM segment it would now use the property API, without the user interface changing (except the error messages are now standard Xlib errors complaining about BadValue, BadMatch or BadAccess). But synclient and xinput use the same API to talk to the server and the server can't tell the difference between the two.

Fast forward 8 years and now we have libinput, wrapped by the xf86-input-libinput driver. That driver does the same as synaptics, the config toggles are exported as properties and xinput can read and change them. Because really, you do the smart work by selecting the right property names and values and xinput just passes on the data. But synclient is broken now, simply because it requires the synaptics driver and won't work with anything else. It checks for a synaptics-specific property ("Synaptics Edges") and if that doesn't exists it complains with "Couldn't find synaptics properties. No synaptics driver loaded?". libinput doesn't initialise that property, it has its own set of properties. We did look into whether it's possible to have property-compatibility with synaptics in the libinput driver but it turned out to be a huge effort, flaky reliability at best (not all synaptics options map into libinput options and vice versa) and the benefit was quite limited. Because, as we've been saying since about 2009 - your desktop environment should take over configuration of input devices, hand-written scripts are dodo-esque.

So if you must insist on shellscripts to configure your input devices use xinput instead. synclient is like fsck.ext2, on that glorious day you switch to btrfs it won't work because it was only designed with one purpose in mind.

[1] Neither does syndaemon btw but it's functionality is built into libinput so that doesn't matter.
[2] xinput set-prop --type=int --format=32 "device name" "hey I have a banana" 1 2 3 4 5 6 and congratulations, you've just created a new property for all X clients to see. It doesn't do anything, but you could use those to attach info to devices. If anything was around to read that.

Posted Fri Jul 15 07:47:00 2016 Tags:

xinput is a commandline tool to change X device properties. Specifically, it's a generic interface to change X input driver configuration at run-time, used primarily in the absence of a desktop environment or just for testing things. But there's a feature of xinput that many don't appear to know: it resolves device and property names correctly. So plenty of times you see advice to run a command like this:


xinput set-prop 15 281 1
This is bad advice, it's almost impossible to figure out what this is supposed to do, it depends on the device ID never changing (spoiler: it will) and the property number never changing (spoiler: it will). Worst case, you may suddenly end up setting a different property on a different device and you won't even notice. Instead, just use the built-in name resolution features of xinput:

xinput set-prop "SynPS/2 Synaptics TouchPad" "libinput Tapping Enabled" 1
This command will work regardless of the device ID for the touchpad and regardless of the property number. Plus it's self-documenting. This has been possible for many many years, so please stop using the number-only approach.
Posted Thu Jul 14 05:44:00 2016 Tags:

I wrote a Twitter Bot that tweets real-time updates from FiveThirtyEight's 2016 U.S. presidential election forecast. This is the latest version of the model that correctly predicated 49 of 50 states in the 2008 election and all 50 states in 2012.

Follow @polling_2016 on Twitter!

Details: The system pulls updates from FiveThirtyEight throughout the day, but will only tweet when the model is updated and the probabilities change. It uses the polls-only model. Typically it generates between zero and three tweets a day. It is written using open source software and runs on Linux. This is unaffiliated with FiveThirtyEight or my employer.

Posted Wed Jul 13 17:00:00 2016 Tags:

In an earlier post, I explained how we added graphics tablet pad support to libinput. Read that article first, otherwise this article here will be quite confusing.

A lot of tablet pads have mode-switching capabilities. Specifically, they have a set of LEDs and pressing one of the buttons cycles the LEDs. And software is expected to map the ring, strip or buttons to different functionality depending on the mode. A common configuration for a ring or strip would be to send scroll events in mode 1 but zoom in/out when in mode 2. On the Intuos Pro series tablets that mode switch button is the one in the center of the ring. On the Cintiq 21UX2 there are two sets of buttons, one left and one right and one mode toggle button each. The Cintiq 24HD is even more special, it has three separate buttons on each side to switch to a mode directly (rather than just cycling through the modes).

In the upcoming libinput 1.4 we will have mode switching support in libinput, though modes themselves have no real effect within libinput, it is merely extra information to be used by the caller. The important terms here are "mode" and "mode group". A mode is a logical set of button, strip and ring functions, as interpreted by the compositor or the client. How they are used is up to them as well. The Wacom control panels for OS X and Windows allow mode assignment only to the strip and rings while the buttons remain in the same mode at all times. We assign a mode to each button so a caller may provide differing functionality on each button. But that's optional, having a OS X/Windows-style configuration is easy, just ignore the button modes.

A mode group is a physical set of buttons, strips and rings that belong together. On most tablets there is only one mode group but tablets like the Cintiq 21UX2 and the 24HD have two independently controlled mode groups - one left and one right. That's all there is to mode groups, modes are a function of mode groups and can thus be independently handled. Each button, ring or strip belongs to exactly one mode group. And finally, libinput provides information about which button will toggle modes or whether a specific event has toggled the mode. Documentation and a starting point for which functions to look at is available in the libinput documentation.

Mode switching on Wacom tablets is actually software-controlled. The tablet relies on some daemon running to intercept button events and write to the right sysfs files to toggle the LEDs. In the past this was handled by e.g. a callout by gnome-settings-daemon. The first libinput draft implementation took over that functionality so we only have one process to handle the events. But there are a few issues with that approach. First, we need write access to the sysfs file that exposes the LED. Second, running multiple libinput instances would result in conflicts during LED access. Third, the sysfs interface is decidedly nonstandard and quite quirky to handle. And fourth, the most recent device, the Express Key Remote has hardware-controlled LEDs.

So instead we opted for a two-factor solution: the non-standard sysfs interface will be deprecated in favour of a proper kernel LED interface (/sys/class/leds/...) with the same contents as other LEDs. And second, the kernel will take over mode switching using LED triggers that are set up to cover the most common case - hitting a mode toggle button changes the mode. Benjamin Tissoires is currently working on those patches. Until then, libinput's backend implementation will just pretend that each tablet only has one mode group with a single mode. This allows us to get the rest of the userstack in place and then, once the kernel patches are in a released kernel, switch over to the right backend.

Posted Mon Jul 11 01:28:00 2016 Tags:

The Washington Post is running a story alleging that surveys show gun ownership in the U.S,. is at a 40-year low. I won’t link to it.

This is at the same time gun sales are at record highs.

The WaPo’s explanation, is, basically, that all these guns are being bought by the same fourteen survivalists in Idaho.

Mine is that the number of gun owners with a justified fear that “surveys” are a data-gathering tool for confiscations is also at a record high, and therefore that the number lying to nosy strangers about having no guns is at a record high.

I think there’s a way to discriminate between these cases on the evidence.

It’s not NICS records, because thoise get destroyed after a timeout. Thankfully…

In any consumer market, a reliable way to tell if it’s broadening or narrowing is whether manufacturers’ and retailers product ranges are expanding or contracting. SKUs are expensive; having more complicates everybodies’ supply chains and planning and accounting.

In a broadening market, the variety of consumer preferences is increasing. It makes sense to chase them with product variations. In a narrowing one the opposite is true, and you shed SKUs that no longer carry the overhead of their differentiation.

In early-stage technologies this effect can be masked by the normal culling of product types that happens as a technology stabilizes. There was much more variety in personal computers in 1980 than there is now! But firearms are not like this; they’re a mature technology.

So a productive question to ask is this: is the huge upswing in gun sales being accompanied by a broadening of product ranges? Google-fu did not provide a definite answer, but I can think of several indicators.

A big one is the explosion in sales of aftermarket parts for AR-15 customization. If that’s a sign of a contracting market, I’ll eat the grips on my Kimber. Another is the way new product classes keep coming out and being difficult to buy until gunmakers tool up to meet demand. The most recent case of this I know of was subcompact (3.5″-barrel) .45ACPs.

Open question for my blog regulars: can we find good public measures for SKU diversity in this space?

Posted Sun Jul 3 11:37:51 2016 Tags:

Haven’t been blogging for a while because I’ve been deep in coding and HOWTO-writing. Follows the (slightly edited) text of an email I wrote to the NTPsec devel list that I I think might be of interest to a lot of my audience.

One of the questions I get a lot is: How do you do it? And what is “it”, anyway? The question seems like an inquiry into the mental stance that a systems architect has to have to do his job.

So, um, this is it. If you read carefully, I think you’ll learn a fair bit even if you haven’t a clue about NTP itself.

Today, after a false start yesterday and a correction, I completed a patch sequence that makes a significant structural change to NTP that isn’t just removing cruft.

This is kind of a first. Yes, I’ve made some pretty dramatic changes to the code over the last year, but other than the not-yet-successful TESTFRAME scaffolding they were almost all bug fixes, refactorings, or creative removals. The one exception, JSON reporting from ntpdig, was rather trivial.

[What I didn’t say to the list, because they already know it, is that the code was such a rubble pile that it actually took that year to clean up to the point where a change like this was reasonable to attempt.]

What I’ve succeeded in doing is almost completely removing from the code the assumption that refclock addresses necessarily have the special form 127.127.t.u. The only code that still believes this is in the ntp.conf configuration parser, and the only reason *it* still believes this in order not to break the existing syntax of refclock declarations.

(In fact, clock addresses do still have this form internally, but that is only to avoid surprising older ntpq instances; nothing in the NTPsec code now requires it.)

I’ve also made substantial progress towards eliminating driver-type magic numbers from the code. The table that used to indirect from driver-type numbers to driver-type shortnames is gone; instead, the driver shortname string is what it should be – an element of the driver method table – and there is only one type-number-to-driver indirection, a table in refclock_conf.c.

This is all clearing the decks for a big user-visible change. I’m going to fix the frighteningly awful refclock declaration syntax. Consider this example:

# Uses the shared-memory driver, accepting fixes from a running gpsd
# instance watching one PPS-capable GPS. Accepts in-band GPS time (not
# very good, likely to have jitter in the 100s of milliseconds) on one
# unit, and PPS time (almost certainly good to 1 ms or less) on
# another.  Prefers the latter.
 
# GPS Serial data reference (NTP0)
server 127.127.28.0
fudge 127.127.28.0 refid GPS
 
# GPS PPS reference (NTP1)
server 127.127.28.1 prefer
fudge 127.127.28.1 refid PPS

The misleading “server” keyword for what is actually a reference clock. The magic 127.127.t.u address, which is the only way you *know* it’s a reference clock. Some attributes of the clock being specified in a mystery ‘fudge’ command only tied in by the magic server address. The magic driver type number 28. The fail is strong here. The only excuse for this garbage (and it’s not much of one – Mills was smart enough to know better) is that it was designed decades ago in a more primitive time.

Here’s how I think it should look:

refclock shm unit 0 refid GPS
refclock shm unit 1 prefer refid PPS

No magic IPv4 address, no split syntax, no driver type number (it’s been replaced by the driver shortname “shm”). It should be less work to get the rest of the way to this (while still supporting the old syntax for backward compatibility) than I’ve done already – I’ve already written the grammar, only the glue code still needs doing.

An unobvious benefit of this change is that the driver reference pages are going to become a lot less mystifying. I can still remember how and why my head hurt on first reading them. Removing the magic addresses and mystery numbers will help a lot.

Along the way I learned a lot about how ntpq and mode 6 responses work. (Like NTP in general, it’s an odd combination of elegant structural ideas with an astonishing accumulation of cruft on top.) In order to remove the magic-address assumptions from ntpq I had to add another variable, “displayname”, to the set you get back when you request information about a peer. In effect, ntpd gets to say “*this* is how you should label this peer”, and ntpq uses that to decorate the clock entries in its -p output.

This has the minor downside that new ntpqs will display 127.127.28.0 (rather than “SHM(0)”) when querying Classic ntpd, which doesn’t ship that variable. Oh well…almost everyone disables remote querying anyway. It was the right thing to do; ntpq has no business knowing about driver type numbers.

(Grrrrr…Actually, *nobody* has any business knowing about driver type numbers. Things that have names should be referred to by name. Making humans maintain a level of indirection from names to numbers is perverse, that’s the kind of detail we have computers to track. Or, to put it slightly differently, “1977 called – it wants its ugly kluge back.”)

It’s easy for codebases this size to wind up as huge balls of mud. There are several nearly equivalent ways to describe my job as a systems architect; one of them centers on enforcing proper separation of concerns so collapse-to- mudball is prevented. The changes I’ve just described are a significant step in the good direction.

Posted Sun Jun 26 02:51:12 2016 Tags:

I ran some quick numbers on the last retargeting period (blocks 415296 through 416346 inclusive) which is roughly a week’s worth.

Blocks were full: median 998k mean 818k (some miners blind mining on top of unknown blocks). Yet of the 1,618,170 non-coinbase transactions, 48% were still paying dumb, round fees (like 5000 satoshis). Another 5% were paying

dumbround-numbered per-byte fees (like 80 satoshi per byte).

The mean fee was 24051 satoshi (~16c), the mean fee rate 60 satoshi per byte. But if we look at the amount you needed to pay to get into a block (using the second cheapest tx which got in), the mean was 16.81 satoshis per byte, or about 5c.

tl;dr: It’s like a tollbridge charging vehicles 7c per ton, but half the drivers are just throwing a quarter as they drive past and hoping it’s enough. It really shows fees aren’t high enough to notice, and transactions don’t get stuck often enough to notice. That’s surprising; at what level will they notice? What wallets or services are they using?

Posted Wed Jun 15 03:00:05 2016 Tags:
A few notes on technology-fueled normalization of lynch mobs targeting both the accuser and the accused. #ethics #crime #punishment
Posted Tue Jun 7 01:00:47 2016 Tags:

I’ve been learning more about tinkering with electronics lately, soldering and casemodding and that sort of thing. The major reason for this is NTPsec-related and will be discussed in a near-future post, but here is an early consequence unrelated to that project:

Converting a PS/2 TrackMan Marble to USB

Posted Thu Jun 2 13:42:55 2016 Tags:

Union syntax

(I'm trying to do this as a quick post in response to some questions I received on this topic. I realize this will probably reopen the whole discussion about the best syntax for types, but sorry folks, PEP 484 was accepted nearly a year ago, after many months of discussions and hundreds of messages. It's unlikely that any idea you can think of here would be new. This post just explains the rationale of one particular decision and tries to put it in some context.)
I've heard some grumbling about the union syntax in PEP 484: Union[X, Y, Z] (where X, Y and Z are arbitrary type expressions). In the past people have suggested X|Y|Z for this, or (X, Y, Z) or {X, Y, Z}. Why did we go with the admittedly clunkier Union[X, Y, Z]?

First of all, despite all the attention drawn to it, unions are actually a pretty minor feature, and you shouldn't be using them much. So you also shouldn't care that much.

Why not X|Y|Z?

This won't fly because we want compatibility with versions of Python 3 that were already frozen (see below). We want to be able to express e.g. a union of int and str, which under this notation would be written as int|str. But for that to fly we'd have to modify the builtin 'type' class to implement __or__ -- and that wouldn't fly on already-frozen Python versions. Supporting X|Y only for types (like List) imported from the typing module and some other notation for builtin types would only sow confusion. So X|Y|Z is out.

Why not {X, Y, Z}?

That's the set with elements X, Y and Z, using the builtin set notation. We can usefully consider types to be sets of values, and this makes a union a set of values too (that's why it's called union :-).

However, {X, Y, Z} confuses the set of types with the set of values, which I consider a mortal sin. This would just cause endless confusion.

This notation would also confuse things when taking the union of several classes that overlap, e.g. if we have classes B and C, where C inherits from B, then the union of B and C is just B. But the builtin set doesn't see it that way. In contrast, the X|Y notation could actually solve this (since in principle we could overload __or__ to do whatever we want), and the Union[] operator ("functor"?) from PEP 484 indeed solves this -- in this example Union[B, C] returns the (non-union) type B, both in the type checker and at runtime.

Why not (X, Y, Z)?

That's the tuple (X, Y, Z). It has the same disadvantages as {X, Y, Z}, but at least it has the advantage of being similar to how unions are expressed as arguments to isinstance(), for example isinstance(x, (int, str, list)) or isinstance(x, (Sequence, Mapping)). (Similarly the except clause: try: ... / except (KeyError, IndexError): ...)

Another problem with tuples is that the tuple syntax is already overloaded in so many ways that it would be confused with other uses even more easily. One particular confusion would be other generic types, for which we'd still want to use square brackets. (You can't really beat Iterable[int] for clarity if you have an iterable of integers. :-) Suppose you have a sequence of values that could be integers or strings. In PEP 484 notation we write this as Sequence[Union[int, str]]. Using the tuple notation we'd want to write this as Sequence[(int, str)]. But it turns out that the __getitem__ overload on the metaclass can't tell the difference between Sequence[(int, str)] and Sequence[int, str] -- and we would like to reject the latter as a mistake since Sequence[] is a generic class over a single parameter. (An example of a generic class over two parameters would be Mapping[K, V].) Disambiguating all this would place us on very thin ice indeed.

The nail in this idea's coffin is the competing idea of using (X, Y, Z) to indicate a tuple with three items, with respective types, X, Y and Z. At first sight this seems an even better use of the tuple syntax than unions would be, and tuples are way more common than unions. But it runs afoul of the same problems with Foo[(X, Y)] vs. Foo[X, Y]. (Also, there would be no easy way to describe what PEP 484 calls Tuple[X, ...], i.e. a variable-length tuple with uniform item type X.)

PS. Why support old Python 3 versions?

The reason for supporting older versions is adoption. Only a relatively small crowd of early adopters can upgrade to the latest Python version as soon as it's out; the rest of us are stuck on older versions (even Python 2.7!).

So for PEP 484 and the typing module, we wanted to support 3.2 and up -- we chose 3.2 because it's the newest Python 3 supported by some older but still popular Ubuntu and Debian distributions. (Also, 3.0 and 3.1 were too immature at their time of release to ever have a large following.)

There's a typing package that you can install easily using pip, and this defines all sorts of useful things for typing, from Any and Union to generic versions of List and Sequence. But such a package can't modify existing builtins like int or list.

(Eventually we also added Python 2.7 support, using type comments for function signatures.)
Posted Wed May 18 18:55:00 2016 Tags:

Type annotations for fspath

Python 3.6 will have a new dunder protocol, __fspath__() , which should be supported by classes that represent filesystem paths. Example of such classes are the pathlib.Path family and os.DirEntry  (returned by os.scandir() ).

You can read more about this protocol in the brand new PEP 519. In this blog post I’m going to discuss how we would add type annotations for these additions to the standard library.

I’m making frequent use of AnyStr , a quite magical type variable predefined in the typing module. If you’re not familiar with it, I recommend reading my blog post about AnyStr . You may also want to read up on generics in PEP 484 (or read mypy’s docs on the subject).

Adding os.scandir() to the stubs for os.py

For practice, let’s see if we can add something to the stub file for os.py. As of this writing there’s no typeshed information for os.scandir() , which I think is a shame. I think the following will do nicely. Note how we only define DirEntry  and scandir() for Python versions >= 3.5. (Mypy doesn’t support this yet, but it will soon, and the example here still works — it just doesn’t realize scandir()  is only available in Python 3.5.) This could be added to the end of stdlib/3/os/__init__.pyi:

from typing import Generic, AnyStr, overload, Iterator

if sys.version_info >= (3, 5):

    class DirEntry(Generic[AnyStr]):
        name = ...  # type: AnyStr
        path = ...  # type: AnyStr
        def inode(self) -> int: ...
        def is_dir(self, *, follow_symlinks: bool = ...) -> bool: ...
        def is_file(self, *, follow_symlinks: bool = ...) -> bool: ...
        def is_symlink(self) -> bool: ...
        def stat(self, *, follow_symlinks: bool = ...) -> stat_result: ...

    @overload
    def scandir() -> Iterator[DirEntry[str]]: ...
    @overload
    def scandir(path: AnyStr) -> Iterator[DirEntry[AnyStr]]: ...

Deconstructing this a bit, we see a generic class (that’s what the Generic[AnyStr]  base class means) and an overloaded function.  The scandir() definition uses @overload because it can also be called without arguments. We could also write it as follows; it’ll work either way:

    @overload
    def scandir(path: str = ...) -> Iterator[DirEntry[str]]: ...
    @overload
    def scandir(path: bytes) -> Iterator[DirEntry[bytes]]: ...

Either way there really are three ways to call scandir() , all three returning an iterable of DirEntry objects:

  • scandir() -> Iterator[DirEntry[str]] 
  • scandir(str) -> Iterator[DirEntry[str]] 
  • scandir(bytes) -> Iterator[DirEntry[bytes]] 

Adding os.fspath()

Next I’ll show how to add os.fspath() and how to add support for the __fspath__()  protocol to DirEntry .

PEP 519 defines a simple ABC (abstract base class), PathLike , with one method, __fspath__() . We need to add this to the stub for os.py , as follows:

class PathLike(Generic[AnyStr]):
    @abstractmethod
    def __fspath__(self) -> AnyStr: ...

That’s really all there is to it (except for the sys.version_info  check, which I’ll leave out here since it doesn’t really work yet). Next we define os.fspath() , which wraps this protocol. It’s slightly more complicated than just calling its argument’s __fspath__()  method, because it also handles strings and bytes. So here it is:

@overload
def fspath(path: PathLike[AnyStr]) -> AnyStr: ...
@overload
def fspath(path: AnyStr) -> AnyStr: ...

Easy enough! Next is update the definition of DirEntry . That’s easy too — in fact we only need to make it inherit from PathLike[AnyStr] , the rest is the same as the definition I gave above:

class DirEntry(PathLike[AnyStr], Generic[AnyStr]):
    # Everything else unchanged!

The only slightly complicated bit here is the extra base class Generic[AnyStr] . This seems redundant, and in fact PEP 484 says we can leave it off, but mypy doesn’t support that yet, and it’s quite harmless — this just rubs into mypy’s face that this is a generic class of one type variable (the by-now famous AnyStr ).

Finally we need to make a similar change to the stub for pathlib.py . Again, all we need to do is to make PurePath  inherit from PathLike[str] , like so:

from os import PathLike

class PurePath(PathLike[str]):
    # Everything else unchanged!

However, here we don’t add Generic , because this is not a generic class! It inherits from PathLike[str] , which is quite un-generic, since it’s PathLike specialized for just str .

Note that we don’t actually have to define the __fspath__()  method in these stubs — we’re not supposed to call them directly, and stubs don’t provide implementations, only interfaces.

Putting it all together, we see that it’s quite elegant:

for a in os.scandir('.'):
    b = os.fspath(a)
    # Here, the typechecker will know that the type of b is str!

The derivation that b has type str  is not too complicated: first, os.scandir('.')  has a str  argument, so it returns an iterator of DirEntry  objects parameterized with str , which we write as DirEntry[str] . Passing this DirEntry[str]  to os.fspath()  then takes the first of that function’s two overloads (the one with PathLike[AnyStr] ), since it doesn’t match the second one ( DirEntry  doesn’t inherit from AnyStr , because it’s neither a str  nor bytes ). Further the AnyStr type variable in PathLike[AnyStr] is solved to stand for just str , because DirEntry[str]  inherits from PathLike[str] . This is the specialized version of what the code says: DirEntry[AnyStr]  inherits from PathLike[AnyStr] .

Okay, so maybe that last paragraph was intermediate or advanced. And maybe it could be expanded. Maybe I’ll write another blog about how type inference works, but there’s a lot on that topic, and other authors have probably already written better introductory material about generics (in other languages, though).

Making things accept PathLike

There’s a bit of cleanup work that I’ve left out. PEP 519 says that many stdlib functions that currently take strings for pathnames will be modified to also accept PathLike . For example, here’s how the signatures for os.scandir()  would change:

@overload
def scandir() -> Iterator[DirEntry[str]]: ...
@overload
def scandir(path: AnyStr) -> Iterator[DirEntry[AnyStr]]: ...
@overload
def scandir(path: PathLike[AnyStr]) -> Iterator[DirEntry[AnyStr]]: ...

The first two entries are unchanged; I’ve just added a third overload. (Note that the alternative way of defining scandir() would require more changes — an indication that this way is more natural.)

I also tried doing this with a union:

@overload
def scandir() -> Iterator[DirEntry[str]]: ...
@overload
def scandir(path: Union[AnyStr, PathLike[AnyStr]]) -> Iterator[DirEntry[AnyStr]]: ...

But I couldn’t get this to work, so the extra overload is probably the best we can do. Quite a few functions will require a similar treatment, sometimes introducing overloading where none exists today (but that shouldn’t hurt anything).

A note about pathlib : since it only deals with strings, its methods (the ones that PEP 519 says should be changed anyway) should use PathLike[str]  rather than PathLike[AnyStr] .

Acknowledgments

(Thanks for comments on the draft to Stephen Turnbull, Koos Zevenhoven, Ethan Furman, and Brett Cannon.)
Posted Wed May 18 14:06:00 2016 Tags:

The AnyStr type variable

I was drafting a blog post on how to add type annotations for the new __fspath__()  protocol (PEP 519) when I realized that I should write a separate post about AnyStr . So here it is.

A simple function on strings

Let’s write a function that surrounds a string in parentheses. We’ll put it in a file named demo.py :

def parenthesize(s):
    return '(' + s + ')'

It works, too:

>>> from demo import parenthesize
>>> print(parenthesize('hola'))
(hola)

Of course, if you pass it something that’s not a string it will fail:

>>> parenthesize(42)
Traceback (most recent call last):
  File "demo.py", line 1, in
  File "demo.py", line 2, in parenthesize
TypeError: Can't convert 'int' object to str implicitly

Adding type annotations

Using PEP 484 type annotations we can clarify our little function’s signature:

def parenthesize(s: str) -> str:
    return '(' + s + ')'

Nothing to it, right? Even if you’ve never heard of PEP 484 before you can guess what this means. (Note that PEP 484 also says that the runtime behavior is unchanged. The calls I showed above will still have exactly the same effect, including the TypeError raised by parenthesize(42) .)

Polymorphic functions

Now suppose this is actually part of a networking app and we need to be able to parenthesize byte strings as well as text strings. Here’s how you’d implement that:

def parenthesize(s):
    if isinstance(s, str):
        return '(' + s + ')'
    elif isinstance(s, bytes):
        return b'(' + s + b')'
    else:
        raise TypeError(f"That's not a string, it's a {type(s)}")  # See PEP 498

With a fancy word we call that a polymorphic function. How do you write a signature for such a function? For the answer we have to dive a little deeper into PEP 484. It defines a nifty operator named Union  that lets us state that a type can be either this or that (or something else). In our case, it’s either str  or bytes , so we can write it like this:

from typing import Union

def parenthesize(s: Union[str, bytes]) -> Union[str, bytes]:
    if isinstance(s, str):
    # Etc.

Now let’s write a little main program with a bug, to show off the type checker:

from demo import parenthesize

a = parenthesize('hello')
b = parenthesize(b'hola')
c = a + b  ### bug here<-- bug="" span="">
print(c)

When we try to run this, the two parenthesize()  calls work fine (yay polymorphism!) but we get a TypeError on the last line:

$ python3 main.py 
Traceback (most recent call last):
  File "main.py", line 5, in
    c = a + b  ### bug here<-- bug="" span="">
TypeError: Can't convert 'bytes' object to str implicitly

The reason should be pretty obvious: in Python 3 you can’t mix bytes and str objects. And when we type-check this program using mypy we indeed get a type error:

$ mypy main.py 
main.py:5: error: Unsupported operand types for + (likely involving Union)

Debugging the bug

So let’s try a program without a bug:

from demo import parenthesize

a = parenthesize('hello')
b = parenthesize('hola')
c = a + b  ### bug here<-- bug="" no="" span="">
print(c)

Run it and it works great:

$ python3 main.py
(hello)(hola)

So the type checker should be happy too, right?

$ mypy main.py
main.py:5: error: Unsupported operand types for + (likely involving Union)

Whoops! The same error. What happened? Of course, I set you up, so I can explain something about type checking.

The trouble with tribbles unions

The type checker takes the signature at face value, so that when checking the call, it infers the type Union[str, bytes]  for every call to parenthesize() , regardless of what the arguments are. This is because, for most functions of even modest complexity, a type checker doesn’t understand enough about what’s going on in the function body, so it just has to believe the types in the signature (even though in this particular case it would probably be easy enough to do better).

In our test program the types of a  and b  are both inferred to be exactly what parenthesize()  claims to return, i.e., both variables have the type Union[str, bytes] . The type checker then analyzes the expression a + b , and for this it discovers a problem: if a is either str or bytes, and so is b , then the +  operator may be invoked on any of these combinations of types: str + str , str + bytes , bytes + str , or bytes + bytes . But only the first and the last are valid! In Python 3, str + bytes  or bytes + str  are invalid operations.

Aside: Even in Python 2, those two are suspect: since while 'x' + u'y'  indeed works (returning u'xy' ), other combinations will raise UnicodeDecodeError, e.g.:

>>>'Franç' + u'ois'
Traceback (most recent call last):
  File "", line 1, in
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4:
ordinal not in range(128)

Anyway, the type checker doesn’t like this business, and it rejects operations on Unions where some combinations are invalid. What can we do instead?

Function overloading

One option would be function overloading. PEP 484 defines a magical decorator, @overload , which lets us get around this problem. We could write something like this:

from typing import overload

@overload
def parenthesize(s: str) -> str: ...
@overload
def parenthesize(s: bytes) -> bytes: ...

This tells the type checker that if the argument is a str , the return value is also a str , and similarly for bytes . Unfortunately @overload  is only allowed in stub files, which are a kind of interface definition files that show a type checker the signatures of a module’s contents without giving the implementation.

Type variables

Fortunately there’s an even better way, using type variables. This is how it goes:

from typing import TypeVar

S = TypeVar('S')

def parenthesize(s: S) -> S:
    if isinstance(s, str):
        return '(' + s + ')'
    elif isinstance(s, bytes):
        return b'(' + s + b')'
    else:
        raise TypeError("That's not a string, dude! It's a %s" % type(s))

Well… Almost. Our main.py program (unchanged from above) now gets a clean bill of health, but when we type-check this version we get errors on both return  lines:

demo.py: note: In function "parenthesize":
demo.py:7: error: Incompatible return value type: expected S`-1, got builtins.str
demo.py:9: error: Incompatible return value type: expected S`-1, got builtins.bytes

This is a bit hard to fathom, but the fix is what I was leading up to anyway, so I’ll reveal it now:

from typing import TypeVar

S = TypeVar('S', str, bytes)

def parenthesize(s: S) -> S:
    if isinstance(s, str):
        return '(' + s + ')'
    elif isinstance(s, bytes):
        return b'(' + s + b')'
    else:
        raise TypeError("That's not a string, dude! It's a %s" % type(s))

The only changed line is this one:

S = TypeVar('S', str, bytes)

This notation is called a type variable with value restriction . Yes, it’s mouthful; we sometimes also call it a constrained type variable. S is a type variable restricted to a set of types. It also has the advantage of telling the type checker that types other than str  or bytes  are not acceptable. Without that, a call like this would have been considered valid:

x = parenthesize(42)

because the original type variable (without the restrictions) doesn't tell mypy that this is a bad idea.

In fact, this particular use case (a type variable constrained to str or bytes) is so commonly needed that it's predefined in the typing module, and all we have to do is import it:

from typing import AnyStr

def parenthesize(s: AnyStr) -> AnyStr:
    # Etc. -- trust me, it works!

Real-world use of AnyStr

In fact, this is how many polymorphic functions in the os  and os.path  modules are defined. For example, in the stub for os.py  we find definitions like the following:

def link(src: AnyStr, link_name: AnyStr) -> None: ...

and also this:

def split(path: AnyStr) -> Tuple[AnyStr, AnyStr]: ...

These show us a bit more of the power of type variables: the signature for link()  indicates that either both arguments must be str  or both must be bytes ; split()  demonstrates that the type variable may also occur in more complex constructs: splitting a str returns a tuple of two str objects, while splitting bytes returns a tuple of two bytes  objects.

That’s all I wanted to share about AnyStr . Thanks for comments on the draft to Stephen Turnbull, Koos Zevenhoven, Ethan Furman, and Brett Cannon.

Posted Tue May 17 16:53:00 2016 Tags:
How quantum cryptographers are stealing a quarter of a billion Euros from the European Commission. #qkd #quantumcrypto #quantummanifesto
Posted Mon May 16 22:05:36 2016 Tags:

Recently Dave Täht wrote a blog post investigating latency and WiFi scanning and came across NetworkManager’s periodic scan behavior.  When a WiFi device scans it obviously must change from its current radio channel to other channels and wait for a short amount of time listening for beacons from access points.  That means it’s not passing your traffic.

With a bad driver it can sometimes take 20+ seconds and all your traffic gets dropped on the floor.

With a good driver scanning takes only a few seconds and the driver breaks the scan into chunks, returning to the associated access point’s channel periodically to handle pending traffic.  Even with a good driver, latency-critical applications like VOIP or gaming will clearly suffer while the WiFi device is listening on another channel.

So why does NetworkManager periodically scan for WiFi access points?

Roaming

Whenever your WiFi network has multiple access points with the same SSID (or a dual-band AP with a single SSID) you need roaming to maintain optimal connectivity and speed.  Jumping to a better AP requires that the device know what access points are available, which means doing a periodic scan like NetworkManager does every 2 minutes.  Without periodic scans, the driver must scan at precisely the worst moment: when the signal quality is bad, and data rates are low, and the risk of disconnecting is higher.

Enterprise WiFi setups make the roaming problem much worse because they often have tens or hundreds of access points in the network and because they typically use high-security 802.1x authentication with EAP.  Roaming with 802.1x introduces many more steps to the roaming process, each of which can fail the roaming attempt.  Strategies like pre-authentication and periodic scanning greatly reduce roaming errors and latency.

User responsiveness and Location awareness

The second reason for periodic scanning is to maintain a list of access points around you for presentation in user interfaces and for geolocation in browsers that support it.  Up until a couple years ago, most Linux WiFi applets displayed a drop-down list of access points that you could click on at any time.  Waiting for 5 to 15 seconds for a menu to populate or ‘nmcli dev wifi list’ to return would be annoying.

But with the proliferation of WiFi (often more than 30 or 40 if you live in a flat) those lists became less and less useful, so UIs like GNOME Shell moved to a separate window for WiFi lists.  This reduces the need for a constantly up-to-date WiFi list and thus for periodic scanning.

To help support these interaction models and click-to-scan behaviors like Mac OS X or Maemo, NetworkManager long ago added a D-Bus API method to request an out-of-band WiFi scan.  While it’s pretty trivial to use this API to initiate geolocation or to refresh the WiFi list based on specific user actions, I’m not aware of any clients using it well.  GNOME Shell only requests scans when the network list is empty and plasma-nm only does so when the user clicks a button.  Instead, UIs should simply request scans periodically while the WiFi list is shown, removing the need for yet another click.

WHAT TO DO

If you don’t care about roaming, and I’m assuming David doesn’t, then NetworkManager offers a simple solution: lock your WiFi connection profile to the BSSID of your access point.  When you do this, NetworkManager understands that you do not want to roam and will disable the periodic scanning behavior.  Explicitly requested scans are still allowed.

You can also advocate that your favorite WiFi interface add support for NetworkManager’s RequestScan() API method and begin requesting periodic scans when WiFi lists are shown or when your browser uses geolocation.  When most do this, perhaps NetworkManager could be less aggressive with its own periodic scans, or perhaps remove them altogether in favor of a more general solution.

That general solution might involve disabling periodic roaming when the signal strength is extremely good and start scanning more aggressively when signal strength drops over a threshold.  But signal strength drops for many reasons like turning on a microwave, closing doors, turning on Bluetooth, or even walking to the next room, and triggering a scan then still interrupts your VOIP call or low ping headshot.  This also doesn’t help people who aren’t close to their access point, leading to the same scanning problem David talks about if you’re in the basement but not if you’re in the bedroom.

Another idea would be to disable periodic scanning when latency critical applications are active, but this requires that these applications consistently set the IPv4 TOS field or use the SO_PRIORITY socket option.  Few do so.  This also requires visibility into kernel mac80211 queue depths and would not work for proprietary or non-mac80211-based drivers.  But if all the pieces fell into place on the kernel side, NetworkManager could definitely do this while waiting for applications and drivers to catch up.

If you’ve got other ideas, feel free to propose them.

Posted Mon May 16 17:43:57 2016 Tags:

I've posted in the past about the Oracle vs. Google case. I'm for the moment sticking to my habit of only commenting when there is a clear court decision. Having been through litigation as the 30(b)(6) witness for Conservancy, I'm used to court testimony and why it often doesn't really matter in the long run. So much gets said by both parties in a court case that it's somewhat pointless to begin analyzing each individual move, unless it's for entertainment purposes only. (It's certainly as entertaining as most TV dramas, really, but I hope folks who are watching step-by-step admit to themselves that they're just engaged in entertainment, not actual work. :)

I saw a lot go by today with various people as witnesses in the case. About the only part that caught my attention was that Classpath was mentioned over and over again. But that's not for any real salient reason, only because I remember so distinctly, sitting in a little restaurant in New Orleans with RMS and Paul Fisher, talking about how we should name this yet-to-be-launched GNU project “$CLASSPATH”. My idea was that was a shell variable that would expand to /usr/lib/java, so, in my estimation, it was a way to name the project “User Libraries for Java” without having to say the words. (For those of you that were still children in the 1990s, trademark aggression by Sun at the time on the their word mark for “Java” was fierce, it was worse than the whole problem the Unix trademark, which led in turn to the GNU name.)

But today, as I saw people all of the Internet quoting judges, lawyers and witnesses saying the word “Classpath” over and over again, it felt a bit weird to think that, almost 20 years ago sitting in that restaurant, I could have said something other than Classpath and the key word in Court today might well have been whatever I'd said. Court cases are, as I said, dramatic, and as such, it felt a little like having my own name mentioned over and over again on the TV news or something. Indeed, I felt today like I had some really pointless, one-time-use superpower that I didn't know I had at the time. I now further have this feeling of: “darn, if I knew that was the one thing I did that would catch on this much, I'd have tried to do or say something more interesting”.

Naming new things, particularly those that have to replace other things that are non-Free, is really difficult, and, at least speaking for myself, I definitely can't tell when I suggest a name whether it is any good or not. I actually named another project, years later, that could theoretically get mentioned in this case, Replicant. At that time, I thought Replicant was a much more creative name than Classpath. When I named Classpath, I felt it was somewhat obvious corollary to the “GNU'S Not Unix” line of thinking. I also recall distinctly that I really thought the name lost all its cleverness when the $ and the all-caps was dropped, but RMS and others insisted on that :).

Anyway, my final message today is to the court transcribers. I know from chatting with the court transcribers during my depositions in Conservancy's GPL enforcement cases that technical terminology is really a pain. I hope that the term I coined that got bandied about so much in today's testimony was not annoying to you all. Really, no one thinks about the transcribers in all this. If we're going to have lawsuits about this stuff, we should name stuff with the forethought of making their lives easier when the litigation begins. :)

Posted Sat May 14 01:14:54 2016 Tags:

While most of the NTPsec team was off at Penguicon, the NTP Classic people shipped a release patched for eleven security vulnerabilities in their code. Which might have been pretty embarrassing, if those vulnerabilities were in our code, too. People would be right to wonder, given NTPsec’s security focus, why we didn’t catch all these sooner.

In fact, we actually did pre-empt most of them. The attack surface that eight of these eleven security bugs penetrate isn’t present at all in NTPsec. The vulnerabilities were in bloat and obsolete features we’ve long since removed, like the Mode 7 control channel.

I’m making a big deal about this because it illustrates a general point. One of the most effective ways to harden your code against attack – perhaps the most effective – is to reduce its attack surface.

Thus, NTPsec’s strategy all along has centered on aggressive cruft removal. This strategy has been working extremely well. Back in January our 0.1 release dodged two CVEs because of code we had already removed. This time it was eight foreclosed – and I’m pretty sure it won’t be the last time, either. If only because I ripped out Autokey on Sunday, a notorious nest of bugs.

Simplify, cut, discard. It’s often better hardening than anything else you can do. The percentage of NTP Classic code removed from NTPsec is up to 58% now, and could easily hit 2/3rds before we’re done,

Posted Thu May 5 03:16:04 2016 Tags:

A recurring question I encounter is the question whether uinput or evdev should be the approach do implement some feature the user cares about. This question is unfortunately wrongly framed as uinput and evdev have no real overlap and work independent of each other. This post outlines what the differences are. Note that "evdev" here refers to the kernel API, not to the X.Org evdev driver.

First, the easy flowchart: do you have to create a new virtual device that has a set of specific capabilities? Use uinput. Do you have to read and handle events from an existing device? Use evdev. Do you have to create a device and read events from that device? You (probably) need two processes, one doing the uinput bit, one doing the evdev bit.

Ok, let's talk about the difference between evdev and uinput. evdev is the default input API that all kernel input device nodes provide. Each device provides one or more /dev/input/eventN nodes that a process can interact with. This usually means checking a few capability bits ("does this device have a left mouse button?") and reading events from the device. The events themselves are in the form of struct input_event, defined in linux/input.h and consist of a event type (relative, absolute, key, ...) and an event code specific to the type (x axis, left button, etc.). See linux/input-event-codes.h for a list or linux/input.h in older kernels.Specific to evdev is that events are serialised - framed by events of type EV_SYN and code SYN_REPORT. Anything before a SYN_REPORT should be considered one logical hardware event. For example, if you receive an x and y movement within the same SYN_REPORT frame, the device has moved diagonally.

Any event coming from the physical hardware goes into the kernel's input subsystem and is converted to an evdev event that is then available on the event node. That's pretty much it for evdev. It's a fairly simple API but it does have some quirks that are not immediately obvious so I recommend using libevdev whenever you actually need to communicate with a kernel device directly.

uinput is something completely different. uinput is an kernel device driver that provides the /dev/uinput node. A process can open this node, write a bunch of custom commands to it and the kernel then creates a virtual input device. That device, like all others, presents an /dev/input/eventN node. Any event written to the /dev/uinput node will re-appear in that /dev/input/eventN node and a device created through uinput looks just pretty much like a physical device to a process. You can detect uinput-created virtual devices, but usually a process doesn't need to care so all the common userspace (libinput, Xorg) doesn't bother. The evemu tool is one of the most commonly used applications using uinput.

Now, there is one thing that may cause confusion: first, to set up a uinput device you'll have to use the familiar evdev type/code combinations (followed-by a couple of uinput-specific ioctls). Events written to uinput also use the struct input_event form, so looking at uinput code one can easily mistake it for evdev code. Nevertheless, the two serve a completely different purpose. As with evdev, I recommend using libevdev to initalise uinput devices. libevdev has a couple of uinput-related functions that make life easier.

Below is a basic illustration of how things work together. The physical devices send their events through the event nodes and libinput is a process that reads those events. evemu talks to the uinput module and creates a virtual device which then too sends events through its event node - for libinput to read.

Posted Wed May 4 23:42:00 2016 Tags: