This feed omits posts by jwz. Just 'cause.
Paul Watson of Sea Shepherd has been arrested on apparently bogus charges from 2002.
Indian police have attacked women who reported violence. A protest against this was banned.
Hard evidence that sharing movies doesn't mean movies can't be profitable.
The article repeats the movie company propaganda by calling sharing "piracy". To use that term is to support the War on Sharing; please join me in refusing to use it.
MESSAGE
Twenty years after the Rio Earth summit, our environment has got worse.
Nuclear reactors are a bad investment. UK electric companies have funny ways of saying that.
The Heatland Institute is taking plenty of heat for its absurd billboards that compared recognition of global heating to being a mass murderer.
It is a pleasure to see greedy bastards stumble, but those that wish to pay for global heating denialism will find another organ to use.
This discussion of the death penalty in the US ends with an interesting observation of where it is used and why.
If you want to discourage something, the effective way is to punish it most of the time. Punishing it severely but rarely tends not to work.
Carl Malamud campaigns to make US legal requirements publicly available on the Internet.
Don't Buy the Spin: How Cutting the Pentagon's Budget Could Boost the Economy.
First giflib release since I reassumed the lead. Short version: lots of useless old cruft thrown out, everything Coverity-scanned, one minor resource leak found and fixed.
As I’ve previously noted, this code was in astonishingly good shape considering its great age. I vigorously beat the dust out of it with Coverity and cppcheck, but found only one very minor bug that way – a malloc leak following a malloc failure in the code that makes color-table structures. I think it is rather likely this case has never actually been triggered.
I retired six utilities, added a bunch of documentation and made it HTML-able, fixed a minor bug in how output GIF versions are computed in an upward-compatible way, and fixed a thread-safety problem. I added a rudimentary regression-test suite; this could use some more work. All tracker bugs have been resolved and closed.
Next release, 5.0, will make one very minor change in the API near extension blocks.
First off, "world domination" is not the only metric, nor the most useful one in every case. We have tens of millions of users around the world and I'm sure they'd appreciate it if we didn't forget them. I am one of them, and I know I certainly feel that way. You may be as well.
There's another aspect to that article: it suggests concentrating on mobile. Now .. where have I heard that before? Oh, right: everyone saying the desktop is dead, long live the web, we should focus all our efforts there.
Wake up call #1: hundreds of millions of laptop and desktop systems are sold each year. It's a market that isn't going away. Nothing is "killing" it. It is being displaced to some extent, but it isn't going away. It's less interesting because it isn't growing, and the corporate drive for ever increasing profits thus stamps it as "mature, boring." This is different from "dead."
Wake up call #2: there is no reason we can't do desktop and mobile and web. Yes, "and", not "or". Free software projects could create very compelling horizontal integration between these sectors as long as we treat them as not being mutually exclusive choices. This is part of the strategy of both Apple and Microsoft (and others), and the market would berate either for saying that they were abandoning some of these tech segments to focus exclusively on one. In KDE, our focus on the desktop has been extended to devices and the web in the last few years, and that's a good thing, something that should be supported. Which brings me to:
Wake up call #3: If people engaged in supporting Free software can't manage to keep long term focus, not freak out and continue to support the efforts that are ongoing ... we're screwed. We are, and will be, our own best friends or our own worst enemies. It starts by not telling others to stop supporting the efforts of thousands of volunteers and companies from around the world. That is, simply put, a betrayal.
A sophisticated view would be an examination of how we can draw together the efforts and successes of mobile for the desktop to give it a boost; to analyze how Free software desktop products and Free software mobile and web products can integrate and work well together.
There are projects and teams out there doing exactly that right now. Several teams in KDE are doing exactly that, and we mean business. It would be nice to not have to keep pulling knives out of our backs from journalists as we continue pushing forward. Long live Free software on the desktop, mobile, web and server!
My regular readers will know that (a) I’ve recently been pounding bugs out of GPSD with Coverity, and (b) I hate doing stupid clicky-dances on websites when I think I ought to be able to shove them a programmatically-generated job card that tells them what to do.
So, here’s a side-effect of my recent work with Coverity: coverity-submit. Set up a config file once, and afterwards just run coverity-submit in your project directory and stand back. Supports multiple projects. Because, manularity is evil.
Here’s the HTML documentation.
BitTorrent was NEVER the Performance Nightmare
BitTorrent is a lightning rod on two fronts: it is used to download large files, which
the MPAA sees as a nightmare to their business model, and BitTorrent has been a performance nightmare to ISP’s and some users. Bram Cohen has taken infinite grief for BitTorrent over the years, when the end user performance problems are not his fault.
Nor is TCP the performance problem, as Bram Cohen recently flamed about TCP on his blog.
I blogged about this before but several key points seem to have been missed by most: BitTorrent was never the root cause of most of the network speed problems BitTorrent triggered when BitTorrent deployed. The broadband edge of the Internet was already broken when BitTorrent deployed, with vastly too much uncontrolled buffering, which we now call bufferbloat. As my demonstration video shows, even a single simple TCP file copy can cause horrifying speed loss in an overbuffered network. Speed != bandwidth, despite what the ISP’s marketing departments tell you.
The Bufferbloat Nightmare
We can set BitTorrent and Ledbat aside here for a bit: they are not the actual problem and solution to our performance problems, and never were. Bufferbloat, and the amazingly stupid edge of the broadband Internet are.
The AQM algorithm called RED was invented in the 1990s by Sally Floyd and Van Jacobson to control internet router congestion in router queues, but history has show RED is fundamentally flawed and usually left unused. RED cannot handle the variable bandwidth problem we have in the edge of the Internet. These results were never properly published, for humorous and sad reasons. At best, RED is a tool that may be helpful in large Internet routers until a better algorithm is available.
The article published last week entitled “Controlling Queue Delay” by Kathie Nichols and Van Jacobson in the section “Understanding Queues,” helps greatly in understanding queuing, as well as introducing a novel AQM algorithm called CoDel (“coddle”) to manage buffers which works radically better than RED ever did. A clocked window protocol such as TCP (or others) can end up with a standing queue, adding delay that will never go away, killing speed. That standing queue cannot dissipate, and in fact (since TCP sees no timely loss), it slowly grows over time, as you can see in my original traces, and the delay grows and grows until any size buffer you care to name fills.
Worse yet, TCP’s responsiveness to competing flows is quadratic: 10 times to much buffering means TCP gets out of the way 100 times more slowly. Buffer queues must be managed; that is what an AQM does by signalling the endpoints the buffers are filling. But it is hard to figure out when to signal. No fixed buffer size can ever be correct, particularly in the edge of the Internet.
The fundamental problem is bufferbloat, and the amazingly stupid devices we have in the edge of the internet. These devices and computers typically having but one horribly bloated queue, with no queue management in them. The CoDel algorithm attacks the fundamental problem of standing queues, that RED did not, which required manual tuning. CoDel keeps buffers working the way they should: removing the standing queue, running usually nearly empty, so they can absorb bursts of packets that are inevitable in packet switched networks. The amount of buffering then becomes (probably almost) irrelevant, other than possibly costing money and power. For reasons I’ll cover shortly in another blog post, we think additional measures are also necessary; but the missing piece to solving bufferbloat has been a fully adaptive AQM algorithm that works well. The rest is engineering.
Without some mechanism to signal the endpoints to adjust their speed in the face of buffers filling (either by packet drop or ECN), we’ll continue to have problems with everything (including HTTP) and everything built on top of TCP. CoDel, we believe, is the tool here. Knowing a queue is filling excessively and managing it is a fundamental improvement over trying to inferring queue filling from delay.
Running CoDel code for Linux (dual licensed BSD/GPL2) is already staged in net-next for the next Linux merge window. Testing continues, but initial results match CoDel simulations. CoDel works, and works well.
You can build a delay based TCP, as was done in TCP Vegas, it can lose out to conventional TCP’s and has some other unsolved problems. No vendor is going to want to ship something that can make their systems work worse relative to competitors. Getting everyone to convert at once all over the Internet is a non-starter. I do not see a path forward in that direction.
The complete bufferbloat solution includes deploying CoDel in our operating systems, our home routers (which come from the factory with firmware that is based on at least five year old antique code), our broadband gear (which comes with a single queue, no classification or “fair” queuing) need upgrade or replacement. Bufferbloat is also hiding elsewhere in the Internet, including our cellular wireless systems.
Back to BitTorrent and its history, and what we can learn from the incorrectly diagnosed nightmare it triggered.
My cable modem is more than 10 times overbuffered according to the grossly flawed 100ms rule of thumb, even with a 2Mbps up-link. The netalyzr data shows my modem is typical. My brother’s DSL connection, rather than the 1.2 seconds up-link buffering, has 6 seconds of buffering, at least 60 times the worse than useless traditional 100ms “rule of thumb”. Tail drop is the worst of all possible worlds; you delay signalling the endpoints until the last possible instant.
At the time BitTorrent deployed, most cable customers had only a 256K to 768k uplink, with the same DOCSIS 2 modem; so rather than the 1.2 seconds I did on a 2Mbps uplink, it was correspondingly worse, and was comparable to my brother’s current DSL service.
BitTorrent filled these buffers. It was one of the first applications to be left running that would routinely fill the uplinks. BitTorrent was damned by association since it was often found running at the time the network engineers looked to see why the customer was complaining.
ISP’s reduced their nightmare overnight with a configuration change: Comcast, for example, upped their minimum upstream bandwidth to 784Kb, the most bloated buffers became 1/3 the size in time overnight. Many customers had long since bought the 2Mbps upstream service. The video demonstrates just how bad typical bufferbloat is on 2Mbps. 784k (with the same cable modem) will be three times worse!
BitTorrent may have problems that make it disliked by ISP’s (having to do with large scale traffic shifting), but ISP’s really hate customers calling up unhappy: this comes directly out of their bottom line profit, and is a competitive problem as well (to the extent there is competition between ISP’s; at my house, there is none).
There is one fly in the ointment for uTP/Ledbat, however. Since CoDel will allow us to finally attack the real problem and keep delays low all the time, Ledbat will no longer sense delay and cease to be effective at keeping out of the way of TCP. Ledbat behaves like TCP Reno in that case. This is what diffserv and “fair” queuing techniques were invented for. HTTP TCP traffic should have interactive priority, while downloads of all sorts, including HTTP downloads, BitTorrent, scp, and other bulk transport has lower priority. So if uTP/Ledbat also marks their traffic, we can deal with it in the edge of the network where we deploy CoDel and keep it from interfering with other traffic. We have to make our home routers and broadband less stupid; AQM is necessary, but not sufficient to get us a “real time” Internet. More about this soon.
Our edge network devices and our computers are stupid and broken. Most ISP senior executives likely still think it was BitTorrent causing their nightmares. It wasn’t BitTorrent’s fault at all, in the end. It was bufferbloat. Anything else you do with TCP or other protocols can and does cause serious problems. So long as we are stupid enough to think memory is free and how buffering is handled doesn’t matter, we are doomed. So get off of Bram’s case.
Network Neutrality and a Call for Transparency
Those who think they understand the network neutrality debate triggered by BitTorrent and do not understand bufferbloat and its history are wrong. Both sides of this debate need to step back and rethink what happened in its light. A new application, BitTorrent, deployed which caused ISP’s real severe operational nightmares. Their phones were ringing off the hook with unhappy customers due to horrifying performance. I made the same service calls about terrible performance, but I wasn’t using BitTorrent.
The ISP’s bufferbloat nightmare was hidden: no ISP wanted to admit they had a serious performance problem in public, and they misdiagnosed the real cause. BitTorrent is often left running for long periods; so it often happened to be present when ISP’s would troubleshoot. In secret, measures were taken to try to control the nightmare. This lack of transparency was the root cause of the blow-up. Opacity contributed for a half a decade delay diagnosing and understanding bufferbloat. It will take at least a half a decade to deploy fixes everywhere they are needed.
We can expect future problems like this unless there is much greater transparency into operational issues occurring in networks in the Internet. The Internet engineering community as a whole, did not have enough eyes on the problem to diagnose bufferbloat properly when bufferbloat first became severe. A very senior ISP engineer played the key role in bufferbloat’s final diagnosis, handing me the largest number of the pieces needed to assemble my dismal puzzle, and closing the loop to both ICSI and Dave Clark’s warning to him about the “big buffer problem”, but diagnosis could and should have happened five years earlier. Problems take much longer to solve when few people (even the very capable ones at that ISP) have access to the information needed for diagnosis.
When similar events happen in the future, what should we do then? How do we quickly diagnose and fix problems, rather than blaming the mostly innocent, and causing complete confusion on the root cause? What do we do while figuring out how to fix problems and deploying the fix? Sometimes it will be a simple and quick fix. Sometimes the fix will be hard and lengthy, as in bufferbloat. Sometimes the fix may be the application itself, that is badly designed (and we should think if the network needs ways to protect itself). Sometimes it will make make sense to manage traffic, in some (neutral) way, temporarily.
When will the next operational nightmare occur? And how long will it take for us to figure out what is going on when it happens? Will the right people be contacted quickly? How and where do operational problems get raised, and to whom, with what expertise when they occur? How, when, where, and with whom is information shared for diagnosis? Should there be consequences for hiding operational problems? How do we best observe the operating Internet? The need for transparency is the fundamental issue.
We are flying on an Internet airplane in which we are constantly swapping the wings, the engines, and the fuselage, with most of the cockpit instruments removed but only a few new instruments reinstalled. It crashed before; will it crash again?
For the next growing nightmare is certainly already hidden somewhere.
It is when, not if, the next nightmare arrives to haunt us.
I’ve been pounding on GPSD with the Coverity static analyzer’s self-build procedure for several days. It is my great pleasure to report that we have just reached zero defect reports in 72.8KLOC. Coverity says this code is clean. And because I think this should be an example unto others, I shall explain how I think others can do likewise.
OK, if you’re scratching your head…Coverity is a code-analysis tool – an extremely good one, probably at this moment the best in the world (though LLVM’s open-source ‘scan-build’ is chasing it and seems likely to pass it sometime down the road), It’s proprietary and normally costs mucho bucks, but as a marketing/goodwill gesture the company allows open source projects to register with them and get remote use of an instance hosted at the company’s data center.
I dislike proprietary tools in general, but I also believe GPSD’s reliability is extremely important. Navigation systems are life-critical – bugs in them can kill people. Therefore I’ll take all the help I can get pushing down our error rate, and to hell with ideological purity if that gets in the way.
Coverity won’t find everything, of course – it’s certainly not going to rescue you from a bad choice of algorithm. But it’s very, very good at finding the sorts of lower-level mistakes that human beings are very bad at spotting – memory allocation errors, resource leaks, null-pointer dereferences and the like. These are what drive bad code to crashes, catatonia, and heisenbugs.
Excluding false positives and places Coverity was being a bit anal-retentive without finding an actual bug, I found 13 real defects on this pass – all on rarely-used code paths, which makes sense for reason I’ll explain shortly. That’s less than 1 defect per 5 KLOC (KLOC = 1000 logical lines of code) which is pretty good considering our last scan was in 2007. Another way to look at that data is that, even while adding large new features like AIS support and NMEA200 and re-engineering the entire reporting protocol, we’ve introduced a bit fewer than three detectable defects per year in the last five years.
Those of you who are experienced software engineers will be picking your jaws up off the floor at that statistic. Those of you aren’t – this is at least two orders of magnitude better than typical. There are probably systems architects at Fortune 500 companies who would kill their own mothers for defect rates that low. Mythically, military avionics software and the stuff they load on the Space Shuttle is supposed to be this good, except I’ve heard from insiders that rather often it isn’t.
So, how did we do it? On no budget and with all of three core developers, only one working anywhere even near full time?
You’ll be expecting me to say the power of open source, and that’s not wrong. Sunlight is the best disinfectant, many eyeballs make bugs shallow, etc. etc. While I agree that’s next to a necessary condition for defect rates this low, it’s not sufficient. There are very specific additional things we did – things I sometimes had to push on my senior devs about because they at times looked like unnecessary overhead or obsessive tailchasing.
Here’s how you engineer software for zero defects:
1. Be open source.
And not just because you get helpful bug reports from strangers, either, all though that does happen and can be very important. Actually, my best bug-finders are semi-regulars who don’t have commit access to the code but keep a close eye on it anyway. Like, there’s this Russian guy who often materializes on IRC late at night and can barely make himself understood in English, but his patches speak clearly and loudly.
But almost as importantly, being open source plugs you into things like the Debian porterboxes. A couple of weeks ago I spent several days chasing down port failures that I thought might indicate fragile or buggy spots in the code. It was hugely helpful that I could ssh into all manner of odd machines running Linux, including a System 390 mainframe, and run my same test suite on all of them to spot problems due to endianness or word-size or signed-char-vs.-unsigned-char differences.
Closed-source shops, in general, don’t have any equivalent of the Debian porterboxes because they can’t afford them – their support coalition isn’t broad enough. When you play with the open-source kids, you’re in the biggest gang with the best toys.
Invest your time heavily in unit tests and regression tests
GPSD has around 90 unit tests and regression tests, including sample device output for almost every sensor type we support. I put a lot of effort into making the tests easy and fast to run so they can be run often – and they are, almost every time executable code is modified. This makes it actively difficult for random code changes to break our device drivers without somebody noticing right quick.
Which isn’t to say those drivers can’t be wrong, just that the ways they can be wrong are constrained to be through either (a) a protocol-spec-level misunderstanding of what the driver is supposed to be doing, or (b) an implementation bug somewhere in the program’s state space that is obscure and difficult to reach. Coverity only turned up two driver bugs – static buffer overruns in methods for changing the device’s reporting protocol and line speed that escaped notice because they can’t be checked in our test harnesses but only on a live device.
This is also why Coverity didn’t find defects on commonly-used code paths. If there’d been any, the regression tests probably would have smashed them out long ago. I put in a great deal of boring, grubby, finicky work getting our test framework in shape, but it has paid off hugely.
Use every fault scanner you can lay your hands on.
Ever since our first Coverity scan in 2007 I’d been trying to get a repeat set up, but Coverity was unresponsive and their internal processes clearly rather a shambles until recently. But there were three other static analyzers I had been applying on a regular basis – splint, cppcheck, and scan-build.
Of these, splint is (a) the oldest, (b) the most effective at turning up bugs, and (c) far and away the biggest pain in the ass to use. My senior devs dislike the cryptic, cluttery magic comments you have to drop all over your source to pass hints to splint and suppress its extremely voluminous and picky output, and with some reason. The thing is, splint checking turns up real bugs at a low but consistent rate – one or two each release cycle.
cppcheck is much newer and much less prone to false positives. Likewise scan-build. But here’s what experience tells me: each of these three tools finds overlapping but different sets of bugs. Coverity is, by reputation at least, capable enough that it might dominate one or more of them – but why take chances? Best to use all four and constrain the population of undiscovered bugs into as small a fraction of the state space as we can.
And you can bet heavily that as new fault scanners for C/C++ code become available I’ll be jumping right on top of them. I like it when programs find low-level bugs for me; that frees me to concentrate on the high-level ones they can’t find.
Be methodical and merciless
I don’t think magic or genius is required to get defect densities as low as GPSD’s. It’s more a matter of sheer bloody-minded persistence – the willingness to do the up-front work required to apply and discipline fault scanners, write test harnesses, and automate your verification process so you can run a truly rigorous validation with the push of a button.
Many more projects could do this than do. And many more projects should.
The GPS with my magic modification that makes it into a 1ms-accurate time source over USB arrived here last week. And…wow. It works. Not only is it delivering 1PPS where I can see it, it’s the best GPS I’ve ever handled on a couple other axes as well, including superb indoor performance. Despite the fact that it’s been sitting on my desk five feet from a window blocked by large trees, it acquired sat lock in seconds and (judging by the steadily blinking LED) doesn’t appear to have lost it even transiently at any time since.
(Fun fact about that blinking LED on your GPS – that’s actually being lit up by the 1PPS pulse! Yes, the dumb flashing LED telling you your GPS has a fix is actually marking top-of-second with 50ns atomic-clock accuracy – kind of like using an F16 to deliver junk mail.)
I’m kind of boggled, actually. This device, my very first hardware hack, went from from mad gleam in my eye to shipping for production in less than ten weeks. No, you can’t easily buy one yet, but that’ll change within a few weeks when the first U.S. retailer lands a shipment.
Um, so maybe I really am Manfred Macx after all? I have spent an awful lot of time pulling people into agalmic positive-sum games, and the hypervelocity hack of the market I’ve just done (make a bunch of other people rich and empowered with a simple idea and some connective juice) is very much the same sort of thing Manfred does all through Charles Stross’s novel Accelerando. The guys on the thumbgps-devel list think this is hilarious and have talked the Chinese into nicknaming the device the Macx-1. Two of them are now addressing me as ‘Manfred’ in a ha-ha-only-serious way; I am not sure I approve of this.
The Chinese we’re dealing with (the company is Navisys) seem to be enjoying all this. Of course they make agreeable noises at customers as a matter of commercial reflex, and it’s not easy to be sure through the slightly stiff Chinglish they speak, but…I think they actually like us. I think they’re not used to having customers that are interesting and know their engineering and make jokes at the same time. It seems to have been a fun ride for all parties involved.
The non-Plain-Jane concept designs that the thumbgps list was kicking around haven’t completely died as topics of discussion, but the existence of real hardware for cheap does tend to concentrate minds on it. The other company I was talking with, UniTraq, hasn’t been heard from in a couple of weeks; perhaps they lost interest after we downchecked the CP2101 USB adapter in their prototypes.
Dunno what the quantity-one retail price in the U.S. will be yet, but a little birdie tells me Navisys is quoting less than $30 qty 100, so make your own guess about retailer markup. No, it’s not on the Navisys website yet, but they are taking bulk orders. Ask for the Macx-1 by name – formally it’s a revision of the GR601W, but they had to shift from a dongle to a mouse enclosure for the prototypes at least and it’s unknown to me whether the older designation will survive. I suspect the Chinese are still thinking out how exactly to market this thing.
There’s an opportunity here for anyone in the retail consumer-electronics biz. This is a great product – inexpensive, well designed, almost uniquely capable, My opinion of uBlox (the GPS chip’s vendor) has gone way, way up; this beats the snot out of the SiRF-II- and SiRf-III-based designs I’m used to even if you ignore the timing-source use.
It’s pretty hard to see how this project could gone better, actually. Now it’s time for phase II, where we use a hundred or so copies of the Macx-1 to build the Cosmic Background Bufferbloat Detector and fix the Internet.
As I mentioned in a previous blog entry, we'll be shipping the Vivaldi tablet computer with 1GB of RAM .. and today I can tell you even more good news: we've doubled the internal storage to 8GB as well. We'll be settling on the USA pricing shortly as well, and I think people will be pleasantly surprised with where that lands.
Purchase orders for the first production runs of devices have gone in. This is truly a "point of no return" for the project, and that is very, very satisfying to have reached. We have some of the typical right-to-the-wire engineering work to do on the software side, but then we'll be pulling all the triggers and emails will pour forth and sales will open.
We've been signing letters of understanding with various companies as part of an effort to build a partner network around Make·Play·Live. This will allow us to provide services and support around Vivaldi, the Add-Ons App and future efforts that would be impossible to do otherwise. I've got a whole separate blog entry brewing about it which I'll release after the official announcement next week.
Speaking of sales, yesterday the first Make·Play·Live accounts were created using the Add-ons App. It was also the day that the credit card processing system went live, making addons.makeplaylive.com a fully operational battle station. In step with this, today is the last day of development for version 1.0 of the Add-Ons app. We'll start loading content on the server soon and move it over to an SSL secured home on port 80. A development installation will remain up on port 3000, and we intend to keep that open to the public as well.
Aaaand, I have another long blog entry in the "philosopher/pragmatist" series already written and waiting for one more round of editing before pushing "publish" on it.
Somewhere in all this, I also managed to catch up with mailing lists and push a fix to libplasmagenericshell in kde-workspace that fixes a crash when loading themes without window shadows. I have to say "thanks" to Marius Cirsta from Frugalware for doing the detective work that tracked down the source of the problem and made my job simple.
Installing the tree
Getting that setup is quite easy these days:mkdir -p xorg/util git clone git://anongit.freedesktop.org/git/xorg/util/modular xorg/util/modular cd xorg ./util/modular/build.sh --clone --autoresume built.modules /opt/xorgThat takes a while but if any component fails to build (usually due to missing dependencies) just re-run the last command. The built.modules file contains all successfully built modules and the script will simply continue from the last component. Despite the name, build.sh will also install each component into the specified prefix.
You get everything here, including a shiny new copy of xeyes. Yes, what you always wanted, I know
Note that build.sh is just a shell script, so you can make changes to it. Disable the parts you don't want (fonts, for example) by commenting them out. Or alternatively, generate a list of all modules, remove the ones you don't want or need and build with that set only:
./util/modular/build.sh -L > module_list vim module_list # you can skip fonts, apps (except xkbcomp) and exotic drivers ./util/modular/build.sh --clone --autoresume built.modules --modfile module_list /opt/xorgEither way, you end up with /opt/xorg/bin/Xorg, the X server binary. I just move my system-installed and then symlink the new one.
sudo mv /usr/bin/Xorg /usr/bin/Xorg_old sudo ln -s /opt/xorg/bin/Xorg /usr/bin/XorgNext time when gdm starts the server, it'll start the one from git. You can now update modules from git one-by-one as you need to and just run make install in all of them. Alternatively, running the build.sh script again without the --clone parameter will simply git pull in each module.
Setting up the environment
What I then define is a few environment variables. In my .zshrc I havealias mpx=". $HOME/.exportrc.xorg"and that file contains
export PKG_CONFIG_PATH=/opt/xorg/lib/pkgconfig:/opt/xorg/share/pkgconfig export LD_LIBRARY_PATH=/opt/xorg/lib/ export PATH=/opt/xorg/bin:$PATH export ACLOCAL="aclocal -I /opt/xorg/share/aclocal" export MANPATH=/opt/xorg/share/man/So running "mpx" will start git versions of the clients, link clients against git versions of the libraries, or build against git versions of the protocol.
Why this setup?
The biggest advantage of this setup is simple: the system install doesn't get touched at all and if the git X server breaks changing the symlink back to /usr/bin/Xorg_old gives me a working X again. And it's equally quick to test Fedora rpms, just flick the symlink back and restart the server. I have similar trees for gnome, wayland, and a few other large projects.It also makes it simple to test if a specific bug is a distribution bug or an upstream bug. Install the matching X server branch instead of master and with a bit of symlink flicking you can check if the bug reproduces in both. For example, only a few weeks ago I noticed that xinput got BadAtom errors when run from /usr/bin but not when run from /opt/xorg/bin. Turns out it was a thing fixed in the upstream libXi but not backported to Fedora yet.
The drawback of this setup is that whenever the xorg-x11-server-Xorg module is updated, I need to move and symlink again. That could be automated with a script but so far I've just been too lazy to do it.
[Update 11.05.12: typo and minor fixes, explain build.sh -L]
Background: a not very clear article in ACM Queue led to a post by Bram Cohen claiming TCP sucks.
The first article is long and seems technically correct, although in my opinion it over-emphasizes unnecessary details and under-emphasizes some really key points. The second article then proceeds to misunderstand many of those key points and draw invalid conclusions, while attempting to argue in favour of a solution (uTP) that is actually a good idea. So I'm writing this, I suppose, to refute the second article in order to better support its thesis. That makes sense, right? No? Well, I'm doing it anyway.
First of all, the main problem we're talking about here, "bufferbloat," is actually two problems that we'd better separate. To oversimplify only a little, problem #1 is oversized upstream queues in your cable modem or DSL router. Problem #2 is oversized queues everywhere else on the Internet.
The truth is, for almost everyone reading this, you don't care even a little bit about problem #2. It isn't what makes your Internet slow. If you're running an Internet backbone, maybe you care because you can save money, in which case, go hire a consultant to figure out how to fine tune your overpriced core routers. Jim Gettys and others are on a crusade to get more ISPs to do this, which I applaud, but that doesn't change the fact that it's irrelevant to me and you because it isn't causing our actual problem. (Van Jacobson points this out a couple of times in the course of the first article, but one gets the impression nobody is listening. I guess "the Internet might collapse" makes a more exciting article.)
What I want to concentrate on is problem #1, which actually affects you and which you have some control over. The second article, although it doesn't say so, is also focused on that. The reason we care about that problem is that it's the one that makes your Internet slow when you're uploading stuff. For example, when you're running (non-uTP) BitTorrent.
This is where I have to eviscerate the second article (which happens to be by the original BitTorrent guy) a little. I'll start by agreeing with his main point: uTP, used by modern BitTorrent implementations, really is a very good, very pragmatic, very functional, already-works-right-now way to work around those oversized buffers in your DSL/cable modem. If all your uploads use uTP, it doesn't matter how oversized the buffers are in your modem, because they won't fill up, and life will be shiny.
The problem is, uTP is a point solution that only solves one problem, namely creating a low-priority uplink suitable for bulk, non-time-sensitive uploads that intentionally give way to higher priority stuff. If I'm videoconferencing, I sure do want my BitTorrent to scale itself back, even down to zero, in favour of my video and audio. If I'm waiting for my browser to upload a file attachment to Gmail, I want that to win too, because I'm waiting for it synchronously before I can get any more work done. In fact, if me and my next-door neighbour are sharing part of the same Internet link, I want my BitTorrent to scale itself back even to help out his Gmail upload, in the hope that he'll do the same for me (automatically of course) when the time comes. uTP does all that. But for exactly that reason, it's no good for my Gmail upload or my ssh sessions or my random web browsing. If I used uTP for all those things, then they'd all have the same priority as BitTorrent, which would defeat the purpose.
That gives us a clue to the problem in Cohen's article: he's really disregarding how different protocols interoperate on the Internet. (He discounts this as "But game theory!" as if using sarcasm quotes would make game theory stop predicting inconvenient truths.) uTP was *designed* to interact well with TCP. It was also designed for a world with oversized buffers. TCP, of course, also interacts well with TCP, but it never considered bufferbloat, which didn't exist at the time. Our bufferbloat problems - at least, the thing that turns bufferbloat from an observation into a problem - come down to just that: they couldn't design for it, because it didn't exist.
Oddly enough, fixing TCP to work around bufferbloat is pretty easy. The solution is "latency-based TCP congestion control," the most famous implementation of which is TCP Vegas. Sadly, when you run it or one of its even better successors, you soon find out that old-style TCP always wins, just like it always wins over uTP, and for exactly the same reason. That means, essentially, that if anyone on the Internet is sharing bandwidth with you (they are), and they're running traditional-style TCP (virtually everyone is), then TCP Vegas and its friends make you a sucker with low speeds. Nobody wants to be a sucker. (This is the game theory part.) So you don't want to run latency-based TCP unless everyone else does first.
If you're Bram Cohen, you decide this state of affairs "sucks" and try to single-handedly convince everyone on the Internet to simultaneously upgrade their TCP stack (or replace it with uTP; same undeniable improvement, same difficulty). If you co-invented the Internet, you probably gave up on that idea in the 1970's or so, and are thinking a little more pragmatically. That's where RED (and its punny successors like BLUE) come in.
Now RED, as originally described, is supposed to run on the actual routers with the actual queues. As long as you know the uplink bandwidth (which your modem does know, outside annoyingly variable things like wireless), you can fairly easily tune the RED algorithm to an appropriate goal queue length and off you go.
By the way, a common misconception about RED, one which VJ briefly tried to dispel in the first article ("mark or drop it") but which is again misconstrued in Cohen's article, is that if you use traditional TCP plus RED queuing, you will still necessarily have packet loss. Not true. The clever thing about RED is you start managing your queue before it's full, which means you don't have to drop packets at all - you can just add a note before forwarding that says, "If I weren't being so nice to you right now, I would have dropped this," which tells the associated TCP session to slow down, just like a dropped packet would have, without the inconvenience of actually dropping the packet. This technique is called ECN (explicit congestion notification), and it's incidentally disabled on most computers right now because of a tiny minority of servers/routers that still explode when you try to use it. That sucks, for sure, but it's not because of TCP, it's because of poorly-written software. That software will be replaced eventually. I assure you, fixing ECN is a subset of replacing the TCP implementation for every host on the Internet, so I know which one will happen sooner.
(By the way, complaints about packet dropping are almost always a red herring. The whole internet depends on packet dropping, and it always has, and it works fine. The only time it's a problem is with super-low-speed interactive connections like ssh, where the wrong pattern of dropped packets can cause ugly multi-second delays even on otherwise low-latency links. ECN solves that, but most people don't use ssh, so they don't care, so ECN ends up being a low priority. If you're using ssh on a lossy link, though, try enabling ECN.)
The other interesting thing about RED, somewhat glossed over in the first article, is VJ's apology for mis-identifying the best way to tune it. ("...it turns out there's nothing that can be learned from the average queue size.") His new recommendation is to "look at the time you've had a queue above threshold," where the threshold is defined as the long-term observed minimum delay. That sounds a little complicated, but let me paraphrase: if the delay goes up, you want to shrink the queue. Obviously.
To shrink the queue, you "mark or drop" packets using RED (or some improved variant).
When you mark or drop packets, TCP slows down, reducing the queue size.
In other words, you just implemented latency-based TCP. Or uTP, which is really just the same thing again, at the application layer.
There's a subtle difference though. With this kind of latency-self-tuning RED, you can implement it at the bottleneck and it turns all TCP into latency-sensitive TCP. You no longer depend on everyone on the Internet upgrading at once; they can all keep using traditional TCP, but if they're going through a bottleneck with this modern form of RED, that bottleneck will magically keep its latencies low and sharing fair.
Phew. Okay, in summary:
- If you can convince everybody on the internet to upgrade, use latency-sensitive TCP. (Bram Cohen)
- Else If you can control your router firmware, use RED or BLUE. (Jim Gettys and Van Jacobson)
- Else If you can control your app, use uTP for bulk uploads. (Bram Cohen)
- Else You have irreconcilable bufferbloat.
Or, tl;dr:
- Yes. If you use BitTorrent, enable uTP and mostly you'll be fine.
setlocal noexpandtab shiftwidth=8 tabstop=8The alternative is to add a snippet to the file itself but not every maintainer is happy with that.
/* vim: set noexpandtab tabstop=8 shiftwidth=8: */

