Some of my thoughts, filtered slightly for public consumption.

Django is Dangerous


Django is a web framework that consists of everything a backend engineer would think a website might need. It's an ORM tool, a templating engine, a content management system, an admin console, and a collection of utilities. It was invented to power my small-town newspaper's website[0], for which it may have been well-suited. But it has two massive flaws that cripple many of the websites that use it today.

No Data Integrity

Data integrity problems are particularly insidious, because they tend to lie dormant until a site starts to get a lot of traffic and accumulates lots of data, and because bad data, unlike bad code, often cannot be fixed. Django provides plenty of data integrity footcanons for the careless.

ATOMIC_REQUESTS = False by Default

Among many other things, Django provides an ORM: instead of using SQL directly, you create "models", which are Python classes with "fields" that represent various pieces of data, and the framework handles their serialization and deserialization. In doing so, it also tries to hide the finer points of database interaction from you.

Among these points is transaction management, where you decide which series of database queries need to take place as atomic transactions[1][2]. Since the framework has no insight into the conceptual relationships between queries, it has essentially two choices: the most fine-grained strategy (only individual queries are atomic) or the most course-grained strategy (every HTTP request is processed atomically). These choices offer the classic trade-off between performance and safety, which everyone agrees should default towards safety. Django allows the user to control this behavior via the ATOMIC_REQUESTS setting, but it defaults to False—the unsafe but more performant choice.

Common Django usage patterns exacerbate this terrible design decision. Most Django views[3] look like the following:

def my_view(request):
    instance = GrabBagOfData.objects.get(...)   # retrieve and deserialize a model instance from the database
    instance.some_godawful_expensive_method()   # modify the instance object                             # serialize and store the modified instance
    return HttpResponse(...)

Unless explicitly told otherwise, the save method writes every field on the model back to the database, including those which have not been modified—silently overwriting any changes that have been made since the model instance was retrieved. Since GrabBagOfData probably serves numerous distinct purposes, all sorts of AJAX requests are firing off concurrently to modify different fields and trampling each other's updates. Hell, some_godawful_expensive_method is often so slow that the user will manually trigger new requests before it has finished.

get_or_create and update_or_create are non-atomic

These methods allow you to get or update a model instance specified by certain parameters, if it exists, and creates such an instance if it does not. What is the point of using these methods instead of, say, trying a get and falling back to a create? One would probably expect them to prevent data races, i.e. if an instance is created after the get fails but before the create is issued, these methods should prevent a duplicate instance from being created. But the Django documentation gently notes otherwise, in an unhighlighted paragraph of text more than a page into the notes for get_or_create:

This method is atomic assuming correct usage, correct database configuration, and correct behavior of the underlying database. However, if uniqueness is not enforced at the database level for the kwargs used in a get_or_create call (see unique or unique_together), this method is prone to a race-condition which can result in multiple rows with the same parameters being inserted simultaneously.

In fact, the implementation makes no attempt to prevent data races, relying entirely on the database[4]. To add insult to injury, it recommends lowering the MySQL isolation level[5], thereby making your entire system less safe.

Validation defined on the model is only enforced on the form

Django models can be associated with one or more "forms", which are Python representations of HTML forms used to create and update model instances. Forms have a lot of default behavior, particularly validation, which is controlled by properties of the model. In fact, many properties of the model exist only to control the forms' default behavior. Because nobody ever modifies a model instance expect through its form, right? And good luck keeping track of which constraints are where. Non-null? That's on the model. Define choices on a field, explicitly enumerating what values it can have? That's on the form. Uniqueness? On the model. Decimal field? The model will take any string!

This inconsistent validation also results in a classic terrible user experience: forms, pre-populated with the existing data for an object, that cannot be submitted because the existing data is invalid.

Invalid values are silently coerced to None

Not consistently of course. I can't tell you how many times I've run something like the following:

> Model.objects.filter(field__isnull=True)      # retrieve every model instance where field is None
[]                                              # looks like there are no such instances
> Model.objects.get(id=12345).field is None     # check whether a particular model instance has field set to None
True                                            # it does

What has happened here is that the serialized model instance has an invalid value, but the value is not NULL, so the first query finds nothing. The second query deserializes the instance, and upon encountering the invalid value, silently treats it as if it were NULL, instead of raising an error like any sane function would. This tragedy is particularly common with datetimes, since poorly-behaved applications tend to ignore all the nastiness involved in handling datetimes correctly[6][7].

Duplicate fields abound

Since queries can't use computed columns and adding a real column is only a makemigrations away, models steadily accrue duplicate fields that represent the same data in different ways. Since Django doesn't support complex constraints, these fields inevitably drift out of sync. Pretty soon half of your Vehicles have is_motorcycle == True and wheel_count == 4, and you're not sure which field to trust (hint: neither).

One of the great things about Python is that you can refactor inconsistent properties like this with the @property decorator. But while the ORM allows you to access columns as properties, the reverse is not true, so you have to manually refactor every query.

Dynamic Templating

Dynamic templating gives your views an enormous surface and makes them effectively untestable[8]. But suppose that you come up with a hack to kinda-sorta test a page. Django's templating engine offers two different kinds of template inheritance, the extend and include tags. This alone means that a single template may have content spread over many files, but it gets worse. While a template can only extend one other template, but it can include arbitrarily many, and can even include templates dynamically—making it nearly impossible to tell what the template looks like after resolving inheritance, or even how many different templates you have. I've worked on systems where nobody knew the order of magnitude—and nobody would be surprised to learn it was in the millions.

Forms, our friends from the previous section, are also the kings of dynamic templating—large, critical chunks of entirely dynamically created HTML—and as such bring their own pains. Change the name of a field? Since <input> names and ids are dynamically generated, now your JavaScript and your CSS are broken! A designer wants to make an adjustment? Chances are they'll have to touch the Python code too. What could go wrong?

Performance naturally suffers too. In addition to the obvious inefficiency of building every page per-request, you have to compress them per-request as well! Or you would, if you didn't have to abandon compression completely to prevent BREACH attacks, and as a result send several times as much data with each response.


These are far from all the problems with Django[9][10], but what makes them the most insidious is that they grow with scale. Most of these are minor issues when you are small and your project is simple. And Django is useful, which is why so many projects pick it up early on. Then as your project grows, these problems get worse and affect more users.

Can Django be fixed? Most of the data integrity issues are implementation mistakes that could be fixed with relatively little work and without introducing too much backwards incompatibility. Performance would certainly suffer, particularly for get_or_create and update_or_create, but performance has never been a priority for Django. Moving validation to the model would be the primary source of backwards incompatibility, but it seems likely that any system which relies on storing invalid values is already broken. It would be difficult to offer a good way around creating duplicate fields, but this at least is a problem with most frameworks. Dynamic templating, on the other hand, was a fundamental design mistake, and migrating away from it would be almost as much work as switching frameworks altogether, so it will probably never leave Django.

In the best possible outcome, where Django fixes most of its data integrity issues and various other warts, I still would recommend against using it.

  1. ^

    The Lawrence Journal-World, served HTTP-only and from the "www2" subdomain.

  2. ^

    Atomicity is a central concept in concurrent computing, and a full introduction would not fit in a footnote. In brief, an atomic transaction is a collection of database queries which cannot be interrupted by another query, and which from the perspective of another query all appear to execute at the same time.

  3. ^

    Making this choice for the developer requires turning autocommit on, in direct contradiction of PEP 249.

  4. ^

    Views are the functions that process incoming requests and return responses.

  5. ^

    The implementation assumes that creating a duplicate will raise an IntegrityError, which is only the case if duplicates violate a database constraint.

  6. ^

    From the documentation:

    If you are using MySQL, be sure to use the READ COMMITTED isolation level rather than REPEATABLE READ (the default), otherwise you may see cases where get_or_create will raise an IntegrityError but the object won't appear in a subsequent get() call.
  7. ^

    Leap Day, Daylight Savings Time, and months indexed from 0 (thanks JavaScript) are all common causes.

  8. ^

    A surprisingly common response is "don't have poorly behaved applications modifying your database". I wonder what kind of utopia these people live in, where they control, or even know about, all the applications modifying their database.

  9. ^

    I would like to thank Tim Best for pointing this out.

  10. ^

    Like the queryset methods first() and earliest(field), which is like order_by(field).first() except when the queryset is empty, in which case first() returns None while earliest(field) raises an ObjectDoesNotExist exception.

  11. ^

    Or the FileField and FieldFile API, where the documentation claims that FieldFile behaves like Python's file object, for instance documenting as:

    Behaves like the standard Python open() method and opens the file associated with this instance in the mode specified by mode.

    However, instead of returning an open file (which is helpfully a context manager) like the builtin open(), returns None, so none of the same patterns apply.

Towards Universal HTTPS Adoption


HTTPS was invented in 1994. We are now on the verge of a milestone: it is estimated that by the end of 2016, adoption will reach 50%. But how the hell has it taken more than two decades to secure half of the internet?

HTTP Considered Harmful

When HTTPS was implemented, it was widely thought that encryption was only necessary in rare cases, such as financial transactions. Indeed, it is not immediately obvious that most internet traffic should be encrypted. But the internet has become far broader than the original implementers expected. People post their inner thoughts and secrets on message boards. They learn about drugs, about sexuality, about politics. They watch porn, revealing their sexual orientation. All of these activities have the potential to destroy friendships, relationships, jobs and families. And even if these activities aren't illegal now, they may become so (potentially ex post facto[0]) when someone changes countries, or when their country changes on them.

Unencrypted traffic also poses security threats for encrypted traffic. Someone's password for their gardening forum is likely to be their password for their email. When they click on a link on a news site[1] they trust, and it takes them to what looks like their bank's website, they don't bother to double-check. Third parties are left vulnerable too—injected JavaScript can make them an unwitting participant in a DDoS attack[2].

Obstacles to HTTPS Adoption

The main obstacle to HTTPS adoption is a small set of browser security policies:

Further note that, even when a website has successfully addressed mixed content, the mere threat of a warning or error can dissuade a sysadmin (or their boss) from deploying HTTPS.

These policies made sense to the original implementers given their assumptions about how HTTPS would be used. Any financial transaction was simply expected to be performed over HTTPS, so the downside of reduced adoption didn't come up. Surely the user should be warned if images on the page could be spoofed, hence the noticeable warning for mixed content. Loading JS or CSS insecurely allows an active attacker to compromise the page entirely, so is totally inappropriate for financial transactions and should fail by default. And sending the referer[3] with insecure requests could leak sensitive data in the path[4], it should definitely be stripped in these cases.

But for most websites the benefit is indeed minor. Allowing mixed content CSS or JavaScript lets an active attacker[5] entirely compromise the webpage, but allowing mixed content images gives them little practical advantage over a passive attacker, while sending the Referer header gives them none. Meanwhile these browser behaviors offer relatively little extra protection against passive attacks, which, in an era of ubiquitous data collection, are far easier to perform and hence of far greater concern for most websites[6]. Mixed content warnings are almost always triggered by static images, which are rarely[7] sensitive. Eavesdropping on CSS and JavaScript requests may reveal the specifc page a user is on within a website, it does not reveal the page's user-specific content[8]. Thus the impact of allowing such mixed content is similar to passing the Referer header, which directly reveals which page the user is on.


There should be a way for websites to provide improved security without being burdened by these policies. This is not to say that browsers should never enforce them; as discussed above, they are entirely appropriate for financial transactions and other high-value targets for active attacks. It would not even be appropriate to show the usual HTTP icon on such pages when one of these policies are violated, because doing so would reduce the incentive to make sure such violations never occur. But I think a middle ground exists which would be radically better than the status quo.

There has been some progress along these lines. The W3C defines an unsafe-url referrer policy which tells the browser to send the referer with plain HTTP requests, as if the page were loaded over HTTP itself. Unfortunately, referrer policies are not supported by IE, rendering it effectively useless for most pages.

I propose a broader, simpler solution: a No-Security-Claim header. If a page is sent with the No-Security-Claim header, it should be presented as if it were accessed over HTTP, even when accessed over HTTPS, and without regard to mixed content. Additionally, it should send the referer as if accessed over HTTP. This would allow websites to provide the partial security discussed above at much lower cost.

Honestly, I am not hopeful that any solution would ever be ported to old browsers (i.e. IE), and so universal HTTPS adoption is probably at least a decade off. The best time for this proposal would have been 20 years ago. But I hold out some slim hope that implementing No-Security-Claim would be a simple matter of changing a few conditionals, and so could sneak into a security update. Until then, the battle continues.

  1. ^

    In law, "ex post facto" refers to a law that is applied to actions that predate the passage of the law. While this is illegal in many countries (including the United States), it is not uncommon historically, and like any law, a law against ex post facto laws can be revoked.

  2. ^

    The vast majority of news sites are currently HTTP-only.

  3. ^

    See the CloudFlare blog for an explanation of JavaScript-based DDoS attacks, and Netresec's analysis of China's use of this and other attacks against GitHub.

  4. ^

    The attentive reader may note that "referer" is not spelled consistently throughout. This is intentional. When referring to the Referer header, it is spelled "referer", as this is how the header is spelled in the HTTP specification and in every major implementation. However, the W3C's referrer policy specification uses the standard English spelling "referrer", and so that spelling is used in reference to said specification.[9]

  5. ^

    Defense in depth aside, putting sensitive data in the path is a terrible idea. Paths are shared casually, often via URL shortening services, which make them brute-force enumerable.

  6. ^

    Internet traffic is vulnerable to two different kinds of attacks: passive attacks, which just read the data being sent over the network, and active attacks, which block traffic being sent over the network and substitute it with their own. Performing an active attack requires having full control over one of the computers on the route between the user and the server, while a passive attack requires only the ability to steal data from one these computers, possibly after the fact, or the ability to surreptitiously tap into a fiber-optic cable.

  7. ^

    In fact, thanks to Edward Snowden, we know that the Five Eyes agencies are performing such attacks routinely and at a massive scale. And they are probably not alone.

  8. ^

    Sexually explicit images being an exception.

  9. ^

    This is assuming these are requests for static resources. Dynamic resources such as JSON should almost always be requested from servers under the control of the website's owner, and so if the page can be served over HTTPS, it should be easy to serve them over HTTPS too.

  10. ^

    The use of the word "refer" in discussing the spelling of "referer" is also intentional, just in case the facts of the matter did not make you sad enough.

Analyzing the Effects of Voting


I haven't seen a persuasive analysis of whether and how one should vote in a US presidential election[0]; every analysis I've seen either concludes that voting is pointless (at least outside of a swing state), or ignores the Electoral College completely. I intend to fill this gap.

Tipping the Election

Obviously one vote is statistically extremely unlikely to affect the outcome of the election. The naive analysis suggests that the probability your vote tips the election is the probability of a tie assuming you don't vote.

Some people would object to this analysis here, contending that a single voter never decides an election, because if an election is within a single vote then it ultimately gets decided by court battles over recounts. There is some truth to this, but I don't think it significantly effects the probability of your vote deciding the election. Almost certainly there is some negative margin against which your candidate could not possibly win a legal challenge, and some positive margin that would make their victory immune to one. If we let pxp_x denote the probability that your candidate will win given xx votes, then the sum of pxpx1p_x - p_{x-1} (which is the impact of your vote) over this margin is 1. If we further assume that the probability of any given xx within the margin is constant[1], then the expected value of the impact of your vote, should it fall in the margin, is 1/M1/M where MM is the size of the margin. Since the probability of any given outcome within the margin is assumed to be constant, it is the same as the probability of a tie, so the probability of your vote falling in the margin is P(tie)×MP(\text{tie})\times M. Thus the probability of your vote affecting the outcome is P(tie)×M×1/M=P(tie)P(\text{tie})\times M\times 1/M=P(\text{tie}), as in the naive analysis.

The low influence of a single vote on the outcome of the election is counter-balanced by the large impact affecting its outcome would have. I'm inclined to believe that the two roughly cancel, so that a perhaps one-in-several-million chance of affecting the outcome is worth the effort of voting.

Effects in a Safe State

But I live in California, where my vote is almost 1,000 times less likely than the average vote to affect the outcome[2]. So then what are the effects of my vote? It nudges various statistics a little.

The Popular Vote

My vote increases my candidate's popular vote total. If they win, it slightly improves their electoral mandate. If they lose, it slightly erodes their opponent's mandate. It also increases the chance of your candidate winning the popular vote but losing the electoral college[3], and decreases the chance of the reverse.

Given George W. Bush's presidency, I think the impact of electoral mandates is relatively minor[4], and can mostly be ignored for this analysis. Similarly, we seem to have handled the electoral/popular vote mismatch without much consequence, although this upcoming election may be different given its ugly tone.

Increased Turnout Among My Demographic Groups

My vote also increases turnout among members of demographic groups I belong to (white, college-educated, 18-29 years old), probably increasing their political clout and furthering causes they support. Do I agree with them on the whole, or in particular on issues where their political clout is likely to tip the scales? Perhaps. At least the 18-29 year-old demographic is badly under-represented, and has serious long-term concerns which most (older) voters do not.

Increased Total Turnout

Any vote increases total voter turnout, which probably increases confidence in our Democracy. In my view, this is the strongest argument for voting.

Democracy is famously "the worst form of Government except all those other forms that have been tried"[5]. Voters are often short-sighted, poorly informed and easily misled, while politicians lie and pander to them. But while rule by strength is oft contested, and heirs have long raised armies to contest each others' claim to the throne, Democracy has given the United States peaceful transitions of power for 150 years. I believe this is because Democracy grants legitimacy to a leader in a way that is uniquely hard to overcome by martial force, based on the understanding that the results of an election reflect the will of the people. When more than half of adult citizens vote, this is a strong claim. But if this number slides to 40%, 30%? At some point the loser, or anyone else with a popular following, can credibly reject the President's authority.

This effect is inherently hard to measure. But the collapse of our Democracy would, I think, be at least 100x worse than the average difference between presidential candidates[6], likely resulting in Third and Final World War given the US's dominant role in global politics. Thus even a weak relationship between voting and peaceful transitions of power would be on par with a one-in-a-million chance to determine the next President. So even if you live in a safe state, if you are eligible, please vote.

  1. ^

    I'm only considering the presidential election because it's the most universal. State and local elections are often more important, but those dynamics vary heavily between state and locality.

  2. ^

    This should be fairly accurate for small margins, say 0.5%.

  3. ^

    This number comes from FiveThirtyEight's election forecast, which computes the "voter power index" of a voter in each state, defined as the probability of both their state being decided by a single vote and their state tipping the electoral college. I use the "now-cast" version of the forecast, because it assumes the difference between the current polls and the results is entirely polling error, rather than allowing for some movement in the polls before the election. This more accurately reflects the knowledge I will have on or near election day, when I actually decide whether to vote. Note also that the voting power indices don't actually display values as low as 0.001; values below 0.1 are displayed as "<0.1". However the actual value can be inferred from the width of the bar used to illustrate it. At time of writing, Alaska has a voting power index of 4.3 and a bar with width 45%, while California has a bar with width 0.01156%, giving a voting power index of 0.0011.

  4. ^

    Of course the popular vote count is not precise, and in the event of a result within the margin of error it is unlikely an "official" winner would be designated since it makes no legal difference. So to be more precise, your vote increases the strength of your candidate's claim to having won the popular vote.

  5. ^

    Admittedly, it is difficult to tell how large the effect of 9/11 was on Bush's power.

  6. ^

    Winston Churchill, speech to the House of Commons, November 11, 1947. Appears in Winston S. Churchill: His Complete Speeches, 1897–1963, ed. Robert Rhodes James, vol. 7, p. 7566 (1974).

  7. ^

    Admittedly, given some of Trump's statements about disputing the election, jailing his opponent, admiring dictators, and bombing families of militants, this election may be an exception.

You Are Probably Misusing Opacity


I have noticed a trend among designers of using opacity/alpha to control the color of text and icons. This is even explicitly recommended by the Material Design specification, Google's official style guide:

Black or white text that is transparent remains legible and vibrant against background color changes. This makes it more flexible than grey text in the same contexts.

The spec gives the example of grey text vs semi-transparent black text on a magenta background. It explains:

Grey text (hex value of #727272) on a white background becomes hard to read if the background color changes to magenta.

This argument might be more convincing if it weren't written in grey[0] text, explicitly violating its own recommendation. The spec also dictates different colors and opacities for light vs dark backgrounds, so it seems they do not entirely buy the background-independence argument. Indeed there are two sides to this story, as the table below illustrates.

Semi-Transparent Black Grey
Is better for light backgrounds. Is better for dark backgrounds.
Is better for light backgrounds. Is better for dark backgrounds.

Based on this alone, I would consider the battle between color and opacity to be a wash. But using opacity comes with some excellent footguns, since there are two ways to set the opacity of text, neither of which do quite what you want. You can set the color property, but this requires overriding the base color as well as the alpha channel[1], so it's unsuitable for CSS classes meant to apply to diverse elements. Alternatively, you can set the opacity attribute, completely avoiding that problem. But opacity doesn't just apply to text—it applies to all content. If your content is just a block of raw text or a single icon this is fine, but if you try to style text or an icon within that block of text, the opacities stack instead of the inner opacity overriding the outer. And if any parent has its color set with an alpha channel, that alpha value stacks as well.

To illustrate the mess this can create, consider the following CSS applied to three nested divs:

.secondary-rgba {
    color: rgba(0, 0, 0, .54);
.secondary-opacity {
    opacity: .54;
This text has the "secondary-rgba" class, setting its color to black and .54 opaque.
This text has the "secondary-opacity" class, setting its opacity to .54. Because color is inherited and stacks with opacity, it is ~.29 opaque.
This text also has the "secondary-opacity" class. Because its parent's opacity and its inherited color stack with its own opacity, it is ~.16 opaque.

Yes, you really have to read that. At .54 opacity[2], the text is a little straining to read, but not too bad. At .29 it's difficult, and at .16 it's downright frustrating.

There's an even subtler way using opacity can introduce bugs: it triggers a new stacking context. This changes the order in which elements are composited, often causing the semi-opaque element to eclipse other elements, as happens to icons in Twitter's mobile site:

This even bit me when I switched to using a fixed header nav for this blog (on desktop), with the inner divs in the example above eclipsing the nav.

What should we do? My dream is that we can return to readable #000 (or even #333) text everywhere, avoiding all of these problems. Unfortunately, I don't expect designers to abandon faded text any time soon. But maybe we can convince them to at least use color rather than opacity or alpha in their specifications.

  1. ^

    Specifically, a hex value of #616161; I'm not sure where they get #727272, which is not the color of their example, and is almost completely illegible.

  2. ^

    For example, if you create a class .semi-transparent { color: rgba(0, 0, 0, .54); } and use it to override the property color: red on an element, then the resulting element will be semi-transparent black instead of semi-transparent red.

  3. ^

    .54 opacity is the Material Design standard for "secondary" text, whatever that means.

Against AI Risk


If you are not familiar with the Rationalist community or AI risk, this post will mean little to you.

I keep running into the Rationalist community of late, and almost every time I encounter the idea of "AI risk". Many[0] Rationalists, seemingly very intelligent and thoughtful people[1], believe the primary threat to Humanity is the possibility of a super-intelligent AI, one so far beyond humans that the pursuit of its own goals incidentally destroys us. I believe this fixation[2] is is deeply misguided.

The arguments I've seen[3] can be boiled down to two points:

  1. The danger posed by an entity is a function of how powerful it is and how misaligned its objectives are with ideal human values.
  2. A sufficiently advanced AI would be so powerful that even the slightest misalignment with human values would result in human extinction, or very close.

I actually agree on both points, but disagree that they form a strong argument to worry about AI risk—at least as usually defined by Rationalists. The real AI risk isn't an all-powerful savant which misinterprets a command to "make everyone on Earth happy" and destroys the Earth. It's a military AI that correctly interprets a command to kill a particular group of people, so effectively that its masters start thinking about the next group, and the next. It's smart factories that create a vast chasm between a new, tiny Hyperclass and the destitute masses. I can't say how far off these are, but surely they are nearer than super-intelligent general AI.

Nor is AI a unique threat viewed through this lens. Technology exists to make people and institutions more powerful. This is a good thing to the extent that the people and institutions in question are "good". But AI is hardly the only technology powerful enough to turn dangerous people into existential threats. We already have nuclear weapons, which like almost everything else are always getting cheaper to produce. Income inequality is already rising at a breathtaking pace. The internet has given birth to history's most powerful surveillance system and tools of propaganda.

My plea to Rationalists is to consider these problems first. Technology in general poses an existential risk on a much shorter time-scale than super-intelligent AI does. We as a species will need general solutions to this problem. We will need to prevent "radicals"—for increasingly tame definitions of the term—from acquiring ever-more-common technology. At the same time we will need protection from our protectors, whose power will only increase and become easier to abuse. Society will need radical transformation. I suspect that after this transformation, AI risk will be a radically different problem, if it still exists at all.

It may be tempting to argue that even if AI risk is not one of the most important problem to work on today, it still deserves more attention than it gets now[4]. On the surface this is a reasonable argument. Of course society has room for people to consider multiple problems, and it is even healthy to do so. But I would expect this argument to be least persuasive to Rationalists. For one, while it makes sense on a societal level, on an individual level (assuming, safely, that this post does not convince everyone) it's a classic failure to treat opportunity cost as true cost. For another, I suspect that most Rationalists have a conviction that their attention is far more valuable than average. This is not a criticism; personally I find it hard to go through life believing otherwise. But this conviction carries with it a duty not to squander your attention.

  1. ^

    The weasel-word "many" is intentional, as I have no concrete idea how many Rationalists worry about AI risk. Anecdotally it seems the majority are, but my experience is colored by selection bias (often I don't know someone is a Rationalist until AI risk comes up) and confirmation bias. I would love to see statistics on this.

  2. ^

    Talking with Rationalists often feels like looking into a mirror to me—but the reflection is inexplicably off, like I've stepped into the Twilight Zone. We have the same interests, the same problems, favor the same style of writing, gravitate towards the same media, and combined these lead to very similar worldviews. But then I catch a glimpse of something completely bizarre—polyamory, cuddle piles, or most often, AI risk. Part of my motivation for this post is an emotional desire to reconcile this unsettling reflection by correcting it, although I do not dismiss the possibility that I will be the one persuaded.

  3. ^

    I believe the sheer coolness of the idea of AI risk, combined with the (plausible) view that Rationalists are uniquely well-equipped to fight it, causes this fixation. Elizier Yudkowsky and his Machine Intelligence Research Institute probably acted as a catalyst. Unfortunately a full examination of how AI risk came to consume the community is beyond the scope of this post.

  4. ^

    The best summary of these arguments is from Scott Alexander at SlateStarCodex, an excellent example of a smart person I agree with on almost everything, but whose concern with AI risk baffles me.

  5. ^

    In the same post, Scott Alexander notes that Floyd Mayweather was paid ten times more for a single boxing match than has ever been spent on studying AI risk. But this comparison is pointless—you have no influence over the money going to Floyd Mayweather to punch people (assuming you don't pay to watch boxing), while you have control over your own money and attention going to study AI risk.

Lessons from Milo


If you do not know who Milo Yiannopoulos is, stop reading now and hope never to learn.

Milo Yiannopoulos has fallen. As a self-identifying liberal, I hope he never returns. But before we liberals celebrate too much, we must remember that it was the Right that took him down[0]. Instead, we should take a moment to reflect on why America was subjected to Milo in the first place.

It was the Left that enabled Milo. His scheme was simple, yet shockingly effective:

  1. A conservative college group invites him to speak on campus.
  2. University administrators, who believe in, are publicly committed to, and in many cases are legally bound by[1] the ideal of Free Speech, allow him to speak and urge their students to follow suit.
  3. Liberals show up to protest en masse.
  4. Extreme elements associated with the Left commit violence.[2]
  5. Milo is blocked from speaking, freeing him from having to say anything interesting or worthwhile.
  6. The media covers the shit out of "riots", because it makes ratings.
  7. Milo gets quoted in media and featured on TV as a tie-in to these stories, getting a vastly larger platform than a lecture hall.
  8. The university is forced to publicly apologize to him and the group that invited him, who in turn feel vindicated.

In short, we get played. Hard.

Note step 4 in this process—the use of violence by elements of the Left to block Milo from speaking. This was most prominently on display when he came to Berkeley, where the resulting protests-turned-riots received extensive national news coverage. A lot of generally thoughtful, well-intentioned liberals I know defended this, on the grounds that Milo's speeches were sufficiently harmful[3] that mild violence[4] was acceptable. Free speech ideals aside, this defense is incoherent. Long-term, these reactions to Milo gave him his platform, and without them he would not be a threat in the first place. Short-term, blocking him from speaking at a lecture hall in no way prevents him from saying whatever he wants; in the best case his supporters will watch him on Facebook and YouTube, but in the worst—as at Berkeley—he will be interviewed on cable and garner 1000x more viewers. If these tactics appear successful, because he does not say whatever you were trying to prevent him from saying, then either he wasn't going to in the first place, or he is choosing not to in order to incentivize violence and you are playing directly into his hands.

If violent protests didn't stop Milo from doing harm, what did they do? We have already seen that they are a key part of his scheme, both to extract political capital from (liberal) universities and to generate publicity. For Milo, this meant attention, an editorship at Breitbart, speaking fees, and a $250,000 book deal. I doubt he cared about anything else. But the collateral damage was great. Most news viewers didn't see a triumph of tolerance at Berkeley, or even a defeat of free speech. They saw men in black masks beating a man on the ground. They heard Milo's dark warnings about these "intolerant liberals" who would be coming for them next. And finally they understood that they need President Trump to protect them.

Are these viewers wrong? Probably. Milo was intentionally drawing the ire of the Left, and perhaps uniquely suited for it. But can they really be blamed for overlooking this fact, given the Left did as well? Perhaps they had nothing to fear as long as they were less offensive than Milo. And yet:

Given the above, it seems rational for them to view liberals as a threat, if only because they have an incomplete picture. And even if it weren't, irrational fears matter just as much in a democracy.

The Left needs to stop getting played. It's not just headache-inducing; it's deeply harmful to our political goals. As a meta-point, we need to stop lashing out at our allies when they tell us we're being played, or when they warn us that our enemy's "enemy" is not out friend[5]. We need to examine why it is that we seem systematically vulnerable to this kind of bait, and what other ways we are sabotaging ourselves. We could accomplish such great things! Instead, we got Donald Trump elected.

  1. ^

    The video that brought Milo down was dug up and spread around by the conservative twitter account @ReaganBattalion, in response to Milo's invitation to CPAC. The Washington Post published a timeline of the fallout from this and subsequent videos, which came almost entirely from conservative groups.

  2. ^

    Public universities cannot impose disparate requirements on invited speakers on the basis of the content of their speech, except within very narrow limits recognized in 1st Amendment jurisprudence.

  3. ^

    According to Berkeley's official statement, the violence was almost entirely the work of a group that marched in from off-campus—the so-called "Black Bloc". It is difficult to confirm this account, but it seems plausible to me. Unfortunately, it's all but irrelevant in the eyes of the broader world.

  4. ^

    The example I have usually heard cited is that he outed a transgender student by name and made some remarks about them that could be interpreted as condoning violence against them. Whether being outed as transgender on the Berkeley campus puts one at risk of serious harm is debatable, but it is certainly wrong.

  5. ^

    Berkeley wasn't that violent, but I—and perhaps a million others—saw:

    • A man beaten on the ground with sticks.
    • A woman hit in the face with a stick.
    • An event light, a tree, and numerous trash cans set on fire.
    • A car's windshield smashed in—which then drove off with a rioter on the hood.
    • A driver pepper-sprayed—this one because some rioters had confused said car with the aforementioned one, despite them looking nothing alike.
  6. ^

    The author of these tweets, Zeynep Tufecki, is an academic Sociologist who sometimes writes for the New York Times. I highly recommend following her on twitter for insightful perspectives on modern social phenomena and how social media drives them, which she should be expanding upon in her forthcoming book. She also has her own analysis of Milo's shtick in tweetstorm form.

The Cult of Averages


Society is obsessed with statistics, and even more so with boiling statistics down to a single number—generally the arithmetic mean. We want to believe our judgements are based on vast amounts of data, but shy away from the complexity of real analysis, and so we turn to the mean as our savior. Most visibly, we cite differences in means between demographic groups, or between time periods, to justify our political narratives. Yet we never question what these means mean[0][1].

Ill-Founded Means

The mean of a data set is its sum divided by its size. There is no self-evident interpretation of this number for arbitrary kinds of data, although people seem to assume otherwise. Next time a mean of a data set comes up, ask yourself whether it makes sense to talk about the sum of that data set. If not, how does dividing it by the number of data points make any more sense?

Many commonly-sited means lack any concrete interpretation. Consider mean GPAs, which are in fact means of means (and hence double silly). I'm sure you've read more than one piece citing differences in mean high-school GPAs between demographics groups—men and women, rich and poor, black and white[2]—as evidence of various structural failures. Ignore their conclusions for a minute[3], and try to explain how to interpret the mean GPA, without simply repeating the definition. "On average, how well students are doing" is a common attempt, but completely meaningless. "How well the average student is doing" is slightly better, but there is no such thing as a mean student; that would be like having a mean parent. In what sense is a C "half-way" between an A and an F, much less a C-average student "half-way" between a straight-A student and a straight-F student? What if someone re-numbered the GPA system so that a B was worth 2.5 points instead of 3? This would vastly change the mean GPA, but it has no real significance to anything we might want to measure. We're trying to use math to derive meaning from meaningless numbers. Garbage in, garbage out.

Another good example is IQ. A bedrock of modern threads of racism is the assertion that certain races have higher average IQs. But this is a very strange number to pay attention to if you know the definition of IQ: for a given test (with numeric scores presumed to correlated with intelligence), the IQ defined by that test is the unique increasing function of the score such that it is normally distributed across the population with mean 100 and standard deviation 15. Whether that makes any sense to you or not, it should be clear that a person's IQ is the result of a complex mathematical operation. It is difficult to interpret as it is. But how do you interpret the sum of two results of this complex operation? Divided by two? Can you expect that any conclusions you can draw from IQ apply to means of IQs? Of course not.

Abusing Means (and Medians)

Some means have a reasonable interpretation, but are used to draw broad conclusions they cannot actually support. Incomes are a good example. Mean incomes are a sensible concept, as two people can in fact pool their incomes and split the result. Of course, they generally don't do so, thus mean income is only a narrowly useful concept. For example, it is useful for evaluating how much a government would have to spend per citizen if it instituted a flat-rate income tax. Similarly, it is useful for estimating the total size of an economy. But outside these rare cases, the mean usually tells you little. Income taxes usually aren't flat-rate. Someone twice as rich doesn't, on average[4], buy twice as much rice; but on average they buy more than twice as many sports cars[5].

Comparisons again get us in trouble, since the mean is rarely what we actually want to compare. If the mean income of one group is lower than that of another, what does that mean exactly? Is there some sense in which the people in one group are uniformly paid less? Perhaps; this is certainly the narrative that is used when talking about disadvantaged groups in the United States—that they are reliably paid less for the same work. Differences in mean income are often used to support this intuition. But are the people of South Korea (per capita GDP, PPP: $37,900) uniformly paid so much less than the people of the Qatar (per capita GDP, PPP: $129,700)? Suddenly our intuition disagrees with what we thought the mean was telling us.

The basic assumption in these kinds of comparisons is that the distributions being compared are similar in shape, and are just shifted up or down. Sometimes this is true, but clearly sometimes it is not. Note that in the preceding examples I haven't made any claims about the shapes of these distributions—I don't know what they look like, and chances are neither do you. Nobody seems to be asking, and if they are, the data isn't being widely published.

Things get worse when we start comparing across time, because even if we think we know the distribution of a data set, this gives us no reason to think we know the distribution of the changes of that data set over time. Even the mean's sophisticated cousin—the median—often misleads us here[6]. Real median income in the United States has been rising since the end of the Great Depression, with a few brief interruptions. But what story does this tell? The story we want to hear is that across shorter time spans, individual people are making more money than they used to; and across longer time spans, children make more money than their parents did at their age, or at least over their lives. In this case, we know the reality is different. If we instead group workers by date of birth, we see that median lifetime wages have been falling for decades. If we further segment by sex, we see that men's lifetime wages have been falling since the cohort born in 1942(!), with falling inequality in women's wages making up the difference for the median across sexes until women's wages began stagnating in the 1980's[7]. So how can median incomes still be rising, albeit slowly? We're getting older, and old people make more money[8].

Nothing is Normal

This is a special case of a more general phenomenon. When people think about differences in means or medians, they often have a mental picture of a pair of bell curves, centered around different points, and base their conclusions on that picture. Implicitly, they are assuming that the distributions in question are so-called normal distributions.

Despite the unfortunate name, most distributions are not normal. Normal distributions are strikingly common by a statistician's standards, which is to say they resemble a non-trivial minority of the distributions one comes across. Specifically, any distribution which arises from the sum of a large number of independent, yet similarly distributed variables, will be approximately normal. Every one of these caveats is important. GPAs are a sum[9], but not of independent variables—of course students who do well in one class tend to do well in another. Individual income is generally not a sum of more than one source of income, and very rarely more than a few.

Among the hotly discussed statistics discussed above, only IQ is normally distributed—since it is defined as such. But this is misleading. As previously argued, nobody cares about IQ qua IQ, they care about is a proxy for something else they can't measure. But what are the chances that something else is normally distributed? Low. Furthermore, while IQ is defined to be normally distributed, measurements of it rely on transforming test scores into IQs based on a finite sample of scores. At either end of the score distribution, sample size falls and sample error grows. So someone whose score on one IQ test translates to an IQ of 160 (4 standard deviations above the median—by definition) will likely have a very different score on another test, and it is unlikely that only 0.006% of people would score higher.

So even when something appears to be normally distributed, we should only trust that within a couple standard deviations. And yet everyone wants to hear about outliers. Here in Silicon Valley, the vogue claim among male chauvinists is that, while perhaps on average women are as smart as men, men have higher variance in IQ, and therefore at the extreme high end of the IQ distribution (e.g., them and their friends), almost everyone is male (e.g., them and their friends)[10]. While it is true that the ratio a higher-variance normal distribution to a lower-variance one approaches infinity at the ends of the distribution, this is not true of approximately normal distributions, so in light of the previous paragraph this claim amounts to statistical bunk.

Doing it Right

The median is a simple and obvious improvement over the mean for most things. But why constrain ourselves to a single number? We live in an age of graphical media—why not show the distribution? If the conclusion is not obvious from the distribution, then either the effect is very small, or the statistical analysis required to reach it is complex and easy for non-experts to get wrong. In the first case it should only be of interest to domain experts, while in the second it can only be evaluated by them. Either way, this kind of statistical analysis is best left to peer-reviewed journal articles.

Of course, the reason we do things wrong isn't because doing things right is hard. It's because doing things wrong gives media companies plausible deniability to write "news" stories with shocking and yet clear-cut conclusions supporting a popular ideological position, ideal for social-media driven advertising.

It is difficult to get a man to understand something, when his salary depends upon his not understanding it! — Upton Sinclair[11]

  1. ^

    In short: means are only meaningful inside vector spaces, and functions of means are only meaningful when the function is linear.

  2. ^

    I'm punny, sue me.

  3. ^

    Racial classifications are also ill-defined, and becoming increasingly divorced from reality as intermarriage increases. The computer scientist Les Earnest has written on this from a CS point of view.

  4. ^

    They're probably right that there are structural failures in our school systems that substantially hinder the educations of various groups. This is certainly what my intuition tells me. But just because a given interpretation of the data supports your intuition doesn't mean that interpretation is right. Have courage—cite your intuition instead of deeply flawed data analysis!

  5. ^

    Hopefully you're asking what average—mean or median? Here I mean mean.

  6. ^

    Formally, both of these are nonlinear functions of income.

  7. ^

    In fact, while at least the mean difference over time is the same as the difference in the means, since the median is non-linear we don't even have that guarantee.

  8. ^

    This Washington Post article is unusual in that it exploits graphics to clearly explain these trends and avoids major statistical pitfalls.

  9. ^

    I suspect that the underlying data in this study also takes into account falling labor force participation[12] in a way that most median wage statistics ignore, namely by counting workers who stop working; unfortunately I do not have the underlying data to back this up.

  10. ^

    At least if you fix the number of classes.

  11. ^

    Not just in Silicon Valley, mind you. Social scientists as prominent as Steven Pinker have made similar claims appealing to normal distributions of aptitude.

  12. ^

    I, Candidate for Governor: And How I Got Licked (1935), ISBN 0-520-08198-6; repr. University of California Press, 1994, p. 109.

  13. ^

    Labor force participation is the fraction of all adults (16+) who are in the labor force, e.g. count towards the "official" U-3 unemployment rate. Participation has been buoyed by women leaving the home, but despite this the participation rate has fallen 4 percentage points in the last 20 years (BLS, note that you have to change the time range to see 20 years). Part of this is the US's aging population, which wouldn't affect the study of lifetime earnings, but a large part is that 5.3 million more working-age adults have been added to disability rolls over the same time (Washington Post article on rising disability, another rare example of good statistical analysis in media, which avoids many pitfalls by looking at changes county-by-county).

Please Lie


You ought to tell more white lies.

This may seem like strange moral advice, but we live in strange times. Times of ubiquitous surveillance, where everyone's purchases, their web browsing habits, their social media use, the TV shows and movies they watch, and a hundred other data streams are fed into hidden, unaccountable computer models to determine whether you will get a job offer, a loan, an insurance policy, and how hard companies will be able to squeeze you when they sell to you[0].

It is relatively obvious that such surveillance may change what's in your best interests, although part of the problem with this system is that it's so hard to tell what is and isn't in your interests anymore. But it may not be obvious—to someone unfamiliar with the computer models—how it affects your moral duties, and in particular how it imposes a duty to lie.

A brief discussion of these models is necessary. You have probably heard the terms artificial intelligence and machine learning. These are broad terms for a large class of technologies, which mostly fall under two umbrellas—statistical models and neural networks. These technologies differ in important ways, but they share a common operating principle: they find correlations in large data sets and use them to develop heuristics.

This is similar to much of human learning. When applied to large groups of people, we call the resulting heuristics stereotypes, and we are often uncomfortable with their moral implications. The primary difference between the computer's heuristics and our stereotypes is the depth and breadth of information available to it. Instead of having a stereotype that young black men are likely to commit crimes, for example, it may decide that young black men who have lived in certain zip codes and shop at Walmart are unusually likely to commit crimes, while those who have lived in other zip codes and shop at vintage record stores are less likely to. Often, machine learning models will have tens of thousands of such heuristics, and this precision makes them powerful. They really will zone in on which groups are likely to commit a crime, or pay back a loan, or get injured, or are willing to pay the most for a given product.

Because these models are based on correlations, your behavior impacts how the model views everyone. When the model learns something new about you—say you make a purchase, or an insurance claim, or you just go another year without getting arrested—the model will take everything it knows about you—your age, gender, race, zip code, credit history, political views, hobbies, favorite TV shows, who you're friends with on social media, etcetera—and update its predictions about everyone else who has any of these characteristics in common with you. Perhaps you buy overpriced headphones, and the model predicts it can overcharge your Facebook friends (and their friends) who watch the same TV shows as you for audio equipment. Perhaps you are healthy and shop at Trader Joe's, and the model slightly decreases health insurance premiums for other Trader Joe's shoppers while slightly increasing them for everyone else. Perhaps you become a regular at a cafe, and the model concludes that people at that cafe are likely to vote Democrat and are fruitful targets for Republican turnout-suppression advertisements.

These examples are clearly cherry-picked, but they fall into broad trends brought on by such models:

They make wealthy companies wealthier. Wealthy companies are able to gather more data and develop better models than smaller competitors, and can use these to more effectively advertise to customers, and to do what economists call price discrimination—charging different people different prices because some are willing to pay more than others. Amazon is particularly notorious for this, and while to my knowledge they have backed off on literally charging different prices to different consumers at the same time and in the same neighborhood, they adjust prices so frequently that the effect is the same. The consequence of this, according to classical economic theory, is that while some consumers will pay less than they would previously, as the company's price discrimination improves, all surplus value will eventually be captured by the company. Classical economics also says that price discrimination should be impossible without barriers to entry—which we have already seen that these models provide.

They reinforce the powerful. Besides making the wealthy wealthier, they allow political campaigns to parley advertising dollars into votes more effectively. Outside democracies they are far more powerful, providing tools to detect dissidents and to target disinformation campaigns[1].

They entrench poverty. Companies charge poor people more for things, because they can't afford to delay their purchases, or because they're less savvy consumers. Recall my opening examples of how these models are used for employment, loans, and insurance policies, and combine that with the fact that people in historically poor groups are statistically riskier to employ, loan to or insure, so will be punished by these models. More directly concerning is that these models are sometimes used by the US criminal justice system in pre-trial release hearings and even sentencing, and they heavily penalize black people[2].

More abstractly, I expect they will have a chilling effect on society. Eventually people will realize that they're being penalized for many of their day-to-day actions, but they still won't know what they're being penalized for. Uncertainty breeds hesitance. I expect people will become reluctant to talk online, reluctant to shop, reluctant to subscribe to newspapers and magazines. Or they'll become duplicitous, trying to shape their online lives and public habits in a way they think favors them. Being surrounded by either, or both, will be pretty intolerable.

Abstaining from training these models is very, very hard. You can't shop online or use a credit card, you can't use social media, can't sign up for anything requiring a Facebook login[3], can't use store discount cards, can't subscribe to anything, can't use map-based services on your phone, etcetera. You'll be at a significant disadvantage in life and most people will regard you as a social pariah. And even if you do abstain, you'll be alone, and it will make very little difference.

Fortunately there is a much easier and more effective way: lie. When a store asks for your phone number or your zip code, lie. Lie on your Facebook profile, or use a second Facebook profile when browsing and signing up for things[4]. When using a ride-sharing service, put in the address next door to your destination. If you have the technical skill to do so, spoof your location on your phone.

This is far better than abstention. As models become more precise in their predictions, they are relying on smaller and smaller signals in the underlying data sets. A random lie will tend to be a strong signal in a different direction than any real trend, exerting disproportionate influence on the model. Instead of merely negating your own influence on the model, you're counteracting other people's too.

  1. ^

    Incidentally, almost everyone I've talked to about this either does not believe this is happening (often likening it to a conspiracy theory), but that if it were it would be horrifying, or is well aware of and untroubled by it. The latter is much more common in Silicon Valley; after all they can hardly deny the explicit purpose of so many of the tools and businesses they have built.

  2. ^

    Disinformation campaigns are particularly useful for those in a position of power, because confusion and disorganization undermine any attempt to change the status quo. For much more on this and other authoritarian uses of technology, I recommend the work of Zeynep Tufekci.

  3. ^

    The linked article claims that a particular model is "biased against blacks", which is true in a loose, colloquial sense of the word, but is not true in the sense that statisticians use the word. What the article really shows is that there is disparate impact—black people suffer much more from use of this model than white people do, even those who will not jump bail or re-offend. The paper on which the article is based shows that the model has similar precision and recall for black and white people, meaning it is not biased in the statistical sense. But it also shows that the false positive rate is much higher for black people. This is a natural statistical consequence of the fact that the model is imprecise and that black people jump bail or re-offend at much higher rates than white people. But it's still noteworthy from a public policy point of view. It would be silly to dismiss the article because it doesn't show statistical bias—after all, a model that only used race as input and rated black people twice as risky as white people would not be biased in the statistical sense but most people would agree that it is racist.

  4. ^

    For example, most dating apps require a Facebook login. This may seem pretty trivial unless you are a single 20-something in an area with an unfavorable gender ratio, in which case it will seem anything but.

  5. ^

    This is against Facebook's terms of service, but they can go fuck themselves.

What Mathematics has Lost


Maryam Mirzakhani passed away recently. She was a first-rate mathematician, and the greatest living researcher in my former field of polygonal billiards. I fear that with her, we have lost our best chance of resolving the questions that once possessed me. But I have realized that we have actually lost much more; mathematics has not just lost her future work, but much of her past as well.

Mathematics is, at its heart, not a collection of theorems. It is about human understanding. It is often said among mathematicians that proofs are more important than theorems, and that definitions are more important than proofs. Mathematics is about understanding rather than knowledge, about finding frameworks and developing intuitions for logical structures. The first and still greatest insight into polygonal billiards was the unfolding: the observation that, instead of visualizing a billiard ball bouncing off the edge of a polygon, you can visualize the polygon reflecting across that edge while the ball continues in a straight line (I have built a demo of this here). From a coordinate geometry perspective, this does not change the calculations at all—a computer plotting the trajectory has to do just as much work as before. Yet all important theorems of polygonal billiards, including Mirzakhani's, flow from this observation.

While we treasure Mirzakhani's theorems, and moreso her proofs, the true value of her work lay in her unique intuitions that made them possible. Yet these are almost entirely absent from her papers. We can guess at them from the definitions she introduces, from the nature and structure of her proofs. But the originals were never truly laid out on paper, and only incompletely shared with her collaborators. To an extent, they died with her.

I do not mean to single out Mirzakhani as a bad writer. She was one of the better in our field, and we have only lost so much understanding with her because there was so much to lose. But it is not our custom to write down our intuitions. I have heard the process of writing a math paper described as "doing everything possible to disguise the fact that it was written by a human being". We have been trained to write mathematics in such a way that readers (within our specialized sub-field) may (if with difficulty) verify that our claims are correct, but never understand how or why we arrived at them. They are only truly comprehensible if you already share much of the underlying intuition of the author. To really read them, you must be able to build up the same intuitions the author did, using their writing only to check your work as you go. This has value, but not nearly as much as presenting the intuitions directly.

It does not have to be this way. These intuitions are developed through conference talks, seminars, and informal discussions. It is largely a collaborative process, and one carried out through words. Words like "squiggly-ness" and "kinda looks like", that embarrass us in formal writing, that we are loathe to put to paper lest they undermine our illusion that our mathematical process is as clean and logical as the product. But this is not the main way our pride gets in the way of explaining our underlying intuitions. In the words of the great geometer Mikhael Gromov[0]:

This common and unfortunate fact of the lack of an adequate presentation of basic ideas and motivations of almost any mathematical theory is, probably, due to the binary nature of mathematical perception: either you have no inkling of an idea or, once you have understood it, this very idea appears so embarrassingly obvious that you feel reluctant to say it aloud; moreover, once your mind switches from the state of darkness to the light, all memory of the dark state is erased and it becomes impossible to conceive the existence of another mind for which the idea appears nonobvious.

I think he is too generous in saying that mathematicians lose the memory of their ignorance—rather I think we willfully pretend it never existed. As writers, we need only get over ourselves and admit in our writing that we are human. This burden falls on readers as well—we must not sneer at writers who deign to be understood.

There is one significant hurdle, which is that outside of collaboration, mathematics relies on the strange convergence of individual mathematician's intuitions after staring at a problem for long enough. This is what my well-meaning teachers meant when they told me that there was no substitute for hard work, that there was no book that could explain things to me, that even the best books could only show me what it was I had to learn. And while I have gained much intuition through collaboration, they were right that some of it I was only able to find through solitary thought. While I'm confident that other mathematicians converge on these intuitions—otherwise they could not write the proofs they do—I have no idea how to share them.

Why do we develop convergent intuitions, and how can they be shared? This is surely the most important question in mathematics, and yet it receives no formal attention from practicing mathematicians. Instead it is relegated to philosophers of mathematics. Is this really a satisfying state of mathematics? If there has been progress, it has not impacted the practice of mathematics. Today, the real substance of mathematics remains locked away in a few mortal minds.

  1. ^

    Berger, Marcel. "Encounter with a geometer, Part II." (2000).

A Front-End Debugging Adventure


Like many tricky bugs, this one begins with a ticket from QA containing an unholy incantation which, when uttered, will summon a demon to lay waste to the app. In this case: "Go to this page in Firefox, double-click on X, then on Y, then on Z, then click on W."

First I must reduce this incantation to something sane. It turns out that X and Y are superfluous; double-clicking on Z will (sometimes, and only in Firefox) throw a silent exception, leaving the app in an inconsistent state, which is only apparent to the user upon clicking on W. The exception is useless—the stack trace, like any stack trace for JS code generated from Dart, ends in Array.prototype.slice.apply(arguments)— but this is nearly a smoking gun: Z has a double-click listener on it, which must be triggering the exception. Commenting out the meat of the double-click listener[0] does not fix the problem though. Neither does removing it entirely.

The only other cause I can think of is the mousedown listener. Double-clicking slowly reveals that the exception is actually thrown before the mouse button is released the second time. So this bug is disguised—it's actually a bug in the mousedown listener, but only triggered the second time, and only if it's fast enough, and only in Firefox.

Clearly we need to dig into the meat of the mousedown listener. Nothing in it looks browser-dependent—no fancy APIs are being called, no prefixed styles are applied. It could be caused by one of Firefox's nearly endless supply of open bugs, but having dealt with a number of those before, my intuition suggests otherwise. The only other plausible difference between browsers I can think of is timing—Firefox's layout algorithms are much slower than Chrome's—which could trigger a race condition that went unnoticed in Chrome.

Some background: This code is part of a fancy table component, and allows users to click and drag the border between columns to resize them. It handles this by installing a mousedown listener on a thin div along the border, which on mousedown installs mousemove and mouseup listeners, and adds a line to the DOM that moves with the user's cursor (because fancy). The mouseup listener then removes the line. For performance reasons, these DOM modifications are performed asynchronously. It is here that I expect a race condition.

Like stack traces, breakpoints in the compiled JS are useless, so I fall back to classic printf debugging[1]. Comparing the logs produced in Chrome with Firefox, I see the following ordering (with numerous ultimately irrelevant steps removed of course):

Chrome Firefox
mousedown mousedown
line added line added
mouseup mouseup
line removed mousedown
mousedown Exception
line added

Clearly the second mousedown event is somehow interfering with the mouseup handler removing the line. But a single click, no matter how quick, does not trigger the event—what is special about the second mousedown?

Digging into the mousedown and mouseup handlers, the normal order of events is:

  1. The mousedown handler creates a div, storing it as a private class member _lineElem, and schedules it to be added to the DOM.
  2. Once _lineElem is added to the DOM, the mouseup handler is installed.
  3. When the mouseup handler is triggered, it removes _lineElem from the DOM.

Given that we observed the second mousedown event before the line was removed, the buggy order of events must be:

  1. The mousedown handler creates a div, storing it as a private class member _lineElem, and schedules it to be added to the DOM.
  2. Once _lineElem is added to the DOM, the mouseup handler is installed.
  3. The mousedown handler creates a new div, overwriting _lineElem, and schedules it to be added to the DOM.
  4. The mouseup handler is triggered, and tries to remove _lineElem from the DOM.

But the line currently attached to the DOM is no longer referenced by _lineElem—that points to a new element that isn't yet in the DOM. When the mouseup handler tries to remove it, an exception is thrown and Angular cannot complete the digest cycle.

The fix is simple enough—only set _lineElem once the element has been added to the DOM.

  1. ^

    And waiting 5 minutes for the Dart to re-compile to JS.

  2. ^

    In Dart, one actually calls print with an interpolated string.

Electronic Health Records Considered Harmful


Reading An American Sickness, a blood-boiling portrait of the American healthcare system by former doctor and NYT reporter Elisabeth Rosenthal, has led me to reflect on my own small role in this tragedy, back when I was a junior software engineer working at an EHR company. Rosenthal only briefly touches on the role of medical software in rising costs, but where she does her analysis fits with my experience.

For background, most hospitals and doctor's offices will use a single, monolithic piece of software called an Electronic Health Records system or EHR for all of their day-to-day needs, e.g. scheduling appointments, charting patients' visits, prescribing medications, ordering lab tests, and often billing. Insofar as other "endpoint" medical software exists, e.g. imaging software on an MRI machine, it generally feeds directly into the EHR system. Thus EHR systems largely determine the medical software landscape.

Rosenthal cites two ways in which EHRs contribute to rising prices:

  1. Barriers to competition: EHRs are not compatible with each other, and endpoint medical software tends to be compatible with a specific EHR. Thus patients are forced to use the same, high-priced hospital labs, radiologists and specialists for all of their care.
  2. Costly defaults: Doctors will almost never choose any but the default option for drugs, lab orders, imaging or specialist referrals. Hospitals know this, and configure their EHRs to default to high-priced in-hospital options. Some EHRs, such as PracticeFusion[0], make commissions off of orders and prescriptions made through their systems, and set defaults accordingly.

An understated yet critical factor is that EHRs are horrible. Incredibly hard to use and chock full of bugs, there's a good chance that the data one doctor puts in won't be anything like what the next one pulls up. This results in ubiquitous inefficiency and the occasional deadly mistake. But how did it come to this?

As Rosenthal notes, EHR adoption was very fast and is at nearly 100% now due largely to a government program called Meaningful Use, which offered huge tax breaks to doctors who used a "Meaningful Use certified" EHR system. This incentive was so large that no uncertified software could possibly compete in the market, so ever EHR vendor raced to get certified.

However well-intentioned, it delivered a 1-2 punch to medical software as a whole. On one front, it required EHR software to implement a long list of underspecified requirements, such as:

(14) Patient list creation. Enable a user to electronically and dynamically select, sort, access, and create patient lists by: date and time; and based on each one and at least one combination of the following data: (i) Problems; (ii) Medications; (iii) Medication allergies; (iv) Demographics; (v) Laboratory tests and values/results; and (vi) Ambulatory setting only. Patient communication preferences.

It is unclear what any of this means. What exactly does it mean to sort a list of patients by "problems"? Sort by the number of problems? That doesn't seem very useful. Sort by the name of the kind of problem, alphabetically? What if they have multiple problems? Sort by serverity? How do you rank the severity of the problems then? And so on for every item in this list. As a result EHR vendors had no idea what to implement, and so implemented the easiest thing that sort of looked like the requirements. Even if the sorting did not work as the engineers intended, the certifiers had no grounds to object since they had no specifications as to how it should work.

As one of my then colleagues put it:

It was like the Health Department requiring every restaurant to serve each of 1000 dishes, where item 697 was 'Hamburger: has ketchup', and you could serve a brick with ketchup to the inspector and get a pass.

And serve a brick with ketchup they all did; it was the only practical way to meet all of the requirements on time.

On the other front, Meaningful Use has us stuck with the first EHRs to meet its underspecified requirements. Since EHRs are so vast and incompatible, almost no doctor is ever willing to switch EHRs. Thus any new EHR must break into the market by targeting doctors still using paper and pen. Yet Meaningful Use artificially drove almost every doctor to adopt an EHR, so there are almost no doctors for a new EHR to target.

Furthermore, "meaningful" is not at all how I would describe the use of EHRs in American medicine. A shocking amount of the time, EHRs are treated as just one more sink into which data must be manually entered (never to see the light of day again), or as one more source of mutually conflicting data about a patient to find, shake your head at and ignore. What's worse, this may be the correct approach to dealing with current EHRs.

Rosenthal is right that it is practically impossible to transfer data between EHRs, or to use endpoint equipment not specifically built for that EHR, reducing competition and causing wasteful duplication of tests and examinations. Lack of interoperability is nothing new in the software world, but EHRs have definitely refined the art. The most obvious cause is the woeful inadequacy of HL7, the "standard" data format implemented by most healthcare software. It has a host of flaws, but the largest is that it offers far too much flexibility, and as a result all implementations emit very different HL7 for the same data, while only accepting a very limited and arbitrary subset of HL7.

There is some hope in this area: the FHIR draft standard has been gaining traction and has some early implementations. It rectifies the most serious flaws in HL7 by building on more modern data formats (namely JSON and XML) and thereby making it harder to write an overspecialized parser.

But there are two less obvious yet equally crippling limitations to interoperability: network access and authentication. Even if systems can parse each other's data, how are they supposed to locate that data, and how are they supposed to be granted authorization for it? Most systems are made "accessible" over some sort of network, but in bizarre and limited ways:

Should you actually get network access to a system, how are you supposed to obtain authorization to send or receive data for a patient? Outside of systems installed by the same vendor—which of course are granted full access to each other—the "standard" solution is to implement patient accounts and an OAuth or (shudder) SAML system by which patients can provide authorization to other systems. But even when these are implemented correctly (a rare occurrence), it is rarely clear how to contact the patients with an authorization request, and the patients rarely know that these accounts have been made for them and thus have no idea what to do when they receive a request.

These failures are probably a result of the strong incentives that local monopolies have against interoperability. But it is unclear how to counteract these incentives without first fixing the problem of local monopolies. Normally government regulation could intervene, but regulation has lagged too far behind technology and cannot currently address these issues; the attempts within Meaningful Use to do so were useless at best.

Going Forward

If I had to propose a solution, I would steer clear of having the government regulate EHR software directly and instead have them attempt to force interoperability, so that an actually decent EHR would stand a chance in the market. I would ask them to play the role of data broker. Suppose the government ran a service that all EHRs had to register with, where a patient could log in, view the data each EHR had on them in a standardized format, and check a box for each EHR they wanted to have access to that data.

This system would not have to be good, so long as it was usable. Of course I would expect EHRs to only send partial data to it, and do so unreliably, so it would be important that the agency managing it respond to patient complaints with swift and hefty fines.

There are some obvious criticisms of this plan:

Authentication remains a difficult problem, although at least under this system it only needs to be solved once. For authenticating patients, the system could use the same standards the IRS uses for tax filings. These are not perfect, but if someone can access and manipulate your tax filings, that's usually worse than accessing your medical records. Authenticating EHRs could be fairly simple using existing technologies, for example by having the EHRs use client certificates when sending data to and retrieving data from the system. A more complex authentication system like OAuth would allow EHRs to integrate with the system without waiting for the system to approve them and load their client certificates—instead relying on entirely the patient's confirmation through the OAuth flow—but it is unclear to me that this is worth the extra complexity for both the system and EHRs, or even desirable.

In a more perfect world we shouldn't trust the government with our medical records, but the healthcare data privacy ship has set sail, circumnavigated the globe, caught fire and sunk. By and large the government already has our medical records, whether through Medicare, Medicaid or Meaningful Use audits. So do EHR systems, whose security is a bad joke among infosec professionals—in fact, this is how I became interested in infosec, which is my current field. Doctor's offices and hospitals are even worse: all you have to do is fax them a request for information with a patient's name and date of birth—which can even be wrong—and they will fax medical records back!

While the US Government has had some high-profile tech screw-ups (cough cough), under the Obama administration it built some good teams at the US Digital Services and especially 18F, which have helped many government agencies run their tech competently. These people also know how to keep data encrypted at rest and in transit, which is better security than the status quo.

At the end of the day, I don't know whether doctors would use this system. Doctors by and large claim to hate current EHRs, and that they want better. But their stated preferences and their revealed preferences are different things, and I've usually been disappointed. If nothing else, this system could at least allow savvy patients with serious conditions to own their data and be their own doctors, which increasingly seems like their only option.

  1. ^

    Until recently, that is—PracticeFusion has gone out of business. Good riddance.