Some of my thoughts, filtered slightly for public consumption.

A Front-End Debugging Adventure

Like many tricky bugs, this one begins with a ticket from QA containing an unholy incantation which, when uttered, will summon a demon to lay waste to the app. In this case: "Go to this page in Firefox, double-click on X, then on Y, then on Z, then click on W."

First I must reduce this incantation to something sane. It turns out that X and Y are superfluous; double-clicking on Z will (sometimes, and only in Firefox) throw a silent exception, leaving the app in an inconsistent state, which is only apparent to the user upon clicking on W. The exception is useless—the stack trace, like any stack trace for JS code generated from Dart, ends in Array.prototype.slice.apply(arguments)— but this is nearly a smoking gun: Z has a double-click listener on it, which must be triggering the exception. Commenting out the meat of the double-click listener[0] does not fix the problem though. Neither does removing it entirely.

The only other cause I can think of is the mousedown listener. Double-clicking slowly reveals that the exception is actually thrown before the mouse button is released the second time. So this bug is disguised—it's actually a bug in the mousedown listener, but only triggered the second time, and only if it's fast enough, and only in Firefox.

Clearly we need to dig into the meat of the mousedown listener. Nothing in it looks browser-dependent—no fancy APIs are being called, no prefixed styles are applied. It could be caused by one of Firefox's nearly endless supply of open bugs, but having dealt with a number of those before, my intuition suggests otherwise. The only other plausible difference between browsers I can think of is timing—Firefox's layout algorithms are much slower than Chrome's—which could trigger a race condition that went unnoticed in Chrome.

Some background: This code is part of a fancy table component, and allows users to click and drag the border between columns to resize them. It handles this by installing a mousedown listener on a thin div along the border, which on mousedown installs mousemove and mouseup listeners, and adds a line to the DOM that moves with the user's cursor (because fancy). The mouseup listener then removes the line. For performance reasons, these DOM modifications are performed asynchronously. It is here that I expect a race condition.

Like stack traces, breakpoints in the compiled JS are useless, so I fall back to classic printf debugging[1]. Comparing the logs produced in Chrome with Firefox, I see the following ordering (with numerous ultimately irrelevant steps removed of course):

Chrome Firefox
mousedown mousedown
line added line added
mouseup mouseup
line removed mousedown
mousedown Exception
line added

Clearly the second mousedown event is somehow interfering with the mouseup handler removing the line. But a single click, no matter how quick, does not trigger the event—what is special about the second mousedown?

Digging into the mousedown and mouseup handlers, the normal order of events is:

  1. The mousedown handler creates a div, storing it as a private class member _lineElem, and schedules it to be added to the DOM.
  2. Once _lineElem is added to the DOM, the mouseup handler is installed.
  3. When the mouseup handler is triggered, it removes _lineElem from the DOM.

Given that we observed the second mousedown event before the line was removed, the buggy order of events must be:

  1. The mousedown handler creates a div, storing it as a private class member _lineElem, and schedules it to be added to the DOM.
  2. Once _lineElem is added to the DOM, the mouseup handler is installed.
  3. The mousedown handler creates a new div, overwriting _lineElem, and schedules it to be added to the DOM.
  4. The mouseup handler is triggered, and tries to remove _lineElem from the DOM.

But the line currently attached to the DOM is no longer referenced by _lineElem—that points to a new element that isn't yet in the DOM. When the mouseup handler tries to remove it, an exception is thrown and Angular cannot complete the digest cycle.

The fix is simple enough—only set _lineElem once the element has been added to the DOM.

  1. ^

    And waiting 5 minutes for the Dart to re-compile to JS.

  2. ^

    In Dart, one actually calls print with an interpolated string.