Formerly a mathematician, I started programming on the back-end, specializing in algorithm design and implementation, in C if I needed the performance or in Haskell otherwise. After a couple years I started branching out, doing front-end development on Android and the web, and tackling a broader range of applied problems. For a while I was a full-stack engineer working in Python/Django and JavaScript at drchrono, an electronic health records startup. I've now moved all the way to Frontend Engineer, working at Google in Mountain View, CA.

You can see many of my projects here, as well as on my GitHub.

Password Strength


Mark Burnett's release of 10M passwords got me wondering just how strong people's passwords are on average. SPOILER: NOT VERY. So I designed a Python tool to estimate how many bits of entropy a password has against a strong attacker. It considers several plausible strategies an attacker might use, and returns the lowest number of bits the password has relative to any strategy.

The strategies used are:

These strategies cover most common methods of password creation. To use XKCD's example, "Tr0ub4dor&3" is a bad password while "correcthorsebatterystaple" is good. This is actually slightly off: "Troubador" is actually spelled "Troubadour". With corrected spellings, we have:

Password XKCD's estimate My estimate
Tr0ub4dour&3 28 30
correcthorsebatterystaple 44 50

Applied to Mark's password list, the results are alarming:

Threat Model Appropriate Services Bits of Entropy # Users % Users
Your little sibling Piggy bank combination 0-19 3178432 31.8%
Mischievous friends Forum accounts 20-29 3280626 32.8%
Jealous ex Photo storage, social media 30-39 1976416 19.8%
Organized crime Financial accounts 40-49 1009985 10.1%
Governments Email, keyrings 50-59 308812 3.1%
Determined governments PGP keys, underground networks 60-999 245702 2.5%

What caused this situation, and what can we do about it? The primary problem is that we've trained anti-patterns into users. The usual process of creating a password goes something like this, breaking as soon as the user is allowed:

> Enter password: password
Error: password must contain a number
> Enter password: passw0rd
Error: password must use mixed case
> Enter password: Passw0rd
Error: password must contain a symbol
> Enter password: Passw0rd!
Error: password must not contain a word
> Enter password: P4ssw0rd!

Combine this with bizarre length limits (USAA, for example, only allows passwords up to 12 characters, which is what originally prompted my password generator), and it is easy to see why most passwords are simple patterns applied to single words. Compare instead a password like "correcthorsebatterystaple", which fails even the first test. To get something like real security and overcome people's training, I think we need intelligent password strength tests on websites, combined with correct instructions on making strong passwords.

Password Generator


This combines a master password with the name of a service and takes the SHA-256 hash. An (old) bash version is available on my GitHub, and an android version is available on the Google Play store as SHA-PASS, with code also on GitHub. This is a Javascript version, the source of which is here.

Budget Supercomputer


Like most hackers, I have far more uses for computing power than I have computing power. Unfortunately, even Cray's new budget supercomputer, the XC30-AC, is a tad out of my price range at $250k. Fortunately you can buy that much processing power on AWS for $10/hr and hack together a Beowulf cluster. Who could resist?

The Problem

I was doing some research with a professor on the representation theory of the symmetric group SnS_n, for which we needed to compute some very large and expensive matrices called character tables. The algorithmic and computational difficulty of the problem is nicely summed up by the complexity: T(n)Θ(n5/2+2/2p(n)1+2)T(n) \in \Theta\left(n^{5/2+\sqrt2/2}p(n)^{1+\sqrt2}\right) Here p(n)p(n) is the number of partitions of nn, which grows roughly like exp(n)\mathrm{exp}(\sqrt{n}). In my case n=46n=46, hence the necessity of a supercomputer. The paper describing the algorithm I was implementing claims a lower complexity by a factor of n\sqrt{n}, but this analysis rests on the assumption that sparse polynomials of length nn but arbitrary degree and arity can be added to dense polynomials in O(n)O(n) time. If there's a machine architecture that can do that, I would dearly like to know.

The code for all this is on GitHub, and the math is discussed in more depth on my mathematics page. Don't worry if the math is Greek to you. The key facts are that these computations are 1) very hard and 2) embarrassingly parallel.


AWS offers three different types of computation instances: Reserved, On-Demand and Spot. Reserved instances are for long-term applications, so we have to decide between on-demand and spot. On-demand instances give you a guaranteed instance for as long as you want it. You bid on a spot instance; as soon as the price falls below your bid, you'll get the instance, and as soon as the price rises above, you'll get kicked off. No saving, no warning, just poof! But spot instances are generally much cheaper than on-demand, so this is what we'll use.

You might expect (like me) that the spot price of computing instances would vary about like the spot prices of commodities. It doesn't. Spot Prices on AWS The price is more or less constant most of the time, but occasionally you get a gigantic spike. However, since the average price is so much lower than on-demand, we will be using spot instances.

Instance Type

It is tempting to just go with c3.8xlarge instances and be done with it. But since we know systems don't always perform as we expect, it is worth testing. This turns out to be wise: I got almost 100% CPU usage on c3.4xlarge instances or smaller, but only %65 usage on c3.8xlarge instances. (I have since performed the same test on AWS's new c4.8xlarge instances, which seem to have fixed the parallelization issues with the c3.8xlarge and give me almost 100% usage.) c3.4xlarge performance I decided to launch 16 c3.4xlarge instances as slaves, for a total of 256 cores. For a master I used an on-demand t2.medium instance, which offers decent performance when only used intermittently, as was the case here.

Beowulf Cluster

The AWS offers a Hadoop service which makes distributing computations trivial, but it costs almost as much as the spot instances themselves. We're too cheap for that.

Since this problem is embarrassingly parallel, all the nodes need to do is agree on are indices from 0-15, and then they need only write their results every few minutes to a shared directory which can then be put together the master. Enter the hack: I used NFS to create the shared directory, then had each node write its IP address as a filename. Once the node detected that there were 16 such files, it selected the index of its own IP address among the 16 and started computing.

# connect to master
sudo mkdir /mnt/master
sudo mount -t nfs %ipaddr%:%dir% /mnt/master

# perform handshake to get unique id
cd /mnt/master/handshake
touch "$HOSTNAME"
while [ $(ls -1 | wc -l) -lt %num_nodes% ]; do
id=$(ls -1 | grep -n "$HOSTNAME" | sed "s/:.*//")
let id=id-1


Working with distributed, virtualized systems is not without its pains. One of my machines' NFS client just died a few hours in (at the start of my classes, of course) and had to be restarted. Thankfully my program saved its progress every few minutes, so this was relatively costless. All but one instance finished at roughly the same time, after about 16 hours. For some reason--which must be related to virtualization somehow--one of my instances took twice as long as all the others. It almost certainly wasn't poor balancing; it would be a very strange world indeed where the complexity of the characters of S46S_{46} depends heavily on their index modulo 16.

Storing Data in GIFs


Did you know that Flickr offers 2TB of free storage, but only for images? If only there were a way around that...

These programs encode and decode arbitrary files as uncompressed GIFs. The files increase in size by roughly 1/7, because uncompressed GIFs only allow bytes in the range 0x00-0x7F in the data block. The encoding maps every block of 7 bytes into a block of 8 bytes where the first byte stores the first bit of each other byte.

For example, the encoding program and its source:

GIF'ed executable GIF'ed source

You can see the 340-byte hard-coded header clearly in both images. In the executable on the left it's the middle pinkish strip; in the source on the right it's the big section with diagonal green lines going through it.

Formal Morality


My project to mathematize everything could never be complete without mathematizing morality. I model a moral theory as a groupoid of preorders of probability distributions of worldstates. So it's pretty simple really. A lot of mathematical setup finally gives us the function we need:

-- judges which state is Just (punnyness >9000), or Nothing
judge :: MoralTheory -> PossibleState -> PossibleState -> Maybe PossibleState
judge theory state1 state2 = case (comp (veil theory state1) (veil theory state2), comp (veilMinusSelf theory state1) (veilMinusSelf theory state2)) of
	(PGT, PGT) -> Just state1
	(PLT, PLT) -> Just state2
	_ -> Nothing
		comp = pCompare $ fst $ head theory

If you're philosophically inclined you might notice the function "veil", which is indeed a reference to Rawls. A more in-depth discussion and the code itself can be found on GitHub.

Running a toy moral theory with 2 actors and 3 worldstates, we get:

Worldstates: ["Alice falling in love","Bob falling in love","Both getting $10,000"]
Alice's values: Lexicographical (Alice in love, Bob in love, money)
Bob's values: Bob in love with probability x is valued the same as money with probability 3x, Alice in love is valued least.
Alice -> Bob Correspondence: 
        Alice falling in love -> Bob falling in love
        Bob falling in love -> Alice falling in love
        Both getting $10,000 -> Both getting $10,000

Alice's duties: 
        (ProbDist [("Alice falling in love",0.0),("Bob falling in love",1.0),("Both getting $10,000",0.0)],ProbDist [("Alice falling in love",0.0),("Bob falling in love",0.0),("Both getting $10,000",1.0)])
        Judgement: Just (ProbDist [("Alice falling in love",0.0),("Bob falling in love",1.0),("Both getting $10,000",0.0)])
        Alice must choose Bob falling in love over the money, because both agree that the veiled situation of falling in love half the time is more valuable than 100% chance of getting the money. The unveiled situation happens to agree with Alice's personal values as well.

        (ProbDist [("Alice falling in love",1.0),("Bob falling in love",0.0),("Both getting $10,000",0.0)],ProbDist [("Alice falling in love",0.0),("Bob falling in love",0.0),("Both getting $10,000",1.0)])
        Judgement: Nothing
        Alice values falling in love over getting the money, and Bob is indifferent for his equivalent situation. This would result in a duty to Alice, except it would be a duty to herself, which is impossible.

        (ProbDist [("Alice falling in love",0.75),("Bob falling in love",0.25),("Both getting $10,000",0.0)],ProbDist [("Alice falling in love",0.0),("Bob falling in love",0.0),("Both getting $10,000",1.0)])
        Judgement: Nothing
        Alice, were she in Bob's shoes, would value Bob falling in love over the money at any probability, but when her values are not imposed on Bob he does not value the options this way. Thus no duty is created.

Bob's duties: 
        (ProbDist [("Alice falling in love",1.0),("Bob falling in love",0.0),("Both getting $10,000",0.0)],ProbDist [("Alice falling in love",0.0),("Bob falling in love",0.0),("Both getting $10,000",1.0)])
        Judgement: Just (ProbDist [("Alice falling in love",1.0),("Bob falling in love",0.0),("Both getting $10,000",0.0)])
        The second case above, from Bob's perspective. While Alice cannot have a duty to herself, Bob can have a duty to her. Note that the unveiled situation disagrees with Bob's own values, so in this case the moral theory has nontrivial implications.

Mouse Events Demo

I've noticed a lot of confusion around how mouse events work, such as the difference between mouseover and mouseenter, what order events are fired in, and what happens when the user moves their mouse around while holding down a mouse button. I've made a demo with an explanation of the various events; it's relagated to its own page to avoid polluting event listeners.