Modelling the bodybuilding process

One of the biggest challenges in starting the process of bodybuilding is looking in the mirror every day. You see the person you are in the mirror, weighed by the baggage of the person you've been seeing for years - not the person you're intending to become.

Visualisation is key to motivation.

To help with that, I've been working on some... projections of the effects of my workouts, plotting ahead towards the 9/1 deadline I've set for myself for Burning Man 2015.

Read More

Improving the model of base card value

So while the previous model had a good fit, we can do better.

First, it's worth noting that we were training only off of creatures; weapons with no text were not modelled, and we were modelling those after the creatures.

By adding a categorical value for the card type, we can include both sets of data in the model, and provide a better prediction for both creatures and weapons.

Lastly, we add something to the model to account for card balance, penalizing cards that have are "lopsided" towards attack or health.

Predicted vs actual for fitted values.  Remember that horizontal values (actual) are quantized integers; vertical values (predicted cost) are not.

Predicted vs actual for fitted values.  Remember that horizontal values (actual) are quantized integers; vertical values (predicted cost) are not.

An updated poisson(link="identity") model, including the card type as a categorical (factor) attribute of the model.

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-0.80723  -0.05861   0.03219   0.11557   0.88626  

                                              Estimate Std. Error z value Pr(>|z|)
(Intercept)                                    0.05952    0.45112   0.132    0.895
Attack                                         0.06746    1.17126   0.058    0.954
Health                                         0.81846    1.14619   0.714    0.475
sqrt(abs(Attack - Health))                    -0.29185    0.98007  -0.298    0.766
CardType_q2                                   -2.39374    3.99784  -0.599    0.549
Attack:sqrt(abs(Attack - Health))              0.25817    0.53455   0.483    0.629
Health:sqrt(abs(Attack - Health))             -0.18654    0.64174  -0.291    0.771
Attack:CardType_q2                             0.07223    2.55269   0.028    0.977
Health:CardType_q2                             0.98153    2.49477   0.393    0.694
sqrt(abs(Attack - Health)):CardType_q2         0.73281    2.73863   0.268    0.789
Attack:sqrt(abs(Attack - Health)):CardType_q2  0.46050    1.66056   0.277    0.782
Health:sqrt(abs(Attack - Health)):CardType_q2 -0.67290    1.20226  -0.560    0.576

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 59.0578  on 45  degrees of freedom
Residual deviance:  3.7054  on 34  degrees of freedom
AIC: 148.7

Number of Fisher Scoring iterations: 6

...and for a preview of just how much card mechanics beyond the Mana/Attack/Health relationship affect the true card cost, here's a plot of the predicted base vs actual for all minions and weapons.

Predicted "base value" vs the actual value, highlighting the effects of other card mechanics on the true cost of a card.

Base Card Value: Gaussian vs Poisson

All of these mathematical endeavours begin with the presumption that Blizzard has a secret formula it uses to compute the amount of mana a card ought to cost. The value on the card will be different on the card for one of three reasons - the influence of mechanics, the effects of rounding (because a card costs 1 or 2 mana, not 1.5), and tweaking based on how it plays (variance introduced through human evaluation).

What we're ultimatley trying to build (for the base card value) is a model of the basic relationships between the values on the card (Attack, Health, and Mana), and these other effects, to 'reverse-engineer' the basic formula.

A standard lm model uses a family of functions for analyzing the variance of a dataset based on the assumption that there's a gaussian distribution in your data, and is primarily concerned with producing a binary predictor for those values.

As always, model fitting is part art, part science, and part throwing shit at the wall to see what works. Or, it is when I do it, at any rate.


Given our belief that there's a direct, additive relationship between the base card value and the attack/defense values on the card, in theory, a glm on a poisson family should get you a better fit than the gaussian; it better represents the expected behaviour of a count variable. But is it true?

First, the original, gaussian LM fit from last time.

  Call: lm(formula = Mana ~ Attack + Health, data = dataset, subset = CardType == 1 & CardText == "" & Mana > 0)

  Residuals: Min 1Q Median 3Q Max -1.9940 -0.2844 0.1968 0.2218 0.7611 

  Coefficients: Estimate Std. Error t value Pr(>|t|)   
  (Intercept) -0.16376 0.14999 -1.092 0.283   
  Attack 0.50626 0.06172 8.202 2.28e-09 _*_

  ## Health 0.43566 0.06107 7.134 4.27e-08 _*_

  Signif. codes: 0 ‘**_’ 0.001 ‘_**_’ 0.01 ‘_’ 0.05 ‘.’ 0.1 ‘ ’ 1

  Residual standard error: 0.4999 on 32 degrees of freedom Multiple R-squared: 0.9359, Adjusted R-squared: 0.9319 F-statistic: 233.6 on 2 and 32 DF, p-value: < 2.2e-16

The autoplot for lm(formula = Mana ~ Attack + Health), showing a fit using the gaussian family.

Now let's change tack.

If we presume that our outcome value is "count"-like - additive in nature from the base values on the card - we can switch to the generalized linear model, and switch to a poisson distribution - with intriguing results.

glm(formula = Mana ~ Attack + Health, family = family, data = dataset, subset = CardType == 1 & CardText == "" & Mana > 0)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.0093  -0.2435  -0.1382   0.2194   0.9077  

            Estimate Std. Error z value Pr(>|z|)  
(Intercept) -0.17849    0.23141  -0.771   0.4405  
Attack       0.14891    0.06234   2.389   0.0169 *
Health       0.16466    0.06622   2.487   0.0129 *
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 44.9140  on 34  degrees of freedom
Residual deviance:  4.4248  on 32  degrees of freedom
AIC: 101.1

Number of Fisher Scoring iterations: 4
The autoplot for lm(formula = Mana ~ Attack + Health, family=poisson), showing a fit using the poisson family.

The autoplot for lm(formula = Mana ~ Attack + Health, family=poisson), showing a fit using the poisson family.

Note the differences in the two sets of graphs:

  • Residuals are healthier. The residuals on the poisson distribution have better deviations; LM range was -1.9940 to 0.7611 (~2.75); our Poisson GLM is -1.0093 to 0.9077 (~1.9).

  • Residual QQ is better. The two graphs are day and night; you now see a nice, clean, quantized Q-Q for the residuals, showing the stair-stepping you'd expect to see when you know that whatever magic formula exists has the effects of rounding (because mana is an integer) applied.

  • Cook's Distance improved. We go from having some pretty strange outliers to being within 0.06 on all modelled values.

Scale-Location and Residuals Vs Leverage also see huge improvements over their normal counterpart.

In short, the poisson family appears to do a much better job of estimating the base value of the card than the normal family.

pMana and Base Card Value

So, once you've scrubbed out zero-mana cards from scoring (which are a problem disconnected from value - it costs you in the deck build, but the card in isolation is always a pretty good deal), you clean up a few of the outliers and end up with a pretty solid qqplot.

QQ plot of predicted vs. actual, monsters > 0 mana with no additional mechanics.

QQ plot of predicted vs. actual, monsters > 0 mana with no additional mechanics.

QQ plot of predicted vs. actual, for *all* monsters and weapons (including those with mechanics)

QQ plot of predicted vs. actual, for *all* monsters and weapons (including those with mechanics)

Our goal in computing a base card value is to look purely at the numbers on the card, without considering the effects of the card mechanics, the rule violations that appear on the card. You do this for a few reasons:

  • Many mechanical effects have dependencies that are situational or may just not go off the way you hope they would when you built the deck.
  • Some cards have a "Silence" mechanic that wipe the card text, leaving you with just the base minion.
  • Dependencies on combos and card synergies requires careful deck engineering.

Most of your deck should be stable, dependable, and work towards a single goal; every card in your deck should help you reach that goal. Like building your first boat, or house, the temptation is to throw every cool thing you ever saw into a deck; what usually happens next is a catastrophic chain of losses.

What makes building a valuation like this interesting is the stuff at either end, the outliers on the outskirts of Value Town. I'm also pleasantly surprised to find a fairly normal distribution.

The outliers are more-or-less who you'd expect them to be.

At the bottom of the value pile is the Molten Giant, at a mana cost of 20 for an 8/8 creature that's really only worth 7-8 mana. It's all in the rule violation of the card text: Costs (1) less for each damage your hero has taken.

At the top of the heap is Mukla's Big Brother... a massive 10/10 creature that costs 6 mana but should cost 9. Again, the card text says it all: So strong! And only 6 Mana?!

Two more cards without card text do well, here:

  • Emerald Drake, costing 4 for a 7/6 creature, is well known to be great value and at a 92/100 rating for value is a great choice.
  • Blood Fury is a great value at 3 mana for a 3/8 card.

Its brethren above 90% are all either great value or have debilitating problems with their additional mechanics.

  • Ancient Watcher at 2 mana for a 4/5 looks good until you see that its big restriction is that it can't attack.
  • Injured Blademaster has a debilitating Battlecry that deals 4 damage to himself, a 4/7 creature that becomes a 4/3 when you play him.
  • Earth's Elemental is great value at 5 for a 7/8, but has an Overload: (3) that takes out three of your mana for a whole turn.

Millhouse Manastorm and Flame Imp are right up there with Ancient Watcher , but again, those debilitating mechanical violations come in to destroy their utility.

Even the Oasis Snapjaw and Chillwind Yeti appear in the correct order. (Yeti > Snapjaw, in case you didn't know.)

A cursory review of cards shows a pretty good match for general consensus on perceived value, so I'm going to roll with this for a V1.

Histogram of card$baseScore generated from linear regression

Note that it's hard to read too much from the distribution here; this is data that came out of the model, which iteself presumes a normal distribution of the source data. I'd have to switch to a glm or bayesian model to avoid making that assumption about the source data...

Which is something I'll look at in future passes. For now, I'm comfortable that the fit passes basic sniff tests.

Hearthstone: Mana cost

When evaluating the base cost of a card, you might be tempted to say that most of the cost lives in the base attributes; so how would you evaluate that statement for truthiness?

Looking at a linear regression fit from all Minions with an expressed cost >0:

Linear regression model of all Minion cards with mana cost > 0

First, just look at the quantization in the residuals-vs-fitted. Pretty, isn't it? That suggests that the mechanics associated with these cards have clear, distinguishable values; this is Blizzard's own statisticians at work.

Next is the fit to a normal distribution; not bad, and as you'd expect, the outliers are the ones whose mechanics strongly influence mana cost (in either direction).

    Min      1Q  Median      3Q     Max 
-4.7648 -0.5829 -0.1133  0.4963 11.3935 

            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.02670    0.14274  -0.187    0.852    
Attack       0.54873    0.04468  12.282   < 2e-16 ***
Health       0.53042    0.04092  12.962   < 2e-16 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.21 on 270 degrees of freedom
Multiple R-squared:  0.7617,    Adjusted R-squared:   0.76 
F-statistic: 431.6 on 2 and 270 DF,  p-value: < 2.2e-16

So a basic LM fit is surprisingly expressive - moreso by far than I was expecting, and it matches Trump's views on base cost of card being a very important factor. In fact, even without filtering out all of the cards that represent more unusual cases, it covers more than 76% of the variance in the dataset.

We can do better, though - if we're looking to fit a model for base cost, let's restrict the model to those mechanics that don't actually express any other mechanics.

In other words, let's go build a linear model that fits only the relationship between Mana, Attack, and Health for minions with no other mechanics.

Linear regression model of all Minion cards with mana cost > 0 and no other mechanics.

The resulting fit is better, too - We're at 93% of the variance of the data covered by the model.

    Min      1Q  Median      3Q     Max 
-1.9916 -0.2755  0.2203  0.2287  0.7730 

            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.17189    0.13566  -1.267    0.213    
Attack       0.50416    0.05960   8.460 3.56e-10 ***
Health       0.43905    0.05918   7.419 7.90e-09 ***
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4879 on 37 degrees of freedom
Multiple R-squared:  0.9338,    Adjusted R-squared:  0.9302 
F-statistic: 261.1 on 2 and 37 DF,  p-value: < 2.2e-16

Of course, the problem with this is that we're now restricted to 37 degrees of freedom, and there's still quite a bit of scatter between the fit and residuals.

In fact, if you take this model, and use it to predict mana cost for all cards in the deck if the card had no other mechanics than its base value, you get something like this:

It bodes well for my tuning of cost evaluation models for Deckalytics.

Weathrman is no more...

Weathrman was built around a simple idea; search Flickr for a photograph taken near you, showing weather conditions and time of day roughly similar to your current location's weather conditions and time of day.

For the most part, it worked surprisingly well, given that I had limited API capabilities from Flickr, time-of-day is badly managed in images, most of Flickr's images weren't geocoded well, the API is horribly slow, and althoiugh we start searching hyperlocal and step upwards until we find something, each of those queries is run sequentially (due to both limited number of queries per day to Flickr and the cost of AppEngine to me).

It's been many years since I seriously looked at the codebase; the app still worked, but the quality of the images it pulled had steadily decreased, people aren't really using Flickr like they used to.

The net effect is that the app was old (written for Froyo, and last updated in 2011), had bad image choices due to limited API control... and for a while, was actually costing me money to run, as it had enough users to hit AppEngine's limit on free CPU time.

But no longer; it's fallen out of use, gets few requests. The app has bitrotted, people have moved on from being endlessly fascinated with live wallpapers, and there's no point leaving it up.

So down it comes, three years after my last update.

If it brought you joy, thank you; if it brought you tears, I apologize.

It's ALIVE!!!!

AdSense Dashboard 3.1 is alive!!!

For those who've been wondering, life's been a little busy lately. Since moving to Mountain View, I've taken over the role of TL lead and Manager on the advertiser's side frontend of our Display business at Google, what we call the 'Content AdWords Frontend' in Google.

That's left me with precious little time; I first knew about 1.2 six months or more before it got released, and knew I'd have to migrate the dashboard to the API - but I ended up moving to Mountain View to take on this role before I had the chance to do so.

Now that my other major project, Search and Display Select is launched, I have a moment to take a breath and fix the dashboard.

Now, keep in mind I still want everyone to move over to the official app - but it's also true that it's 4.0+ only, and I supported folks on Froyo.

Froyo is no more. If you're still on Froyo, go buy a phone, the OS is much nicer now and you'll be happy you did.

So the AdSense Dashboard is Gingerbread or later now. I've gone and changed a few things to make it easier to maintain and take some of the pain away - including to moving to Play Services for authentication. (Auth used to be particularly ugly under the hood.)

New navigation hierarchy

  • Local TimeZone everywhere. Everyone but Google thinks in their local timezone. So timezone reporting isn't optional; the app works in the timezone you gave AdSense.
  • A new widget that supports resizing and lockscreen use.
  • Goodbye, ViewPager. It was broken anyways, and we now have way too many reports to just blindly page through.
  • Hello, Navigation The new design paradigm on Android is an ActionBar button linked to a navigation drawer; now that we have navigation, we've added more reporting.
  • New reports Ad unit and site reporting have been added.
  • More data A full set of metrics on all of the reports we show.
  • Pull To Refresh Because I was wrong, Nick.
  • New icons While playing with the navigation drawer I found we needed some kind of visual indicator. I wanted scaleable icons that worked at all DPIs, but was way too lazy to actually go and make icons of all of those sizes. Enter FontAwesome, a font with a host of icons of just the right style and use case; that, plus a customised TextView that supports specifying a font, and a bit of aggressive caching of typography, and we've got some icons in the app now.
  • Use of typography I switched everything over to the fonts that are used in JellyBean and KitKat, Roboto. This is temporary, until I can find (or get around to buying for app embedding) something like Trafalgar and Requiem
  • API 1.4 introduced a change I've been begging for since the first version of the API; at least some of this happened because they're now seeing these problems for themselves as users of the API.
  • Rewrite of networking I rewrote the networking to make a single batch request at the same time I moved to Play Services for authentication. The refactor cut the amount of code I had in the app by more than half.
  • Use of Play Services SDK Play Services adds a lot of critical support for doing auth properly across a wider range of devices.

And, of course, moving off of the v1.2 API, which is what broke the app for all of October and November.

I did this in two stages - a 3.0 release in November, and a 3.1 release just a few days ago to make use of some of the earlier cleanup.

Along the way, I cleaned up a bunch of code, imported the 1.4 libraries, followed the daisy chain of required updates, moved to Android Studio, deleted all of that and switched to the maven repository, fixed all the maven conflicts, updated to later versions of support libraries, rewrote a bunch of stuff that the support library changes broke, etc. This has been, undoubtedly, a massive yak shaving exercise; but it's better off now.

We'll see where things go next; as always, send me your feature requests, complaints, and general chatter to the support address.

AdSense Dashboard Update

TL;DR: App broken, fix coming this weekend. AdSense API we were using has been replaced by a newer version. App will be updated this weekend with fixes and new features. In the meantime, use the Google Adsense app in the Android Market.

So since I write that last message, I found myself with an unusual job offer; I left the AdSense team in London, which I've been on for years, and moved to Mountain View to run the Display portion of the AdWords frontend and help advertisers make sense of our display ads business.

So my life has been a bit upside-down lately, and that's meant I've had very little time to launch the version of the app that wouldn't have broken when the rest of the AdSense API team did their launch. Most of my life is still in boxes, there's IKEA furniture stacked in little flat-pack cardboard boxes in our new house in Noe Valley, and the dog sees me for about three hours a day.

None of those things are excuses for having failed to deliver you an update, but I hope they go some way into explaining why it didn't happen in time.

I'm very sorry the AdSense Dashboard isn't working for you today. I'm going to fix this very soon - this weekend - and will iterate a few times over the weekend to clean up a few long-standing bugs and handle a few errors and crashes that the occasional person has reported.

Now, it's important to state clearly that this app isn't going to live forever. I really want everyone to migrate to the official app - I worked hard to get approval for that team to get officially sanctioned and to build the best possible user experience for you and the rest of our AdSense users, and they're working hard to build out a fantastic feature set, and already has features I don't offer, like notifications. If that app has the feature set you're happy with, switch now; if it doesn't, tell them what you most wish it had by providing feedback on the app, which they're watching closely.

I miss my old team. I'm so proud of the app they've built, and the UX team has done an amazing job of making something not just functional but beautiful. Now's a great time to go and try their app - again, if you haven't used it recently - and remind yourself of what having full time engineering on making a great user experience can achieve, rather than some guy in his 20% time.

If you have any comments or questions, feel free to contact me.