Learn Statistics with Slugdge

1872
0
Share:

You’d think that the divine inspriation of an esoteric cult that deifies gastropods would supply sufficient lyrical fodder to last Slugdge until the end of time (or until great Mollusca’s return, whichever comes first), but it’s 2018, baby! A manifold, primordial demiurge sates his supplicants’ hunger for both carnal pleasure and hermetic wisdom, should their faith be true. You too, penitent slaves of the pod, can unlock the secrets of black geometry and stochastic processes by cracking open Mollusca’s tome of pseudopod statistics! And perhaps, in the process, you may find some wisdom of true value.

Friends, let me level with you for a minute. I need to clear the air about something. Despite all that you have been led to believe, I am not, in reality, former president George W. Bush. Nor am I even a professional blogger (as if my poor attempts at humor on Sundays ever convinced you otherwise). No, I am, by trade, a traffic safety professional. It sounds glamorous, I know, but all it really means is that I’m a nerdy schmuck who took a lot of math and statistics classes in undergrad/grad school; now I spend much of my day poring over grimoires of crash data, casting spells in Microsoft Excel and SAS to auger what road conditions will produce undue risks for road users.

I’m sorry for lying all these years. Please, take a moment to compose yourself.

Back for more? Okay, prepare to open your mind to Mollusca and his great prophet, Dr. Ezra Hauer.

As you might expect, it’s rare for me to encounter much crossover between my professional interests and my extracurricular activities; true, metal at times canvasses the grislier outcomes of motor vehicle collisions, but in all my years of service to dark (numerical) arts, never once have I encountered a band touch upon something that is foundational, intrinsic to my work.

Until now.

Late last week while returning home from a meetup in Durham, my ears were especially in tune with Slugdge’s track “Crop Killer” off their latest, greatest release, Esoteric Malacology. I was vibing to the riffs while winding along a ribbon of dark highway bisecting the dense overgrowth, when, like a bolt of lightning ripping through my thickly geek-encrusted consciousness, Matt Moss belligerently bellowed a particular, unique set of words I’d never yet heard uttered together in all of metal’s canon.

Regression to the mean, repression of the weak
fraudulent pyramid scheme of human hierarchy

HOLY SLIME MOLD! REGRESSION TO THE MEAN!

Folks, I struggle to put in words just how exciting it is for a traffic nerd to see the words “regression to the mean” in metal lyrics. This statistical principle is the bedrock of traffic safety studies, the cornerstone upon which Ezra Hauer and Susan Herbel and Mohamed Abdel-Aty built their church of safety analysis. If you were a CPA and heard a band singing about Present Net Worth, your excitement would still be less palpable than my own. REGRESSION TO THE MEAN!

Sorry, I just need a moment to compose myself.

__________

If you haven’t yet received the mark of the mollusk and lack a background in statistical studies, afford me a moment to bring you up to speed. Regression to the mean, broadly, is the statistical property of data to trend back to a particular average value over time, especially after periods of extreme values within the data. The axiom holds that, generally speaking, if some value you’re measuring seems unusually large or unusually small, that measurement is likely an outlier, and subsequent measurements should be much closer to the mean value.

In his landmark monograph on traffic safety prediction, Observational Before-After Studies in Road Safety: Estimating the Effect of Highway and Traffic Engineering Measures on Road Safety, Dr. Ezra Hauer, Professor Emeritus of Civil Engineering at University of Toronto and grandfather to the traffic safety discipline, traced the origins of the regression to the mean phenomenon. In 1877, Sir Francis Galton, meteorologist, biologist, and statistician of some repute, studied a tendency for abnormally tall parents to sire shorter, on average, children. In essence, the heights of children actually regressed to some average. Regression to the mean can be observed in myriad other fields and data pools, from golf scores to batting averages to gambling wins.

In terms of traffic safety, the number of crashes observed on particular entities – i.e. intersections or roadway segments – tends to regress to some average that is a function of all manner of factors: the driving population’s general behaviors, traffic and pedestrian volumes at the entity, prevailing travel speeds, roadway division, lane width, land development properties, number of adjacent crosswalks, shoulder width, pavement type, etc. In practice, the phenomenon can be plotted as shown below.

Hopefully the figure above raises a number of questions, first and foremost of which is likely, “What’s the big deal?” Or, put another way, “How does regression to the mean affect crash predictions?” As you can surmise from the figure above, not accounting for regression to the mean can doom your traffic safety intervention to poor application. Say, for example, that you identify a site in your jurisdiction that had a really bad year last year (the before period); our natural instinct may be to look at this single high crash measurement and assume that that entity is dangerous and needs to be treated immediately. You may then go on to spend millions of dollars to install traffic calming devices, HAWK signals, a roundabout, what-have-you. The next year (the after period), you measure the number of crashes at this location and find that the number of crashes dropped significantly. Great job! You solved the traffic safety problem!

Or did you? How many years of data did you collect? Can you say for certain that the number of crashes you observe in the after period would not have been the natural result anyway, regardless of your intervention? After all, crashes are stochastic, random events. Is it possible that there was another site that perhaps was actually at more risk but went untreated because you did not account for regression to the mean? Could it be that a particularly dangerous intersection had an unexpectedly low crash frequency in the before period but would return to a high number of fatalities in the after period?

If this seems nitpicky because, after all, aren’t you still improving safety, consider this recent example from New York City. At a workshop earlier this year, New York DOT reported major gains in pedestrian safety after installation of protected bike lanes and other measures at a number of sites; note that the conclusions highlighted in the image below are based on a relatively low sample of data, one that simply compares before and after results. Setting aside the fact that cyclist crashes actually increased, this simple comparison grants no reason to actually believe that safety improved. If the before period measurements were merely capturing an exceptionally high period of crashes, NYDOT may have, in the best case scenario, merely aided the natural regression to the mean of safety at these locations, and in the worst case scenario, actually spent taxpayer money to make safety worse for cyclists (and the motorists in the crowd rejoice!).

Source: StreetsBlog

I don’t include this illustration to shame NYDOT; the city is, in fact, one of the current leaders in traffic safety in the United States (due to their application of the Vision Zero principles), and their long-term trend of fatalities is in decline while the rest of the country shows, sadly, the opposite trend in recent years. I reference them here to illustrate that if we are to do anything to save the nearly 40,000 lives lost in the United States each year to road traffic crashes, we need better science and better investments, especially given what little leadership and direction the current Administration has provided for infrastructure and transportation.

All of this should beg the question, if regression to the mean is so prevalent and creates such a dangerous bias in our measurements and evaluations, what do we do about it? The current state of the practice for accurately predicting crashes in such a way as to circumvent the regression to the mean problem is to use the Empirical Bayes method to predict crashes. This method is threefold:

  1. Collect multiple years of data – A good rule of thumb is to have about 5 to 10 years of data in the before period; ideally, the after period should match.
  2. Use a statistical distribution that account for randomness – General linear regression, or the art of predicting data points based on a straight “best-fit” line, assumes a normal (i.e. bell-shaped) distribution to the data. The assumption is that most of the data will fall under the bell-shape of values (with the average value in the center) and within one or two standard deviation(s). Because crashes are random events that result from a series of convergences of risky and unlikely circumstances, assuming a normal distribution does not work. Instead, current practice is to perform a negative binomial regression (essentially a curved, logarithmic line of fit) that incorporates a poisson (or random) distribution within the data.
  3. Weight predicted crashes against observed crashes – This step is perhaps the most important because it ensures that we are applying the correct ratio of likelihood to a particular negative binomial prediction in relation to a site’s observed history. That is to say, if our negative binomial line tells us that next year we’ll have 10 fatal crashes at an intersection, but we actually had 16 the previous year, we might realistically expect 12 crashes at this site next year. Not only does that expected value give us a much more accurate prediction that isn’t overly biased by extreme values, it also gives us an idea of how many crashes we may be able to eliminate with a particular treatment. Money swell spent.

Of course, this particular application of regression to the mean is almost certainly not what Matt Moss had in mind for the lyrics of “Crop Killer.” No, a more careful reading seems to elucidate a commentary on the social condition. The mean in view here is not an average of traffic crashes but a natural state of social function and order, one that reactionaries and fascists seem to misunderstand.

There are no alpha wolves
Only those struggling to survive the state of nature
Domestic animals lost without the trappings of their civilization
You like to play armchair eugenicist, militant pessimist
An ivory tower that’s built upon
the product of a million labors

Regression to the mean, repression of the weak
fraudulent pyramid scheme of human hierarchy

Do you ever wonder why you’ve been deprived
the wealth of nations?
do you feel the trickle down your leg?
Preoccupied with war against your own and your own home
How is it you’re not already dead?

The regression, then, seems to be a crumbling of the all-powerful reich dreamed up in the deranged mind of the totalitarian. The regression is the downward trend to ruin of those who deem themselves superior to their fellow man. The regression is justice and order in motion.

Last week, R. M. Temin opined in his brilliant column on the reactionary nature of metal that one of the four key changes we need to see in the metal community is a shift back to lyrics that matter.

…there’s no reason lyrics in extreme metal can’t be a means of conveying an affirmatively anti-establishmentarian, anti-capitalist message that can appeal to those who are attracted to the subculture by its transgressiveness. Black Sabbath, as discussed earlier in this piece, had a great knack for using occult and Satanic imagery as metaphor to express the anxiety and discontent of the era. There’s no reason today’s bands can’t harness brutal imagery in the service of social commentary in the same way. But they must work harder at it than most have shown themselves willing to do.

Mr. Temin, Slugdge here are responding in kind. Not only do we see social commentary delivered in the esoteric fashion for which Slugdge has been made famous, but we see it nested in unique and interesting details rarely seen in extreme metal lyrics. As I stated at the beginning, this is the first time I’ve ever noticed something as intrinsic to my own profession as regression to the mean; the experience of hearing that phrase snarled out with spit and bile directed at those who would do harm to their fellow man certainly commanded my attention. Imagine if more lyricists branched out into unconventional fields and mined deeper subject matter than the same old swords’n’satan that has been the genre’s bread and butter for a millennia. Imagine if metal bands could find even more ingenious ways to appeal to the rich multicultural experience of the international brotherhood our genre pretends to foster. And imagine if those lyrics contained a message that was wholly relevant to the times.

Imagine if the lyrical mean our bands regressed to was one of import and one that spoke to the traffic engineer and the suburban teen alike.


Esoteric Malacology is out now via Willowtip. Pick it up here and follow the band on Facebook here. If you’d like to know more about the traffic safety profession, as good a place to start as any is here.

Did you dig this? Take a second to support Toilet ov Hell on Patreon!
111 Shares