Mathematics, Big Data, and Joss Whedon

Mathematics, Big Data, and Joss Whedon

Definition 1: The symmetric difference of two sets A and B, denoted A \Delta B , is the set of elements in each of A and B, but not in their intersection.

Let A be “Mathematics”, and let B be “Data Science”. This is certainly not the first article vying for attention with the latter buzzword, so I’ll go ahead and insert a few more here to help boost traffic and readership:

Analytics, Machine Learning, Algorithm,

Neural Networks, Bayesian, Big Data

These formerly technical words (except that last one) used to live solidly in the dingy faculty lounge of set A. They have since been distorted into vague corporate buzzwords, shunning their well-defined mathematical roots for the sexier company of “synergy”, “leverage”, and “swim lanes” at refined business luncheons. All of the above words have allowed themselves to become elements of the nebulous set B: “Data Science”. As the entire corporate and academic world scrambles to rebrand themselves as members of Big Data™, allow me to pause the chaos in order to reclaim set A.   This isn’t to say that set B is without its merits. Data Science is Joss Whedon, making the uncool comic books so hip that Target sells T-shirts now. The advent of powerful computational resources and a worldwide saturation of data have sparked a mathematical revival of sorts. (It is actually possible for university mathematics departments to receive funding now.) Data Science has inspired the development of methods for quantifying every aspect of life and business, many of which were forged in mathematical crucibles. Data science has built bridges between research disciplines, and sparked some taste for a subject that was previously about as appetizing to most as dry Thanksgiving turkey without gravy. Data science has driven billions of dollars in sales across every industry, customized our lives to our particular tastes, and advanced medical technology, to name a few. Moreover, the techniques employed by data scientists have mathematical roots. Good data scientists have some mathematical background, and my buzzwords above are certainly in both sets. Clearly,  A \cup B   is nonempty, and the two sets are not disjoint. However, the symmetric difference between the two sets is large. Symbolically,  (A \Delta B) \gg   A \cup B   . To avoid repetition of the plethora of articles about Data Science, our focus will be on the elements of mathematics that data science lacks. In mathematical symbols, we investigate the set A \ B.

Mathematics is simplification. Mathematicians seek to strip a problem bare. Just as every building has a foundation and a frame, every “applied” problem has a premise and a structure. Abstracting the problem into a mathematical realm identifies the facade of the problem that previously seemed necessary. An architect can design an entire subdivision with one floor plan, and introduce variation in cosmetic features to produce a hundred seemingly different homes. Mathematicians reverse this process, ignoring the unnecessary variation in building materials to find the underlying structure of the houses. A mathematician can solve several business problems with one good model by studying the anatomy of the problems.

Mathematics is rigor. My real analysis professor in graduate school told us that a mathematician’s job is two-fold: to break things and to build unbreakable things. We work in proofs, not judgment. Many of the data science algorithms and statistical tests that get name dropped at parties today are actually quite rigorous, if the assumptions are met. It is disingenuous to scorn statistics as merely a tool to lie; one doesn’t blame the screwdriver that is being misused as a hammer. Mathematicians focus on these assumptions. A longer list of assumptions prior to a statement indicates a weak statement; our goal is to strip assumptions one by one to see when the statement (or algorithm) breaks. Once we break it, we recraft it into a stronger statement with fewer assumptions, giving it more power.

Mathematics is elegance. Ultimately, this statement is a linear combination of the previous two, but still provides an insightful contrast. Data science has become a tool crib of “black box” algorithms that one employs in his language of choice. Many of these models have become uninterpretable blobs that churn out an answer (even good ones by many measures of performance. Pick your favorite measure–p values, Euclidean distance, prediction error.) They solve the specific problem given wonderfully, molding themselves to the given data like a good pair of spandex leggings. However, they provide no structure, no insight beyond that particular type of data. Understanding the problem takes a back seat to predictions, because predictions make money, especially before the end of the quarter. Vision is long-term and expensive. This type of thinking is short-sighted; with some investment, that singular dataset may reveal a structure that is isomorphic to another problem in an unrelated department, and even one that may be exceedingly simple in nature. In this case, mathematics can provide an interpretable, elegant solution that solves multiple problems, provides insight to behavior, and still retains predictive power.

As an example, let us examine the saturated research field of disk failures. There is certainly no shortage of papers that develop complex algorithms for disk failure prediction; typically the best performing ones are an ensemble method of some kind. Certain errors are good predictors of disk failure, for instance, medium errors and reallocated sectors. These errors evolve randomly, but always increase. A Markov chain fits this behavior perfectly, and we have developed the method to model these errors. Parameter estimation is a challenge, but the idea is simple, elegant, and interpretable. Because the mathematics are so versatile, with just one transition matrix a user can answer almost any question he likes without needing to rerun the model. This approach allows for both predictive analytics and behavior monitoring, is quick to implement, and is analytically (in the mathematical sense) sound. The only estimation needed is in the parameters, not in the model structure itself. Effective parameter estimation will effectively guarantee good performance.

There is room for both data scientists and mathematicians; the relationship between a data scientist and a mathematician is a symbiotic one. Practicality forces a quick or canned solution at times, and sometimes the time investment needed to “reinvent the wheel” when we have (almost) infinite storage and processing power at hand is not good business. Both data science and mathematics require extensive study to be effective; one course on Coursera does not make one a data scientist, just as calculus knowledge does not make one a mathematician. But ultimately, mathematics is the foundation of all science; we shouldn’t forget to build that foundation in the quest to be industry Big Data™ leaders.


~Rachel Traylor @mathpocalypse


Innovation in “A Big Company” – Part 1

Innovation in “A Big Company” – Part 1

I know what you’re thinking… innovation in a big company is an oxymoron, can’t be done. And you’re probably justified in thinking so. But it can be done. It has been done. And it will be done again.

What enables some products to innovate decades into their existence, while others never get past their first innovative idea? First, those product teams understand that the type of innovation needed varies over time and so they plan accordingly and measure themselves appropriately. Second, they take their successes and failures and learn from them. Finally, they look for common patterns so they can replicate their success across the entire organization.

But it starts with understanding the lifecycle of innovation.

The Innovation Lifecycle

Innovation graphic

The definition of innovation is “something new or different.” Most commonly, innovation is thought of as a whole new something – a new startup, a new product line, a new product class or even a net new market. Let’s call this Revolutionary Innovation. This is a well understood model and so I will discuss it only briefly as it pertains to a big company. What I want to highlight though, is that big companies with successful multi-billion dollar product lines offer a lot of opportunity for a whole other kind of innovation that I’ll call Evolutionary Innovation – this is innovation that happens within the confines of an existing product to keep it fresh and competitive. Such innovation can be just as rewarding as the revolutionary kind – indeed, the challenge of creating something new out of existing parts and with all sorts of constraining factors is often far more rewarding than clean-slate innovation for me.

The innovation lifecycle of a product is depicted in the diagram. Every reasonably successful product starts life with some revolutionary innovation, either tapping into a new market or disrupting an existing market. There is always some big-bang innovation that propels it into orbit, shown as the steep blue ramp. From an innovation perspective, this is the best period of the product’s life – the most innovation happening in the shortest period of time. The main reason is that you’re operating with a clean slate, with nothing to slow the pace of innovation implementation. Once the product is established in the market, though, a major and permanent change happens – the rate of innovation slows down because innovation must be balanced with other priorities: investment in sales & marketing, closing/completing feature gaps, servicing a rapidly growing customer base, improving product quality in response to limitations in the early product versions exposed by hyper-growth. This is also often the time that the visionaries that drove the initial innovation depart, eliminating both the depth of product knowledge but more importantly the engine of innovation. These are big head winds that can easily stall innovation to a point where the product stagnates and becomes vulnerable. The risk is not just from the next big revolutionary innovation but just as much from other similar products that leapfrog it with evolutionary innovations. If the innovation is not revived in time, the product eventually dies (red line in the picture).

Thus, for a product to be truly successful i.e. long-lived, it must transition from the Revolutionary innovation phase to this new phase of Evolutionary innovation, shown by the green line. In this phase, innovation occurs in waves as the technology and market landscape evolves. However, the pace of innovation tends to slow down over time as the product gets “heavier” – it takes longer to get a new idea into the product for all the reasons I mentioned before. Driving innovation into the product in such circumstances requires all kinds of skill (and immense energy) beyond just having a cool idea. You need to find a way to implement that idea that makes it easy for it to get incorporated into the product. Not too long ago, I was hired by the Advanced Development group of a previous employer to find a solution to the threat from a hyper-growth startup that was invading our space. I had done a similar product in the past so I knew what it should look like architecturally. The question was, what components should I use to build it? Everybody in the group advised me – urged me – not to use the stuff that the mainstream product was based on, and instead to do it using Linux components because we could do it faster that way. I decided to not only use the existing product components, but also got the mainstream product group to loan us four engineers to work in our team. We got a full-function prototype done in less than six months and the project was a huge hit. The engineers went back to the product group with the code and the team assigned to this new feature eventually grew to 120 people.

Why does this model matter? First, if we’re going to explore whether a big company can innovate, we need to understand that there are different kinds of innovation and be clear about which one applies when. Second, we need to understand the challenges companies face in driving this innovation and what it takes to overcome them.

Over the next few posts, we will look at some EMC case studies of evolutionary innovation. We’ll talk about why these succeeded – the techniques to overcome the head winds and deliver innovation of both kinds in a big company.

-Sudhir Srinivasan @DoctorSudhir