In this blog post I’m going to give some perspectives on my own experience in industrial research: I’ve been in small research groups, large research labs, and something that is actually a bit hard to describe: an island of research in a sea of engineers.
I have been in the CTO office of a division of EMC since joining the company in 2009. While I started with the CTO office of Data Domain, that lasted just a few weeks before EMC acquired the company; over time, the division encompassing Data Domain and its CTO office has moved from focusing only on data protection to its current position as the Core Technologies Division (CTD) with most of the storage and data protection products offered by EMC.
Before that, I was in industrial research labs since obtaining my Ph.D. about 25 years ago.
I describe myself as being on an island of research in a sea of engineers because my role in EMC these days is just like it was in the “research labs” before it: a combination of coming up with novel technologies, evaluations, and other techniques that can help the business, and a strong presence in the “academic research” community and its conferences, journals, university sponsorships and collaborations, and so on. But most of the rest of the company … strike that, nearly everyone else in a technical role in the company … is more focused on the engineering side of things than worrying about papers.
Or at least they were. My job these days is also to help the rest of the engineers understand what’s involved with academic publishing and to help EMC have a larger impact in academia.
Why Companies Publish (Or Not)?
Let me take a step back and discuss corporate research and publications more generally. Historically, a number of large technology companies have had a sizable division devoted specifically to research. This would include developing new products and services as well as improving existing ones. Many employees in such an organization would come from academic research backgrounds and would participate in the “research community” by disseminating their own research results in papers at conferences and sometimes archival media such as journals. The benefits to these research organizations from publishing their work are various, including:
- Telling the world about their great ideas. This has many benefits, especially convincing others that if they work at that company, they too can work on great things. In addition, research work often builds on other work; publishing something really useful, especially if the software behind it is made available, can result in widespread adoption.
- Interactions with the academic community. Most conferences are heavily skewed toward attendees from universities, either professors or students. Participating in conferences is a way to meet academics and establish relationships that can lead to internships, full-time hiring, and collaborations. I’m sure that many of the interns I’ve mentored over the years became aware of me through my conference participation, and that many of them consider the opportunity to publish a key factor in deciding where to work. My colleagues and I have coauthored papers with most of the interns who have worked with us, sometimes numerous papers with a single student, and these collaborations have helped both the students and the company.
- Feedback. The process of submitting a paper to a conference results in considerable improvements to the paper and can result in improvements to the underlying system, by virtue of the reviews and the audience feedback when presented.
- Documenting results. Everyone is always busy, and documentation is not always a high priority. People may be more likely to document a system in an understandable fashion if the documentation is intended for a general audience, and the results from evaluating the system may be more understandable.
- Personal development. Companies recognize that some individuals value the experiences they gain from publications, including interactions at conferences, establishing reputations, and learning about other work. Some of these benefits accrue from attending conferences even without publishing, but some are unique to publications. Plus, once you publish a lot, participation on program committees is a way to gain additional experience and interactions.
But only a few companies have large, established research groups. In fact, sometimes the view of industrial research changes from one year to the next, as a result of other factors such as the state of the company or the economy as a whole.
A large company does not guarantee a large research organization (or any research organization at all). This can be for reasons such as geography: my first industrial position after obtaining my Ph.D. was with a newly formed US-based research lab for a large Japanese company. There were about 8-10 researchers there at the time. While it started with the same aspirations as the labs for the larger companies I mentioned above, a couple of years later the upper management declared that “publishing was not in the business interests” of the company. The lab director predicted correctly that the researchers would find this change of heart disturbing and choose to move to other companies or academia.
Why would some companies encourage dissemination of results while others find it contrary to their business interests? And where does EMC fit in that spectrum?
The answer to that question depends a bit on where in EMC you mean: the answer varies by organization, because there is no single central research division with the kind of mandate I mentioned above. Instead, each division has usually had an “advanced development” organization with the charter to do the sort of innovation and analysis I mentioned before, though not necessarily with the expectation that their work would be shared externally. Since CTD spans so much of the company, its AD team is similar to that of a many corporate-level organization, albeit small relative to EMC as a whole.
Outside CTD, publications just aren’t something the business thinks about, although there are some exceptions (e.g. corporate CTO and RSA Labs). For the rest of this post, I’ll focus on CTD, where publications are actively encouraged.
Driving Publications in a Product-Centric Division
As I said, my job of late has been to help others in the division publish their work in conferences and other similar venues. This means everything from identifying interesting work and encouraging people to work on publications all the way to working with others to write a paper. Few within the division are active “researchers” with numerous publications under their belts, but many engineers working on products are doing interesting things that would merit publication. I try to find them, if they don’t find me first, and encourage them to submit papers.
Before I continue, I need to note that it wasn’t always like this. When I joined EMC via its acquisition of Data Domain, publications were viewed by my management as useful when they provided collateral to actual products. While an excellent reason to publish, it tends to limit the opportunities. For instance, many consider the seminal paper on deduplicating backup systems to be the description of the Data Domain File System, published in the USENIX FAST conference in 2008. It described some features of the system, which was of great use to the academic research community, but it also helped potential buyers of the system appreciate that the technology was innovative.
The approach to publications changed when the CTO of the division changed. At the first face-to-face meeting of the small advanced development team of the backup division, Stephen Manley announced that we should try to publish more. The initial target was on the order of 3-5 papers, and it has expanded as the team has grown. It was now OK to publish interesting ideas that were not directly related to a product. Why? Because the value of publications for our group is similar to the value for the large research divisions that have emphasized publications all along. In fact there is an extra important aspect from our perspective: name recognition. I’ve attended conferences where I’ve met professors who don’t even know what EMC is, despite it being a large multinational company with an enormous presence in technologies such as storage, security, information management, and others.
In order to encourage publications, our division provides bonuses for submitting papers and for having them published, similar to bonuses for patent submissions and issuances. (The idea isn’t original to EMC; I know of at least one other company that has had publication bonuses, but it similarly had no central research organization.) Some might argue these incentives are not large enough to make the difference to someone deciding whether to work on a paper, and considering the extra effort involved (often above and beyond the responsibilities in someone’s “day job” if they are in an engineering role), this is true. But I can say from experience that it still feels good to get that extra incentive.
In addition, we have strengthened our internal processes (additional reviews to help get the best papers to the best venues). As the division’s “publications czar” I have worked with a number of authors to help guide them to identify appropriate venues, to understand the expectations of those conferences, and to improve their write-ups. I presented to the engineering community within the division to inform them about the desirability of publications, the processes they need to follow, and the resources and incentives available to them. Sometimes I have reached out to specific people I’ve heard might be good candidates to author a paper. In one case, someone who had been with the company for decades said “I thought we don’t do that!” My job is to be an evangelist and to convey our new goals … teach the old dogs new tricks in a manner of speaking.
The Early Returns
We are seeing an increase in the number of people interested in submitting papers, and it will be interesting to take stock of the success of these submissions in the coming months. We’re also broadening our scope – having previously focused on a small set of conferences such as FAST, papers from CTD are going to theory conferences, machine learning conferences, and others.
Finally, note that throughout this post, I’ve really focused on “publication strategy within a large division of a large technology company” – not the company as a whole. I hope that the policies we’ve enacted in CTD will carry over to the rest of the company and those groups not already sharing their successes with the academic community will get involved and achieve the same benefits we have. Regardless, the benefit of publications even within one division brings benefits to EMC as a whole.
–Fred Douglis @FredDouglis