Friday, June 14, 2013

Is "data" singular or plural? Yes

At LinkedIn, the group STET: Professional Copy Editors has been considering "the data are" and "the data is." Here is the post that presented the issue:  "I run into this a lot at work. Our guide calls for 'data are,' but I think it sounds awkward. I understand that 'data' is the plural of the singular Latin 'datum,' but I would argue that we use it as a collective noun, which often take singular verbs. Thoughts (and reasons) for or against?"

What follows in the comments does not quite fill one with confidence about the professionalism of copy editors. One editor consulted friends and family; one recalled a pronouncement from a journalism professor four decades previously. Most expressed some personal preference. (You will have to sign up for LinkedIn to read them.)

But at least some editors thought to consult dictionaries. The Oxford English Dictionary, for example, has a citation for data as a mass noun taking a singular verb from an 1826 number of the Edinburgh New Philosophical Journal: "Inconsistent data sometimes produces a correct result." The singular sense in computing dates from 1946. 

Merriam-Webster's Unabridged Dictionary calls data "plural in  form but singular or plural in construction" and appends this concise note on usage:

"Data leads a life of its own quite independent of datum, of which it was originally the plural. It occurs in two constructions: such as a plural noun (like earnings), taking a plural verb and plural modifiers (such as these, many, and a few) but not cardinal numbers, and serving as a referent for plural pronouns (such as they and them); and as an abstract mass noun (like information), taking a singular verb and singular modifiers (such as this, much, and little), and being referred to by a singular pronoun (it). Both constructions are standard. The plural construction is more common in print, evidently because the house style of several publishers mandates it."

Merriam-Webster's Dictionary of English Usage  has a long and interesting entry on the career of this Latin word in English, summing up: "Data has never been the plural of a count noun in English. It is used in two constructions--plural, with plural apparatus, and singular, as a mass noun, with singular apparatus. Both constructions are fully standard at any level of formality.

The current edition of the American Heritage Dictionary finds that "singular data has become a standard usage."

Garner's Modern American Usage calls data a "skunked term," a damned-if-you-do, damned-if-you-don't word. Though he prefers using it as a plural, he ruefully recognizes that the singular sense has gained traction and is approaching "fully accepted" status.

So anyone seriously questioning whether data is singular or plural has simply not done the homework. 

That leaves only the question of whether to use it as a singular or a plural in context. 

Some editors, I gather from the LinkedIn responses, are shackled to scientific or technical style guides so rigid as to make a hard-shelled acolyte of the Associated Press Stylebook gasp in envy. Thus data-ever-plural can be added to the long register of pig-headed and arbitrary strictures one encounters in the workplace. Submit under protest.

Then there are the individual preferences, and several responders to the LinkedIn post inform us whether data as a plural or singular sounds good to them.  Individual tastes and preferences do have a place in writing; if you dislike one of those senses, don't use it in your own writing. But unless evidence is brought to bear, your individual preference for data as a singular or plural is of no more help to me than your preference for green or red chile.

Data, the evidence plainly shows us, is in common use as a singular or plural noun. If the sense of data is "facts," then a plural verb is called for. If the sense of data is "information" or "evidence," then a singular verb is appropriate. 

And there, as Dr. Johnson would have said, is an end on't. 


  1. I think most people simply don't know how to research usage questions. People are so used to looking up the answer in a book or being told what it is that they don't realize that you can use empirical methods to investigate usage.

  2. I'm intrigued by the OED example: "Inconsistent data sometimes produces a correct result." Inconsistency arises between two or more things/measurements/values, so I would prefer the plural verb. But what has logic got to do with language?!

    IMHO there is also a call for a plural verb when discussing meta-data. "The data collected were: age, sex, height and weight." In this case the sense of "data" goes beyond "facts".

  3. The trouble with the evidence presented in dictionaries is that it's circular: copy editors consult dictionaries, which primarily record the decisions of earlier copy editors. Their main advantage over usage guides is that they don't have an ax to grind, as the latter generally do (MW(C)DEU excepted). These circles aren't necessarily vicious, notably in the case of orthography, where it is much more important to have a fixed form than what the form is. But in more subtle matters, the only use of dictionary-based evidence is to discredit false rules, not to establish true ones. That can only be done by individual taste and judgment as applied to individual writings — which is why they pay you the big bucks.