The word of which ‘data’ is purportedly the plural has simply disappeared; this means two things. Firstly, passively, it creates a linguistic space into which ‘data’ can drop – there is no ambiguity in using ‘data’ in a singular sense. Secondly, and more importantly, if ‘datum’ has effectively disappeared, it tells us that ‘data’ cannot be simply its plural; unanchored, it has moved away from this simply derived meaning, to a distinct and independent meaning of its own. It has accordingly accreted usage rules of its own, unencumbered by any latin past.

‘Data’ no longer means just one (damn) datum after another. Twentieth-century ‘data’ refers to a mass of raw information, which we measure rather than count, and this is as true now as it was when the word made its 1646 debut. This universal perception of data as measured rather than counted puts the word firmly and unambiguously in the same grammatical category as ‘coal’, ‘wheat’ and ‘ore’, which is that of the mass, or aggregate, noun. As such, it is always and unavoidably grammatically singular. We would never ask ‘how many wheat do you have?’ or say that ‘the ore are in the train’ if we wished to be thought a competent speaker of english; in the same way, and to the same extent, we may not ask ‘how many data do you have?’ or say ‘the data are in the file’ without committing a grammatical error.

Is any clarity lost by treating the word “data” as a collective singular, other than to native speakers of Latin? (Of which there are none.)

There is no benefit to ensuring the singular/plural agreement of word roots that couldn’t be accomplished more simply by feeding Bill Safire a couple of Xanax every morning. Grammatical prescriptivism, untethered from a desire for clarity, is merely a mechanism to strictly maintain class distinctions.

