On September 5, 2018, the New York Times published what might be, according to one commentator, "the most significant and consequential op-ed ever published." (1) The article, purportedly authored by a "senior official" within the Trump administration, contained a number of explosive assertions concerning the fitness of the President and claimed that the author, together with other officials in the administration, was "working diligently from within to frustrate parts of [Trump's] agenda and his worst inclinations." (2) Specifically, the article described Trump as "impetuous, adversarial, petty and ineffective," (3) charged that "his impulsiveness results in half-baked, ill-informed, and occasionally reckless decisions," (4) and stated that he "shows a preference for autocrats and dictators ... [with] little appreciation for the ties that bind us to allied, like-minded nations." (5) As a result of the "instability" (6) displayed by the president, the author justified taking actions to thwart the president's agenda, arguing that since "the president continues to act in a manner that is detrimental to the health of our republic ... our first duty is to this country ... to preserve our democratic institutions ... until he is out of office." (7)
In just a handful of days, the op-ed garnered more than 15,000 comments. (8) The commenters spanned the political spectrum, and raised numerous serious questions about the article and the actions of the author and his or her cohorts. These ranged from demands that the author and likeminded officials resign if they felt so strongly that they could not serve the aims of the president, to charges that they had engaged in a "deep state" coup d'etat against an elected president, to pleas for them to provide Congress and the American people with evidence of President Trump's unfitness for office. (9) Indeed, President Trump himself took to Twitter to denounce the New York Times and the author, suggesting that one or both could be accused of treason, and urging Attorney General Jeff Sessions to order the Justice Department to investigate as a matter of national security. (10)
One question in particular seemed to command the public's attention: which of the officials in the Trump administration had written the article? Almost immediately, the speculation began. While some amateurs attempted to see if tell-tale indications of the author's identity could be gleaned from its wording, a major news organization approached a noted forensic authorship analyst to inquire whether he would be willing to analyze the text of the op-ed to determine its likely authorship. (11)
In a case such as this, even speculations can be dangerous. The consequences of a proposed attribution of authorship--from a professional source, in particular--could be both serious and extensive. Certainly, the identified author would likely be fired from his or her position in President Trump's administration. Sympathizers, suspected or proven, would also probably be dismissed. Depending on the extent of such a purge, the administration's functionality might be severely impacted. Further, public reaction to the identification and its aftermath could affect subsequent elections, or even completely derail the political careers of implicated individuals. With the stakes being so high, any authorship attribution would need to be correct, and demonstrably so. Only a scientifically validated analysis could suffice.
WHY KNOWING WHO WROTE A QUESTIONED TEXT MATTERS
This episode demonstrates that the question "Who wrote this questioned text?" can have profound repercussions. Authorship attribution analysis--the forensic practice of examining texts to develop evidence as to the identity of the producer of a text of questioned provenance--is significant in many fields of inquiry. Literary scholars want answers to questions such as: Did Shakespeare write all the plays and sonnets attributed to him? (12) Or, in a more modern context, did the Harry Potter author J.K. Rowling also write the crime novel, The Cuckoo's Calling, under the pen name Robert Galbraith? (13) Journalists want answers to questions like: Who is the inventor of Bitcoin? (14) Historians are interested in questions such as: Who wrote the Jack the Ripper letters in Victorian England? (15) And: Which portions of the Federalist Papers were written by James Madison, which by Alexander Hamilton, and which by someone else? (16)
That last question might also be of interest to lawyers and legal scholars. In fact, for many issues in both criminal and civil contexts, authorship attribution can play a key role in arriving at the correct legal conclusion to the case. Here are just a few examples:
* Abby receives several emails threatening her life. The police want to know who wrote them.
* Bernie's new will changes his beneficiaries significantly. After Bernie's death, his former beneficiaries contest the will. Did Bernie actually compose it, or did one of the new beneficiaries do so?
* Charlene is found dead of a drug overdose, with an apparent suicide note found on her computer. Did Charlene write the suicide note, or was it written by Dana, who had a motive to kill her? (17)
* Defamatory online posts allege financial improprieties by Dean England. The dean suspects that the posts were written by ex-Professor Francis, who was denied tenure. Was the professor the author of the posts?
* Trade secrets of BizCo are revealed online. Those secrets are known to several people, including Gregory, who had signed a confidentiality agreement regarding trade secrets of BizCo. If Gregory exposed the trade secrets, BizCo intends to sue him for breach of that agreement. Who among the group of trade secret possessors actually revealed the secrets? (18)
* Henry and Isabel both submitted the same paper to a professor. Each one claims to have written the paper. Who did? (19)
* Jack signed a confession to a serious crime. He claims that the police added the incriminating parts of the "confession" to a non-incriminating part that he did write. Did Jack write the incriminating part of the confession? (20)
* Kerry is kidnapped and a ransom note sent to Kerry's wealthy parents. The ransom is paid, and Kerry released unharmed. The police suspect that the kidnapping was staged and that Kerry or Kerry's partner wrote the ransom note. Did either of them write it? (21)
* Larry was fired after having sent racially offensive emails to his supervisor. He argued that several other employees had access to his computer and could have written the offensive emails. Did Larry actually write them? (22)
* A series of juvenile novels was authored first by Mack, and later by Nora, after Mack's death. However, it is disputed which of the two authored a novel written while Mack was ill. Which of the authors is due the royalties to the questioned novel? (23)
* Oliver, an asylum seeker, asserts that his life would be in danger if he were returned to his native country because of published articles critical of the government that he wrote under a pseudonym. If Oliver did write the articles, and that fact became known to the regime, he should be entitled to asylum in this country due to a well-founded fear of political persecution. So, did Oliver write the articles? (24)
* Police found a manifesto threatening a terrorist attack. The police want to identify the author to prevent the planned attack. Who wrote it? (25)
* Paul was a college chum of Quentin. He claims that he and Quentin jointly developed the idea behind a company that Quentin later started, and that Quentin sent him emails promising him a (50)% share in the company in recognition of his contribution. Quentin denies both writing the emails and making the promise. Did Quentin write the promissory emails? (26)
These pseudonymous and sometimes- hypothetical cases illustrate just some of the many legal contexts in which authorship attribution questions arise. Forensic examination of the characteristics of the questioned texts can help shed light on such issues.
The forensic analysis of a text in order to determine its authorship has, for decades, been accomplished by forensic linguists. These experts are trained in the science of linguistics, and have often applied their specialized expertise in sociolinguistics, discourse analysis, and pragmatics. Although the range of applications for forensic linguistics is extremely wide, (27) this Article will focus only on the authorship attribution application. Herein, we maintain that reliability is an essential component of pattern comparison forensic practices, that testing for validity and measuring of error rates are the Daubert factors (28) that best guarantee reliability, and that the practices of authorship attribution can be an effective model of how to carry out this testing.
AUTHORSHIP ATTRIBUTION ANALYSIS AS A PATTERN-BASED FORENSIC SCIENCE: FOUNDATIONS FROM THE SCIENCE OF LINGUISTICS
All pattern comparison forensic practices begin with a premise: namely, that a creator's characteristics are reflected in his or her creation, such that patterns displayed in the creation can provide evidence regarding the identity of its maker. Consider, for example, forensic odontology, which begins with the assumption that the properties of one's teeth will be reflected in any of his or her resulting bitemarks. (29) Similarly, forensic authorship attribution begins with the linguistics-based premise that language users have individual preferences and habits that determine their use of language.
Just as a community's distinctive use of language can be said to constitute a dialect, an individual's distinctive use of language is said to be his or her idiolect. (30) According to Malcolm Coulthard--one of the most prominent forensic linguists (31)--a person's idiolect "will manifest itself in distinctive and cumulatively unique rule-governed choices for encoding meaning linguistically in written and...