Inauthentic Paper Detector

Status: anti-counterfeit technology
Last year I posted about a group of MIT students who created an Automatic Scientific Paper Generator, capable of creating "random Computer Science research papers, including graphs, figures, and citations." One of the papers created by this program was accepted for presentation at the World Multi-Conference on Systemics, Cybernetics and Informatics. To stop something like this happening again, researchers at the Indiana University School of Informaics have invented an Inauthentic Paper Detector. It's supposed to be able to tell whether a paper has been written by a human or a machine. The researchers write: "The main purpose of this software is to detect whether a technical document conforms to the statistical standards of an expository text... We are trying to detect new, machine written texts that are simply generated not to have any meaning, yet appear to have meaning on the surface."

I tested the Inauthentic Paper Detector by having it analyze the last couple of entries I've written. It told me: "This text had been classified as INAUTHENTIC with a 38.4% chance of being authentic text." I guess this confirms the theory that the real Alex drowned in Loch Ness back in September 2004 and was replaced by replicant Alex. (via New Scientist)

Identity/Imposters Science

Posted on Thu Apr 27, 2006

More content from the Hoax Museum:


Does your wife know?
Posted by Unfairly Balanced  in  Earth  on  Fri Apr 28, 2006  at  01:11 AM
Just tried it on an essay I wrote. 83% Authentic.

Huh. I guess the thousand monkeys working on a thousand typewriters really does produce good work. Shame about the knife fights.
Posted by Soldant  on  Fri Apr 28, 2006  at  04:51 AM
Huh. I just tested it on some writing I did about monuments around the city.

A big, fat, INAUTHENTIC.
Posted by Boo  in  The Land of the Haggii...  on  Fri Apr 28, 2006  at  05:53 AM
I just tried to run Henry V's St. Crispin's Day speech from the Shakespeare's Henry V. The debate over authorship is now over, and Shakespeare is not the winner since the text has only a 25.3% chance of being authentic. Perhaps this means that someone had a very powerful 16th century computer write the play? gulp
Posted by noelcoward  on  Fri Apr 28, 2006  at  07:17 AM
I daresay it doesn't work; however, none of these tests are fair, since the programme is supposed to test technical documents... whatever technical means.
Posted by outeast  on  Fri Apr 28, 2006  at  07:59 AM

Well, I've only one scientific paper to hand - a Lancet review of the Toxoplasmosis literature. Since that's a rather small sample, I added 5 abstracts ripped off PubMed:

Toxoplasmosis (seminar)

with a 38.2% chance of being authentic text

Cell aggregation of Pseudomonas aeruginosa strain PAO1 as an energy-dependent stress response during growth with sodium dodecyl sulfate.

with a 59.6% chance of being an authentic paper

Cyanide detoxification by the cobalamin precursor cobinamide

with a 33.5% chance of being authentic text

In vitro evaluation of stent patency and in-stent stenoses in 10 metallic stents using MR angiography.

with a 29.2% chance of being authentic text

Long-Term Clinical Outcome in Patients With Congenital Chloride Diarrhea

with a 12.6% chance of being authentic text

Addition of Carbenes to the Sidewalls of Single-Walled Carbon Nanotubes

with a 27.9% chance of being authentic text

OH MY GOD!!! It's SO CLEAR that there is a HORRIBLE scandal here - almost all published medical papers are inauthentic!!! Someone should be told!!!

Oh, I ran 6 real fake papers, too - all were rated inauthentic. But with 5/6 false positives, who cares?
Posted by outeast  on  Fri Apr 28, 2006  at  08:24 AM
I ran the above entry. Guess what...

This text had been classified as
with a 12.9% chance of being authentic text
Posted by melissa  in  arkansas  on  Fri Apr 28, 2006  at  09:28 AM
>>Does your wife know?

LOL. Maybe my wife was in on it. Like a Stepford Husband kind of thing.
Posted by The Curator  in  San Diego  on  Fri Apr 28, 2006  at  09:48 AM
LOL I just ran some of my old essays from high school and TAFE through...apparently my personal philosophy paper has a 16.6% chance of being authentic.

Hmm...the most authentic-looking paper is on my strengths and weaknesses...odd...
Posted by Smerk  in  to mischief  on  Fri Apr 28, 2006  at  10:31 PM
Hmm, it doesn't seem to be all that accurate, does it? It labeled as inauthentic a number of papers I wrote on such subjects as epizoic cyanobacteria (41.4% chance of being authentic), Neo-Nazism (19.6% chance), recommendations on lab safety procedures (22.7% chance), a post-accident LPG explosion assessment (30.0% chance), a derivation from Kepler's 3rd Law of Orbital Motion (41.4% chance), an evaluation of Scientology practices (38.4% chance), an assessment of nuclear capabilities (39.8% chance), and an examination of the Druse (22.7% chance), as well as two short stories (25.7% and 30.0%).

The only one it said was authentic was a comparison of the first four books of the New Testament, with a 91.5% chance of being an authentic paper.

It did correctly identify one "paper" that I submitted that was total gibberish as being only 13.3% authentic. I suspect that it will correctly identify most inauthentic papers simply due to the fact that it considers nearly anything to be inauthentic.
Posted by Accipiter  on  Sat Apr 29, 2006  at  04:11 AM
Haha, it really works... I've copy/pasted text from the United States Declaration of Independence from the first paragraph through "He has Dissolved Representative Houses".

Result : This text had been classified as
with a 15.2% chance of being authentic text

I say the US has a problem... the Declaration of Independence is actually the creation of high level groups of interest that rule from the shadows.

Bulls*it wink

Posted by DukeLeto  in  Bucharest  on  Sat Apr 29, 2006  at  07:28 PM
the only thing I can find that is "authentic" is the "Publication SIAM Conference on Data Mining (2006)" on their own website, ugh.
Posted by Tom  on  Sun Apr 30, 2006  at  01:01 PM
I thought this would something that could tel if paper was real. You'd get results like "No. This is very very thin wood". grin
Posted by Tom K  on  Mon May 01, 2006  at  05:56 AM
Missing a be, sorry...
Posted by Tom K  on  Mon May 01, 2006  at  06:04 AM
I just tested 3 essays my wife wrote for her masters degree - essays I KNOW wshe wrote herself - and all 3 were inauthentic. My wife is obviously not human, something I've been trying to tell folks for years!

Seriously though - any fool could write a silly program that comes up with these bogus results. Who knows what the actual logic is - it's probably somne complete nonsense.
Posted by Seth Easton  in  Washington DC  on  Tue May 02, 2006  at  03:40 PM
Commenting is no longer available in this channel entry.