Wikipedia articles invented by a neural network


Wikipedia has a page where they list, for entertainment purposes, the titles of a bunch of pages that didn’t meet the cut. These are mostly pages that were submitted as pranks, although a few of them are clever enough that you can’t quite tell. Reader Emily Davis sent me a list of them – here are a few real deleted articles that humans wrote.

List of movie posters with lamps in them
How to trick people into thinking you’re a wizard
List of people who died with tortoises on their heads
People Who delete My Articles have no sense of Humor
I like eggs
Do scented candles burn faster than unscented candles
An article that contains nothing but a full stop
List of differences between apples and oranges
Category:Farts in literature
Category:Political posters using an octopus
Woo woo woo woo woo woo wah ooooo wah
List of all Wikipedia lists that do not contain themselves

It makes a terrible dataset for a neural network – only 1112 unique entries, some of which are quite long, and big variation in style and subject matter. I decided to try it anyway.

I trained a character-level recurrent neural network (that is, it uses individual letters as building blocks) with a very small memory to prevent it from memorizing the small dataset so quickly. Even so, most of the generated names were either incomprehensible or memorized from the original dataset. Those that weren’t, however, fit right in. It turns out text-generating neural networks are great at mashups and non sequiturs.

Popal chickens
List of U.S. pants
List of the Hamburgers
Category:Athletes with maps
Why Inited States
Evil chicken
Liquid cheese
List of bands with pies on them
Ant Fields are bear hair fetishism
Monster Diseases
Why Won’t Space
Tire bear (country)
What hoop
This page is a very short article
Poople who don’t have beer from sydney
Goat that cookie
Near Dogs
Donkey words in the cartoons
Poople who woo wah the pilot
Death of chicken
What is the day
What fame butt
List of fictional characters with the ball
Who is not leaders
List of parps
Proper programming language
Turdis programming language
Article with a cat
Friends and existence
How to draw a coconut
Tree donkey
Category:People who can’t speed
Beer for chickens
Tree Wars

Whoever it is who likes to enter long strings of repeated characters as pranks (I’m looking at you, Sand Person), the neural network shares your obsession. Repeated text is easier to learn, and so the neural network tends to latch onto it easily and, especially when I give it a short memory, takes repetition to even wilder excess (see: The Cow With No Lips).

Beneral Pissednessessessinessismasticlesismsomic comotute

Woo woo woo woo woo woo woo woo woo woo woo woo woo wah ooooooooooooooooooooooooooooooooooo ooooooo ooo on other intortational characters with removable travel

Wich chemical appearaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

Note: please do not actually attempt to create these articles on Wikipedia.

Bonus material: sign up here to get some more article titles, including a few that were a bit too rude to post here.