The other day I trained a neural net to generate the names of cookies, based on about 1,000 existing recipes. The resulting names (Quitterbread Bars, Hand Buttersacks, Low Fuzzy Feats, and more) were both delightfully weird and strangely plausible. People even invented delicious recipes for them. But given that I’ve trained neural networks to generate entire recipes before, why not have the neural network generate the entire thing, not just the title?
Well, this is why.
The first neural network I tried is textgenrnn, which I’ve used to generate things like new species of snakes, names for creepy shopping malls, and terrifying robotics teams. Given 1000 cookie recipes from David Shields’s site, textgenrnn could do a recognizable recipe – but its titles and directions were a bit suspect.
Now, granted, it’s confused about other things, too. A memory approximately 40 characters long means that it doesn’t know how many times it has already added sugar (apparently its favorite ingredient). (Other algorithms, like GPT-2, have ways to zoom out.)
I decided to see if textgenrnn would figure out recipe titles if it trained for longer. It generated the recipe above after it had seen the example recipes 3 times each. Below is what it did after another 3 looks at each dataset (6 in total). The title is… different. The recipe is more chaotic. It has at least moved on from its obsession with sugar.
After 3 more looks at the data (9 in total), things have gotten even worse, though according to the loss (the number it uses to track how closely it matches the training data), it thinks it’s doing better than ever. It seems to be freaking out in particular about the repeated + signs that some recipes use as decorative borders.
There are terms for these kinds of training disasters that sound more like warp engine malfunctions: “mode collapse” usually applies to image-generating algorithms, though, and “exploding gradients” usually is signaled by the neural net thinking it’s doing worse and worse, not better and better.
So I moved back to another algorithm I’ve used for a long time, char-rnn, which seems to do well at long texts like recipes.
The recipes are… better. I’ll give it that much.
Some of its ingredients are questionable, and its directions certainly are. But at least (despite a memory only 50 characters long) it has managed to do the beginning, middle, and end of a long recipe. It’s often fuzzy about how to end a recipe, since my rather messy dataset has recipes ending with sources, urls, and even ISBNs. Recipe notes are also highly variable and therefore confusing.
So what happened with textgenrnn? I mentioned my difficulties with textgenrnn to the algorithm’s creator, Max Woolf, and he urged me to try again. There’s some randomness to the training process. Sometimes textgenrnn degenerates into chaos during training, and even then sometimes it pulls itself together again. When it did well, its instructions even start to make sense. You could make the Butterstrange Bars below (almost). Given this amount of randomness, it’s nice when researchers report the aggregate results of several training runs.
Bonus recipe! Sign up here and you can get an Excellent (not excellent) recipe for “Mrs. Filled Cookies” along with (optionally) bonus material every time I post.