In Drawing on the Right Side of the Brain, her classic book on the intersection of neuropsychology and artistic practice, Betty Edwards describes children’s attempts to produce realistic drawings at an early age. As soon as they get out of the scribbling stage, once their language skills develop and their thinking becomes dominated by the left side of their brain—the side which uses logic, words, and symbols, as opposed to the right side, which makes intuitive connections and thinks more concretely—they often abandon the pursuit of realism and begin to populate their drawings with symbolic representations (think “a big circle with lines coming out” as a symbol for “the sun”). These symbols can easily be read as whatever the child is trying to draw; the adults praise the child’s budding skill, and the child never learns to observe and replicate the volumes, negative spaces, curves, and lines that are actually present in their view of the world around them. Edwards notes that many children suspect that their symbol-drawings really don’t look lifelike at all.
Except for the last part, this is exactly what happens with the AI art generators. An AI is prompted with a set of instructions, which it interprets to mean “give me a picture of something that looks like that.” The AI is not trying to draw from life or copy some aspect of reality; it is reaching into the repository of symbols that our culture has accumulated and making a guess at which combination of symbols will best match our expectations. This is what a child does when asked to draw a cat or a house or whatever. Of course, the child’s interlocutor will praise the drawing and exclaim that it looks “just like a cat”—it would be cruel not to. And look at how much praise is being heaped on the AI art generators! Writing in Futurism, Tom Ward claims that AI art is better than human art; Meanwhile, Ai-Da, a humanoid robot that paints and draws, is blowing the minds of the people at The Guardian. Is it a good or a bad thing that AI uses symbols to make its art? Before answering that, I want to go into more depth about how symbolism works in the context of AI art—but to do so, I have to show you a bunch of badly-drawn pictures of an annoying anime character.
Earlier this year I asked Nightcafé to create a picture of “Sailor Moon making a salad.” You can see the result immediately below. There are two places where the AI created something which looks like some leafy green lettuce; so far, so good. And there are several elements of the image which seem to be symbols for Sailor Moon—the gold balls, the red and blue stripes, some blond-hairish thing in the top right—yet these symbols are arranged incoherently.
Nightcafé allows an image to be evolved by being run through the AI as a prompt on its own, hopefully sharpening and refining the image a bit. So I did that, and this is what happened:
The salad imagery is still there (with one of the gold balls now in the salad) and the Sailor Moon iconography is beginning to consolidate in the center of the image (I can see an upper torso, neck, and shoulder of some sort of figure in there—if I use my imagination). The AI gave me a very good set of symbols corresponding to my initial prompt, but just like a child’s drawing of a stick figure with symbols for hands, facial features, and body in all the wrong proportions, the AI did not compose its symbols into a rational whole.
Nightcafé is rather bad at organizing an image into something that makes sense, but Craiyon (aka DALL·E mini) is much better at that sort of thing. I ran my experiment in Craiyon and was very impressed with the results; finally, the AI is listening to what I’m saying and composing a coherent image.
But there are still problems. What is going on with Sailor Moon’s face in the top left image? How is she holding the salad bowl in the center left? And the AI reverts to a vague suggestion of leafy greens in the bottom right panel—I’m bringing a lot of myself to the art when I interpret that as “salad.” We are at the level, here, of Picasso’s famous portrait of Gertrude Stein, which uses symbols of facial features to recreate Stein’s appearance, yet which does not accurately reflect the reality of what Picasso saw when he painted it.
Recently, I got an invitation to try the Midjourney beta (not because I’m special or anything; they gave invites to something like 300,000 people at once). Of course, you can imagine what I did before my free trial ran out.
There is still some nonsense here (what is going on with the hands and shoulders?), but I find this very intriguing because Sailor Moon doesn’t look like an anime character—she looks like a regular person. What does this signify? It’s as if the AI thought, “Sailor Moon is a person, so I will create the image of a person.” And instead of giving a correct rendition of what Sailor Moon looks like, we get a (more-or-less) correct portrait of a human being.
A friend of mine was recently given access to DALL·E—the premier, the big one, the one that is “so powerful that we can’t open it up to everyone yet” and he performed my experiment for me. The result is inaccurate in precisely the opposite way from what Midjourney gave me.
Here, DALL·E decided that “Sailor Moon is an anime character, I’ll draw anime characters making the salad.” But that’s not what Sailor Moon looks like in the show. The clothes are the wrong color; her hair is different. My seven-year-old daughter drew a guitar from life recently; the guitar she drew had a sound hole very close to the bridge, but her model’s sound hole was up next to the fingerboard. Same thing: the AI is using what it thinks it knows, and drawing an image that it guesses will pass as an acceptable, readable symbol for what was originally asked. Although Midjourney’s Sailor Moon has a very impressively-rendered face, that face is still just a symbol; it does not correspond to any real face, either in the world, or in the Japanese television show that it was supposed to copy from. Similarly, DALL·E’s attempt to create an anime character results in drawings which correspond to the general type, but not to any specific anime character.
In her book, Betty Edwards describes the enormous frustration of children who do not get past the symbol-drawing stage despite their desire to produce drawings that look real, and as a consequence abandon their artistic pursuits. But there are so many opportunities for the use of symbol-drawings; illustrations, graphic design, and cartoons all fall into that category. It is unfortunate that these arts are for some reason not considered as valuable as the “Fine Arts” like painting. A deft manipulation of symbolic modes of communication is a skill that must be learned, practiced, and perfected just like the other arts. No one should say that Watchmen or Maus is bad art just because the pictures are made of mostly symbols. And art does not have to look like or be as complicated as The Maids of Honor or the Sistine Ceiling to be valuable, good, or interesting.
So it would seem that AI art already has a use case—illustrations. Of course, this leaves open the question of what will happen to all of the illustrators who will now lose their jobs to DALL·E. Or will they? Once DALL·E itself is finally released (an event which appears to be imminent), artists of all kinds can easily jump on, set up an account, and start making illustrations. And an artist who can tailor a prompt to produce exactly what they envisioned in their head will have a marketable skill just as much as one who can manipulate a brush, a pencil, or Photoshop. I’m not worried for the artists at all.
Your sub-headline leads me to wonder about the difference between the prompt “Sailor Moon making a salad” vs “Sailor Moon tries to make a salad”. Former is relatively straightforward vs-a-vis symbolic mode you discuss but latter perhaps begins with a right-brain judgement about the nature of “trying” which human artist may better capture than AI.