Joe Russo thinks everyone will direct and star in their own movie with AI — here’s why that’s a disaster.

Bryan Tan
12 min readMay 9, 2023
Photo by Gage Skidmore — CC3.0 — https://commons.wikimedia.org/wiki/File:Joe_Russo.jpg

Joe Russo, one half of the Russo Brothers, the directors responsible of some of the most successful films in the Marvel franchise, has been of late singing praises of the boon that new developments in generative AI will bring to movies; that it will personalize and “democratize” art in ways we can hardly imagine. Here are some quotes from his panel interview with Collider to give you a sense of his expectations.

“You could walk into your house and [say to] the AI on your streaming platform. “Hey, I want a movie starring my photoreal avatar and Marilyn Monroe’s photoreal avatar. I want it to be a rom-com because I’ve had a rough day,” and it renders a very competent story with dialogue that mimics your voice. It mimics your voice, and suddenly now you have a rom-com starring you that’s 90 minutes long. So you can curate your story specifically to you.”

“The value of it is the democratization of storytelling. That’s incredibly valuable. That means that anyone in this room could tell a story, or make a game at scale, with the help of a photoreal engine or an engine and AI tools. That, I think, is what excites me about it most.”

It’s a profound lack of wisdom that is driving this urge to impose these new technologies onto our already fragile modern psyches and then to suggest that it will actually be empowering to us. There isn’t in the entire interview any indication that this is anything other than convenient and cost saving, or somehow merely inevitable due to the march of technology. It should also be said that the auspice of this conversation is a convention panel so we cannot expect from this an academic rigor, or that this represents his full views in any way. Yet if Russo had any personal consternation with what he describes, he did not voice it. We will proceed with the assumption that words matter. I’ve of course paraphrased Russo’s comments here and I encourage you to read the entire interview if you are curious or doubt my intentions.

So having got some of that reactionary frustration out of my system, I want to unpack some of the implicit values here and see if we can arrive somewhere a little more reasonable.

When Russo says we can insert ourselves into dramas of our creation and meld ourselves into the lives of famous people, he assumes we actually want this. Maybe I’m just an old soul but I can’t see the appeal of watching an entire film like this. There is certainly something of a humorous novelty of the TikTok, Snapchat variety in these kinds of face swapping games, but children do not shout at their mothers “let me tell you a story,” but “tell me a story!” They do not demand to be placed into these stories either. It does not matter if the story is about an owl, a fish, a boy, or a girl, it just matters that it’s good. I cannot recognize any primal urge to be made the center of attention in this manner. We actually quite like to disappear when watching a film so that we can see how it is that others behave in extreme circumstances. We love submitting to the dramaturgy of the teller. It means we don’t need to make decisions for a short while in our sufficiently complex lives. Does anyone playing a video game name the character after themselves or attempt to render the avatar with their perfect likeness? One never forgets that you are not playing yourself in the game, but a character. Would this not strike us as a bit psychopathic? I am shocked that Russo can express excitement at this with hardly any kind of qualification.

Why Russo would suggest offloading this burden of artistry onto the viewer is something I cannot understand. Anyone who has made art knows that no work is ever finished, you simply stop working on it. On Russo’s model everyone now has this responsibility, and who will be able to resist the constant tweaking of their character, having a lingering feeling of dissatisfaction with their choices of the drama. Perhaps next time you will get the balance just right. This vision makes every work perpetually incomplete and formless in the eyes of its viewer who now has the burden of participation in creation. One will lose the sense of unity that comes with viewing a work made by hands not your own.

We ought to notice yet another problem with this vision of the future. For you cannot always be the center of the film if you watch content with an audience of greater than one. Someone is going to have to impose their taste onto the story, and that will leave room for tremendous conflict. There is a neutrality that siblings, a couple, a family, or an audience in a theater can experience, a joint submission to another mind’s dream. If the film isn’t good, it’s the filmmaker’s fault, not your partner’s. In Russo’s vision, the relation is one on one, between you and the screen alone — a peep practice, pornographic in its separateness, in its fixation on novelty, and on the preferences of the user. The result of this is that people will not watch this kind of content together, but in an isolated manner. We are already trending this way with the proliferation of handheld devices and the preference for sitting close to smaller screens rather than farther from a single large one, but still the content itself provides a unifying thread. In this vision it will no longer be possible to discuss the latest episode of the latest show because everyone’s experience will be idiosyncratic.

The implications of this are significant. Our experience of reality as a coherent thing, as a story within time that we are sharing, collapses. We have already fractured society across race, class, and politics, now we can fracture it into individual units and total narcissistic incoherence. By now we should not be ignorant of the massive power by which media of all kinds shape our views of reality. Tampering with these structures ignorantly to indulge our senses will only bring more reactionary violence and extremism. The structure encourages only further fracturing of our view of reality.

I don’t think Russo’s exact vision will come true. It is far too hedonistic and unlivable, but something quite like it probably will. It will have similar characteristics and indulge the same weaknesses of our modern sensibilities. The question is a matter of degree, not of its character. A part of me fears we will get exactly what he has described. As we have seen in the last decade, we cannot rely on the metric of what people would actually enjoy in order predict what future technologies will be successful. We are rather subjected to the fancies that the machine makers create. The devices are their art and we are a captive audience.

Here is an unpopular opinion. There is no such thing as “democratizing art.” New technologies do not empower artists. Art is not concerned with efficiency. New technologies simply impose themselves and their new standards of productivity onto artistic endeavor. If it was once difficult to kick the ball fifty meters whereas now, with my pneumatic calves, it is possible to kick it two hundred, then the goalpost has simply moved an equivalent distance. The audience is not impressed by easily surpassing yesterday’s standards of excellence. The audience for their part does not care about improving technical standards, for they are generally unaware of them and perceive primarily the content of the medium, not its composition.

One is shocked when looking at the graphics of an old video game — our mind remembers them as far more detailed and lifelike, and it is only with the procession of new technology that we notice the goalpost of quality moving. Equally so, great artistry from any era stands out as something different from technical proficiency. Notice how little one is concerned with the blocky graphics of an old Nintendo 64 game and how little this detracts from revisiting these classic games even when modern versions have far surpassed them technically. It is much the same in cinema, the tyrannosaurus in the original Jurassic Park still provokes our awe today whereas the sequels left our memory the moment we left the cinema. Of course, none of this is to say that technology has a trivial impact on art. Quite the contrary, it is indeed paradigm shifting, only its advancement is not the measure of good artistry. It represents instead a changing of the medium, a change of the playing field as it were, and the rules in which the game unfolds.

Perhaps it will be possible for everyone to render Marvel-level graphics on their own TVs to their own preference. It is of no consequence. That which is easy with a new technology which was difficult with a former will pose no special interest to us. It shall only succeed in turning special effects into average effects. It will never be the case that great art will be easy to make, for we will only call that great which is difficult and bears the mark of inspiration and personal sacrifice. It may be that advancements in technology can make a process cheaper or swifter in man hours, but this will do nothing to democratize the process. For still, the audience will look for excellence. As with sport, so too in art does the audience favor and venerate only those few of the greatest mastery.

A human being is not a mere consumer of content, a human wants to leave some mark on the world — to build something of worth. For decades now the most promising horizons of creation have been in the digital frontiers. Thus, that is where creatives of all kinds of have directed their energies, and where their talents have borne the most fruit. The prognosticators seem to think this will continue unabated. But if this AI generative art apocalypse arrives as it is being promised to us, then the digital world will no longer offer possibilities of meaningful creation. For the endless “precession of AI simulacra,” that is soon, and already upon us will drown out our own songs and our own visions of beauty. As a consequence, we shall simply turn our attention back to the physical world, where we can see the impact of our labor.

We should remember that favorite buzz word: inspiration. We use this word so much it has nearly lost its value. But I suppose that it refers to an experience that ignites within us a drive to do something that we have previously imagined impossible. It does not matter how magnificent the new horizons of visual excess can be, or how personalized they can become to our tastes. It will be no more inspiring to watch an AI generated movie than it will be to watch an air cannon fire basketballs into a hoop. This, and other stunts, like AlphaZero’s “mastery” of chess are of interest only to the engineers of machines. Human competition in chess has continued to flourish in spite of AI dominance. People will tune out the noise in search of something more meaningful, something that takes place on a human scale, that bears within it the possibility of emulation.

This is the trend that no one sees coming, that no one would dare to predict, for it runs counter to the entire engine of modern life. This engine runs on the fuel of modernization and digitization, but the car is driven still by a human, who accepts these as the tools through which their potential can be actualized in the world. But if you are a driver in search of roads not traveled, and you find only busy, crowded streets full of traffic; if at any time you try to get off the main highway onto an old, forgotten road, a manic frenzy of robots arrive to pave the path in front of you before you can even get off the exit ramp; you will sooner or later stop driving the car and instead travel on foot, where the robots will not bother you.

This is what the fools who sing praises of these AI horizons do not understand. It matters not how engrossing the content they create shall be. If it does not present to us a horizon to which we can travel and leave some significant mark on the world, then it will not be of interest to us. Already the opportunities of the digital frontiers seem to diminish before us, lacking their prior luster. Everywhere creators lament the end of youtube, of Twitter, of the flattening of Instagram, the languid isolation of remote work; the decline of white-collar jobs, the collapse of tech cities like Seattle and San Francisco, and their tech stocks along with them. The collapse in the profitability of streaming, the list goes on.

The AI prophets say it’s them or bust, but the next era shall only diminish the creator economy further. All the simulation and gamification of The Digital that the tech gurus can dream up will do nothing but prolong the slow realization that the real frontier now is in the derisively titled “meatspace” that we have for decades left out to rot.

The printing press and the industrial revolution rendered uniform the once multivariate splendor and true diversity of traditional human societies. But whereas the industrial world renders reality as uniform, then The Digital always tends towards the idiosyncratic due to its infinite capacity for change. In the uniform satisfaction of The Industrial, we had our material needs met, but none of the aching, and spiritual needs within. The Digital world then offered us the possibility of novelty, of invention, disappearance, and of proliferation of unique and hybrid forms.

Generative AI is promised to deliver on the diminishing promises of The Digital. But there is hidden in AI a new era which subverts that very promise. For AI is opposite of idiosyncratic. It is generalizing. It engages in the slow process of mushing things together into an indecipherable grey slime — the idiosyncrasy of Wes Anderson made boring and commercial. This is a feature, not a bug. It is the very nature of a large-language-model being trained on a dataset of preexisting artistic material to combine and average out that which already exists, not to dream new dreams. Whereas the early days of this AI technology will be used to advance the landscape of The Digital, and by extension, idiosyncrasy; in time, the fundamental sameness and flatness of the generative AI landscape will reveal itself to be nothing but the toddler-like smashing together of action figures.

Artificial General Intelligence of the kind feared in movies, is now a real possibility, and would represent a new category all together. But as we have hopefully sufficiently discussed here, such a cosmic leap is not even necessary to bring massive societal change and unfathomable damage to the human psyche. The “tools” that have already been invented, if merely honed and iteratively improved, are sufficient on their own to do that, and they already are.

I know it’s bleak. But I am clinging to hope that gradually we’ll wake up to this. I am hopeful that the excess and over-stimulation that is soon to envelop us shall wear us out and awaken in us a desire for something more real. If it is too late for my generation then perhaps the next will perceive what we have not. Children can be counted on to rebel against the norms of their parents. Regardless, if it is sooner or later, if we manage to extract ourselves from our subservience to the digital world, we must not do so in a reactionary, Luddite manner. This would be an infantile clinging to the past, dead of potential from its birth.

We must instead leverage the strengths of the digital to make the physical world particular and interesting again. There are immense possibilities in 3D printing at scale, and similar hardware technologies to destroy the tyrannical uniformity we have grown accustomed to post-industrial revolution. We are encouraged to find “our people” in digital communities. Why do we not build real communities? I do not mean this in a vague sense, I mean why do we not build our own homes, and our own centers of life? Why do we outsource all of this to unseeing and unfeeling developers, letting them grow rich while we languish under the ignorance of their designs. We have outsourced everything in “meatspace” to unfeeling machines, or machine-like systems of laws and contracts. We can use the digital to supplant the bland uniformity of life and bring about more bespoke systems that actually serve us and the people we love. We must view the digital world not as the realm in which our potential can be realized, but as merely a medium in which to share the exciting things that we are doing in the real one.

That is, if we can impose our standards of living onto digital systems rather than allowing them to impose their standards onto us. Maybe we can do that if we can only lift up our heads.

--

--

Bryan Tan

Atlanta based Filmmaker, Writer/Director. Writing here about AI implications and cultural matters. https://www.bryanjtan.com https://lucidthemes.substack.com/