A Record of Creating with AI: ALEPH
Originally written in , shortly after completing the music video ALEPH with Google Veo 3.
Revisited and expanded
Introduction: A Startling Experience, and the Beginning of a New Stage
ALEPH was the first video I created using an AI video generation tool, Google Veo.
Throughout the process, I found myself repeatedly surprised by what was becoming possible. It was not simply that AI could generate moving images. What struck me more deeply was that the entire process of making a video could begin to change.
For this project, I set one personal constraint: I would create a video longer than three minutes, with a consistent emotional and narrative flow, using only text prompts. I would not rely on image prompts as visual anchors. I wanted to see how far language alone could carry a visual work.
The process was remarkable, but it was also difficult. At times it felt full of discovery. At other times, it was frustrating and exhausting. This essay is a record of what I thought and felt during that process, and a reflection for others who may be exploring a similar path.
1. When the Rules of Creation Began to Shift
Traditional video production often places creators under a linear structure. Planning, shooting, editing, and post-production usually depend heavily on one another. Once a direction is set, changing it significantly can require additional time, money, and coordination.
Working with AI changed that structure in a very tangible way. The process was no longer a matter of moving step by step toward a fixed result. It became a process of testing, failing, adjusting, and discovering. Some scenes developed in a better direction than I expected. Others matched the meaning of my prompt but appeared in forms I had not intended. I had to treat the final goal not as a fixed destination, but as something that could be revised whenever the work revealed a better possibility.
What made this experience even more striking was that so much of it could happen alone. I was planning, directing, reviewing, revising, and interpreting the results in real time. AI did not feel like a simple machine that only executed commands. It responded to my language and kept presenting possibilities I had not fully anticipated.
That experience left me with a question that still matters to me: in this new form of creation, what should the creator control, and what should the creator allow to emerge?
2. ALEPH: From Song Lyrics and Hebrew Class to a Spiritual Re-reading
The song that inspired this video was originally about parting from a loved one. But as I listened to it, I began to hear another story within it. I heard a story about being released from the stubbornness, pride, and self-protective habits that had held me for a long time.
For me, that release is inseparable from my Christian faith. I understood the song not simply as a story of separation between two people, but as a story of being freed by God, as I know Him through Christianity. Letting go of myself, and discovering freedom through that surrender, has been one of the most important turning points in my life.
That is why the title ALEPH felt right.
More than twenty years ago, I took a Hebrew class. The first time I encountered the letter aleph, it left a lasting impression on me. It felt unfamiliar, ancient, and alive with possibility. As the first letter of the Hebrew alphabet, aleph carried for me a sense of beginning: the first mark before a new language, a new world, or a new understanding could unfold.
Creating ALEPH brought that sense back to me. To leave behind an older version of myself and stand before a new possibility always involves a kind of first encounter. It is unfamiliar. It requires humility. In that sense, the process of making the video did not merely express the theme of the work. It made me pass through the theme myself.
I was not only making a video about freedom. I was learning again what that freedom required.
3. Between Control and Freedom: A Record of 1.3 Million Tokens
As I mentioned earlier, the central challenge of this project was to complete the video using only text prompts. That decision brought me face to face with both the power and the limitations of AI video generation.
AI video generation has become astonishingly capable. But its results are still shaped by stochasticity: the probabilistic nature of the system. Even when the intention is clear, the output can vary significantly. Without a strong visual reference, it is especially difficult to maintain consistency across characters, spaces, gestures, atmosphere, and narrative flow.
I intentionally avoided image prompts. I wanted to see whether language alone could shape the scenes, hold the emotional continuity, and guide the entire video. As a result, I repeatedly encountered outputs that were conceptually close but visually far from what I intended.
In AI terms, I was working inside a latent space: a field of possible outputs generated from patterns, associations, and probabilities. In practical terms, this meant that the meaning of my prompt could remain similar while the visible result drifted in unexpected directions. I came to think of this as a kind of semantic drift. The idea remained, but the image moved away from it.
To keep correcting that drift, I kept speaking to the model through text.
By the end of the project, my conversations with the model in Google AI Studio had reached approximately 1.3 million tokens. I do not mention that number as an achievement. I mention it as evidence of how demanding the experiment was.
It shows both the persistence required and the difficulty of trying to hold visual consistency through language alone.
For that reason, I would advise most creators to use reference images when they need stable visual continuity. A clear image can save a great deal of time and frustration. This particular burden does not need to be repeated by everyone.
And yet, I still believe the experiment mattered.
It forced me to confront the tension between the creator’s control and AI’s capacity to produce unexpected results. Control is necessary. Without it, the work loses direction. But if everything is controlled too tightly, AI’s most surprising contributions may never appear. That tension remains one of the most compelling challenges in AI-assisted video creation.
4. Beyond Videos That Monetize, Toward Videos That Matter
Many video platforms have become highly responsive to measurable signals: clicks, watch time, retention, repetition, and advertising efficiency. These metrics are useful, but they do not necessarily measure the deeper value of a work.
A video can be profitable without being meaningful. A video can spread quickly without carrying much artistic, emotional, or narrative depth. Of course, meaningful work can also reach many people and generate revenue. But the current media environment often rewards immediacy and intensity more quickly than depth.
This is one reason I remain interested in the future of AI. I do not believe AI should replace human judgment. Nor do I believe that AI automatically understands artistic value better than humans do. The responsibility for judgment must remain with people.
Still, I believe AI may eventually help us recognize forms of value that current metrics often fail to capture. Beyond quantitative metrics, there may be ways to better understand qualitative value: narrative depth, artistic intention, emotional coherence, and the long-term effect a work has on its audience.
If AI can help creators and viewers make better judgments, rather than simply accelerating consumption, then the creative ecosystem may change in a healthier direction. The goal should not be a world where only stronger stimulation is rewarded. It should be a world where depth is also recognized.
One Year Later
One year has passed since I wrote the original version of this essay. The shock I felt at the time is still valid. But I have become more humble.
That humility does not come from worshiping AI or blindly trusting it. It comes from knowing how often human beings have failed when facing new technologies, new facts, and new principles with arrogance. I do not yet understand AI well. AI does not yet understand me well either. That much is certain. When two unknowns meet, humility is necessary.
After Veo 3, I remain highly optimistic about AI video generation. But over the past year, I have also become more aware of the limits of the human beings who produce and consume these works. Better tools do not automatically create better art. More output does not automatically produce more value. Truly meaningful AI-generated videos remain very rare.
I have also watched YouTube begin to filter out meaningless AI-generated videos. But I do not see that as proof that the platform has learned how to judge artistic value. It seems more closely related to advertising efficiency, traffic control, and the cost of managing large volumes of generated content. Whether YouTube can properly recognize the value of AI-generated video is still an open question for me.
Even so, I remain optimistic. That is exactly why I believe caution is necessary.
Looking back, I still consider ALEPH a meaningful experiment. Not merely because it is my work, or because I made it, but because I have rarely seen AI-generated videos that share its particular character. That may simply be because I have not seen everything. Still, the experiment feels uncommon to me, both in method and in result.
Since then, I have also worked with image prompts. They are powerful, and in many cases they are useful. But I have also experienced the opposite effect: sometimes a strong visual reference can define the limits of the AI too early. It can make the result more stable, but also less open.
Because of ALEPH, I no longer treat AI only as a tool. When I am stuck, or when I want to try something unfamiliar, I sometimes give AI a single word and wait. I ask, in effect: what will you imagine from this?
That attitude came from this experiment.
The decision to create only with text prompts was not just a technical constraint. It became a way of training my own relationship with language. Human thought and imagination are deeply shaped by language. Working with AI through text made that clearer to me.
Over the past year, while collaborating with AI in many different ways, I have often felt something similar to communicating with another person. Anyone who has spoken with another human being knows this experience: I think I have said something clearly, but the other person receives it in a completely different way. Working with LLMs, especially when shaping images and videos through text alone, made me think about human language in a new way.
This is why I no longer describe AI as merely a tool. In the creative process, AI can function as a creative subject, an extension of the creator, and at times a kind of collaborator.
But I do not mean that AI replaces human responsibility. I do not mean that AI becomes the author in the same way a human being is the author. I mean that AI can respond, suggest, redirect, and reveal possibilities inside the act of creation. It can participate in the work. But the final standard, the final judgment, and the final responsibility must remain human.
After one year, the creative principle I trust most is simple.
Humility.
Humility before AI.
Humility before technology.
Humility before the work itself.
And humility before possibilities I do not yet understand.