By now, you've probably seen your fair share of jaw-dropping AI images on your social feeds and thought to yourself, "How are people creating these amazing images?" So, you jump onto Midjourney, eager to create your own. But what you produce isn't quite what you expected. You try again, only to be disappointed once more. Sound familiar?
In order to generate what you desire, you need to know how to prompt Midjourney's LLM the right way and to be honest, I've spent my fair share of time failing to generate what I want for Confident's weekly blog post cover images (such as the current one above).
So, are you ready for a light-hearted and interactive tutorial? Lets begin!
Getting Started with your first Midjourney artwork
To get started with Midjourney, sign up to Discord if you haven't already and complete the registration process. Once you have Discord up and running, open the Midjourney website and click "Join Beta".

Once you've signed up, you can select a paid or a free plan. Users on the free plan can generate images in any one of Midjourney's newbies channels, while paid users can send commands directly to the Midjourney bot.
To begin with your first image, start typing / followed by imagine command. Then, it will let you enter a prompt (a description for generating an image), for example: /imagine prompt: beautiful colorful horse

Congratulations! You just used Midjourney to generate your first image.
How does Midjourney work?
Midjourney uses an LLM (a large language model) to create images from text descriptions. This model has been trained on a vast array of text-image pairs, enabling it to understand and interpret the text prompts to produce similar images.
Let's break down this image creation process.
Analyzing the Prompt
The LLM starts by dissecting the prompt into its core ideas and terms. If you input something like "a photorealistic portrait of a woman," the system identifies key concepts like "photorealistic," "portrait," and "woman."
A basic Midjourney prompt looks like this:

While a more advanced prompt may look like this:

We'll get back to this later. What's important is to understand that whatever you write is used to create the latent vector in the following step.
Creating a Latent Vector
Next, the LLM translates these concepts into a latent vector. This is a numerical code that captures all the image details - its color palette, shapes, style, objects, and more.
All those parameters are used inside the model to understand your request, by matching the vector to data it already knows and has been trained on.
This is why the following tip by the official Midjourney documentation is so important:
The Midjourney Bot works best with simple, short sentences that describe what you want to see. Avoid long lists of requests. Instead of: "Show me a picture of lots of blooming California poppies, make them bright, vibrant orange, and draw them in an illustrated style with colored pencils," try: "Bright orange California poppies drawn with colored pencils."
This means you're more likely to get better results with shorter prompts!
Using a Diffusion Model to generate the image
The final step of generating the image involves converting this latent vector into the actual image. This is where a diffusion model comes into play. It's a kind of AI that can form images from seemingly random patterns.
Starting with a blank canvas, the model slowly refines the image, adding layers of detail until it reflects what the latent vector describes. The way it adds this 'noise' is controlled, making sure the final image is clear and recognizable.
Other well-known generative AI platforms such as Stable Diffusion uses the same technics.
This is also the reason while waiting for Midjourney to complete its image creation, you notice blurry images which eventually turn into amazing art work.

The basics
Begin with a short prompt, and focus on what you want to create - our subject. Let's say we are interested in creating a portrait of a woman. We can begin with something like this: /imagine A portrait of a young woman with light blue eyes

Once we have our initial image, it is all about iterations and improvements. We can now focus on details that matter, such as medium, mood, composition, environment.
Let's say we want to get a more realistic photo: /imagine A realistic photo of a young woman with light blue eyes

This one is more realistic; however, let's give it the touch of an old photograph. To achieve that, we can simply add a year, say, 1960.
/imagine A realistic photo of a young woman with light blue eyes, year 1960

We've come a long way by only adding small details, such as the year and the medium type (realistic).
Pro tip: The Midjourney Bot does not comprehend grammar, sentence structure, or words as humans do. Using fewer words means that each one has a more powerful influence.
Now, let's add a composition; for instance, if I am interested in a headshot from above, we can revise our prompt accordingly: /imagine Bird-eye view realistic photo, of a young woman with light blue eyes, 1960

Pretty cool right?
Continue experimenting with various elements such as environment, emotions, colors, and more to discover the diverse outcomes they can produce.

Midjourney, utilizing a well-trained Large Language Model (LLM) and a diffusion model, has the capability to generate a wide range of variations based on your initial image. This allows for a great deal of flexibility and creativity in the image creation process.
By instructing the bot to produce either strong or weak variations, you can refine the output step by step. You might start with a broad concept and then progressively narrow down the details, or you could begin with a highly specific image and explore slight adjustments. The process continues until you reach a result that meets your vision or preference.

For example, asking for a strong variation will result in the following images:

Advanced techniques
Now that we understand the basics of Midjourney's LLM, we can dive into parameters. Parameters are options added to a prompt that change how an image is generated.
Changing aspect ratio
Pro tip: parameters are always added at the end of the prompt
One of the most important parameters is the aspect ratio. Midjourney's default aspect ratio is square (1:1), but what if we want to create a great cover image (such as this article's cover) or a portrait image? We just need to add --ar at the end of the prompt.
For example: /imagine Bird-eye view realistic photo, of a young woman with light blue eyes, 1960 --ar 1:2

Notice the --ar followed by the aspect ratio.
Confident AI: The DeepEval LLM Evaluation Platform
The leading platform to evaluate and test LLM applications on the cloud, native to DeepEval.















