There’s a new sheriff in town for those of us who are excited about the eruption of mind-blowing (if ethically dubious) tools made possible by advances in generative AI image-creation. The FLUX family of models from Black Forest Labs represents another leap forward in what is available to us, both in terms of the overall quality of the creative tools and the specificity of what can be asked of them.
FLUX AI vs Midjourney
As a long-time Midjourney user, I can’t help but point out that the platform hasn’t completely lost its mojo with the introduction of the FLUX suite.
When it comes to generating work that looks genuinely artistic and impactful–even tasteful–I still consider Midjourney to be the reigning champion for certain things, so long as you’re willing to do most of the work with text alone.
That being said, MidJourney is a pay-to-play tool, and it’s much (MUCH!) less flexible than what is available when using an advanced diffusion model in the context of a work environment like ComfyUI. With some extra elbow-grease, FLUX is a superior tool.

The FLUX Suite
There are a bunch of different variations within the FLUX toolset, even beyond the Big Three that Black Forest specifically offers.
There’s the Pro model, which you can only use after passing through a (pretty reasonable) online paywall like replicate.com or freepik.com.
Then there’s the Dev model and the Schnell (fast) model, which you can download from huggingface or CivitAI. The Dev Model comes in at a whopping 24 gigabytes and takes time to load and generate images; it’s a monster, both in its size and capabilities.
The Schnell model is a lightweight, faster model, but it still delivers impressive results (and you can run it on lighter hardware). If your GPU has less than 24gigs of VRAM, you’re going to struggle with the Dev model.
Beyond that, there are a bunch of “quantized” models that run at various different levels of speed and capability (these are referred to as the Q8, Q6, Q4 and so on). They are basically pruned a little bit, like compressed images. I didn’t test every single one of them, but the online chatter about them is that they deliver pretty satisfactory results with much less of a computational footprint…
What Makes FLUX Special?
So let’s look at some of the things that make FLUX special. First of all, check out some realistic photographic imagery. These don’t incorporate any LoRAs or other inputs (ControlNets, etc.)–just the model talking to the KSampler.

The images are basically the greatest hits of prompts that required several tries, additions, re-writes, and misfires. If you have a VERY specific image in mind, it’s going to take some time to get there, and your prompt will get longer and longer as the process continues.
That being said, the overall quality of photographic imagery is unsurpassed. When it comes to rendering people, faces, and poses, FLUX AI generates images that are stunningly realistic. Ditto for landscapes, nature images, machinery, and interiors. This is a well-trained animal, indeed.
FLUX AIn’t Perfect.
But let’s nitpick a bit. FLUX still struggles with that familiar “A.I. Look”–vaguely shiny, oversaturated, or just overly-pristine. The elephant in the phony Vogue shoot looks especially… weird. And fabrics and skin tones still seem oddly airbrushed, almost as if they’re occupying a space between photography and illustration.

Working on its own, FLUX still seems to behave a little like it was trained exclusively on stock photography. Even if you ask it to give you “candid” or “shot on an iPhone,” the people tend to look like they’re straight out of central casting, with a make-up and hair crew just outside the frame.
So if you want the people in your images to look like something other than fashion models or cultural cliches, you’ll need to say so specifically. (As an aside, the FLUX Pro model purports to tackle this issue directly).
It’s also inexplicably difficult to achieve certain things… For example, in the image of the father-daughter photo in front of the pick-up truck, I requested multiple times in the prompt that they be “muddy.” It simply wouldn’t do that… I’m sure I could have gotten there eventually with the right word combination, but it can be unexpectedly time-consuming to do simple things, given that much of what these systems do seems utterly impossible.
Get Our 7 Core TouchDesigner Templates, FREE
We’re making our 7 core project file templates available – for free.
These templates shed light into the most useful and sometimes obtuse features of TouchDesigner.
They’re designed to be immediately applicable for the complete TouchDesigner beginner, while also providing inspiration for the advanced user.
Artwork with FLUX Models
Artwork with the FLUX models is impressive, but without a LoRA (see below!), it doesn’t quite stand up to the creativity of Midjourney.
I generally find the models to traffic in cliches, and they don’t know the names of any artists, apparently. This may be a conscious choice on the part of Black Forest Labs… naming a particular artist in a prompt is a potential copyright infringement, so maybe we should thank them for leaving it out. Midjourney seems to advertise the um… theft(?) of artistic techniques as its raison-d’etre.
You can get some headway by naming a particular trend or style (i.e. watercolor), but the models tend to have set ideas about the meaning of particular keywords. Longer prompts yield increasingly better results, so don’t give up at first… you can communicate with FLUX in plain English, and it does a pretty good job of composing layouts and moods as per instruction.

Using FLUX to Render Text
Another leap forward is FLUX’s capacity to handle text. You can hand it a few descriptors and a line of text, and it will generate one viable option after another, only occasionally mangling or misspelling the words (this was a pitfall of other older models, and one area where Midjourney seemed to stand alone).
Check out these Logos, all generated with one-sentence prompts:

Inpainting & Outpainting Tools
One last example of the versatility of FLUX models is that they will work with inpainting and outpainting tools, so you can work on particular areas of an image, or enlarge the image and fill in the areas that are as yet un-rendered.
This tool has been available in Photoshop for a few months, so obviously this isn’t a revolutionary development, but you can do it in ComfyUI for free with a model that is exceptionally accurate and creative.
Check out these modifications to an Abe Lincoln portrait. (The grey areas are painted over and re-rendered in successive frames with the FLUX-Fill model.)

Using FLUX with ControlNets & LoRAs
Unlocking the full potential of FLUX (and putting the tools on solid footing to compete with Midjourney) requires a deeper dive into more advanced workflows where ControlNets and LoRAs become essential.
ControlNets
ControlNets are specific guidance tools, enabling you to take an image and offer its general contours to the main model, so that you can feed it a particular image composition and have it work from there. Black Forest Labs released two (Depth and Canny), but there are a host of others from third-party developers.
Canny offers the model what essentially amounts to an edge filter, finding the main contours and rendering them as hard white lines.

The depth ControlNet provides an approximate depth map.

LoRAs
A LoRA–short for “Low-Rank Adaptation”–is the ingredient that vastly expands the capacity of the FLUX toolkit and brings the versatility of the (aptly-named) base models into full bloom.
A LoRA is a sort of filter or plug-in that takes the output of the base model and modifies it according to its peculiar specialty. There are graffiti LoRAs and oil-paint LoRAs. There are LoRAs that are experts in watercolor, nature photography, impressionism, Salvador Dali, The Rock, Ariana Grande, comic book art, tarot art, and storybook. And if you enjoy making titillating images of fantasy warrior princesses or anime teenagers wielding uzis, well! You have an all-you-can-eat buffet of options. Head over to CivitAI or huggingface to check them out.
Basically, the base model does all the heavy lifting, getting the image close to what you’re after. The LoRA steps in and provides the finishing touches, but they can make a HUGE difference, especially if you’re targeting a particular style.









Wrap Up
The dizzying array of offerings for AI creators is in itself a prelude to what is true of any creative undertaking: multiple tools will be required to achieve particular ends.
ComfyUI is probably the most versatile software suite to bring the various digital gadgets into one place, but I would advise anyone looking at this medium not to lose sight of the big picture.
You could use Midjourney to create a certain artistic style, bring those images into FluxGym (running inside of Pinokio) to train your own FLUX LoRA, and then use a ComfyUI network to run your LoRA with a FLUX base model (and other tools) for your final output.
In short, there is no need to discard your favorite workflows and resources because there’s a new one in the mix. The FLUX suite of models is an incredible contender when it comes to the speed and quality of the full spectrum of AI workflows, but it need not replace them. The FLUX models can simply be leveraged to improve them all.