Stability AI today launched Stable Diffusion 3 Medium, which the British startup calls its “most advanced text-to-image open model yet.”
Comprised of 2 billion parameters, SD3 Medium promises photorealistic results without complex workflows. Crucially, the model can generate these images while running on individual consumer systems.
It also overcomes common artefacts in hands and faces, Stability said.
The company built SD3 Medium to understand complex prompts involving spatial relationships, compositional elements, actions, and styles.
Typography has also been enhanced. Stability described the text generation accuracy as “unprecedented.” The company attributes these improvements to the Diffusion Transformer architecture.
Another core attraction is the model’s size. At 2 billion parameters, the model is smaller than many Stable Diffusion 3 models, which range from 800 million to 8 billion parameters.
Thanks to low VRAM footprint, SD3 Medium is “ideal” for running on standard consumer GPUs without performance degradation, Stability said. It can also absorb nuanced details from small datasets, which enhances customisation.
Christian Laforte, Stability’s co-CEO, told TNW that the startup plans to continuously improve the model.
“Stability AI will continue to push the frontier of generative AI, and will aim to retain its lead at the forefront of image generation,” he said.
Users can now test SD3 Medium via Stability’s API. The model weights are available under an open non-commercial license and a low-cost Creator License. Anyone interested in large-scale commercial use can contact the startup for licensing details.
Problems and solutions for Stability AI
SD3 Medium arrives in turbulent times for Stability.
Founded in 2020, the startup was soon acclaimed as one of generative AI’s emerging leaders. Alongside rivals Midjourney and OpenAI’s Dall-E, Stable Diffusion rose to the summit of the nascent text-to-image sub-sector. In 2022, investors valued the startup at $1bn.
Since then, however, a flurry of lawsuits and financial concerns have engulfed the business.
Artists have sued the company for training its AI models on their work without consent. Stability has also discussed a sale as it faces a cash crunch, The Information reported last month.
As the problems mounted, the company’s CEO and founder, Emad Mostaque, resigned in March. Mostaque said he was leaving to pursue decentralised AI.
The software, however, has consistently impressed. Images from SD3 Medium suggest that the performance has been further enhanced.
Further upgrades are already in the pipeline — and not just for images. According to Laforte, the company is also focusing on “multimodal efforts across video, audio, and language.”
One of the themes of this year’s TNW Conference is Ren-AI-ssance: The AI-Powered Rebirth. If you want to go deeper into all things artificial intelligence, or simply experience the event (and say hi to our editorial team), we’ve got something special for our loyal readers. Use the code TNWXMEDIA at checkout to get 30% off your business pass, investor pass or startup packages (Bootstrap & Scaleup).
Get the TNW newsletter
Get the most important tech news in your inbox each week.