OpenAI Opens Doors to DALL-E Text-to-Image Conversion Service • The Register


OpenAI on Wednesday made DALL-E, its cloud service for generating images from text prompts, available to the public without any waiting lists. But the crowd that had gathered outside his door may have advanced.

The original DALL-E debuted in January 2021 and was replaced by the DALL-E 2 in April. The latest version, which offers much improved text-to-image capabilities, allowed people to sign up to use the service, but put aspiring AI artists on a waiting list – a list that hasn’t budged in the last five months for that. Reg journalist. The new utility is called DALL-E, although it is still version 2 of the technology.

OpenAI justified the closed listing citing the need for caution. The organization wanted to prevent users from generating violent, hateful or pornographic images and prevent the creation of photorealistic images of public figures. And it’s created policies to that effect, because misuse and misinformation are real concerns with machine-learning imaging technology.

“To ensure responsible use and a great experience, we will be sending invitations gradually over time,” OpenAI advised beta enrollees in April via email. “We’ll let you know when we’re ready for you.”

While OpenAI was distributing access to 1,000 users per week (in May), Midjourney – an AI-powered text-to-image service – entered public beta in July. Midjourney’s Discord server, through which users interact with the service, reportedly reached around one million users by the end of July.

That was roughly the number of invites OpenAI issued at the time, following a transition to beta testing. Midjourney’s Discord server currently lists 2.7 million members, while OpenAI currently claims to have 1.5 million users.

In August, another AI image generation company called released its own text-to-image model called Stable Diffusion, under a permissive CreativeML Open RAIL-M license.

The result has been renewed interest in Stable Diffusion because people can run code on a local computer, without worrying about fees – OpenAI and Midjouney require payment when users have exceeded their free tier allowances.

Additionally, Stable Diffusion is seen as a way to create explicit images without worrying about censoring cloud keepers – whether or not those images comply with the limited (and unlikely to be enforced) restrictions of the Stable Diffusion license.

“In just a few days, there was an explosion of innovation around it,” Simon Willison, an open-source software developer, wrote in a blog post about a week after Stable Diffusion went public. “The things people are building are absolutely amazing.”

Late to the party

Barely a month later, it looks like OpenAI is late to the starting gate.

“DALL-E is open to everyone (no waiting list)!” joked Brendan Dolan-Gavitt, an assistant professor in the Department of Computer Science and Engineering at NYU Tandon, by Twitter. “It’s amazing what a few weeks of open source competition can do ;)”

“The challenge OpenAI faces is that they are not just competing with the team behind Stable Diffusion, they are competing with thousands of researchers and engineers building new tools on top of Stable Diffusion” , Willison said. The register.

“The rate of innovation there in the last five weeks alone has been extraordinary. DALL-E is powerful software, but it’s only bettered by OpenAI themselves. It’s hard to see how they’ll be able to keep up rhythm.”

Artist Ryan Murdock (@advadnoun), which helped kick-start text-to-image AI by flipping OpenAI’s CLIP rapid assessment model and plugging it into VQGAN, expressed a similar sentiment.

“I think OpenAI is still relevant, but DALL-E is not,” he said in a chat with The register. “I see very few people using DALL-E in the scene because it costs money, is limited in terms of what it can or will produce, and cannot be used with exciting new research.”

Murdock also observed that the texture of DALL-E images “looks really bad because the superresolution isn’t driven by the text.”

This is an area where open source innovation has helped: among the first additions to the Stable Diffusion image generation process were two code libraries, GFPGAN and Real-ESRGAN, which handle repairing face rendering errors respectively. AI and image scaling.

Citing the ongoing debate over the ownership of the images – many artists are unhappy that their work was used without their consent to form these models – Murdock said the ship appears to have sailed because the Stable Diffusion models now live on people’s computers. He predicts even more hindsight as these AI models evolve to generate video.

Undaunted by external developments that have trivialized AI image generation and touting more robust filtering to ensure image security, OpenAI sees a business opportunity.

“We are currently testing a DALL-E API with several customers and are excited to soon offer it more widely to developers and enterprises so they can build applications on this powerful system,” the company said. ®


Comments are closed.