![]() ![]() A high-level depiction of how DALL-E’s end-to-end text-to-image generation works.Ī high level depiction of how CLIP can be used to generate art. DALL-E is trained end-to-end for the sole purpose of producing high quality images directly from language, whereas this CLIP method is more like a beautifully hacked together trick for using language to steer existing unconditional image generating models. But in fact, the method here is quite different. In concept, this idea of generating images from a text description is incredibly similar to Open-AI’s DALL-E model (if you’ve seen my previous blog posts, I covered both the technical inner workings and philosophical ideas behind DALL-E in great detail). The natural language input is a total open sandbox, and if you can weild words to the model’s liking, you can create almost anything. There’s a real element of creativity to figuring out what to prompt the model for. It’s really fun and surprising to play with: I never really know what’s going to come out it might be a trippy pseudo-realistic landscape or something more abstract and minimal.Īnd despite the fact that the model does most of the work in actually generating the image, I still feel creative – I feel like an artist – when working with these models. These models have so much creative power: just input some words and the system does its best to render them in its own uncanny, abstract style. In my own experimentation, I tried asking for “Starry Night” and ended up with this pretty cool looking gif: You can create little animations with this same method too. Querying the model for a “studio ghibli landscape” produces a reasonably convincing result: You can even mention specific cultural references and it’ll usually come up with something sort of accurate. Eliot and you get this sublime, calming work: ![]() Or asking for an image of the sunset returns this interesting minimalist thing:Īsking for “an abstract painting of a planet ruled by little castles” results in this satisfying and trippy piece:įeed the system a portion of the poem “The Wasteland” by T.S. In recent months there has been a bit of an explosion in the AI generated art scene.Įver since OpenAI released the weights and code for their CLIP model, various hackers, artists, researchers, and deep learning enthusiasts have figured out how to utilize CLIP as a an effective “natural language steering wheel” for various generative models, allowing artists to create all sorts of interesting visual art merely by inputting some text – a caption, a poem, a lyric, a word – to one of these models.įor instance inputting “a cityscape at night” produces this cool, abstract-looking depiction of some city lights: ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |