AI Thumbnails with Stable Diffusion
Sunday, October 23rd 2022, 10:01:53 pm
So earlier today, I had a thought:
What if all of these posts had unique thumbnail images beside them?
Then reality set in:
I don’t have an illustrator or a design team. I’m just one guy with a markdown blog.
Then I had a little mini eureka moment:
What if I generated all my thumbnails using DALL-E and/or Stable Diffusion?
And so I did!
How To Generate Images Manually using DALL-E
So one can easily generate little images using OpenAI’s DALL-E API and save them with each post:
However, that process is pretty manual. The ideal process would be something like a pipeline or some way to do it programmaticall during the Gatsby build
task (or before it).
Generating A List of Text Prompts
Regardless of which route, a list of all the blog posts needs to be generated with their source filepaths and their tags, etc. These could be used to make prompts from. So using GatsbyJS GQL Schema a query like this can be created to extract text content from all of your posts.
query PostsTagsEtc {
allMarkdownRemark {
edges {
node {
frontmatter {
tags
title
ogDescription
slug
date
}
fileAbsolutePath
}
}
}
}
This returns some nice, structured JSON to fuel image-gen model prompts:
{
"data": {
"allMarkdownRemark": {
"edges": [
{
"node": {
"frontmatter": {
"tags": [
"code",
"update",
"machine_learning",
"reinforcement_learning"
],
"title": "What I've Been Doing This Past Year",
"ogDescription": null,
"slug": "its-been-too-long",
"date": 1646609399762
},
"fileAbsolutePath": "/path/to/code/website/content/blog/2022/03/06/what-ive-been-doing-this-past-year/index.md"
}
},
{
"node": {
"frontmatter": {
"tags": [
"music",
"video",
"synths",
"livelooping",
"sonicpi",
"proceduralgeneration"
],
"title": "Synth Jam 6 | bitstream hymn",
"ogDescription": null,
"slug": "synth-jam-6-or-bitstream-hymn",
"date": 1593132427404
},
"fileAbsolutePath": "/path/to/code/website/content/blog/2020/06/25/synth-jam-6-or-bitstream-hymn/index.md"
}
// etc...
}, ]}}}
From here, the above results are saved as a json file and then using a simple python script to format will transform it into an easier to work with list I’ve called posts.json
here:
import json
import os
cwd = os.getcwd()
source_root = "/path/to/your/repo"
input_file_name = cwd + '/data/posts.json'
output_file_name = cwd + '/data/prompts.json'
outdir_prefix = "thumbnails"
# Read the file data
with open(input_file_name, 'r') as f:
data = json.load(f)
output = []
for edge in data["data"]["allMarkdownRemark"]["edges"]:
prompt = None
slug = edge["node"]["frontmatter"]["slug"]
# This uses the slug as a prompt, but it can be any field
slug_as_words = slug.split("-")
prompt = " ".join(slug_as_words)
outdir = sourcefilepath.replace("index.md", outdir_prefix)
if prompt is not None:
output.append({
# This is the actual prompt
'prompt': prompt,
'outdir': outdir
})
# Write the output file
with open(output_file_name, 'w') as json_file:
json.dump(output, json_file)
This is where most of the experimentation can take place for better systems of generating prompts, but I just went with a simple transformation on my post titles and slugs for the sake of example.
Afterwards, the resulting prompts.json
file can be kept around and used to generate examples (i.e. prompts) for your model.
Stable Diffusion Model Pipeline
For the sake of simplicity, I went with using @woctezuma’s Google Colab Notebook as a base and then updated it with some additional code to process my prompts.json
file and generate images for me that I could download as simple files later.
The first set of results didn’t look great to me, but at least they worked as a proof of concept:
Conclusion
Since I’m going for this pretty pixel-art-heavy style on the website, I don’t think I like the results enough to actually use these images from Stable Diffusion, but you can see my full notebook on Github and try it for yourself. If you don’t need “pixelart style”, then the results actually look pretty usable.
For now, I stuck to using the DALL-E generated pixel art as a small, finite library for my limited number of posts.
In the future I’d like to spend some time getting set up and familiar with InvokeAI locally to make this all run on my laptop (rather than spending money on Google AI Platform). That way, it would be easy to generate a thumbnail for each post as its written.
So far, InvokeAI looks really promising. With hust with the demo installation and a simple prompt of:
invoke> a fantastic alien landscape -W576 -H512 -s60 -n4
I generated the following cool alien landscape images:
Now all I need to do is find a good way to give these a nice pixel-art postprocessing effect like this automatically:
Alas, it’ll have to wait for the next post.
Written by Omar Delarosa who lives in Brooklyn and builds things using computers.