Holy Shift: Word Cloud Studio
As I look to be 'creative' for certain sections of my candidating portfolio I got a bit distracted and 'vibe coded' a very cool (I think!) word cloud generator.
Geek alert!
Like many blogging platforms, Ghost (which is superb by the way), allows users to export their articles in a pretty common JSON file. I wanted to take that export and create a Word Cloud. I've never really had enough content to make word clouds really 'sing', so this was a bit of a stab in the dark.
My first attempts were a bit weak:

I used a free online tool, I can't remember which; probably overshared my data and the outcome wasn't what I had hoped. It wasn't identifying passages, wasn't applying enough weight to certain words, it included various filler words or stop words from my articles and so begins the art of word clouds...they're not straight forward.
What became very clear though, sitting with Margie and looking at this first attempt, were the ways we were able to connect certain words that appeared near each other, give it a go. 'first, time, church'. 'god, people, need'. 'best, context, christ'...etc. etc. Fascinating, how it prompts thoughts. I didn't really like the output though. It wasn't 'my style'.
Queue, Visual Studio Code, an AI plugin and the trend of vibe-coding...I wanted an app, my way.

It had to use either JSON or TXT files as its source. But exclude all that jazz that comes with JSON files...but include them if the user wanted to, for some reason, cloud the lot!
I wanted some customisation for Text Processing: Stopwords (a new phrase for me), which are excluded from the analysis, I needed to exclude words like 'the' and 'and'...etc. I wanted to exclude words from the JSON that are added as part of Ghost's export and other fillers. Whilst also making it flexible enough to customise additional stop words, remove stopwords (thereby including them!) and boost phrases or words by a multiplier. Riiight.
Then I thought, I need it to also detect Bible references, and if I would like those to stand out in the final output, they might need a weighting too. It became quite clear quite quickly that nice looking word clouds that reflect the desired outcome are an art-form that require both mathematics and some creative massaging of data:

So why not allow my generator to also list the words used, in order, with counts and for adjustments to be made at a granular level? Essentially, some analysis to take place before generating:


So, looking at these, my blog up to this point used 3,273 unique tokens (words, word-combos, elements that could be 'clouded :)
So what about weighting and layout. Well I need to decide on the maximum number of items, the minimum and maximum font and what's called the curve power, essentially how much bigger the most common words appear compared to less common ones. Word clouds are not easy:

And last but not least, how do we want it all to look. Let's throw in some options there too:

So, the outcome, is something that looks a bit more like what I had imagined:

And you can give it a go yourself if you like. You can throw basically anything at it you like in a JSON or TXT format and tinker to your hearts content. It's specifically designed for faith related word clouds (and so masterfully manages bible references), but give it a go:
In three hours of distraction / focus. I've come up with an idea, tweaked and tinkered, got my head a little around GitHub, Visual Studio Code, Render and all sorts of other things. Surface stuff only, but I think, very cool.

I'm still none the wiser about how to feature any of this in my portfolio. But I still think it's cool.