About Dis_Describer – Cold Brew Dev

The Dis_Describer twitter bot is a python bot that posts photos of theme parks, primarily Disney theme parks, along with captions generated by Microsoft Computer Vision.

Motivation

This project is mostly just for fun. It was inspired by the similar @citydescriber account, a love for Disney parks, and the need for a project one January afternoon.

While the AI behind the Microsoft Computer Vision and similar services is quite sophisticated, theme parks, particularly Disney parks, pose a unique challenge.

You can see the AI is able to easily identify a train:

Theme parks, however, are designed to look like things other than what they are. We thus get a fun competition–the AI tries to discern what something actually is, while the theme park designers try to trick the AI into seeing something else.

The Imagineers did pretty good at building the Rock ‘n’ Roller Coaster guitar, for example:

a red guitar on a red surface pic.twitter.com/n2iZJ0rNSu
— Dis Describer (@dis_describer) January 14, 2021

Current Programming

The outline of the bot’s programming is straightforward.

The bot’s keeper maintains a folder of images. If you would like to submit a photo, please use the submissions page. Currently we are able to queue all submissions as long as they meet the page criteria.

Data pertaining to the images are stored in an SQLite database.

The bot runs 10X per day on the bottom of select hours. At those times it:

Submits the URL of the hosted photo through the Microsoft Computer Vision API
Retrieves the caption
Posts the photo with the caption to twitter via the Twitter API

The overall time to complete the project by a novice coder was about 5 hours.

Hopes For the Future

Potential goals for the project include…

Improve / automate photo sourcing. Right now photo selection is entirely manual and relies on the keeper’s photos and guest submissions, all of which are manually managed. It would be good to automate this.

Independent machine learning. The bot currently relies on the Microsoft Computer Vision API for its captions. It would be nice if it built those on its own via machine learning.

Reply reading. If the bot could read replies, it could use those replies to help it form an understanding of different images.

Discussion. The bot may even be able to develop chat functionality, allowing it to provide at least basic information about what’s in the pictures. Simple functionality might be replies to “what’s the height requirement for this ride?” More complex might be “when are wait times shortest for this ride?”