Gaël Hugo, artist in residence at Google Arts & Culture Lab, was interested in helping anyone with their doodles using AI Audio. Sketch onto the canvas and receive feedback and tips for improvement. For inspiration a sketch style image based on your drawing is generated by Google AI to offer you visual cues for improvement. This first experimentation is focusing on the structure and style of your artwork, with more to come in the future.
The experiment uses Gemini - Google’s Multimodal Large Language Model - to analyze and understand your doodle. From the analysis of the doodle, Gemini generates a script about the drawing, how to improve it and a prompt to generate a sketch style image. Once the image prompt is created, it is fed into an Image model to generate a sketch and the script sent to a Google AI Audio Model to generate the audio in real time.