Camera makers have used some flavor of Artificial Intelligence in point-and-shoot cameras in a limited way for many years now (think “auto scene modes”), but Google has taken AI and photography to an entirely new level with the introduction of the Google Clips Camera. It’s the first completely independent on-device machine learning-based camera that can be taught to focus on people who are important to you. Clips is a hands-free camera that you can clip to a shelf, wear on your shirt, or set on a table and it will find the shot.
Josh Lovejoy of Google wrote in a recent blog post, that Google looked “across products to see how machine learning can stay grounded in human needs while solving for them.” Capturing great photos of friends and family, automatically, turns out to be one of those needs. So, they created Google Clips, an intelligent camera designed to capture candid photos and animated GIFs of familiar people and pets.
To help train the camera, they picked the brains of a documentary filmmaker, a photojournalist, a fine-arts photographer, and a wedding photographer. “We began gathering footage from people on the team and trying to answer the question, ‘what makes a memorable moment?'” notes Lovejoy. A three-year process, which also included teaching the AI system what a “bad” image looked like by feeding it deliberately blurry, poorly composed and exposed images, resulted in the Google Clips camera.
Lovejoy expresses the motivation for the product thusly: “What if we could build a product that helped us be more in-the-moment with the people we care about? What if we could actually be in the photos, instead of always behind the camera? What if we could go back in time and take the photographs we would have taken, without having had to stop, take out a phone, swipe open the camera, compose the shot, and disrupt the moment? And, what if we could have a photographer by our side to capture more of those authentic and genuine moments of life, such as my child’s real smile? Those moments which often feel impossible to capture even if one is always behind the camera. That’s what we set out to build.”
The Google development team also taught the camera to recognize—and avoid—bad photos by feeding it samples of bad pictures.
Is it perfect? No. But that was anticipated by the developer team. “Unlike traditional software development, Machine Learning systems will never be “bug-free” because prediction is an innately fuzzy science. But it’s precisely this fuzziness that makes Machine Learning so useful,” notes Lovejoy.