Christian Thurau (Co-Founder, CBO), Ingo Bax (Co-Founder, CTO) and Florian Hoppe (Co-Founder, COO). Not pictured: Roland Memisevic (Co-Founder, Chief Scientist). © Anne Schönharting
What inspired you to found your startup?
Video is the world’s largest generator of data, captured by hundreds of millions of cameras. It follows that the pursuit of intelligent machines requires substantial advances in computer vision to analyze all video data that is being produced around the world. The key moment that gave the idea to start a business was the realization that the rate limiting factor for advancing video understanding is a lack of high-quality, labeled video data that is required to train deep neural networks.
How do you define success for yourself and your company?
Solving video understanding from a technical perspective and getting a great product off the ground and deployed in the hands of millions of users.
Is there anything you’d do differently if you could do it again?
Focus on one idea and reject opportunities outside your area of focus. There is nothing more important for a startup than staying focused in the early days.
What problem does your product solve?
Our software consists of a machine learning system that takes the raw pixels of a video feed as input and outputs a textual description in real-time whenever an action of interest is occurring. Using a large proprietary video database for training, our system delivers more accurate scene descriptions than competitive offerings. It works with any camera sensor, whereas competitive solutions rely on 3D sensing technologies to add depth information to the video sequence.
How are you different?
Our technological breakthroughs have been enabled by novel ways to leverage the crowd to create high-quality video data. We instruct crowdworkers to record short video clips based on carefully predefined and highly specific descriptions. This form of “crowd acting” allows us to generate large amounts of densely labeled, meaningful video segments at low cost. It enables us to build software that can detect not just objects, but also complex actions, behaviours and relationships between objects.
How often does your product/service show up in a user’s day or week?
An increasingly large number of consumers and enterprises leverage cameras and monitoring systems to record videos. We are looking to translate that content into actionable insights and information. We envision a future where our users engage with our services multiple times a day.
Impact: how are you doing good and building a better future?
We believe that video understanding will have a significant impact on the future of A.I. and as a consequence on the role of machines in society. Visual data is central to the way we humans learn about the world. The same is true to machines. Imagine how much we could improve the care for our aged loved ones if it were possible to install a handful of smart camera devices in fixed locations to monitor changes in activities of seniors, aid their memory, and ultimately improve their health?
How has the startup scene in Berlin changed?
The Berlin tech scene has matured considerably. A decade ago there were only a few dozen tech startups in Berlin. Now there are close to three thousand. It’s an exciting time to found a machine learning company in Berlin.
What are the pros and cons of launching your startup from Berlin?
Berlin’s increasingly vibrant startup community has turned out to be a great place for attracting and training strong talent in both deep learning research and engineering. The war for talent is less intense in central Europe than in London or San Francisco, which has given us an edge against startups that are based in these ecosystems. Berlin is also more affordable, which has allowed us to operate frugally.
Twenty BN was featured in “The Hundert Vol. 10 – Startups of Berlin“, October 2017.