After over two years, our company has finally unveiled our hard work. We set out to rethink mobile search, and the result is Ozlo, an intelligent conversational AI. With this step we join a number of technology startups and behemoths alike who believe that an emphasis on language and conversation, rather than more pixels and pointers, is the interface of the future.
Ozlo is a focused product that helps you find food and drink via an interface that feels like text messaging. You type what you want, and via a directed conversation you iteratively hone in on something delightful. You can get a higher-level overview of the product on our blog, and you can sign up today for our invite-only beta.
Ozlo is different from anything else that exists today. He's different in two important ways. First off, is focus. We have built a conversational product that attacks head on the problem of multi-turn dialog in human language. Conversation is not a series of disconnected fully formed sentences, but rather a lackadaisical and iterative process whereby speakers discover their desires as they express them. A deep commitment is required to build a product that interacts with people how they actually communicate - and this is the challenge that we've accepted.
The second key difference, is the deep technology that supports the product. Over the past two years, in order to achieve our ambitions we've had to build a lot, including a custom search engine, a graph database, a query language for interrogating the former, a term-wise pattern language, a grammar that resembles a probabilistic context-free free grammar, a natural language understanding layer, a knowledge base, an ingestion and resolution pipeline and distributed execution framework capable of creating probabilistic structured knowledge out of LOTS of chaotic signal, an inference pipeline capable of extracting knowledge from unstructured prose, a discourse system that feels like interactive fiction, a declarative rendering language, a svelte iOS app to render it, and so many things in between.
As each piece was constructed, there's been considerable care to not over-build. We've asked repeatedly whether this is all really necessary to create a satisfying product that is based on human conversation. We've found that it is. For example, there really isn't a graph database in the world today that has the level of geographic awareness we need and can perform intersections measured in microseconds. This is necessary when you need to interrogate your world knowledge hundreds or thousands of times to gain confidence that you understand what in the world the user is talking about for each utterance.
Once this technology stack is realized, it becomes possible in hours to surface new knowledge in response to user's desires. It becomes possible in hours to learn new classes of expressions that people actually say. It becomes possible in weeks to learn how to talk about entirely new categories of things.
With this technology stack you have the ability to understand that a "Moscow Mule" is a type of cocktail. The ability to differentiate between places you know have it on the menu, vs places you think probably serve a mule. The combination of concrete known facts about the world with abstract knowledge about how concepts interrelate, make magical and natural conversation possible.
Ozlo is different. He's a product focused on dialog, how humans actually talk, supported by deep tech and big data.
The power of conversational UI
Intuitive visual UI design is deep - a true art form. There are many simple questions, however, that don't translate well into a visual form. For instance, how might you design a UI that lets you search for restaurants based on their hours? Sure there is an answer, but it is yet another widget which makes an interface feel heavy and overwhelming. With language, the answer is light, apparent and intuitive - "bars open after 10pm near evans & university". There is no complicated UI which requires multiple taps and interactions, but rather, you just do what you've been doing since your first birthday, speak.
Using language as an interface makes extremely advanced features usable. This is exciting because it lifts so many limitations from our technology. Features that were once condemned because they are too complicated for the most advanced users now become expected by the most novice. Language changes everything.
High hopes and key decisions
I have high hopes for Ozlo because of the key decisions we've made in building him. Specifically, we've focused on high-leverage technology - avoiding the temptation of a flashy and shallow product that can't grow.
One pragmatic decision we made was our willingness to leverage every affordance available to us in the app to give a great experience. Despite the fact that dialog is our company focus and the thematic underpinning of our product, we complement that with whatever visual affordance works with users. This includes things like context-sensitive response suggestions throughout the conversation to constantly reinforce the breadth of language we do understand. We do this while trying to predict your next desire, putting it under a single tap.
Another helpful choice we made was to deconflate speech recognition from language understanding. Our belief is that fantastic domain aware speech recognition is not a fundamental requirement for a solid conversational UI. By removing this feature, we are able to focus all our efforts on building a foundation.
Similarly, our ambition extends far beyond just food. Our hope is soon we will be able to widen the product, but the restraint to focus on a single domain is what has let us really build the technology that will allow us to expand. Starting with food is starting with the hardest domain, as restaurants come and go every day, there are a LOT of them, and there is not authoritative data about them available - the data is noisy and scattered all over the place. Finally, questions about food are typically geographical and temporal and shaded with personal taste. This makes restaurants a beautiful playground where you can learn a lot, and these are learnings that directly translate to other topics.
Finally, while the above is true, the bit of dogma and belief that we do not compromise on is the assertion that solid language comprehension is the future. We have refused to use human intermediaries, as we wanted from day one to take this challenge head on. This has in turn focused our time investment into our ability to comprehend, and a majority of the technology we've created is in service of the ability to hold a conversation.
As Ozlo learns and grows, his comprehension will continually improve, the things he knows about and types of conversations he can have, and actions he can take will improve. The visual UI will fall away more and more, and what will be left is a sidekick who can help you find what you want faster. You will be able to make faster and better decisions, to take action, and to re-pocket your phone and lift your eyes back to the world.
Try the public beta
The Ozlo that exists today is only a taste of what we think we can accomplish. The team is insatiable in their desire to refine, hone, improve and expand Ozlo. The time has come, however, for us to get Ozlo in people's hands - your hands - so we can begin to shape him based on how people actually want to use him.
Our commitment is to watch, and learn, and act - releasing new features and improvements frequently based on what our users tell us. So join us, give Ozlo a try, and let's see how far we can go?