AI-Powered Virtual Assistant For Ideation Sessions
Waverley has partnered with a company of thinkers and innovators to implement a new kind of virtual assistant that can turn the idea-creation process upside down.
The customer is a group of innovators united by the idea of a product to help people and companies drive their creative processes. Inspired by the capabilities of virtual smart assistants like Alexa, Siri, and Google Home, they came up with a concept of a virtual smart assistant that can work as an Ideation Facilitator. Generally, the role of Ideation Facilitator is filled by a person who is involved in group or individual brainstorming sessions. This person directs the flow of a discussion so that it naturally results in an executable idea, doable action items, or effective problem solving. The customer suggested this role could be well accomplished by an AI-powered machine.
The client envisioned the end product as an Ideation Tool in the form of a physical smart speaker that acts as a virtual Ideation Facilitator. It should be able to perceive and reproduce human oral speech so that it substitutes for a live person and can hold a natural conversation on absolutely any topic. By asking relevant directive questions and using a pack of creativity tools and brainstorming techniques, this virtual facilitator must guide the user towards creative ideas or brand-new conclusions that can help the user invent and develop something novel. Moreover, the Ideation Tool has to assist in validating the idea, identifying its business value, checking its novelty, filling out patent submission forms, and other activities related to bringing concepts to reality.
For Waverley engineers, this turned out to be a proof-of-concept R&D project with lots of challenges and continuous search for improvement. We kicked off with a prototyping phase and a limited budget which meant using some ready-to-use solutions. We chose Google tech stack as the most advanced and satisfying toolkit for real-time speech recognition and natural language processing, including Google Cloud Platform for hosting (Kubernetes, Cloud SQL), Speech-to-Text and Text-to-Speech, Dialogflow, and Google Home smart speaker as the product hardware.
As the client brought us new ideas and requirements, our software engineers realized that some of the ready-to-use solutions were not a good fit. For example, Google Home smart speaker with Google NLP services can not be used for continuous dictation due to privacy restrictions. Also, the Dialogflow service does not process speech that converts to more than 256 characters, and we did not want our product to put such a restriction on target users. How should users count how many characters they pronounce? Moreover, the goal was to let the user speak for as long as they want. Meeting this need would mean developing custom software from scratch for a piece of bare smart speaker hardware.
As a result of our research, we pivoted moving away from the Google Home solution with their NLP services and Dialogflow. The idea of a smart speaker was temporarily put on hold, when the client realized that developing custom embedded software requires more time, effort, and financial resources than they’d hoped. Thus, we shifted our main focus to the development of a web application that supports both chat and voice interfaces, creativity tools, ideas storage, the functions of collaboration, patent search, and patent form submission, as the outcome of this process . It is designed to function as a supporting tool for a live ideation facilitator rather than as a substitute of a human being.
Feature DevelopmentAt the current development stage, the product has a set of features that perform the main functions commissioned by the client:
- 1. Chatbot
Chatbot with voice recognition and synthesis ability. This feature is implemented with the help of Google Speech-to-Text and Text-to-Speech services in conjunction with our custom back-end algorithm for improved pause detection and continuous dictation ability. This was an effective solution to the problem of speech length restriction and poor pause detection. Now, the user may speak for an unlimited time, make pauses to take a breath, and finish their thoughts. Meanwhile, the app will listen and convert it all to text (with mostly accurate punctuation).
- 2. Machine Learning
The ability of the chatbot to interpret the user’s input and find relevant output. We were faced with a huge challenge in the form of an unusual requirement: the computer system needs to be able to understand, from a random query, whatever knowledge domain a user wants to explore. Likewise, it needs to be able to provide relevant responses. Machine Learning algorithms were developed to provide a solution for some of these tasks. As a result, the system can ask directive general questions, classify the user’s arbitrary answers as “yes” or “no”, and provide relevant words’ associations in response.
- 3. Brainstorming Methods
A set of creativity tools including the brainstorming methods of associations and Osborn’s Checklist (aka SCAMPER). During the brainstorming sessions, facilitators mix a number of brainstorming methods and creativity tools in order to guide the discussion into a productive flow of thoughts and idea-generation. The application relies on a predefined workflow to choose the right creativity tool, and on the ML algorithms to provide the words’ associations. What is more, the user is provided with two types of associations – both random and relevant ones, as the client stressed the importance of both types in the creativity process.
- 4. Ideas Storage
Ideas storage. Apart from the chat flow, the user can see the list of their and shared ideas on the left and, by picking one, see all the previous discussion of that idea on the chat flow and a filled-out patent submission form on it. As for data security issues, we offer reliable data encryption in addition to Google Cloud strong security options.
- 5. Patent Submission
Automatic completion and submission of required forms. On the right-hand side of the chat flow, the user will see a patent submission form for each idea that is automatically filled out by the Ideation Tool. It also retains the basic information about an idea, such as name/topic, abstract, business value, novelty, and other details.
- 6. Patent Search
Patent search. Using the Google Patent Search service, we also enabled the application to find patents related to the user’s idea and provide links to their detailed descriptions.
- 7. Collaboration
Collaboration and admin settings. By setting the corresponding user roles in the app (e.g. organization admin, ideation manager, user inventor, etc.), the user may be assigned different sets of permissions, such as users/ideas/roles/permissions management or ideas import. By granting other people from their organization access rights to view and comment on their ideas, users can collaborate on the same ideas, and share thoughts.
Waverley’s Data Science specialists were tasked to meet two major challenges in the project:
- Information retrieval from open-ended content. Using the word association creativity tool, the user may provide input on any existing subject domain, and the system has to detect the context and provide a relevant answer. We chose the Wikipedia database as the most ample open source of content that has practically no contextual limits. Using the semantic similarity search method and the approximate similarity search algorithm, the system is able to analyze the top 100 most relevant Wikipedia articles, find the most frequently used words and phrases, and provide them to the user as relevant associations for their query. These associations are rendered in a form of word cloud with clickable words: each word will redirect the user to patents search and wiki search pages.
- Binary NLP text classification. In order to engage in an effective discussion, the chatbot has to clarify whether it is moving in the right direction. To this end, there was a need to implement the sentiment analysis of the user’s speech, that is to classify the user’s answer to a question as either positive or negative. With the help of the developed ML algorithm, the system is able to correctly identify any input as either a “yes” or “no” answer.
The tech stack involved in this task is based on Python infrastructure including Numpy, Scikit-learn, NLTK, and FastText library.
Initially, the project did not involve any BA expertise, but the client soon realized the need for a Business Analysis expert to increase the overall business efficiency of the project. At this point, the main task of our business analyst is to find ways to make the discussion flow of the chatbot with the user as natural as possible. To reach this aim, we are working out an effective workflow for the chatbot. It should be able to provide guidance in the form of subtle, unintrusive, organic responses, and questions. Plus, the system then will switch between the brainstorming tools and methods it has in the toolkit quite smoothly.
As we work with the client in the rhythm of short development cycles, we communicate with the subject matter expert – the actual ideation facilitator – to discuss any improvements that can be made. As a result, we have to make changes, add and try out new tools and functions, decide on what to preserve and what to omit. We are still working out the problem of increasing relevance, improving question categorization and labeling so the system will get better hints on the discussion flow.
We have built an AI-powered product capable of facilitating individual and group brainstorming sessions. The application is designed to work in several phases: initial data collection, ideation with creativity tools and brainstorming methodologies, detail clarification, and idea submission form creation to start the patenting process. It is ready to recognize and analyze voice input, provide relevant output in the form of text and speech synthesis, as well as storage and collaboration capabilities. A set of additional features are still in development, for example a built-in demonstration feature of the app functionality, an even more natural discussion flow, pause detection algorithms improvement, and question labeling. We are also looking forward to resuming our work on custom embedded software to implement the initially planned smart speaker vision.