Baobab Tech Solutions  |  June 21, 2023

Unhiding knowledge: combining AI tech to re-imagine knowledge sharing

Knowledge is lost because information is hidden in vast amounts of data, documents, webinars, conversations and unpublished experiences. It’s not accessible because of the technicality of the content, the (foreign) language, the format or that it’s hidden on page 134 of a report that no one has the time to read, or that it’s experience that is never shared. Let's explore how we can leverage advancements in AI to build comprehensive knowledge systems that serve entire sectors, ultimately democratizing knowledge.

Rethinking knowledge management with AI

Traditionally, knowledge management and databases have been restricted to specific formats and structures like libraries, searchable FAQs, etc. Search this library, see there is a webinar on that topic, but you won’t convince me to watch all 60 minutes of that, even at 2x speed, to find that nugget of information that is valuable to me.

Today, we need to shift this paradigm, leveraging the power of AI to democratize knowledge accessibility. Artificial Intelligence, especially the increasingly sophisticated Language Learning Models (LLMs), is revolutionizing the way we capture, store, and retrieve knowledge, making it more accessible than ever before.

Unveiling the hidden: AI in information capture

Every day, vast amounts of knowledge are hidden in PowerPoint presentations, webinars, research papers, reports, field notes, and news articles. There's a wealth of insight buried on page 276 of a report or minute 47 of a two-hour webinar from an expert, and not to mention the valuable data and experience gathered by professionals. These knowledge fragments are often lost or overlooked due to accessibility and format challenges.

AI, however, can remedy this by parsing information, translating, summarizing, classifying, structuring and storing it in a form that is easily searchable, not merely by keywords but through semantic comparison. Thus, if you're looking for something with similar semantic meaning, AI-enhanced databases can retrieve it.

Apart from documents and data, another significant source of hidden knowledge is the exchange between experts and clients. Valuable insights can be derived from these conversations, contributing to the larger knowledge base while maintaining information privacy and confidentiality.

Information retrieval and transformation

AI can also transform how we retrieve and interpret information. Whether it's translating a foreign language document, converting a data table or unstructured data into a visual graph, or simplifying complex technical terms, AI tools are redefining the way we understand and interact with knowledge.

Understanding inquiry intent

AI can enhance the user experience by discerning the intent behind technical inquiries, thus improving the relevance and applicability of responses. This allows users to ask questions (and getting a response) in their own language, style and depth of complexity or simplicity, removing barriers to accessing knowledge.

Feedback systems: continuous improvement and knowledge augmentation

AI-powered systems can also incorporate feedback mechanisms. As users interact with the system and corrections or additions are made based on expert input, the knowledge base becomes smarter and more comprehensive over time. This creates a dynamic, evolving knowledge ecosystem that continuously improves, benefiting all users and the sector as a whole.


The advancements in AI are revolutionizing how we manage and access knowledge, unearthing valuable insights hidden in the vast data landscape. By leveraging AI, we can build a sector-wide knowledge and support system that makes knowledge more accessible, more understandable, and more valuable to everyone in the sector.

A sneak peek into WASH AI

Behind the scenes of our approach to unearthing hidden knowledge is WASH AI, an initiative by Baobab Tech designed to exploit the potential of AI in the Water, Sanitation, and Hygiene sector. This part might get a bit technical so feel free to ask GPT to simplify it or explain it to you ;-)

WASH AI uses multiple layers of software and Large Language Models (LLMs) as well as a system known as Retrieval Augmented Generation (RAG), which combines the benefits of pre-trained language models and information retrieval. The full WASH AI system is complex but consists of two integral modules.

image

The first module is a Parsing Module that ingests publicly available content from various sources like web pages, video transcripts, documents, reports, field notes, and audio files. It breaks down the content into manageable chunks, classifies them, transforms them, and converts them into vector embeddings. These vectors are then stored in a vector database with metadata about these information chunks.

The second module is the User Inquiry Module. This part of the system is designed to interact with the user. It classifies the user's inquiry to understand its intent. In some instances, this process involves translating the question into English, the primary language of the vector database and semantic comparisons. This classification could include contextual focus (geography, research intention, technical troubleshooting intention, general learning intention, knowledge contribution intention, etc)

Once the inquiry is classified, the system engages in various agent tasks. It retrieves information from the semantic databases, decides which elements to prioritise, and if necessary, summarises, simplifies, and generates visualisations based on data sets. It provides information back to to user in their language and offers to explore further.

In essence, WASH AI is not just about storing and retrieving information. It's about understanding the context and intent behind every question, ensuring the most relevant and accessible knowledge is delivered. Through such a system, we unlock the value of distributed knowledge, enabling us to make better decisions and ultimately leading to better outcomes in the WASH sector and beyond.