Knowledge Bases

Welcome to the Knowledge Bases Hub. This Page will link out to all of our experiments and cases studies.

Hierarchical Structures

What are information hierarchies? Informational Hierarchies are a way to organize information so it sets a clear context, and convey the relationship between entities and the overall context. It is a clear signal to what information is most important and it can help clarity the proper context.

Folders

Hierarchies can be organized into folders, sub-folders, etc. This is a parent, child type of relationship.

URLs

Well-structured websites have a proper URL hierarchy that conveys the correct meaning, context, and relationship between entities.

Consider the difference between:

1. ChatbotConferences.com/conferences/2019/nyc

2. ChatbotConferences.com/new-york-city

3. ChatbotConferences.com/nyc/2019

Example one implies that there are multiple events in multiple cities. Example two, only shows the city which does not mean much, and example three, conveys that there are multiple events in NYC.

The first example is the strongest. It has a ROOT, of Conferences. The ROOT is the main topic of the website. The second level of the hierarchy or SEED is the year and the final NODE is the City. Organizing it by year is much easier than by city, which is why year is on a higher level within the hierarchy.

Topical
Map

The goal is to create a topical map that includes all of the relavent entities, within the proper contextual hierarchies so that the LLM can quickly identify the context. A great topical map with will give the LLM a lot of details to go on which will make it easier to answer detailed questions.

Test

The Hierarchy Test will test these ideas out to see to what degree they make a difference with LLMs. We build two chatbots for this test. The chatbot have the same information and the main difference is the Hierarchy organization.

The Test:

1) Bot 1: The bot is trained on the following pages:

2) Bot 2: The Bot is trained on a single Page

https://www.chatbotconference.com/knowledge-bases/all-agendas

Go to the Test

Documents: Context & Semantics

Does the way information within a document is organized matter? Semantics helps LLMs identify the main context of the document and the relationship of entities with each other. These relationships inform the LLM on the proper way to connect concepts and the topic which later help give rise to meaning.

Macro & Micro
Semantics

The overall context of a document is set by the Marco and Micro Semitics. The Marco Context is the overarching topic of the document and is organized by the H1, H2, H3, H4 Tags. All these tags are in order of importance and set up the Marco Context. The H1 tag, is the title of the document and represents its overall Macro Context. The H2 tags are the main topics within the document and support the overall thesis. The H3 tags are sub-topics of the H2 tags.

Consider the difference between:

1. Marco Context: H1, H2, H3, H4

2. Micro Context: Definitions, questions, phrases, world order within each heading.

Well-written documents generally are well organized, and easy to follow, read and understand. They typically have a hierarchical structure which allows the reader to go deeper into a topic. Topics are laid out in a logical and coherent manner. Topics often have sub-topics and supporting information. Well-written documents are able to answer our questions.

Contextual
Layers

Consider the question below. We broke it down into its elements so you can gain insight into how an LLM reads it. Answering questions like this is the promise of LLMs.

What are the best bikes [knowledge domain] for [functional word] short boys [contextual domain]
What are the most useful diets [knowledge domain] for [functional word] children with insomnia [contextual domain] for kids under six [contextual layers]?

We broke it down into its elements so you can gain insight into how an LLM reads it. If children is well defined or the document has contextual layers, then an LLM can answer a detailed question like this.

Structure

What is the overall structure of a good document?

Macro Content: H1 Title Tag
10% Summary: Extractive & Abstractive Summary

60% Main Topics: H2 Tags & Micro Context: Definitions, paragraphs, etc
30% Supplementary Context: H2 & H3 Subtopics, related topics, synonyms, antonyms, etc

Test

We will test two articles that have the same information. Article 1, will have all of the attributes shared above. Article two will be missing most of the attributes above. Each article will be used to train a bot and we can all play around with the differences.

The Test:

1) Bot 1: Doc has the following

H1 Title Tag
Summary
H2 & H3
Definitions
Supplementary Content

2) Bot 2: Doc is designed to mirror a poorly structured articles

All tags converted to Paragraph Format
No summary
No supplementary content
No definitions or questions

Go to the Test

Knowledge Bases

Hierarchical Structures

Folders

URLs

Topical Map

Test

Documents: Context & Semantics

Macro & Micro Semantics

Contextual Layers

Structure

Test

Topical
Map

Macro & Micro
Semantics

Contextual
Layers