
In recent years, generative AI and LLM (Large-Scale Language Models) such as ChatGPT have been on the rise and are expected, but the problem of hallucination (output that differs from reality) has also been pointed out. A framework called RAG is attracting attention as a powerful means of utilizing LLM while alleviating this issue. RAG is a technique that significantly improves the quality of output by referencing appropriate information.
FRONTEO's discovery AI "KIBIT" has an algorithm specialized for searches, and is particularly effective in the "search" step to obtain the correct information in RAG. By combining KIBIT's search power with LLM, we will also introduce "Takumi KIBIT Zero," an AI system that makes it possible to utilize unorganized data within a company more simply and with higher precision than LLM alone..
What is RAG? - One of the methods aiming to improve the reliability of generated AI
RAG (Retrieval-Augmented Generation) is translated as "Search Augmented Generation," and is a document generation AI/LLM (Large-Scale Language Model) information source that obtains optimal information by another means and then sends the answer to the LLM. This refers to a method for generating information and improving its accuracy and reliability.
This is an idea presented in a paper by Meta (formerly Facebook) in 2020, and has gained even more attention with the recent rapid spread of generative AI such as ChatGPT. Although generative AI can create natural and persuasive sentences, it sometimes causes a phenomenon called hallucination, which includes errors and creations. RAG is a measure to reduce this hallucination by referring to external information and effectively utilize generated AI.
It may be similar to the difference between studying in advance and taking an exam, or solving exam questions while looking at a textbook. In the case of generative AI such as ChatGPT, which learns in advance and takes the test using that knowledge (the model generates answers from the content it has already learned), RAG solves the questions while looking at the textbook (by referring to the information source). The image would be something like (generate an answer).
Development, issues and countermeasures of LLM and document generation AI including ChatGPT
Among natural language processing AIs, large-scale language models (LLMs) are rapidly developing, and include not only ChatGPT's original GPT-3.5/4, but also Google's PaLM, Gemini, Meta's Llama 2, and more recently. Many LLMs are being developed one after another, such as Claude. On the other hand, there was an issue with document generation AI based on LLM causing hallucination.
Causes of hallucination
Hallucination means "hallucination" and has come to refer to the phenomenon in which generative AI such as ChatGPT sometimes speaks things that are not true as if they were true. One reason for this is that natural language processing AI based on LLM, such as ChatGPT, creates sentences by guessing the next word instead of understanding the meaning of words and creating sentences. Therefore, answers tend to be more context-oriented. This may also be due to references to sources of information that are outdated or inaccurate.
Mindset as a prerequisite for measures against hallucination
The most effective way to deal with hallucinations is to not just accept the content provided by AI at face value, but to acquire the literacy to judge the content before using it.
Examples of specific countermeasures against hallucination
Possible techniques to reduce hallucinations include countermeasures with prompts and adding sources to generated answers.
Measures with prompts
There are ways to limit and control the answers of the AI model, such as giving precise instructions with prompts or having the model respond in stages.
Provide answers with sources
There is also a method of generating answers while adding links to referenced information sources. This makes it easier for users to fact check and correct mistakes.
How to improve LLM performance – RAG, transfer learning, fine tuning
Transfer learning and fine-tuning, which retrain the model on datasets related to the target task, are also used to improve the performance of LLMs (Large-Scale Language Models) and generative AI based on them.
What is transfer learning?
Transfer learning is a method of applying an AI model that has already been trained in one field or task to another field or purpose by adding a new output layer. Since the acquired knowledge can be utilized, a certain level of performance can be expected for other purposes.
What is fine tuning?
Fine-tuning is a method based on an existing large-scale language model (LLM) that adjusts and optimizes the model itself specifically for data related to the desired task, increasing its applicability to a specific domain.
Difference between RAG and transfer learning/fine tuning
In other words, transfer learning repurposes the original knowledge of the AI model, while fine tuning makes the model itself task-specific. RAG, on the other hand, is different in that it can improve the quality of the AI model by referring to appropriate information before outputting, rather than simply tuning the AI model.
RAG business use cases: call center and document creation applications
For example, if RAG is used in a call center that needs to respond appropriately to a variety of inquiries, AI can search for information from manuals and case study databases to accurately create answers and possible answers. The operator no longer needs to do any research, and can smoothly guide you to the appropriate content.
Another possibility is to use it for creating technical documents. If you use RAG in the manufacturing industry and IT companies where it is essential to create documents that require highly specialized knowledge, such as product instruction manuals and technical manuals, AI will search for the necessary information from related technical data, etc. It can help you create more accurate and easy-to-understand explanations.
The reason why RAG is expected to be used in business is that there are many situations where advanced knowledge and associated judgment and skills are required to assist, and it has the potential to dramatically increase the intellectual productivity of various tasks. there is.
RAG architecture and mechanism
RAG (Retrieval Augmented Generation) consists of two main steps: "Retrieval" and "Generation."
First, you enter a question in natural language (the words people normally use), and in the "Search" step, related information is searched from data (internal documents, past case records, etc.) according to the purpose and use. Identify and extract relevant information from data using keywords and context in questions as clues.
In the "Generation" step, an answer sentence is generated and output using the relevant information carefully selected in the search step as a reference.

Relationship between RAG and generative AI/LLM
RAG is a framework aimed at supplementing and enhancing the output of generative AI and LLM. The RAG mechanism combines the output of generative AI and LLM's natural language processing with highly reliable related information such as company-owned data to increase the appropriateness and reliability of the output. It is expected that RAG, which combines target data with generation AI and LLM, will create a synergistic effect and improve intellectual productivity in various fields.
To improve RAG accuracy
In order to improve the accuracy of RAG answers, it is especially important to improve the "search" step. Not only does the quality of the data that serves as the information source directly affect the accuracy of the output, but the algorithm that accurately searches for relevant information based on the context of the question is extremely important.
“Takumi KIBIT Zero” uses discovery AI “KIBIT” to accurately acquire information at RAG and utilize unorganized internal data.
In workplaces like the manufacturing industry, where a diverse range of people are involved, the issue of individualization of work techniques and skills is a common issue. Ideally, equipment failure data, product defect data, reports, etc. should be properly collected so that even if a problem occurs, it can be resolved immediately, but such data and documents are often left in an unorganized state. Thing.
"Takumi KIBIT Zero" is a system that uses KIBIT, a discovery AI that is strong in search, to further improve the accuracy of RAG and make the most of the company's knowledge and data that have accumulated a lot of knowledge.
KIBIT Zero, a craftsman who combines the AI engine KIBIT, which is good at “discovery,” with LLM/generation AI, so to speak, “picks the best of the best”, uses the company-specific data at hand to find the correct answers and references needed “now”. You can accurately list the documents you want to search. This is possible because KIBIT has a unique algorithm for "discovery" with high search power to find similarities and relationships.
Becoming an organization that can constantly share accumulated knowledge
Takumi KIBIT Zero can accurately extract the necessary information from a large amount of data, and when a summary of the extracted information is required, the RAG scheme uses generation AI to output a summary, resulting in a more accurate response than LLM alone. can be generated. It is much simpler and more reliable than the traditional method of re-customizing the LLM.
Even with decades of unorganized data with disparate notations and formats, KIBIT's conceptual search can clear up business terminology and spelling variations, making it possible to narrow down the target materials and answers and identify appropriate information. As a result, not only will you be able to quickly find accurate answers and references, but repeated use will improve your questioning and problem-solving skills, leading to the creation of an organization where each person can understand and apply the information. Optimal use of such in-house data will lead to shorter on-site lead times and organizational transformation that allows for constant sharing of knowledge.