This repository is the official implementation of "KCitychatBot: A knowledge graph based chatbot system for large scale CityGML dataset"
CityGML has been extensively studied due to its widespread use across various domains. However, its complex hierarchical structure still presents challenges for non-expert users. Recently, large language models (LLMs) have demonstrated significant capabilities in natural language processing (NLP) and chatbot systems. Nevertheless, LLMs heavily rely on pre-trained data, which can lead to hallucination issues and limitations in context length. To address these challenges, we first propose a novel automatic method for transforming CityGML data into knowledge graphs by leveraging a graph database and a transformation plugin. This approach effectively addresses the difficulties of storing and representing the complex structure of CityGML and can serve as an external knowledge base for chatbot systems. Second, we develop a collaborative multi-agent framework that enables natural language queries over CityGML data in a user-friendly manner. By integrating the constructed knowledge graphs with several knowledge augmentation strategies, the chatbot system implements a complete pipeline from natural language input to structured query generation, external knowledge retrieval, and optimized response generation. We conduct experiments on both the city knowledge graphs and the chatbot system to evaluate the accuracy of the knowledge graphs and the interpretability of the system’s outputs. The experimental results demonstrate that the generated knowledge graph is accurate, and the chatbot system performs well in terms of answer accuracy, relevance, and contextual coherence. These findings highlight the potential of the proposed chatbot system to lower the barrier for non-professionals interacting with CityGML data, offering both theoretical insights and practical implications for advancing CityGML applications in the era of LLMs and promoting smart city development.
DeepSeek API and OpenAI API are required, DeepSeek API is used to chat and analysis, while OpenAI API is for embedding models and evaluation. Furthermore, you need to apply for the Neo4j database(free), and download a plugin APOC. How to use with this guide
To install the complete requiring packages, use the following command at the root directory of the repository:
pip install -r requirements.txt
Some important components explained:
crews/ — Agent and task configuration directorydata/ — Three themes and metadata from Plateauknowledge — External knowledge store in this folderload — The construction of knowledge graphsevaluate/ — Based on evaluation_table.xlsx, evaluation autonomouslychatflow.py — The whole workflow of the chatbot systemchatflow_auto.py — For autonomous evaluationevaluation_examples.xlsx — Three evaluation examplesevaluation_table.xlsx — Input your questions in the 'input' column, run chatflow_auto.py, then response and context will be autonomously writtenindex.html — The frontendmain.py — The backendprompts.py — The few-shot learning examples and promptsgit clone this project:
git clone [email protected]:liushouqi/KCitychatBot.git
To construct knowledge graphs, input URL, user, database name amd password in the load/final.py, then run:
python final.py
To use this chatbot system, run:
uvicorn main:app --reload
To evaluate system's outputs, run:
python evaluation.py
main.py, but it takes a significant amount of time. Therefore, using the backend approach is more recommended. To constuct your knowledge graphs, especially for large-scale CityGML datasets, try load/final.py. To chat with CityGML data, try crews/chatbot.py. Remember to change URL, user...data/, otherwise your questions might be meaningless or refer to something that doesn't exist in the data.crews/input2cypher_crew.py and crews/generate_crew.py.