Merge pull request #10 from XingYu-Zhong/main

add_feature:添加本地模型调用到repodome
2024-11-23 12:16:33 +00:00 · 2024-07-09 10:09:29 +08:00 · 2024-07-09 10:09:29 +08:00 · 49930981df
commit 49930981df
parent 9c47195638 9ccf7b7581
6 changed files with 261 additions and 22 deletions
--- a/repodemo/chainlit.md
+++ b/repodemo/chainlit.md
@ -1,35 +1,37 @@
 # CodeGeeX

-#   Welcome to use my chat dome application
+## Welcome to My Chat Dome Application

 This is a simple demonstration application.

-## Usage
+## Instructions

-1.  Enter your question
-2.   Wait for the reply
-3.  Enjoy the conversation!
+1. Enter your question
+2. Wait for a reply
+3. Enjoy the conversation!

 ## Features

-  Support multi-round dialogue
-  Support uploading local zip compressed project package, and can perform project question and answer and modify the project
+- Supports multi-turn conversations
+- Supports internet-connected Q&A
+- Allows uploading local zip project files for project-related Q&A and modifications

 ## Installation

 1. Clone the repository to your local machine
-2. Install dependencies: `pip install -r requirements.txt`
-3. Run the application: `python run.py`
+2. Set up the model; you can choose between a local model or an API model. If using a local model, set `local_model_path` in `run_local.py`
+3. For internet-connected Q&A, set the Bing Search API key in `utils/bingsearch.py` (`bingsearch_api_key`)
+4. Install dependencies: `pip install -r requirements.txt`
+5. Run the application: `chainlit run run.py --port 8888`. For local use: `chainlit run run_local.py --port 8888`

-## Note
+## Notes

-Please ensure that your network environment can access the CodeGeeX API.
+Ensure that your network environment can access the CodeGeeX API.

-##   Disclaimer
+## Disclaimer

-This application is for learning and research purposes only and shall not be used for any commercial purposes. The developer is not responsible for any loss or damage caused by the use of this application.
+This application is for educational and research purposes only. It must not be used for any commercial purposes. The developer is not responsible for any loss or damage caused by the use of this application.

-##   Thank you
-
-Thank you for using our application. If you have any questions or suggestions, please feel free to contact us. We look forward to your feedback and are committed to providing you with better services.
+## Acknowledgements

+Thank you for using our application. If you have any questions or suggestions, please feel free to contact us. We look forward to your feedback and are committed to providing better service.
--- a/repodemo/chainlit_zh-CN.md
+++ b/repodemo/chainlit_zh-CN.md
@ -13,13 +13,17 @@
 ## 功能

 -  支持多轮对话
+-  支持联网问答
 -  支持上传本地zip压缩包项目，可以进行项目问答和对项目进行修改

 ## 安装

 1. 克隆仓库到本地
-2. 安装依赖：`pip install -r requirements.txt`
-3.  运行应用：`chain run run.py --port 8888`
+2. 设置模型，可以选择本地模型或者api模型,如果使用本地模型需要到run_local.py里设置local_model_path
+3. 如果要用联网问答需要设置bingsearch API，在utils/bingsearch.py中设置bingsearch_api_key
+3. 安装依赖：`pip install -r requirements.txt`
+4. 运行应用：`chainlit run run.py --port 8888` 如果用本地：`chainlit run run_local.py --port 8888`
+

 ## 注意

--- a/repodemo/llm/local/codegeex4.py
+++ b/repodemo/llm/local/codegeex4.py
@ -0,0 +1,47 @@
+from pydantic import Field
+from transformers import AutoModel, AutoTokenizer
+from typing import Iterator
+import torch
+
+class CodegeexChatModel():
+    device: str = Field(description="device to load the model")
+    tokenizer = Field(description="model's tokenizer")
+    model = Field(description="Codegeex model")
+    temperature: float = Field(description="temperature to use for the model.")
+
+    def __init__(self,model_name_or_path):
+        super().__init__()
+        self.device = "cuda" if torch.cuda.is_available() else "cpu"
+        self.tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)
+        self.model = AutoModel.from_pretrained(
+            model_name_or_path,
+            trust_remote_code=True
+        ).to(self.device).eval()
+        print("Model has been initialized.")
+
+    def chat(self, prompt,temperature=0.2,top_p=0.95):
+        try:
+            response, _ = self.model.chat(
+                self.tokenizer,
+                query=prompt,
+                max_length=120000,
+                temperature=temperature,
+                top_p=top_p
+            )
+            return response
+        except Exception as e:
+            return f"error:{e}"
+
+    def stream_chat(self,prompt,temperature=0.2,top_p=0.95):
+
+        try:
+            for response, _ in self.model.stream_chat(
+                    self.tokenizer,
+                    query=prompt,
+                    max_length=120000,
+                    temperature=temperature,
+                    top_p=top_p
+            ):
+                yield response
+        except Exception as e:
+            yield f'error: {e}'
--- a/repodemo/readme.md
+++ b/repodemo/readme.md
@ -13,14 +13,26 @@
 ## 功能

 -  支持多轮对话
+-  支持联网问答
 -  支持上传本地zip压缩包项目，可以进行项目问答和对项目进行修改

 ## 安装

 1. 克隆仓库到本地
-2. 安装依赖：`pip install -r requirements.txt`
-3.  运行应用：`chainlit run run.py --port 8888`
+2. 设置模型，可以选择本地模型或者api模型,如果使用本地模型需要到run_local.py里设置local_model_path
+3. 如果要用联网问答需要设置bingsearch API，在utils/bingsearch.py中设置bingsearch_api_key
+3. 安装依赖：`pip install -r requirements.txt`
+4. 运行应用：`chainlit run run.py --port 8888` 如果用本地：`chainlit run run_local.py --port 8888`
+

 ## 注意

-请确保您的网络环境可以访问CodeGeeX的API。
+请确保您的网络环境可以访问CodeGeeX的API。
+
+##   免责声明
+
+本应用仅供学习和研究使用，不得用于任何商业用途。开发者不对因使用本应用而导致的任何损失或损害负责。
+
+##     感谢
+
+感谢您使用我们的应用。如果您有任何问题或建议，请随时联系我们。我们期待您的反馈，并致力于为您提供更好的服务。
--- a/repodemo/requirements.txt
+++ b/repodemo/requirements.txt
@ -1,2 +1,7 @@
 chainlit==1.1.305
-beautifulsoup4
+beautifulsoup4
+#local
+accelerate==0.31.0
+tiktoken==0.7.0
+torch==2.3.1
+transformers==4.39.0
--- a/repodemo/run_local.py
+++ b/repodemo/run_local.py
@ -0,0 +1,169 @@
+import chainlit as cl
+from chainlit.input_widget import Slider
+from llm.api.codegeex4 import codegeex4
+from prompts.base_prompt import judge_task_prompt,get_cur_base_user_prompt,web_judge_task_prompt
+from utils.tools import unzip_file,get_project_files_with_content
+from utils.bingsearch import bing_search_prompt
+from llm.local.codegeex4 import CodegeexChatModel
+local_model_path = '<your_local_model_path>'
+llm = CodegeexChatModel(local_model_path)
+
+class StreamProcessor:
+    def __init__(self):
+        self.previous_str = ""
+
+    def get_new_part(self, new_str):
+        new_part = new_str[len(self.previous_str):]
+        self.previous_str = new_str
+        return new_part
+
+@cl.set_chat_profiles
+async def chat_profile():
+    return [
+        cl.ChatProfile(
+            name="chat聊天",
+            markdown_description="聊天demo：支持多轮对话。",
+            starters = [
+                cl.Starter(
+                label="请你用python写一个快速排序。",
+                message="请你用python写一个快速排序。",
+               
+                ),
+
+            cl.Starter(
+                label="请你介绍一下自己。",
+                message="请你介绍一下自己。",
+               
+                ),
+            cl.Starter(
+                label="用 Python 编写一个脚本来自动发送每日电子邮件报告，并指导我如何进行设置。",
+                message="用 Python 编写一个脚本来自动发送每日电子邮件报告，并指导我如何进行设置。",
+                
+                ),
+            cl.Starter(
+                label="我是一个python初学者，请你告诉我怎么才能学好python。",
+                message="我是一个python初学者，请你告诉我怎么才能学好python。",
+                
+                )
+            ]
+      
+        ),
+        cl.ChatProfile(
+            name="联网问答",
+            markdown_description="联网能力dome：支持联网回答用户问题。",
+            
+        ),
+        cl.ChatProfile(
+            name="上传本地项目",
+            markdown_description="项目级能力dome：支持上传本地zip压缩包项目，可以进行项目问答和对项目进行修改。",
+            
+        )
+    ]
+
+    
+@cl.on_chat_start
+async def start():
+    settings = await cl.ChatSettings(
+        [
+            Slider(
+                id="temperature",
+                label="CodeGeeX4 - Temperature",
+                initial=0.2,
+                min=0,
+                max=1,
+                step=0.1,
+            ),
+            Slider(
+                id="top_p",
+                label="CodeGeeX4 - top_p",
+                initial=0.95,
+                min=0,
+                max=1,
+                step=0.1,
+            ),
+        ]
+    ).send()
+    temperature = settings["temperature"]
+    top_p = settings["top_p"]
+    cl.user_session.set('temperature',temperature)
+    cl.user_session.set('top_p',top_p)
+    cl.user_session.set(
+        "message_history",
+        []
+    )
+    chat_profile = cl.user_session.get("chat_profile")
+    extract_dir = 'repodata'
+    if chat_profile == "chat聊天":
+        pass
+    elif chat_profile =="上传本地项目":
+        files = None
+        while files == None:
+            files = await cl.AskFileMessage(
+                content="请上传项目zip压缩文件!", accept={"application/zip": [".zip"]},max_size_mb=50
+            ).send()
+
+        text_file = files[0]
+        extracted_path = unzip_file(text_file.path,extract_dir)
+        files_list = get_project_files_with_content(extracted_path)
+        cl.user_session.set("project_index",files_list)
+        if len(files_list)>0:
+            await cl.Message(
+                content=f"已成功上传，您可以开始对项目进行提问！",
+            ).send()
+    
+    
+
+@cl.on_message
+async def main(message: cl.Message):
+    chat_profile = cl.user_session.get("chat_profile")
+    message_history = cl.user_session.get("message_history")
+    message_history.append({"role": "user", "content": message.content})
+    if chat_profile == "chat聊天":
+        prompt_content = get_cur_base_user_prompt(message_history=message_history)
+        
+    elif chat_profile=="联网问答":
+        judge_context = llm.chat(web_judge_task_prompt.format(user_input=message.content),temperature=0.2)
+        print(judge_context)
+        message_history.pop()
+
+        if '是' in judge_context:
+            prompt_tmp = bing_search_prompt(message.content)
+            message_history.append({"role": "user", "content": prompt_tmp})
+        else:
+            message_history.append({"role": "user", "content": message.content})
+        prompt_content = get_cur_base_user_prompt(message_history=message_history)
+
+    elif chat_profile =="上传本地项目" :
+        judge_context = llm.chat(judge_task_prompt.format(user_input=message.content),temperature=0.2)
+        
+      
+        project_index = cl.user_session.get("project_index")
+        index_prompt = ""
+        index_tmp = """###PATH:{path}\n{code}\n"""
+        for index in project_index:
+            index_prompt+=index_tmp.format(path=index['path'],code=index['content'])
+        print(judge_context)
+        prompt_content = get_cur_base_user_prompt(message_history=message_history,index_prompt=index_prompt,judge_context=judge_context) if '正常' not in judge_context else get_cur_base_user_prompt(message_history=message_history)
+    
+    
+
+    msg = cl.Message(content="")
+    await msg.send()
+    temperature = cl.user_session.get("temperature")
+    top_p = cl.user_session.get('top_p')
+    
+    if len(prompt_content)/4<120000:
+        stream =  llm.stream_chat(prompt_content,temperature=temperature,top_p = top_p)
+        stream_processor = StreamProcessor()
+        for part in stream:
+            if isinstance(part, str):
+                text = stream_processor.get_new_part(part)
+            elif isinstance(part, dict):
+                text = stream_processor.get_new_part(part['name']+part['content'])
+            if token := (text or " "):
+                await msg.stream_token(token)
+    else:
+        await msg.stream_token("项目太大了，请换小一点的项目。")
+
+    message_history.append({"role": "assistant", "content": msg.content})
+    await msg.update()