Initial commit

2024-11-23 12:16:33 +00:00 · 2024-07-05 09:33:53 +08:00 · 2024-07-05 09:33:53 +08:00 · 9e364eec5d
commit 9e364eec5d
82 changed files with 5437 additions and 0 deletions
--- a/201
+++ b/201
@ -0,0 +1,201 @@
                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/
   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
   1. Definitions.
      "License" shall mean the terms and conditions for use, reproduction,
      and distribution as defined by Sections 1 through 9 of this document.
      "Licensor" shall mean the copyright owner or entity authorized by
      the copyright owner that is granting the License.
      "Legal Entity" shall mean the union of the acting entity and all
      other entities that control, are controlled by, or are under common
      control with that entity. For the purposes of this definition,
      "control" means (i) the power, direct or indirect, to cause the
      direction or management of such entity, whether by contract or
      otherwise, or (ii) ownership of fifty percent (50%) or more of the
      outstanding shares, or (iii) beneficial ownership of such entity.
      "You" (or "Your") shall mean an individual or Legal Entity
      exercising permissions granted by this License.
      "Source" form shall mean the preferred form for making modifications,
      including but not limited to software source code, documentation
      source, and configuration files.
      "Object" form shall mean any form resulting from mechanical
      transformation or translation of a Source form, including but
      not limited to compiled object code, generated documentation,
      and conversions to other media types.
      "Work" shall mean the work of authorship, whether in Source or
      Object form, made available under the License, as indicated by a
      copyright notice that is included in or attached to the work
      (an example is provided in the Appendix below).
      "Derivative Works" shall mean any work, whether in Source or Object
      form, that is based on (or derived from) the Work and for which the
      editorial revisions, annotations, elaborations, or other modifications
      represent, as a whole, an original work of authorship. For the purposes
      of this License, Derivative Works shall not include works that remain
      separable from, or merely link (or bind by name) to the interfaces of,
      the Work and Derivative Works thereof.
      "Contribution" shall mean any work of authorship, including
      the original version of the Work and any modifications or additions
      to that Work or Derivative Works thereof, that is intentionally
      submitted to Licensor for inclusion in the Work by the copyright owner
      or by an individual or Legal Entity authorized to submit on behalf of
      the copyright owner. For the purposes of this definition, "submitted"
      means any form of electronic, verbal, or written communication sent
      to the Licensor or its representatives, including but not limited to
      communication on electronic mailing lists, source code control systems,
      and issue tracking systems that are managed by, or on behalf of, the
      Licensor for the purpose of discussing and improving the Work, but
      excluding communication that is conspicuously marked or otherwise
      designated in writing by the copyright owner as "Not a Contribution."
      "Contributor" shall mean Licensor and any individual or Legal Entity
      on behalf of whom a Contribution has been received by Licensor and
      subsequently incorporated within the Work.
   2. Grant of Copyright License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      copyright license to reproduce, prepare Derivative Works of,
      publicly display, publicly perform, sublicense, and distribute the
      Work and such Derivative Works in Source or Object form.
   3. Grant of Patent License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      (except as stated in this section) patent license to make, have made,
      use, offer to sell, sell, import, and otherwise transfer the Work,
      where such license applies only to those patent claims licensable
      by such Contributor that are necessarily infringed by their
      Contribution(s) alone or by combination of their Contribution(s)
      with the Work to which such Contribution(s) was submitted. If You
      institute patent litigation against any entity (including a
      cross-claim or counterclaim in a lawsuit) alleging that the Work
      or a Contribution incorporated within the Work constitutes direct
      or contributory patent infringement, then any patent licenses
      granted to You under this License for that Work shall terminate
      as of the date such litigation is filed.
   4. Redistribution. You may reproduce and distribute copies of the
      Work or Derivative Works thereof in any medium, with or without
      modifications, and in Source or Object form, provided that You
      meet the following conditions:
      (a) You must give any other recipients of the Work or
          Derivative Works a copy of this License; and
      (b) You must cause any modified files to carry prominent notices
          stating that You changed the files; and
      (c) You must retain, in the Source form of any Derivative Works
          that You distribute, all copyright, patent, trademark, and
          attribution notices from the Source form of the Work,
          excluding those notices that do not pertain to any part of
          the Derivative Works; and
      (d) If the Work includes a "NOTICE" text file as part of its
          distribution, then any Derivative Works that You distribute must
          include a readable copy of the attribution notices contained
          within such NOTICE file, excluding those notices that do not
          pertain to any part of the Derivative Works, in at least one
          of the following places: within a NOTICE text file distributed
          as part of the Derivative Works; within the Source form or
          documentation, if provided along with the Derivative Works; or,
          within a display generated by the Derivative Works, if and
          wherever such third-party notices normally appear. The contents
          of the NOTICE file are for informational purposes only and
          do not modify the License. You may add Your own attribution
          notices within Derivative Works that You distribute, alongside
          or as an addendum to the NOTICE text from the Work, provided
          that such additional attribution notices cannot be construed
          as modifying the License.
      You may add Your own copyright statement to Your modifications and
      may provide additional or different license terms and conditions
      for use, reproduction, or distribution of Your modifications, or
      for any such Derivative Works as a whole, provided Your use,
      reproduction, and distribution of the Work otherwise complies with
      the conditions stated in this License.
   5. Submission of Contributions. Unless You explicitly state otherwise,
      any Contribution intentionally submitted for inclusion in the Work
      by You to the Licensor shall be under the terms and conditions of
      this License, without any additional terms or conditions.
      Notwithstanding the above, nothing herein shall supersede or modify
      the terms of any separate license agreement you may have executed
      with Licensor regarding such Contributions.
   6. Trademarks. This License does not grant permission to use the trade
      names, trademarks, service marks, or product names of the Licensor,
      except as required for reasonable and customary use in describing the
      origin of the Work and reproducing the content of the NOTICE file.
   7. Disclaimer of Warranty. Unless required by applicable law or
      agreed to in writing, Licensor provides the Work (and each
      Contributor provides its Contributions) on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
      implied, including, without limitation, any warranties or conditions
      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
      PARTICULAR PURPOSE. You are solely responsible for determining the
      appropriateness of using or redistributing the Work and assume any
      risks associated with Your exercise of permissions under this License.
   8. Limitation of Liability. In no event and under no legal theory,
      whether in tort (including negligence), contract, or otherwise,
      unless required by applicable law (such as deliberate and grossly
      negligent acts) or agreed to in writing, shall any Contributor be
      liable to You for damages, including any direct, indirect, special,
      incidental, or consequential damages of any character arising as a
      result of this License or out of the use or inability to use the
      Work (including but not limited to damages for loss of goodwill,
      work stoppage, computer failure or malfunction, or any and all
      other commercial damages or losses), even if such Contributor
      has been advised of the possibility of such damages.
   9. Accepting Warranty or Additional Liability. While redistributing
      the Work or Derivative Works thereof, You may choose to offer,
      and charge a fee for, acceptance of support, warranty, indemnity,
      or other liability obligations and/or rights consistent with this
      License. However, in accepting such obligations, You may act only
      on Your own behalf and on Your sole responsibility, not on behalf
      of any other Contributor, and only if You agree to indemnify,
      defend, and hold each Contributor harmless for any liability
      incurred by, or claims asserted against, such Contributor by reason
      of your accepting any such warranty or additional liability.
   END OF TERMS AND CONDITIONS
   APPENDIX: How to apply the Apache License to your work.
      To apply the Apache License to your work, attach the following
      boilerplate notice, with the fields enclosed by brackets "[]"
      replaced with your own identifying information. (Don't include
      the brackets!)  The text should be enclosed in the appropriate
      comment syntax for the file format. We also recommend that a
      file or class name and description of purpose be included on the
      same "printed page" as the copyright notice for easier
      identification within third-party archives.
   Copyright 2024 CodeGeeX Team @ Zhipu AI
   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at
       http://www.apache.org/licenses/LICENSE-2.0
   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
--- a/70
+++ b/70
@ -0,0 +1,70 @@
 The CodeGeeX4 License
 1. 定义
 “许可方”是指分发其软件的 CodeGeeX 团队。
 “软件”是指根据本许可提供的 CodeGeeX4 模型参数。
 2. 许可授予
 根据本许可的条款和条件，许可方特此授予您非排他性、全球性、不可转让、不可再许可、可撤销、免版税的版权许可。
 本许可允许您免费使用本仓库中的所有开源模型进行学术研究，对于希望将模型用于商业目的的用户，需在[这里](https://open.bigmodel.cn/mla/form)完成登记。经过登记的用户可以免费使用本模型进行商业活动，但必须遵守本许可的所有条款和条件。
 上述版权声明和本许可声明应包含在本软件的所有副本或重要部分中。
 如果您分发或提供 THUDM / 智谱AI 关于 CodeGeeX4 开源模型的材料（或其任何衍生作品），或使用其中任何材料（包括 CodeGeeX4 系列的所有开源模型）的产品或服务，您应:
 (A) 随任何此类 THUDM / 智谱AI 材料提供本协议的副本；
 (B) 在相关网站、用户界面、博客文章、关于页面或产品文档上突出显示 “Built with CodeGeeX4”。
 如果您使用 THUDM / 智谱AI的 CodeGeeX4 开源模型的材料来创建、训练、微调或以其他方式改进已分发或可用的 AI 模型，您还应在任何此类 AI 模型名称的开头添加 “CodeGeeX4”。
 3. 限制
 您不得出于任何军事或非法目的使用、复制、修改、合并、发布、分发、复制或创建本软件的全部或部分衍生作品。
 您不得利用本软件从事任何危害国家安全和国家统一，危害社会公共利益及公序良俗，侵犯他人商业秘密、知识产权、名誉权、肖像权、财产权等权益的行为。
 您在使用中应遵循使用地所适用的法律法规政策、道德规范等要求。
 4. 免责声明
 本软件“按原样”提供，不提供任何明示或暗示的保证，包括但不限于对适销性、特定用途的适用性和非侵权性的保证。
 在任何情况下，作者或版权持有人均不对任何索赔、损害或其他责任负责，无论是在合同诉讼、侵权行为还是其他方面，由软件或软件的使用或其他交易引起、由软件引起或与之相关
 软件。
 5. 责任限制
 除适用法律禁止的范围外，在任何情况下且根据任何法律理论，无论是基于侵权行为、疏忽、合同、责任或其他原因，任何许可方均不对您承担任何直接、间接、特殊、偶然、示范性、
 或间接损害，或任何其他商业损失，即使许可人已被告知此类损害的可能性。
 6. 争议解决
 本许可受中华人民共和国法律管辖并按其解释。 因本许可引起的或与本许可有关的任何争议应提交北京市海淀区人民法院。
 请注意，许可证可能会更新到更全面的版本。 有关许可和版权的任何问题，请通过 license@zhipuai.cn 与我们联系。
 1. Definitions
 “Licensor” means the CodeGeeX Team that distributes its Software.
 “Software” means the CodeGeeX4 model parameters made available under this license.
 2. License
 Under the terms and conditions of this license, the Licensor hereby grants you a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty-free copyright license.
 This license allows you to use all open source models in this repository for free for academic research. For users who wish to use the models for commercial purposes, please do so [here](https://open.bigmodel.cn/mla/form)
 Complete registration. Registered users are free to use this model for commercial activities, but must comply with all terms and conditions of this license.
 The copyright notice and this license notice shall be included in all copies or substantial portions of the Software.
 If you distribute or provide THUDM / Zhipu AI materials on the CodeGeeX4 open source model (or any derivative works thereof), or products or services that use any materials therein (including all open source models of the CodeGeeX4 series), you should:
 (A) Provide a copy of this Agreement with any such THUDM/Zhipu AI Materials;
 (B) Prominently display "Built with CodeGeeX4" on the relevant website, user interface, blog post, related page or product documentation.
 If you use materials from THUDM/Zhipu AI's CodeGeeX4 model to create, train, operate, or otherwise improve assigned or available AI models, you should also add "CodeGeeX4" to the beginning of any such AI model name.
 3. Restrictions
 You are not allowed to use, copy, modify, merge, publish, distribute, copy or create all or part of the derivative works of this software for any military or illegal purposes.
 You are not allowed to use this software to engage in any behavior that endangers national security and unity, endangers social public interests and public order, infringes on the rights and interests of others such as trade secrets, intellectual property rights, reputation rights, portrait rights, and property rights.
 You should comply with the applicable laws, regulations, policies, ethical standards, and other requirements in the place of use during use.
 4. Disclaimer
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
 WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
 COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 5. Limitation of Liability
 EXCEPT TO THE EXTENT PROHIBITED BY APPLICABLE LAW, IN NO EVENT AND UNDER NO LEGAL THEORY, WHETHER BASED IN TORT,
 NEGLIGENCE, CONTRACT, LIABILITY, OR OTHERWISE WILL ANY LICENSOR BE LIABLE TO YOU FOR ANY DIRECT, INDIRECT, SPECIAL,
 INCIDENTAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES, OR ANY OTHER COMMERCIAL LOSSES, EVEN IF THE LICENSOR HAS BEEN ADVISED
 OF THE POSSIBILITY OF SUCH DAMAGES.
 6. Dispute Resolution
 This license shall be governed and construed in accordance with the laws of People’s Republic of China. Any dispute
 arising from or in connection with this License shall be submitted to Haidian District People's Court in Beijing.
 Note that the license is subject to update to a more comprehensive version. For any questions related to the license and
 copyright, please contact us at license@zhipuai.cn.
--- a/README.md
+++ b/README.md
@ -0,0 +1,86 @@
 ![](resources/logo.jpeg)
 <p align="center">
    🏠 <a href="https://codegeex.cn" target="_blank">Homepage</a>｜🛠 Extensions <a href="https://marketplace.visualstudio.com/items?itemName=aminer.codegeex" target="_blank">VS Code</a>, <a href="https://plugins.jetbrains.com/plugin/20587-codegeex" target="_blank">Jetbrains</a>｜🤗 <a href="https://huggingface.co/THUDM/codegeex4-all-9b" target="_blank">HF Repo</a> | 🪧 <a href="https://huggingface.co/spaces/THUDM/CodeGeeX" target="_blank">HF DEMO</a>
 </p>
 [English](./README.md) | [中文](./README_zh.md)
 # CodeGeeX4: Open Multilingual Code Generation Model
 We introduce CodeGeeX4-ALL-9B, the open-source version of the latest CodeGeeX4 model series. It is a multilingual code generation model continually trained on the [GLM-4-9B](https://github.com/THUDM/GLM-4), significantly enhancing its code generation capabilities. Using a single CodeGeeX4-ALL-9B model, it can support comprehensive functions such as code completion and generation, code interpreter, web search, function call, repository-level code Q&A, covering various scenarios of software development. CodeGeeX4-ALL-9B has achieved highly competitive performance  on public benchmarks, such as [BigCodeBench](https://huggingface.co/datasets/bigcode/bigcodebench) and [NaturalCodeBench](https://github.com/THUDM/NaturalCodeBench). It is currently the most powerful code generation model with less than 10B parameters, even surpassing much larger general-purpose models, achieving the best balance in terms of inference speed and model performance.
 ## Model List
 | Model             | Type | Seq Length | Download                                                                                                                                                                                                    |
 |-------------------|------|------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 | codegeex4-all-9b  | Chat | 128K       | [🤗 Huggingface](https://huggingface.co/THUDM/codegeex4-all-9b) [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/codegeex4-all-9b) [🟣 WiseModel](https://wisemodel.cn/models/ZhipuAI/codegeex4-all-9b)    |
 ## Get Started
 Use `4.39.0<=transformers<=4.40.2` to quickly launch [codegeex4-all-9b](https://huggingface.co/THUDM/codegeex4-all-9b)：
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 device = "cuda" if torch.cuda.is_available() else "cpu"
 tokenizer = AutoTokenizer.from_pretrained("THUDM/codegeex4-all-9b", trust_remote_code=True)
 model = AutoModelForCausalLM.from_pretrained(
    "THUDM/codegeex4-all-9b",
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=True,
    trust_remote_code=True
 ).to(device).eval()
 inputs = tokenizer.apply_chat_template([{"role": "user", "content": "write a quick sort"}], add_generation_prompt=True, tokenize=True, return_tensors="pt", return_dict=True ).to(device)
 with torch.no_grad():
    outputs = model.generate(**inputs)
    outputs = outputs[:, inputs['input_ids'].shape[1]:]
    print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
 ## Tutorials
 CodeGeeX4-ALL-9B provides three user guides to help users quickly understand and use the model:
 1. **[System Prompt Guideline](./guides/System_prompt_guideline.md)**: This guide introduces how to use system prompts in CodeGeeX4-ALL-9B, including the VSCode extension official system prompt, customized system prompts, and some tips for maintaining multi-turn dialogue history.
 2. **[Infilling Guideline](./guides/Infilling_guideline.md)**: This guide explains the VSCode extension official infilling format, covering general infilling, cross-file infilling, and generating a new file in a repository.
 3. **[repository Tasks Guideline](./guides/Repository_tasks_guideline.md)**: This guide demonstrates how to use repository tasks in CodeGeeX4-ALL-9B, including QA tasks at the repository level and how to trigger the aicommiter capability of CodeGeeX4-ALL-9B to perform deletions, additions, and changes to files at the repository level.
 These guides aim to provide a comprehensive understanding and facilitate efficient use of the model.
 ## Evaluation
 CodeGeeX4-ALL-9B is ranked as the most powerful model under 10 billion parameters, even surpassing general models several times its size, achieving the best balance between inference performance and model effectiveness.
 CodeGeeX4-ALL-9B scored `48.9` and `40.4` for the `complete` and `instruct` tasks of BigCodeBench, which are the highest scores among models with less than 20 billion parameters. 
 ![BigCodeBench Test Results](./metric/pics/Bigcodebench.png)
 In CRUXEval, a benchmark for testing code reasoning, understanding, and execution capabilities, CodeGeeX4-ALL-9B presented remarkable results with its COT (chain-of-thought) abilities. From easy code generation tasks in HumanEval and MBPP, to very challenging tasks in NaturalCodeBench, CodeGeeX4-ALL-9B also achieved outstanding performance at its scale. It is currently the only code model that supports Function Call capabilities and even achieves a better execution success rate than GPT-4. 
 ![Function Call Evaluation](./metric/pics/FunctionCall.png)
 Furthermore, in the "Code Needle In A Haystack" (NIAH) evaluation, the CodeGeeX4-ALL-9B model demonstrated its ability to retrieve code within contexts up to 128K, achieving a 100% retrieval accuracy in all python scripts.
 <p align="center">
  <img src=./metric/pics/NIAH_PYTHON.png alt="图片1描述" width="45%">
  <img src="./metric/pics/NIAH_ALL.png" alt="图片2描述" width="45%">
 </p>
 Details of the evaluation results can be found in the **[Evaluation](./metric/README.md)**.
 ## License
 The code in this repository is open source under the [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) license. The model weights are licensed under the [Model License](MODEL_LICENSE). CodeGeeX4-9B weights are open for academic research. For users who wish to use the models for commercial purposes, please fill in the [registration form](https://open.bigmodel.cn/mla/form).
 ## Citation
 If you find our work helpful, please feel free to cite the following paper:
 ```
@inproceedings{zheng2023codegeex,
      title={CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X},
      author={Qinkai Zheng and Xiao Xia and Xu Zou and Yuxiao Dong and Shan Wang and Yufei Xue and Zihan Wang and Lei Shen and Andi Wang and Yang Li and Teng Su and Zhilin Yang and Jie Tang},
      booktitle={KDD},
      year={2023}
 }
 ```
--- a/README_zh.md
+++ b/README_zh.md
@ -0,0 +1,87 @@
 ![](resources/logo.jpeg)
 <p align="center">
    🏠 <a href="https://codegeex.cn" target="_blank">Homepage</a>｜🛠 Extensions <a href="https://marketplace.visualstudio.com/items?itemName=aminer.codegeex" target="_blank">VS Code</a>, <a href="https://plugins.jetbrains.com/plugin/20587-codegeex" target="_blank">Jetbrains</a>｜🤗 <a href="https://huggingface.co/THUDM/codegeex4-all-9b" target="_blank">HF Repo</a> | 🪧 <a href="https://huggingface.co/spaces/THUDM/CodeGeeX" target="_blank">HF DEMO</a>
 </p>
 [English](./README.md) | [中文](./README_zh.md)
 # CodeGeeX4: 开源多语言代码生成模型
 我们推出了 CodeGeeX4-ALL-9B，这是最新的 CodeGeeX4 系列模型的开源版本。该模型是在 [GLM-4-9B](https://github.com/THUDM/GLM-4) 基础上持续训练的多语言代码生成模型，显著提升了代码生成能力。使用单个 CodeGeeX4-ALL-9B 模型，可以支持代码补全与生成、代码解释、联网搜索、函数调用、仓库级代码问答等多种功能，覆盖了软件开发的各个场景。CodeGeeX4-ALL-9B 在 [BigCodeBench](https://huggingface.co/datasets/bigcode/bigcodebench) 和 [NaturalCodeBench](https://github.com/THUDM/NaturalCodeBench) 等公开基准测试中取得了极具竞争力的表现。它是目前参数量少于 100 亿的最强代码生成模型，甚至超越了更大的通用模型，在推理速度和模型性能方面达到了最佳平衡。
 ## 模型列表
 | 模型             | 类型 | 上下文长度 | 下载地址                                                                                                                                                                                                    |
 |-------------------|------|------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 | codegeex4-all-9b  | Chat | 128K       | [🤗 Huggingface](https://huggingface.co/THUDM/codegeex4-all-9b) [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/codegeex4-all-9b) [🟣 WiseModel](https://wisemodel.cn/models/ZhipuAI/codegeex4-all-9b)    |
 ## 快速开始
 请使用 `4.39.0<=transformers<=4.40.2` 部署 [codegeex4-all-9b](https://huggingface.co/THUDM/codegeex4-all-9b)：
 ```python
 from transformers import AutoTokenizer, AutoModelForCausalLM
 device = "cuda" if torch.cuda.is_available() else "cpu"
 tokenizer = AutoTokenizer.from_pretrained("THUDM/codegeex4-all-9b", trust_remote_code=True)
 model = AutoModelForCausalLM.from_pretrained(
    "THUDM/codegeex4-all-9b",
    torch_dtype=torch.bfloat16,
    low_cpu_mem_usage=True,
    trust_remote_code=True
 ).to(device).eval()
 inputs = tokenizer.apply_chat_template([{"role": "user", "content": "write a quick sort"}], add_generation_prompt=True, tokenize=True, return_tensors="pt", return_dict=True).to(device)
 with torch.no_grad():
    outputs = model.generate(**inputs)
    outputs = outputs[:, inputs['input_ids'].shape[1]:]
    print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
 ## 用户指南
 我们为 CodeGeeX4-ALL-9B 提供了用户指南，帮助用户快速了解和使用该模型：
 ![ALL Fuctions](./resources/all_functions.jpg)
 1. **[系统提示指南](./guides/System_prompt_guideline_zh.md)**：本指南介绍了如何在 CodeGeeX4-ALL-9B 中使用系统提示，包括 VSCode 插件的官方系统提示、自定义系统提示以及维护多轮对话历史的一些技巧。
 2. **[上下文补全指南](./guides/Infilling_guideline_zh.md)**：本指南解释了 VSCode 插件的官方填充格式，涵盖一般补全、跨文件补全和在仓库中生成新文件。
 3. **[项目级代码生成指南](./guides/Repository_tasks_guideline_zh.md)**：本指南展示了如何在 CodeGeeX4-ALL-9B 中使用项目级任务，包括项目级别的问答任务，以及如何触发 CodeGeeX4-ALL-9B 的 aicommiter 功能以执行仓库级别任务中的删除、添加和更改文件操作。
 这些指南旨在帮助大家全面理解模型的用法并更好发挥模型的能力。
 ## 评测指标
 CodeGeeX4-ALL-9B 被评为参数量100 亿内的最强模型，甚至超越了参数量大几倍的通用模型，在推理性能和模型能力之间达到了最佳效果。
 在 BigCodeBench 的 complete 和 instruct 任务中，CodeGeeX4-ALL-9B 分别取得了 `48.9` 和 `40.4` 的高分，这在参数量 200 亿内的模型中是最高的分数。
 ![BigCodeBench Test Results](./metric/pics/Bigcodebench.png)
 Crux-Eval 是测试代码推理、理解和执行能力的基准测试，借助于其强大的 COT 能力，CodeGeeX4-ALL-9B 展现出色的表现。在 HumanEval、MBPP 和 NaturalCodeBench 等代码生成任务中，CodeGeeX4-ALL-9B 也取得了出色的成绩。目前，它是唯一支持 Function Call 功能的代码模型，甚至取得了比 GPT-4 更高的分数。
 ![Function Call Evaluation](./metric/pics/FunctionCall.png)
 此外，在“Code Needle In A Haystack” (NIAH) 评估中，CodeGeeX4-ALL-9B 模型展示了在 128K 范围内检索代码的能力，在python语言环境达到了 100% 的检索准确率，并在跨文件补全任务中表现出色。
 <p align="center">
  <img src=./metric/pics/NIAH_PYTHON.png alt="图片1描述" width="45%">
  <img src="./metric/pics/NIAH_ALL.png" alt="图片2描述" width="45%">
 </p>
 更详细的评估结果请看 **[评估结果](./metric/README_zh.md)** 。
 ## 许可证
 本仓库中的代码是根据 [Apache-2.0](https://www.apache.org/licenses/LICENSE-2.0) 许可证开源的。模型权重根据 [模型许可证](MODEL_LICENSE) 许可。CodeGeeX4-9B 权重对学术研究开放。对于希望将模型用于商业目的的用户，请填写 [登记表](https://open.bigmodel.cn/mla/form)。
 ## 引用
 如果您觉得我们的工作对您有帮助，欢迎引用以下论文：
 ```
@inproceedings{zheng2023codegeex,
      title={CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X},
      author={Qinkai Zheng and Xiao Xia and Xu Zou and Yuxiao Dong and Shan Wang and Yufei Xue and Zihan Wang and Lei Shen and Andi Wang and Yang Li and Teng Su and Zhilin Yang and Jie Tang},
      booktitle={KDD},
      year={2023}
 }
 ```
--- a/guides/Infilling_guideline.md
+++ b/guides/Infilling_guideline.md
@ -0,0 +1,216 @@
 **Code Completion Tutorial: Contextual, Cross-File, and Project-Level Completion**
 This tutorial primarily introduces the code completion capabilities of plugin-enabled models. These capabilities include contextual completion, cross-file completion, and project-level file completion.
 - **Contextual Completion:** Within the same code file, based on the cursor's position and the surrounding context.
 - **Cross-File Completion:** Enhances code completion capabilities by incorporating dependencies or related files of the current code file.
 - **Project-Level Completion:** The model can generate complete new files based on your project information and requirements.
  You can use the CodeGeeX4-ALL-9B-128k model and control memory requirements by setting different `max_length` values. For example, you can set `max_length` to 16k or 32k to run the model on consumer-grade graphics cards.
 <a name="heading_0"></a>**Code Completion Usage Tutorial**
 <a name="heading_1"></a>1. **Contextual Completion**
 1. **File Path:** `"###PATH:"` + relative file path or file name
 2. **Language Tag:** This is very important and must be included. The list of languages is mentioned above, and the format generally starts with the language's initial capital letter, with a few exceptions. If unsure about the language, you can leave it blank after the colon. All language tags start with `"###LANGUAGE:"`.
 3. **Two Modes:** `"###MODE:"`, `LINE` generates a single line, and `BLOCK` generates multiple lines. The default is `BLOCK` mode.
 4. **Format:**
 ```
 <|user|>
 ###PATH:{path}
 ###LANGUAGE:{code_language}
 ###MODE:{LINE/BLOCK}
 <|code_suffix|>{code}<|code_prefix|>{code}<|code_middle|><|assistant|>\n
 ```
 5. Example:
 When all information including path, code_language, mode, suffix, and prefix is provided:
 ```
 <|user|>
 ###PATH:src.py
 ###LANGUAGE:Python
 ###MODE:LINE/BLOCK
 <|code_suffix|>            else:
                depth -= 1
        return max_depth
    return [parse_paren_group(x) for x in paren_string.split(' ') if x]
 <|code_prefix|>from typing import List
 def parse_nested_parens(paren_string: str) -> List[int]:
    """ Input to this function is a string represented multiple groups for nested parentheses separated by spaces.
    For each of the group, output the deepest level of nesting of parentheses.
    E.g. (()()) has maximum two levels of nesting while ((())) has three.
    >>> parse_nested_parens('(()()) ((())) () ((())()())')
   [2, 3, 1, 3]
 """
 <|code_middle|><|assistant|>\n
 ```
 - Situation with no language and no suffix provided
 ```
 <|user|>
 ###PATH:src.py
 ###LANGUAGE:
 ###MODE:LINE/BLOCK
 <|code_suffix|><|code_prefix|>from typing import List
 def parse_nested_parens(paren_string: str) -> List[int]:
    """ Input to this function is a string represented multiple groups for nested parentheses separated by spaces.
    For each of the group, output the deepest level of nesting of parentheses.
    E.g. (()()) has maximum two levels of nesting while ((())) has three.
    >>> parse_nested_parens('(()()) ((())) () ((())()())')
    [2, 3, 1, 3]
   """
 <|code_middle|><|assistant|>\n
 ```
 <a name="heading_2"></a>2. **Cross File Infilling**
 1. Please format reference code as below:
 ```
 ###REFERENCE:
 ###PATH: relative file path or file name
 code snippet
 ###REFERENCE:
 ###PATH: relative file path or file name
 code snippet
 ```
 2. **File Path:** `"###PATH:"` + relative file path or file name
 3. **Language Tag:** This is very important and must be included. The list of languages is mentioned above, and the format generally starts with the language's initial capital letter, with a few exceptions. If unsure about the language, you can leave it blank after the colon. All language tags start with `"###LANGUAGE:"`.
 4. **Two Modes:** `"###MODE:"`, `LINE` generates a single line, and `BLOCK` generates multiple lines. The default is `BLOCK` mode.
 5. **Format:**
 ```
 <|user|>
 ###REFERENCE:
 ###PATH:{path}
 {code}
 ...
 ...
 ...
 ###REFERENCE:
 ###PATH:{path}
 {code}
 ###PATH:{path}\n\n###LANGUAGE:{code_language}\n###MODE:{LINE/BLOCK}\n<|code_suffix|>{code}<|code_prefix|>{code}<|code_middle|><|assistant|>\n
 ```
 6. Example:
 ```
 |Python
 <|user|>
 ###PATH:./sort/quick_sort.py
 def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)
 arr = [3,6,8,10,1,2,1]
 print(quick_sort(arr))
 ###PATH:src.py
 ###LANGUAGE:Python
 ###MODE:LINE/BLOCK
 <|code_suffix|>            else:
                depth -= 1
        return max_depth
    return [parse_paren_group(x) for x in paren_string.split(' ') if x]
 <|code_prefix|>from typing import List
 def parse_nested_parens(paren_string: str) -> List[int]:
    """ Input to this function is a string represented multiple groups for nested parentheses separated by spaces.
    For each of the group, output the deepest level of nesting of parentheses.
    E.g. (()()) has maximum two levels of nesting while ((())) has three.
    >>> parse_nested_parens('(()()) ((())) () ((())()())')
    [2, 3, 1, 3]
    """
 <|code_middle|><|assistant|>\n
 ```
 <a name="heading_3"></a>3. **Repository Level File Generation**
 1. You can use the project-level add, delete, and modify format to complete the task of adding a file in the project.
 2. Related files: Example format is as follows:
 ```
 ###REFERENCE:
 ###PATH: relative file path or file name
 代码
 ###REFERENCE:
 ###PATH: relative file path or file name
 代码
 ```
 3. **File Path:** `"###PATH:"` + relative file path or file name
 4. **Language Tag:** This is very important and must be included. The list of languages is mentioned above, and the format generally starts with the language's initial capital letter, with a few exceptions. If unsure about the language, you can leave it blank after the colon. All language tags start with `"###LANGUAGE:"`.
 5. **Two Modes:** `"###MODE:"`, `LINE` generates a single line, and `BLOCK` generates multiple lines. The default is `BLOCK` mode.
 6. **Format:**
 ```
 <|user|>
 ###REFERENCE:
 ###PATH:{path}
 {code}
 ...
 ...
 ...
 ###REFERENCE:
 ###PATH:{path}
 {code}
 ###PATH:{path}
 ###LANGUAGE:{code_language}
 ###MODE:{LINE/BLOCK}
 <|code_suffix|>{code}<|code_prefix|>{code}<|code_middle|><|assistant|>\n
 ```
 7. Example:
 ```
 <|user|>
 ###PATH:./sort/quick_sort.py
 def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)
 arr = [3,6,8,10,1,2,1]
 print(quick_sort(arr))
 ###PATH:src.py
 ###LANGUAGE:Python
 ###MODE:LINE/BLOCK
 <|code_suffix|>            else:
                depth -= 1
        return max_depth
    return [parse_paren_group(x) for x in paren_string.split(' ') if x]
 <|code_prefix|>from typing import List
 def parse_nested_parens(paren_string: str) -> List[int]:
    """ Input to this function is a string represented multiple groups for nested parentheses separated by spaces.
    For each of the group, output the deepest level of nesting of parentheses.
    E.g. (()()) has maximum two levels of nesting while ((())) has three.
    >>> parse_nested_parens('(()()) ((())) () ((())()())')
    [2, 3, 1, 3]
    """
 <|code_middle|><|assistant|>\n
 ```
--- a/guides/Infilling_guideline_zh.md
+++ b/guides/Infilling_guideline_zh.md
@ -0,0 +1,215 @@
 **代码补全使用教程，上下文补全，跨文件补全，项目级别补全**
 本篇教程中，主要介绍了插件功能的模型的代码补全能力。代码补全能力包括上下文补全，跨文件补全和项目级文件补全。
 - 上下文补全：在同一个代码文件内，根据光标所在位置，上下文信息。
 - 跨文件补全：引入当前代码文件的依赖文件或者相关文件，增强代码补全的能力。
 - 项目级别补全：模型可以根据您的项目信息，以及您的需求，为您生成完整的新文件。
  您可以使用CodeGeeX4-ALL-9B-128k模型，通过设置不同的max_length控制对显存的需求。例如，您可以将max_length设置在16k或者32k以便将模型跑在消费级显卡上
 <a name="heading_0"></a>**代码补全使用教程**
 <a name="heading_1"></a>1. **上下文填空**
 1. 文件路径："###PATH:" + 文件相对路径或文件名
 2. 语言标签非常重要，必须要加上，语言列表见上方，格式一般是语言开头大写，有个别特殊；在不确定语言的情况下，冒号后面可以空着），所有语言标签都是以"###LANGUAGE:"开头，不作另外区分。
 3. 两个种模式："###MODE:"，LINE生成单行，BLOCK生成多行。默认BLOCK模式。
 4. 格式：
 ```
 <|user|>
 ###PATH:{path}
 ###LANGUAGE:{code_language}
 ###MODE:{LINE/BLOCK}
 <|code_suffix|>{code}<|code_prefix|>{code}<|code_middle|><|assistant|>\n
 ```
 5. 示例如下:
 - 有path,code_language,mode,suffix,prefix全部信息的情况
 ```
 <|user|>
 ###PATH:src.py
 ###LANGUAGE:Python
 ###MODE:LINE/BLOCK
 <|code_suffix|>            else:
                depth -= 1
        return max_depth
    return [parse_paren_group(x) for x in paren_string.split(' ') if x]
 <|code_prefix|>from typing import List
 def parse_nested_parens(paren_string: str) -> List[int]:
    """ Input to this function is a string represented multiple groups for nested parentheses separated by spaces.
    For each of the group, output the deepest level of nesting of parentheses.
    E.g. (()()) has maximum two levels of nesting while ((())) has three.
    >>> parse_nested_parens('(()()) ((())) () ((())()())')
   [2, 3, 1, 3]
 """
 <|code_middle|><|assistant|>\n
 ```
 - 没有语言、没有后缀的情况
 ```
 <|user|>
 ###PATH:src.py
 ###LANGUAGE:
 ###MODE:LINE/BLOCK
 <|code_suffix|><|code_prefix|>from typing import List
 def parse_nested_parens(paren_string: str) -> List[int]:
    """ Input to this function is a string represented multiple groups for nested parentheses separated by spaces.
    For each of the group, output the deepest level of nesting of parentheses.
    E.g. (()()) has maximum two levels of nesting while ((())) has three.
    >>> parse_nested_parens('(()()) ((())) () ((())()())')
    [2, 3, 1, 3]
   """
 <|code_middle|><|assistant|>\n
 ```
 <a name="heading_2"></a>2. **跨文件补全**
 1. 相关文件：格式示例如下
 ```
 ###REFERENCE:
 ###PATH: 文件相对路径或文件名
 代码
 ###REFERENCE:
 ###PATH: 文件相对路径或文件名
 代码
 ```
 2. 文件路径："###PATH:" + 文件相对路径或文件名
 3. 语言标签非常重要，必须要加上，语言列表见上方，格式一般是语言开头大写，有个别特殊；在不确定语言的情况下，冒号后面可以空着），所有语言标签都是以"###LANGUAGE:"开头，不作另外区分。
 4. 两个种模式："###MODE:"，LINE生成单行，BLOCK生成多行。默认BLOCK模式。
 5. 格式：
 ```
 <|user|>
 ###REFERENCE:
 ###PATH:{path}
 {code}
 ...
 ...
 ...
 ###REFERENCE:
 ###PATH:{path}
 {code}
 ###PATH:{path}\n\n###LANGUAGE:{code_language}\n###MODE:{LINE/BLOCK}\n<|code_suffix|>{code}<|code_prefix|>{code}<|code_middle|><|assistant|>
 6. 示例如下:
 |Python
 <|user|>
 ###PATH:./sort/quick_sort.py
 def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)
 arr = [3,6,8,10,1,2,1]
 print(quick_sort(arr))
 ###PATH:src.py
 ###LANGUAGE:Python
 ###MODE:LINE/BLOCK
 <|code_suffix|>            else:
                depth -= 1
        return max_depth
    return [parse_paren_group(x) for x in paren_string.split(' ') if x]
 <|code_prefix|>from typing import List
 def parse_nested_parens(paren_string: str) -> List[int]:
    """ Input to this function is a string represented multiple groups for nested parentheses separated by spaces.
    For each of the group, output the deepest level of nesting of parentheses.
    E.g. (()()) has maximum two levels of nesting while ((())) has three.
    >>> parse_nested_parens('(()()) ((())) () ((())()())')
    [2, 3, 1, 3]
    """
 <|code_middle|><|assistant|>\n
 ```
 <a name="heading_3"></a>3. **项目级文件补全**
 1. 您可以使用项目级增删改的格式，完成在项目中补一个文件的任务
 2. 相关文件：格式示例如下
 ```
 ###REFERENCE:
 ###PATH: 文件相对路径或文件名
 代码
 ###REFERENCE:
 ###PATH: 文件相对路径或文件名
 代码
 ```
 3. 文件路径："###PATH:" + 文件相对路径或文件名
 4. 语言标签非常重要，必须要加上，语言列表见上方，格式一般是语言开头大写，有个别特殊；在不确定语言的情况下，冒号后面可以空着），所有语言标签都是以"###LANGUAGE:"开头，不作另外区分。
 5. 两个种模式："###MODE:"，LINE生成单行，BLOCK生成多行。默认BLOCK模式。
 6. 格式：
 ```
 <|user|>
 ###REFERENCE:
 ###PATH:{path}
 {code}
 ...
 ...
 ...
 ###REFERENCE:
 ###PATH:{path}
 {code}
 ###PATH:{path}
 ###LANGUAGE:{code_language}
 ###MODE:{LINE/BLOCK}
 <|code_suffix|>{code}<|code_prefix|>{code}<|code_middle|><|assistant|>\n
 ```
 7. 示例如下:
 ```
 <|user|>
 ###PATH:./sort/quick_sort.py
 def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)
 arr = [3,6,8,10,1,2,1]
 print(quick_sort(arr))
 ###PATH:src.py
 ###LANGUAGE:Python
 ###MODE:LINE/BLOCK
 <|code_suffix|>            else:
                depth -= 1
        return max_depth
    return [parse_paren_group(x) for x in paren_string.split(' ') if x]
 <|code_prefix|>from typing import List
 def parse_nested_parens(paren_string: str) -> List[int]:
    """ Input to this function is a string represented multiple groups for nested parentheses separated by spaces.
    For each of the group, output the deepest level of nesting of parentheses.
    E.g. (()()) has maximum two levels of nesting while ((())) has three.
    >>> parse_nested_parens('(()()) ((())) () ((())()())')
    [2, 3, 1, 3]
    """
 <|code_middle|><|assistant|>\n
 ```
--- a/guides/Repository_tasks_guideline.md
+++ b/guides/Repository_tasks_guideline.md
@ -0,0 +1,189 @@
 **Project-Level Usage Tutorial: Project File Add, Delete, Modify, Project Q&A**
 For project-level tasks, it is recommended to use the CodeGeeX4-ALL-9B-128k model version. This model supports 128k context, which is approximately 10,000 lines of code.
 Due to the inference time cost and the possibility of project content exceeding 128k, we strongly recommend trimming the model output when using the CodeGeeX4-ALL-9B-128k model.
 **Model Output Trimming**
 - BM25 & embedding recall
 - Truncate input exceeding 128k (make sure to retain theme special tokens, such as [CLS], [SEP], etc.)
 - Etc.
 Below is a list of languages supported by CodeGeeX4. You can use the primary languages as an initial way to trim the output.
 **Primary Languages (30 out of 310 languages supported)**
 ```
 ['c', 'c++', 'csharp', 'cuda', 'dart', 'go', 'haskell', 'html', 'java', 'javascript', 'json', 'kotlin', 'lua', 'markdown', 'objective-c++', 'pascal', 'php', 'python', 'r', 'ruby', 'rust', 'scala', 'shell', 'sql', 'swift', 'tex', 'txt', 'typescript', 'vue', 'xml']
 ```
 <a name="heading_0"></a>**Repository-Level Task Tutorial**
 After processing the model input (within 128k),
 we need a special input format to activate the model's project Q&A and project file add, delete, and modify functions.
 <a name="heading_1"></a>1. **Repository Q&A**
 1. **System Prompt**
   The system prompt here uses the same prompt as Q&A. We support both Chinese and English system prompts. The system prompt language does not directly affect the model's output language. You can use phrases like "请用中文回答。" or "Please answer in English." at the end of the system prompt to induce the model's output language.
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary.（请用中文回答。/Please answer in English.)
 ```
 2. Repository Q&A Format
 ```
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary.<|user|>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 <<Query>><|assistant|>\n
 ```
 3. Example：
 - Prompt
 ```
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary.<|user|>
 ###PATH: sort.py
 def quick_sort(arr):
   if len(arr) <= 1:
      return arr
   needle = "codegeex4-all-9b"
   pivot = arr[len(arr) // 2]
   left = [x for x in arr if x < pivot]
   middle = [x for x in arr if x == pivot]
   right = [x for x in arr if x > pivot]
   return quick_sort(left) + middle + quick_sort(right)
 ###PATH: calculator.py
 def calculate(a, b, op):
   if op == '+':
      return a + b
   elif op == '-':
      return a - b
   elif op == '*':
      return a \* b
   elif op == '/':
      if b == 0:
         return 'devider can not be 0'
      else:
         return a / b
      else:
         return 'error'
 what's the needle value in above code snipptss<|assistant|>\n
 ```
 - output
 ```
 Acoording to the above code snippet, the needle value is "codegeex4-all-9b".
 ```
 <a name="heading_2"></a>2. **Repository File Add, Delete, Modify**
 1. **System Prompt**
   For the task of adding, deleting, and modifying project files, a separate system prompt is needed. Similarly, we support both Chinese and English system prompts here. Since the model's output is code, it is not recommended to include content that induces responses in Chinese or English in the system prompt.
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码。请根据用户给出的项目仓库中的代码，以及用户提出的需求，生成新的代码或者更改已有代码。输出格式：\n###PATH:{PATH}\n{CODE}
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly. Based on the code in the project repository provided by the user and the user's requirements, generate new code or modify existing code. The output format is:\n###PATH: {PATH}\n {CODE}
 ```
 1. Repository File Add, Delete, Modify Format
 ```
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly. Based on the code in the project repository provided by the user and the user's requirements, generate new code or modify existing code. The output format is:\n###PATH: {PATH}\n {CODE}<|user|>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 <<Query>><|assistant|>\n
 ```
 2. Example:
 - Prompt
 ```
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly. Based on the code in the project repository provided by the user and the user's requirements, generate new code or modify existing code. The output format is:\n###PATH: {PATH}\n {CODE}<|user|>
 ###PATH: sort.py
 def quick_sort(arr):
   if len(arr) <= 1:
      return arr
   needle = "codegeex4-all-9b"
   pivot = arr[len(arr) // 2]
   left = [x for x in arr if x < pivot]
   middle = [x for x in arr if x == pivot]
   right = [x for x in arr if x > pivot]
   return quick_sort(left) + middle + quick_sort(right)
 ###PATH: calculator.py
 def calculate(a, b, op):
   if op == '+':
      return a + b
   elif op == '-':
      return a - b
   elif op == '\*':
      return a \* b
   elif op == '/':
      if b == 0:
         return 'devider can not be 0'
      else:
         return a / b
      else:
         return 'error'
 Write a new script for calculator to write a simple test|assistant|>\n
 ```
 - Output
 ```
 ###PATH: test_calculator.py
 import calculator
 def test_calculator():
   assert calculator.calculate(2, 3, '+') == 5
   assert calculator.calculate(5, 2, '-') == 3
   assert calculator.calculate(3, 4, '\*') == 12
   assert calculator.calculate(10, 2, '/') == 5
   assert calculator.calculate(10, 0, '/') == 'devider can not be 0'
   assert calculator.calculate(10, 2, '%') == 'error'
 test_calculator()
 ```
--- a/guides/Repository_tasks_guideline_zh.md
+++ b/guides/Repository_tasks_guideline_zh.md
@ -0,0 +1,191 @@
 **项目级别使用教程，项目文件增删改，项目问答**
 在项目级别任务中，建议使用CodeGeeX4-ALL-9B-128k的模型版本，该模型支持128k的上下文，大约是1万行代码。
 出于推理时间成本，以及项目内容仍会超过128k的情况，我们强烈建议在使用CodeGeeX4-ALL-9B-128k模型时，对模型输出进行裁剪。
 模型输出裁剪
 - bm25&embedding召回
 - 超过128k对输入进行截断（这里注意需要保留主题special token，如<|system|>, <|user|>,<|assistant|>等）
 - etc.
 以下附上一个CodeGeeX4支持的语言列表，您可以使用主要语言做为初步裁剪方式。
 主要语言（30种，共支持310中语言）
 ```
 ['c', 'c++', 'csharp', 'cuda', 'dart', 'go', 'haskell', 'html', 'java', 'javascript', 'json', 'kotlin', 'lua', 'markdown', 'objective-c++', 'pascal', 'php', 'python', 'r', 'ruby', 'rust', 'scala', 'shell', 'sql', 'swift', 'tex', 'txt', 'typescript', 'vue', 'xml']
 ```
 <a name="heading_0"></a>**项目级别使用教程**
 对模型输入进行处理后（128k以内）
 我们需要特殊的输入格式激活模型的项目问答和项目文件增删改功能
 <a name="heading_1"></a>1. **项目问答**
 1. System Prompt
   system prompt这里采用和问答一致的prompt，同样这里我们支持中文和英文两种system prompt。中英文          system prompt不直接影响模型输出的语言，您可以在system prompt的最后使用（请用中文回答。/Please          answer in English.）来诱导模型输出的语言
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary.（请用中文回答。/Please answer in English.)
 ```
 2. 项目问答输入格式
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。<|user|>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 <<用户Query>><|assistant|>\n
 ```
 3. 例子
 - 输入
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。<|user|>
 ###PATH: sort.py
 def quick_sort(arr):
   if len(arr) <= 1:
      return arr
   needle = "codegeex4-all-9b"
   pivot = arr[len(arr) // 2]
   left = [x for x in arr if x < pivot]
   middle = [x for x in arr if x == pivot]
   right = [x for x in arr if x > pivot]
   return quick_sort(left) + middle + quick_sort(right)
 ###PATH: calculator.py
 def calculate(a, b, op):
   if op == '+':
      return a + b
   elif op == '-':
      return a - b
   elif op == '*':
      return a \* b
   elif op == '/':
      if b == 0:
         return '除数不能为0'
      else:
         return a / b
      else:
         return '无效的运算符'
 以上代码中，needle的赋值是什么？<|assistant|>\n
 ```
 - 输出
 ```
 根据以上代码，needle的值是"codegeex4-all-9b"
 ```
 <a name="heading_2"></a>2. **项目文件增删改**
 1. System Prompt
   在项目文件增删改这个任务上，需要使用单独的system prompt。同样这里我们支持中文和英文两套system prompt。因为这里的模型输出代码，所以system prompt中不建议加入诱导中英文回答的内容。
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码。请根据用户给出的项目仓库中的代码，以及用户提出的需求，生成新的代码或者更改已有代码。输出格式：\n###PATH:{PATH}\n{CODE}
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly. Based on the code in the project repository provided by the user and the user's requirements, generate new code or modify existing code. The output format is:\n###PATH: {PATH}\n {CODE}
 ```
 1. 项目问答输入格式
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码。请根据用户给出的项目仓库中的代码，以及用户提出的需求，生成新的代码或者更改已有代码。输出格式：\n###PATH:{PATH}\n{CODE}<|user|>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 ###PATH: <<File1 Path>>
 <<File1 Code Content>>
 <<用户Query>><|assistant|>\n
 ```
 2. 例子
 - 输入
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码。请根据用户给出的项目仓库中的代码，以及用户提出的需求，生成新的代码或者更改已有代码。输出格式：\n###PATH:{PATH}\n{CODE}<|user|>
 ###PATH: sort.py
 def quick_sort(arr):
   if len(arr) <= 1:
      return arr
   needle = "codegeex4-all-9b"
   pivot = arr[len(arr) // 2]
   left = [x for x in arr if x < pivot]
   middle = [x for x in arr if x == pivot]
   right = [x for x in arr if x > pivot]
   return quick_sort(left) + middle + quick_sort(right)
 ###PATH: calculator.py
 def calculate(a, b, op):
   if op == '+':
      return a + b
   elif op == '-':
      return a - b
   elif op == '\*':
      return a \* b
   elif op == '/':
      if b == 0:
         return '除数不能为0'
      else:
         return a / b
   else:
        return '无效的运算符'
 写一个新的脚本，为calculator写一个简单测试<|assistant|>\n
 ```
 - 输出
 ```
 ###PATH: test_calculator.py
 import calculator
 def test_calculator():
   assert calculator.calculate(2, 3, '+') == 5
   assert calculator.calculate(5, 2, '-') == 3
   assert calculator.calculate(3, 4, '\*') == 12
   assert calculator.calculate(10, 2, '/') == 5
   assert calculator.calculate(10, 0, '/') == '除数不能为0'
   assert calculator.calculate(10, 2, '%') == '无效的运算符'
 test_calculator()
 ```
--- a/guides/System_prompt_guideline.md
+++ b/guides/System_prompt_guideline.md
@ -0,0 +1,372 @@
 **System Prompt Tutorial: Plugin Features, Multi-Turn Conversations, Custom System Prompts**
 This tutorial primarily introduces the system prompts for plugin features, the use of custom system prompts, and the strategy for maintaining historical information in multi-turn conversations.
 Plugin features include: code comments, unit testing, code explanation, code translation, candidate questions, code fixing, code review, and file Q&A.
 You can use the CodeGeeX4-ALL-9B-128k model and control memory requirements by setting different `max_length` values. For example, you can set `max_length` to 16k or 32k to run the model on consumer-grade graphics cards.
 <a name="heading_0"></a>**System Prompt Usage Tutorial**
 <a name="heading_1"></a>1. **Chat & General Mode**
 - We support both Chinese and English system prompts. The system prompt language does not directly affect the model's output language. You can guide the model's output language by using phrases like "请用中文回答。" or "Please answer in English." at the end of the system prompt.
 - Chinese system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary.（请用中文回答。/Please answer in English.)
 ```
 - Prompting Format
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。（请用中文回答。/Please answer in English.）<|user|>
 <<Query>><|assistant|>\n
 ```
 - Example
 ``````
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary.Please answer in English.<|user|>
 Write a Quick sort in Python<|assistant|>
 Quick Sort is a commonly used sorting algorithm with an average time complexity of O(nlogn). Below is an example of Quick Sort implemented in Python:
 ```python
 def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)
 arr = [3,6,8,10,1,2,1]
 print(quick_sort(arr))
 ```
 ``````
 <a name="heading_2"></a>2. **Code Comments**
 - We support both Chinese and English system prompts. The system prompt language does not directly affect the model's output language. You can guide the model's output language by using phrases like "请用中文回答。" or "Please answer in English." at the end of the system prompt. In the code comments scenario, Chinese and English correspond to the language used for the comments.
 - Chinese system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请为输入代码提供格式规范的注释，包含多行注释和单行注释，请注意不要改动原始代码，只需要添加注释。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please provide well-formatted comments for the given code, including both multi-line and single-line comments. Do not modify the original code, only add comments. Output only the code.（请用中文回答。/Please answer in English.）
 ```
 - Prompting Format
 ```
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please provide well-formatted comments for the given code, including both multi-line and single-line comments. Do not modify the original code, only add comments. Output only the code.<|user|>
 <<Code Snippet>>
 <<Query>><|assistant|>\n
 ```
 <a name="heading_3"></a>3. **Code Explanation**
 - We support both Chinese and English system prompts. The system prompt language does not directly affect the model's output language. You can guide the model's output language by using phrases like "请用中文回答。" or "Please answer in English." at the end of the system prompt.
 - Chinese system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请解释输入代码的含义，包括实现原理、用途和注意事项等。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please explain the meaning of the input code, including the implementation principle, purpose, and precautions.（请用中文回答。/Please answer in English.）
 ```
 - Prompting Format
 ```
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please explain the meaning of the input code, including the implementation principle, purpose, and precautions.<|user|>
 <<Code Snippet>>
 <<Query>>
 Explain the above code snippet<|assistant|>\n
 ```
 <a name="heading_4"></a>4. **Code Translation**
 - We support both Chinese and English system prompts. Mixed Chinese and English output is not supported because the output consists only of code, and any comments within the code will follow the comments in the input code.
 - Chinese system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请将输入代码翻译成目标语言，需要满足目标语言的语法规范，并保证功能正确性。目标语言：{目标语言}。
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please translate the input code into the target language, ensuring that it adheres to the syntax rules of the target language and guarantees functional correctness. Target language: {target_language}.
 ```
 - Prompting Format
 ```
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please translate the input code into the target language, ensuring that it adheres to the syntax rules of the target language and guarantees functional correctness. Target language: {target_language}.<|user|>
 <<Code Snippet>>
 translate the above code snippet into {target_language}, only output the translation result, without explanation.<|assistant|>\n
 ```
 <a name="heading_5"></a>5. **Code Review**
 - Currently, only Chinese output is supported. Alternatively, you can use the Chat & General Mode to construct your own system prompt to complete this task.
 - Chinese system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请对输入代码进行仔细的审查，如果有错误请指出，并提出改进意见。
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please carefully review the input of `git diff` and propose improvements for syntax logic, code quality, code performance, and code security.
 ```
 - Prompting Format
 ```
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please carefully review the input of `git diff` and propose improvements for syntax logic, code quality, code performance, and code security.<|user|>
 <<Code Snippet>>
 <<Query>><|assistant|>\n
 ```
 <a name="heading_6"></a>6. **Code Fixing**
 - We support both Chinese and English system prompts. Mixed Chinese and English output is not supported because the output consists only of code, and any comments within the code will follow the comments in the input code.
 - Chinese system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请检查代码中潜在的bug，并对代码进行修改，请注意只针对代码进行修改，没有必要不要修改注释，输出修改后的代码。
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please check for potential bugs in the code and make modifications. Ensure that only the code is modified, and do not change the comments unless necessary. Output only the modified code.
 ```
 - Prompting Format
 ```
 <|system|>
 answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please check for potential bugs in the code and make modifications. Ensure that only the code is modified, and do not change the comments unless necessary. Output only the modified code.<|user|>
 <<Code Snippet>>
 <<Query>><|assistant|>\n
 ```
 <a name="heading_7"></a>7. **Unit Testing**
 - We support both Chinese and English system prompts. Mixed Chinese and English output is not supported because the output consists only of code, and any comments within the code will follow the comments in the input code.
 - Chinese system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请为输入代码生成单元测试，保证测例的正确性，并覆盖尽可能多的情况。
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please generate unit tests for the input code to ensure the correctness and accuracy of the test cases, and cover as many scenarios as possible to ensure better testing of corner cases. Output only the code.
 ```
 - Prompting Format
 ```
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please generate unit tests for the input code to ensure the correctness and accuracy of the test cases, and cover as many scenarios as possible to ensure better testing of corner cases. Output only the code.<|user|>
 <<Code Snippet>>
 <<Query>><|assistant|>\n
 ```
 <a name="heading_8"></a>8. **Candidate Questions**
 - We support both Chinese and English system prompts. The system prompt language does not directly affect the model's output language. You can guide the model's output language by using phrases like "请用中文回答。" or "Please answer in English." at the end of the system prompt.
 - Chinese system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请预测用户接下来会问的问题，注意和历史记录相关。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer questions users have about programming, coding, and computers, and engage in dialogue by asking questions. Task: Based on the previous conversation, predict the next question the user is likely to ask. The question should be concise and align with the previous conversation content.（请用中文回答。/Please answer in English.）
 ```
 - Prompting Format
 ```
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer questions users have about programming, coding, and computers, and engage in dialogue by asking questions. Task: Based on the previous conversation, predict the next question the user is likely to ask. The question should be concise and align with the previous conversation content.<|user|>
 <<user history 1>><|assistant|>
 <<model output history 1>><|user|>
 <<user history 2>><|assistant|>
 <<model output history 2>><|user|>
 <<user history 3>><|assistant|>
 <<model output history 3>><|user|>
 Based on the previous conversation, here's a programming-related question you can ask.<|assistant|>\n
 ```
 <a name="heading_9"></a>9. **File Q&A**
 - We support both Chinese and English system prompts. The system prompt language does not directly affect the model's output language. You can guide the model's output language by using phrases like "请用中文回答。" or "Please answer in English." at the end of the system prompt. We can use the Chat & General Mode system prompt to complete this task.
 - Chinese system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请预测用户接下来会问的问题，注意和历史记录相关。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer questions users have about programming, coding, and computers, and engage in dialogue by asking questions. Task: Based on the previous conversation, predict the next question the user is likely to ask. The question should be concise and align with the previous conversation content.（请用中文回答。/Please answer in English.）
 ```
 - 示例
 ```
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer questions users have about programming, coding, and computers, and engage in dialogue by asking questions. Task: Based on the previous conversation, predict the next question the user is likely to ask. The question should be concise and align with the previous conversation content.<|user|>
 ###PATH: sort.py
 def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    needle = "codegeex4-all-9b"
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)
 what is needle value？<|assistant|>\n
 ```
 ```
 according to above code, the value of needle is "codegeex4-all-9b"
 ```
 <a name="heading_10"></a>10. **Custom System Prompt**
 - We support custom system prompts to provide a personalized experience for the open-source community and CodeGeeX users. Below is an example.
 - Example
 ```
 <|system|>
 I want you to act as a Python console. I will input commands, and you will reply with what the console should display. I want you to reply only with the terminal output in a single, unique code block and nothing else. Do not write explanations. Do not enter commands unless I instruct you to do so. When I need to tell you something in English, I will put the text in curly brackets.<|user|>
 for i in range(5):
    print(i)
 <|assistant|>\n
 ```
 ```
 0
 1
 2
 3
 4
 ```
 <a name="heading_11"></a>11. **Multi-Turn Conversations**
 - For multi-turn conversations, we can use the Chat & General Mode as the system prompt. If a system prompt other than the Chat & General Mode is used in a multi-turn conversation, we recommend including the system prompt information in the same turn input. Below, we provide an example.
 - Chinese system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary.（请用中文回答。/Please answer in English.)
 ```
 - Prompting Format
 ```
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary.（请用中文回答。/Please answer in English.）<|user|>
 <<Query>><|assistant|>\n
 ```
 - Example
 `````
 <|system|>
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary.<|user|>
 wirte a quick sort using python.<|assistant|>
 Quick Sort is a commonly used sorting algorithm with an average time complexity of O(nlogn). Below is an example of Quick Sort implemented in Python:
 ```python
 def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)
 arr = [3,6,8,10,1,2,1]
 print(quick_sort(arr))
 ```
 <|user|>
 ```python
 def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)
 arr = [3,6,8,10,1,2,1]
 print(quick_sort(arr))
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please provide well-formatted comments for the given code, including both multi-line and single-line comments. Do not modify the original code, only add comments. Output only the code.<|assistant|>
 ```python
 # Quick Sort
 def quick_sort(arr):
    # Check if arr is empty, if so, return arr
    if len(arr) <= 1:
        return arr
    # Choose a pivot value, here we select the middle value of the array
    pivot = arr[len(arr) // 2]
    # Place elements less than the pivot into a new array
    left = [x for x in arr if x < pivot]
    # Place elements equal to the pivot into a new array
    middle = [x for x in arr if x == pivot]
    # Place elements greater than the pivot into a new array
    right = [x for x in arr if x > pivot]
    # Recursively quick sort the left and right arrays and merge the results
    return quick_sort(left) + middle + quick_sort(right)
 arr = [3,6,8,10,1,2,1]
 print(quick_sort(arr))
 ```
 <|user|>
 who are u<|assistant|>
 I am an AI programming assistant named CodeGeeX, developed jointly by the KEG Lab at Tsinghua University and Zhipu AI. I can perform code generation and completion, automatically add comments, translate code, and provide intelligent Q&A. These features can significantly help developers improve their programming efficiency.<|user|>
 Based on the previous conversation, here's a programming-related question you can ask.<|assistant|>\n
 ``````
--- a/guides/System_prompt_guideline_zh.md
+++ b/guides/System_prompt_guideline_zh.md
@ -0,0 +1,374 @@
 **System Prompt使用教程，插件功能，多轮对话，自定义System Prompt**
 本篇教程中，主要介绍了插件功能的system prompt，自定义system prompt的使用，以及多轮对话的历史信息的维护策略。
 插件功能包括：代码注释，单元测试，代码解释，代码翻译，候选问题，代码修复，代码审查，文件问答。
 您可以使用CodeGeeX4-ALL-9B-128k模型，通过设置不同的max_length控制对显存的需求。例如，您可以将max_length设置在16k或者32k以便将模型跑在消费级显卡上
 <a name="heading_0"></a>**System Prompt 使用教程**
 <a name="heading_1"></a>1. **Chat&通用模式**
 - 我们支持中文和英文两种system prompt。中英文system prompt不直接影响模型输出的语言，您可以在system prompt的最后使用（请用中文回答。/Please answer in English.）来引导模型输出的语言
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary.（请用中文回答。/Please answer in English.)
 ```
 - 输入格式
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。（请用中文回答。/Please answer in English.）<|user|>
 <<用户Query>><|assistant|>\n
 ```
 - 例子
 ``````
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。请用中文回答。<|user|>
 用python写一个快速排序<|assistant|>
 快速排序（Quick Sort）是一种常用的排序算法，其平均时间复杂度为 O(nlogn)。下面是一个使用 Python 实现的快速排序示例：
 ```python
 def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)
 arr = [3,6,8,10,1,2,1]
 print(quick_sort(arr))
 ```
 ``````
 <a name="heading_2"></a>2. **代码注释**
 - 我们支持中文和英文两种system prompt。中英文system prompt不直接影响模型输出的语言，您可以在system prompt的最后使用（请用中文回答。/Please answer in English.）来引导模型输出的语言。在代码注释场景中，中英文对应的是注释部分的语言
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请为输入代码提供格式规范的注释，包含多行注释和单行注释，请注意不要改动原始代码，只需要添加注释。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please provide well-formatted comments for the given code, including both multi-line and single-line comments. Do not modify the original code, only add comments. Output only the code.（请用中文回答。/Please answer in English.）
 ```
 - 输入格式
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请为输入代码提供格式规范的注释，包含多行注释和单行注释，请注意不要改动原始代码，只需要添加注释。<|user|>
 <<代码块>>
 <<自定义Query>><|assistant|>
 #此处有\n
 ```
 <a name="heading_3"></a>3. **代码解释**
 - 我们支持中文和英文两种system prompt。中英文system prompt不直接影响模型输出的语言，您可以在system prompt的最后使用（请用中文回答。/Please answer in English.）来引导模型输出的语言
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请解释输入代码的含义，包括实现原理、用途和注意事项等。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please explain the meaning of the input code, including the implementation principle, purpose, and precautions.（请用中文回答。/Please answer in English.）
 ```
 - 输入格式
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请解释输入代码的含义，包括实现原理、用途和注意事项等。<|user|>
 <<代码块>>
 <<自定义Query>>
 解释以上代码<|assistant|>\n
 ```
 <a name="heading_4"></a>4. **代码翻译**
 - 我们支持中文和英文两种system prompt。不支持中英文输出，因为输出仅有代码，代码内的注释，会跟随输入代码中的注释
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请将输入代码翻译成目标语言，需要满足目标语言的语法规范，并保证功能正确性。目标语言：{目标语言}。
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please translate the input code into the target language, ensuring that it adheres to the syntax rules of the target language and guarantees functional correctness. Target language: {target_language}.
 ```
 - 输入格式
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请将输入代码翻译成目标语言，需要满足目标语言的语法规范，并保证功能正确性。目标语言：{目标语言}。<|user|>
 <<代码块>>
 把这段代码翻译成{目标语言}，只输出翻译结果，不要解释：<|assistant|>\n
 ```
 <a name="heading_5"></a>5. **代码审查**
 - 暂时只支持中文输出，或者您可以用聊天&通用模式自己构建system prompt完成此任务
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请对输入代码进行仔细的审查，如果有错误请指出，并提出改进意见。
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please carefully review the input of `git diff` and propose improvements for syntax logic, code quality, code performance, and code security.
 ```
 - 输入格式
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请对git diff输入或代码函数进行仔细的审查，并对语法逻辑，代码质量，代码性能和代码安全提出改进意见。<|user|>
 <<代码块>>
 <<自定义Query>><|assistant|>\n
 ```
 <a name="heading_6"></a>6. **代码修复**
 - 我们支持中文和英文两种system prompt。不支持中英文输出，因为输出仅有代码，代码内的注释，会跟随输入代码中的注释
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请检查代码中潜在的bug，并对代码进行修改，请注意只针对代码进行修改，没有必要不要修改注释，输出修改后的代码。
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please check for potential bugs in the code and make modifications. Ensure that only the code is modified, and do not change the comments unless necessary. Output only the modified code.
 ```
 - 输入格式
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请为输入代码生成单元测试，保证测例的正确性，并覆盖尽可能多的情况。仅输出代码。<|user|>
 <<代码块>>
 <<自定义Query>><|assistant|>\n
 ```
 <a name="heading_7"></a>7. **单元测试**
 - 我们支持中文和英文两种system prompt。不支持中英文输出，因为输出仅有代码，代码内的注释，会跟随输入代码中的注释
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请为输入代码生成单元测试，保证测例的正确性，并覆盖尽可能多的情况。
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary. Task: Please generate unit tests for the input code to ensure the correctness and accuracy of the test cases, and cover as many scenarios as possible to ensure better testing of corner cases. Output only the code.
 ```
 - 输入格式
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请为输入代码生成单元测试，保证测例的正确性，并覆盖尽可能多的情况。仅输出代码。<|user|>
 <<代码块>>
 <<自定义Query>><|assistant|>\n
 ```
 <a name="heading_8"></a>8. **候选问题**
 - 我们支持中文和英文两种system prompt。中英文system prompt不直接影响模型输出的语言，您可以在system prompt的最后使用（请用中文回答。/Please answer in English.）来引导模型输出的语言
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请预测用户接下来会问的问题，注意和历史记录相关。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer questions users have about programming, coding, and computers, and engage in dialogue by asking questions. Task: Based on the previous conversation, predict the next question the user is likely to ask. The question should be concise and align with the previous conversation content.（请用中文回答。/Please answer in English.）
 ```
 - 输入格式
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：根据聊天内容，预测用户接下来会问的问题，问题需要简短且有推荐意义。<|user|>
 <<历史用户输入1>><|assistant|>
 <<历史模型输出1>><|user|>
 <<历史用户输入2>><|assistant|>
 <<历史模型输出2>><|user|>
 <<历史用户输入3>><|assistant|>
 <<历史模型输出3>><|user|>
 请根据以上对话，生成一个我可以提问的，且与之前对话和编程相关的问题。<|assistant|>
 #此处有\n，<|assistant|>\n
 ```
 <a name="heading_9"></a>9. **文件问答**
 - 我们支持中文和英文两种system prompt。中英文system prompt不直接影响模型输出的语言，您可以在system prompt的最后使用（请用中文回答。/Please answer in English.）来引导模型输出的语言。我们可以使用Chat&通用模式的system prompt来完成此项任务。
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请预测用户接下来会问的问题，注意和历史记录相关。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer questions users have about programming, coding, and computers, and engage in dialogue by asking questions. Task: Based on the previous conversation, predict the next question the user is likely to ask. The question should be concise and align with the previous conversation content.（请用中文回答。/Please answer in English.）
 ```
 - 示例
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。请用中文回答。<|user|>
 ###PATH: sort.py
 def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    needle = "codegeex4-all-9b"
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)
 needle的赋值是什么？<|assistant|>\n
 ```
 ```
 根据以上代码，needle的值是"codegeex4-all-9b"
 ```
 <a name="heading_10"></a>10. **自定义 system prompt**
 - 我们支持自定义的system prompt为开源社区的小伙伴，以及codegeex的用户，提供个性化使用体验。以下是一个示例。
 - 例子
 ```
 <|system|>
 我希望你充当一个 python 控制台。我将输入命令，你将回复控制台应该显示的内容。我希望你只在一个独特的代码块中回复终端输出，除此之外什么也不要写。不要写解释。除非我指示你这么做，否则不要输入命令。当我需要用英文告诉你一些事情时，我会把文本放在大括号里<|user|>
 for i in range(5):
    print(i)
 <|assistant|>\n
 ```
 ```
 0
 1
 2
 3
 4
 ```
 <a name="heading_11"></a>11. **多轮对话**
 - 多轮对话我们可以使用Chat&通用模式做为system prompt，如果多轮对话中使用了除Chat&通用模式外的system prompt，我们建议您将system promt的信息放入同轮次<|user|>输入中。以下我们给出一个示例。
 - 中文system prompt
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。（请用中文回答。/Please answer in English.）
 ```
 - English system prompt
 ```
 You are an intelligent programming assistant named CodeGeeX. You will answer any questions users have about programming, coding, and computers, and provide code that is formatted correctly, executable, accurate, and secure, and offer detailed explanations when necessary.（请用中文回答。/Please answer in English.)
 ```
 - 输入格式
 ```
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。（请用中文回答。/Please answer in English.）<|user|>
 <<用户Query>><|assistant|>\n
 ```
 - 例子
 `````
 <|system|>
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：根据聊天内容，预测用户接下来会问的问题，问题需要简短且有推荐意义。<|user|>
 用python写一个快速排序<|assistant|>
 快速排序（Quick Sort）是一种常用的排序算法，其平均时间复杂度为 O(nlogn)。下面是一个使用 Python 实现的快速排序示例：
 ```python
 def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)
 arr = [3,6,8,10,1,2,1]
 print(quick_sort(arr))
 ```
 <|user|>
 ```python
 def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)
 arr = [3,6,8,10,1,2,1]
 print(quick_sort(arr))
 ```
 你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。任务：请为输入代码提供格式规范的注释，包含多行注释和单行注释，请注意不要改动原始代码，只需要添加注释。<|assistant|>
 ```python
 # 快速排序
 def quick_sort(arr):
    # 判断arr是否为空，如果是空就返回arr
    if len(arr) <= 1:
        return arr
    # 选择一个基准值，这里我们选择数组中间的值
    pivot = arr[len(arr) // 2]
    # 将数组中小于基准值的元素放到一个新数组中
    left = [x for x in arr if x < pivot]
    # 将数组中等于基准值的元素放到一个新数组中
    middle = [x for x in arr if x == pivot]
    # 将数组中大于基准值的元素放到一个新数组中
    right = [x for x in arr if x > pivot]
    # 递归地对左右两个数组进行快速排序，并将结果合并
    return quick_sort(left) + middle + quick_sort(right)
 arr = [3,6,8,10,1,2,1]
 print(quick_sort(arr))
 ```
 <|user|>
 你是谁<|assistant|>
 我是一个名为 CodeGeeX 的AI编程助手，是由清华大学KEG实验室和智谱AI公司共同开发的。我可以实现代码的生成与补全、自动添加注释、代码翻译以及智能问答等功能，能够帮助开发者显著提高编程效率。<|user|>
 请根据以上对话，生成一个我可以提问的，且与之前对话和编程相关的问题。<|assistant|>\n
 ``````
--- a/interpreter_demo/Dockerfile.sandbox
+++ b/interpreter_demo/Dockerfile.sandbox
@ -0,0 +1,62 @@
 FROM python:3.11-slim-bookworm
 WORKDIR /mnt/data
 RUN apt-get update && apt-get install -y \
    gcc \
    libffi-dev \
    zlib1g-dev \
    fonts-arphic-ukai \
    fonts-arphic-uming \
    fonts-ipafont-mincho \
    fonts-ipafont-gothic \
    fonts-unfonts-core \
    libgdal-dev \
    g++ \
    && rm -rf /var/lib/apt/lists/*
 RUN pip install --no-cache-dir \
    pydantic \
    tornado \
    jupyter_client \
    ipython \
    ipykernel \
    numpy \
    pandas \
    scipy \
    matplotlib \
    scikit-learn \
    notebook \
    beautifulsoup4 \
    seaborn \
    pytest \
    ipywidgets \
    sympy \
    statsmodels \
    joblib \
    cython \
    lxml \
    xlrd \
    qrcode \
    nltk \
    opencv-python \
    Pillow \
    geopandas
 ENV HOME=/mnt/data
 RUN find / -perm +6000 -type f -exec chmod a-s {} \; || true
 RUN echo "set -o history -o vi" >> /etc/profile
 RUN useradd -u 999 -ms /bin/bash appuser
 RUN chown -R appuser:appuser /mnt/data
 USER appuser
 ENV JUPYTER_RUNTIME_DIR=/mnt/data/.local/share/jupyter/runtime
 ENV JUPYTER_DATA_DIR=/mnt/data/.local/share/jupyter
 ENV JUPYTER_CONFIG_DIR=/mnt/data/.jupyter
 COPY sandbox.py /sandbox.py
 VOLUME [ "/mnt/data" ]
 CMD ["python", "/sandbox.py"]
--- a/interpreter_demo/README.md
+++ b/interpreter_demo/README.md
@ -0,0 +1,28 @@
 # Codegeex4 Interpreter Gradio
 Fully local gradio demo of CodeGeeX4 Interpreter.
 ## Usage
 ### Install Dependencies
 ```python
 pip install gradio requests
 ```
 ### Build & Launch Sandbox
 ```bash
 docker build -t sandbox -f Dockerfile.sandbox .
 docker run --name sandbox --publish 8080:8080 sandbox
 ```
 ### Launch Demo
 ```bash
 python app.py --tgi-addr <tgi-addr>
 ```
 ## Docs
 Check out the [documentation](./SANDBOX.md) for the sandbox API.
--- a/interpreter_demo/README_zh.md
+++ b/interpreter_demo/README_zh.md
@ -0,0 +1,28 @@
 # Codegeex4 代码解释器DEMO
 完全本地可运行的 CodeGeeX4 代码解释器.
 ## 使用方法
 ### 安装依赖
 ```python
 pip install gradio requests
 ```
 ### 构建并启动本地沙盒环境
 ```bash
 docker build -t sandbox -f Dockerfile.sandbox .
 docker run --name sandbox --publish 8080:8080 sandbox
 ```
 ### 启动DEMO
 ```bash
 python app.py --tgi-addr <tgi-addr>
 ```
 ## 文档
 参考 [沙盒API文档](./SANDBOX.md)。
--- a/interpreter_demo/SANDBOX.md
+++ b/interpreter_demo/SANDBOX.md
@ -0,0 +1,136 @@
 # Sandbox API
 ### Ping
 **Path:** GET `/`
 Check whether a sandbox is alive and return information about it.
 #### Request
 -
 #### Response
 **Status:**
 - `200` if alive
 **Example:**
 ```json
 {
    "last_activity": "2006-01-02T15:04:05Z07:00", // RFC 3339
 }
 ```
 ### Execute
 **Path:** POST `/execute`
 #### Request
 **Content-Type:** `application/json`
 **JSON Schema:**
 | Name           | Type              | Description                                                                                            |
 | -------------- | ----------------- | ------------------------------------------------------------------------------------------------------ |
 | `code`         | string            | The code to be executed.                                                                               |
 | `timeout_secs` | number (Optional) | Abort execution after timeout. Does not include environment and runtime creation time. Defaults to 60. |
 #### Response
 **Status:**
 - `200` if successful
 **Content-Type:** `application/json`
 **Example:**
 ```json
 {
    "status": "ok", // Possible values: "ok", "timeout"
    "events": [
        {
            "type": "stream",
            "timestamp": "2006-01-02T15:04:05Z07:00", // RFC 3339
            "data": {
                "name": "stdout", // Possible values: "stdout", "stderr"
                "text": "Hello World!",
            }
        },
        {
            "type": "display_data",
            "timestamp": "2006-01-02T15:04:05Z07:00", // RFC 3339
            "data": {
                "variants": {
                    "text/plain": "<IPython.core.display.Image object>",
                    "image/png": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAIAAACQd1PeAAAADElEQVR4nGP4//8/AAX+Av4N70a4AAAAAElFTkSuQmCC" // Base64 encoded PNG image
                }
            }
        },
        {
            "type": "file", // The program has written a file to disk.
            "timestamp": "2006-01-02T15:04:05Z07:00", // RFC 3339
            "data": {
                "path": "README.md",
                "size": 128, // Size is expressed in bytes
            }
        },
        {
            "type": "error",
            "timestamp": "2006-01-02T15:04:05Z07:00", // RFC 3339
            "data": {
                "ename": "ZeroDivisionError",
                "evalue": "division by zero",
                "traceback": [
                    "\\u001b[0;31m---------------------------------------------------------------------------\\u001b[0m",
                    "\\u001b[0;31mZeroDivisionError\\u001b[0m                         Traceback (most recent call last)",
                    "Cell \\u001b[0;32mIn[1], line 2\\u001b[0m\\n\\u001b[1;32m      1\\u001b[0m \\u001b[38;5;66;03m# \\u8ba1\\u7b97\\u8868\\u8fbe\\u5f0f\\u7684\\u7ed3\\u679c\\u001b[39;00m\\n\\u001b[0;32m----> 2\\u001b[0m result \\u001b[38;5;241m=\\u001b[39m \\u001b[38;5;241;43m361234\\u001b[39;49m\\u001b[43m \\u001b[49m\\u001b[38;5;241;43m/\\u001b[39;49m\\u001b[43m \\u001b[49m\\u001b[38;5;241;43m0\\u001b[39;49m \\u001b[38;5;241m+\\u001b[39m \\u001b[38;5;241m4514\\u001b[39m \\u001b[38;5;241m*\\u001b[39m \\u001b[38;5;241m1234\\u001b[39m \\u001b[38;5;241m-\\u001b[39m \\u001b[38;5;241m27152346\\u001b[39m \\u001b[38;5;241m/\\u001b[39m \\u001b[38;5;241m2023\\u001b[39m\\n\\u001b[1;32m      3\\u001b[0m result\\n",
                    "\\u001b[0;31mZeroDivisionError\\u001b[0m: division by zero"
                ]
            }
        }
    ]
 }
 ```
 ### File upload
 **Path:** POST `/files/upload/-/*path`
 Upload a file to the sandbox under `*path`.
 #### Request
 **Content-Length:** The length of the file in bytes.
 **Body:** The raw contents of the file as bytes.
 #### Response
 **Status:**
 - `201` if upload was successful
 - `409` if file already exists
 ### File download
 **Path:** GET `/files/download/-/*path`
 Download a file from the sandbox from `*path`.
 #### Request
 \-
 #### Response
 **Content-Type:** Automatically detected, depending on the file.
 **Content-Disposition:** `attachment; filename*=UTF-8''<filename>`
 **Body:** The raw contents of the file.
 **Status:**
 - `200` if file exists
 - `404` if file is not found
--- a/interpreter_demo/app.py
+++ b/interpreter_demo/app.py
@ -0,0 +1,208 @@
 import argparse
 import json
 import os
 import re
 from typing import Any, Dict, List, Tuple
 import gradio as gr
 import requests
 SYSTEM_PROMPT = {
    "zh": "你是一位智能编程助手，你叫CodeGeeX，你连接着一台电脑，但请注意不能联网。在使用Python解决任务时，你可以运行代码并得到结果，如果运行结果有错误，你需要尽可能对代码进行改进。你可以处理用户上传到电脑上的文件，文件默认存储路径是/mnt/data/。",
    "en": "You are an intelligent programming assistant named CodeGeeX, connected to a computer, but please note that you cannot access the internet. When solving tasks using Python, you can run code and obtain results. If there are any errors in the results, you need to improve the code as much as possible. You can also handle files uploaded to the computer, with the default storage path being /mnt/data/.",
 }
 CODEGEEX_SPECIAL_TOKENS = {
    "user": "<|user|>",
    "assistant": "<|assistant|>",
    "system": "<|system|>",
    "observation": "<|observation|>",
    "eos": "<|endoftext|>",
 }
 parser = argparse.ArgumentParser(description="CodeGeeX4 Interpreter")
 parser.add_argument("--tgi-addr", type=str, required=True)
 parser.add_argument("--sandbox-addr", type=str, default="http://127.0.0.1:8080")
 parser.add_argument("--temperature", type=float, default=0.2)
 parser.add_argument("--top-p", type=float, default=0.95)
 args = parser.parse_args()
 code_block_regex = re.compile(r"```(.*?)\n(.*?)```", re.DOTALL)
 def execute_code_block(lang, code) -> Tuple[List[Dict[str, Any]], str]:
    assert lang in ["python"]
    response = requests.post(
        f"{args.sandbox_addr}/execute",
        json={"code": code, "timeout_secs": 60},
    )
    response = response.json()
    print(f"[RESPONSE] {response}")
    return response["events"], response["status"]
 def upload_file(filepath: str, contents: str):
    print(f"[REQUEST] Upload {filepath} ({len(contents)} bytes)")
    response = requests.post(
        f"{args.sandbox_addr}/files/upload/-/{filepath.lstrip('/')}",
        data=bytes(contents, encoding="utf-8"),
    )
    print(f"[RESPONSE] {response.text}")
    assert response.status_code == 201
 def stream_chat_completion(message, history):
    should_stop = False
    round = 0
    max_rounds = 5
    file_info = ""
    for filepath in message.get("files", []):
        with open(filepath, "r") as f:
            contents = f.read()
        filename = os.path.basename(filepath)
        upload_file(f"/mnt/data/{filename}", contents)
        file_info += f"# File: /mnt/data/{filename}\n"
        file_info += f"# Size: {len(contents)}\n"
        file_info += "# File uploaded\n"
    prompt = f"{CODEGEEX_SPECIAL_TOKENS['system']}\n{SYSTEM_PROMPT['en']}\n"
    for [user_message, bot_message] in history:
        if isinstance(user_message, tuple):
            # It's a file
            pass
        else:
            # Remove any '![image](data:image/png;base64,...)' from the bot message.
            bot_message = re.sub(
                r"!\[image\]\(data:image/png;base64,[^\)]+\)", "", bot_message
            )
            prompt += f"{CODEGEEX_SPECIAL_TOKENS['user']}\n{user_message}\n"
            prompt += f"{CODEGEEX_SPECIAL_TOKENS['assistant']}\n{bot_message}\n"
    prompt += f"{CODEGEEX_SPECIAL_TOKENS['user']}\n{file_info}{message['text']}\n"
    prompt += f"{CODEGEEX_SPECIAL_TOKENS['assistant']}\n"
    stop_sequences = [
        CODEGEEX_SPECIAL_TOKENS["eos"],
        CODEGEEX_SPECIAL_TOKENS["user"],
        CODEGEEX_SPECIAL_TOKENS["observation"],
    ]
    while not should_stop and round < max_rounds:
        round += 1
        request_json_body = {
            "inputs": prompt,
            "parameters": {
                "max_new_tokens": 2048,
                "do_sample": True,
                "top_p": args.top_p,
                "temperature": args.temperature,
                "stop": stop_sequences,
                "details": True,
                "stream": False,
            },
        }
        print(f"[REQUEST] {request_json_body}")
        response = requests.post(
            f"{args.tgi_addr}/generate_stream",
            json=request_json_body,
            stream=True,
        )
        completion = ""
        for line in response.iter_lines():
            if line:
                event = line.decode("utf-8")
                if event.startswith("data:"):
                    event = event[5:].strip()
                    event = json.loads(event)
                    token = event["token"]["text"]
                    completion += token
                    prompt += token
                    # Only display the token if it's not "special".
                    if event["token"]["text"] not in CODEGEEX_SPECIAL_TOKENS.values():
                        yield token
                    # If the model asks for the code to be executed, do it.
                    if event["token"]["text"] == CODEGEEX_SPECIAL_TOKENS["observation"]:
                        match = code_block_regex.search(completion)
                        if match is None:
                            # Hm, it seems the model didn't write any code.
                            # Let's gently warn it.
                            prompt += f"\n```result\nError: no code to execute.\n```\n{CODEGEEX_SPECIAL_TOKENS['assistant']}\n"
                            yield "```\nError: no code to execute.\n```\n"
                            break
                        lang, code = match.groups()
                        events, status = execute_code_block(lang, code)
                        buffer = []
                        for exec_event in events:
                            if exec_event["type"] == "stream":
                                buffer.append(exec_event["text"])
                            if exec_event["type"] == "display_data":
                                if "text/plain" in exec_event["data"]["variants"]:
                                    buffer.append(
                                        exec_event["data"]["variants"]["text/plain"]
                                    )
                        if status == "timeout":
                            buffer.append("Execution timed out.")
                        if status == "error":
                            buffer.append("Execution failed.")
                        prompt += f"\n```result\n{''.join(buffer)}\n```\n{CODEGEEX_SPECIAL_TOKENS['assistant']}\n"
                        yield f"```\n{''.join(buffer)}\n```\n"
                        for exec_event in events:
                            if exec_event["type"] == "display_data":
                                if "image/png" in exec_event["data"]["variants"]:
                                    yield f"![image](data:image/png;base64,{exec_event['data']['variants']['image/png']})"
                                elif "text/html" in exec_event["data"]["variants"]:
                                    yield exec_event["data"]["variants"]["text/html"]
                        break
                    # If the model otherwise ends the generation, stop here.
                    if event["details"] is not None:
                        should_stop = True
                        break
        print(f"[RESPONSE] {completion}")
 def predict(message: Dict[str, Any], history: List[List[str | None | tuple]]):
    completion = ""
    for delta in stream_chat_completion(message, history):
        completion += delta
        # Replace (sandbox:/ by (<sandbox-address>/
        completion = completion.replace(
            "sandbox:/", f"{args.sandbox_addr}/files/download/-/"
        )
        yield completion
 demo = gr.ChatInterface(
    fn=predict,
    title="CodeGeeX4 Interpreter",
    description="",
    examples=[
        {"text": "Compute factorial of 21 using code", "files": []},
        {
            "text": "Plot the class distribution of this dataset",
            "files": ["./data.csv"],
        },
        {
            "text": 'Reverse the following string and save it to a file: "9738426487936"',
            "files": [],
        },
    ],
    multimodal=True,
 )
 demo.launch()
--- a/interpreter_demo/data.csv
+++ b/interpreter_demo/data.csv
@ -0,0 +1,151 @@
 "sepal.length","sepal.width","petal.length","petal.width","variety"
 5.1,3.5,1.4,.2,"Setosa"
 4.9,3,1.4,.2,"Setosa"
 4.7,3.2,1.3,.2,"Setosa"
 4.6,3.1,1.5,.2,"Setosa"
 5,3.6,1.4,.2,"Setosa"
 5.4,3.9,1.7,.4,"Setosa"
 4.6,3.4,1.4,.3,"Setosa"
 5,3.4,1.5,.2,"Setosa"
 4.4,2.9,1.4,.2,"Setosa"
 4.9,3.1,1.5,.1,"Setosa"
 5.4,3.7,1.5,.2,"Setosa"
 4.8,3.4,1.6,.2,"Setosa"
 4.8,3,1.4,.1,"Setosa"
 4.3,3,1.1,.1,"Setosa"
 5.8,4,1.2,.2,"Setosa"
 5.7,4.4,1.5,.4,"Setosa"
 5.4,3.9,1.3,.4,"Setosa"
 5.1,3.5,1.4,.3,"Setosa"
 5.7,3.8,1.7,.3,"Setosa"
 5.1,3.8,1.5,.3,"Setosa"
 5.4,3.4,1.7,.2,"Setosa"
 5.1,3.7,1.5,.4,"Setosa"
 4.6,3.6,1,.2,"Setosa"
 5.1,3.3,1.7,.5,"Setosa"
 4.8,3.4,1.9,.2,"Setosa"
 5,3,1.6,.2,"Setosa"
 5,3.4,1.6,.4,"Setosa"
 5.2,3.5,1.5,.2,"Setosa"
 5.2,3.4,1.4,.2,"Setosa"
 4.7,3.2,1.6,.2,"Setosa"
 4.8,3.1,1.6,.2,"Setosa"
 5.4,3.4,1.5,.4,"Setosa"
 5.2,4.1,1.5,.1,"Setosa"
 5.5,4.2,1.4,.2,"Setosa"
 4.9,3.1,1.5,.2,"Setosa"
 5,3.2,1.2,.2,"Setosa"
 5.5,3.5,1.3,.2,"Setosa"
 4.9,3.6,1.4,.1,"Setosa"
 4.4,3,1.3,.2,"Setosa"
 5.1,3.4,1.5,.2,"Setosa"
 5,3.5,1.3,.3,"Setosa"
 4.5,2.3,1.3,.3,"Setosa"
 4.4,3.2,1.3,.2,"Setosa"
 5,3.5,1.6,.6,"Setosa"
 5.1,3.8,1.9,.4,"Setosa"
 4.8,3,1.4,.3,"Setosa"
 5.1,3.8,1.6,.2,"Setosa"
 4.6,3.2,1.4,.2,"Setosa"
 5.3,3.7,1.5,.2,"Setosa"
 5,3.3,1.4,.2,"Setosa"
 7,3.2,4.7,1.4,"Versicolor"
 6.4,3.2,4.5,1.5,"Versicolor"
 6.9,3.1,4.9,1.5,"Versicolor"
 5.5,2.3,4,1.3,"Versicolor"
 6.5,2.8,4.6,1.5,"Versicolor"
 5.7,2.8,4.5,1.3,"Versicolor"
 6.3,3.3,4.7,1.6,"Versicolor"
 4.9,2.4,3.3,1,"Versicolor"
 6.6,2.9,4.6,1.3,"Versicolor"
 5.2,2.7,3.9,1.4,"Versicolor"
 5,2,3.5,1,"Versicolor"
 5.9,3,4.2,1.5,"Versicolor"
 6,2.2,4,1,"Versicolor"
 6.1,2.9,4.7,1.4,"Versicolor"
 5.6,2.9,3.6,1.3,"Versicolor"
 6.7,3.1,4.4,1.4,"Versicolor"
 5.6,3,4.5,1.5,"Versicolor"
 5.8,2.7,4.1,1,"Versicolor"
 6.2,2.2,4.5,1.5,"Versicolor"
 5.6,2.5,3.9,1.1,"Versicolor"
 5.9,3.2,4.8,1.8,"Versicolor"
 6.1,2.8,4,1.3,"Versicolor"
 6.3,2.5,4.9,1.5,"Versicolor"
 6.1,2.8,4.7,1.2,"Versicolor"
 6.4,2.9,4.3,1.3,"Versicolor"
 6.6,3,4.4,1.4,"Versicolor"
 6.8,2.8,4.8,1.4,"Versicolor"
 6.7,3,5,1.7,"Versicolor"
 6,2.9,4.5,1.5,"Versicolor"
 5.7,2.6,3.5,1,"Versicolor"
 5.5,2.4,3.8,1.1,"Versicolor"
 5.5,2.4,3.7,1,"Versicolor"
 5.8,2.7,3.9,1.2,"Versicolor"
 6,2.7,5.1,1.6,"Versicolor"
 5.4,3,4.5,1.5,"Versicolor"
 6,3.4,4.5,1.6,"Versicolor"
 6.7,3.1,4.7,1.5,"Versicolor"
 6.3,2.3,4.4,1.3,"Versicolor"
 5.6,3,4.1,1.3,"Versicolor"
 5.5,2.5,4,1.3,"Versicolor"
 5.5,2.6,4.4,1.2,"Versicolor"
 6.1,3,4.6,1.4,"Versicolor"
 5.8,2.6,4,1.2,"Versicolor"
 5,2.3,3.3,1,"Versicolor"
 5.6,2.7,4.2,1.3,"Versicolor"
 5.7,3,4.2,1.2,"Versicolor"
 5.7,2.9,4.2,1.3,"Versicolor"
 6.2,2.9,4.3,1.3,"Versicolor"
 5.1,2.5,3,1.1,"Versicolor"
 5.7,2.8,4.1,1.3,"Versicolor"
 6.3,3.3,6,2.5,"Virginica"
 5.8,2.7,5.1,1.9,"Virginica"
 7.1,3,5.9,2.1,"Virginica"
 6.3,2.9,5.6,1.8,"Virginica"
 6.5,3,5.8,2.2,"Virginica"
 7.6,3,6.6,2.1,"Virginica"
 4.9,2.5,4.5,1.7,"Virginica"
 7.3,2.9,6.3,1.8,"Virginica"
 6.7,2.5,5.8,1.8,"Virginica"
 7.2,3.6,6.1,2.5,"Virginica"
 6.5,3.2,5.1,2,"Virginica"
 6.4,2.7,5.3,1.9,"Virginica"
 6.8,3,5.5,2.1,"Virginica"
 5.7,2.5,5,2,"Virginica"
 5.8,2.8,5.1,2.4,"Virginica"
 6.4,3.2,5.3,2.3,"Virginica"
 6.5,3,5.5,1.8,"Virginica"
 7.7,3.8,6.7,2.2,"Virginica"
 7.7,2.6,6.9,2.3,"Virginica"
 6,2.2,5,1.5,"Virginica"
 6.9,3.2,5.7,2.3,"Virginica"
 5.6,2.8,4.9,2,"Virginica"
 7.7,2.8,6.7,2,"Virginica"
 6.3,2.7,4.9,1.8,"Virginica"
 6.7,3.3,5.7,2.1,"Virginica"
 7.2,3.2,6,1.8,"Virginica"
 6.2,2.8,4.8,1.8,"Virginica"
 6.1,3,4.9,1.8,"Virginica"
 6.4,2.8,5.6,2.1,"Virginica"
 7.2,3,5.8,1.6,"Virginica"
 7.4,2.8,6.1,1.9,"Virginica"
 7.9,3.8,6.4,2,"Virginica"
 6.4,2.8,5.6,2.2,"Virginica"
 6.3,2.8,5.1,1.5,"Virginica"
 6.1,2.6,5.6,1.4,"Virginica"
 7.7,3,6.1,2.3,"Virginica"
 6.3,3.4,5.6,2.4,"Virginica"
 6.4,3.1,5.5,1.8,"Virginica"
 6,3,4.8,1.8,"Virginica"
 6.9,3.1,5.4,2.1,"Virginica"
 6.7,3.1,5.6,2.4,"Virginica"
 6.9,3.1,5.1,2.3,"Virginica"
 5.8,2.7,5.1,1.9,"Virginica"
 6.8,3.2,5.9,2.3,"Virginica"
 6.7,3.3,5.7,2.5,"Virginica"
 6.7,3,5.2,2.3,"Virginica"
 6.3,2.5,5,1.9,"Virginica"
 6.5,3,5.2,2,"Virginica"
 6.2,3.4,5.4,2.3,"Virginica"
 5.9,3,5.1,1.8,"Virginica"
--- a/interpreter_demo/image.png
+++ b/interpreter_demo/image.png
--- a/interpreter_demo/sandbox.py
+++ b/interpreter_demo/sandbox.py
@ -0,0 +1,401 @@
 import argparse
 import asyncio
 import json
 import logging
 import os
 import signal
 import sys
 from asyncio import Queue
 from datetime import datetime, timezone
 from typing import Annotated, List, Union
 import tornado.escape
 import tornado.ioloop
 import tornado.web
 from annotated_types import Gt
 from jupyter_client.asynchronous.client import AsyncKernelClient
 from jupyter_client.manager import AsyncKernelManager
 from pydantic import BaseModel
 # Shell Jupyter message types
 JupyterMessageTypeExecuteRequest = "execute_request"
 JupyterMessageTypeExecuteReply = "execute_reply"
 # IOPub Jupyter message types
 JupyterMessageTypeStream = "stream"
 JupyterMessageTypeDisplayData = "display_data"
 JupyterMessageTypeExecuteResult = "execute_result"
 JupyterMessageTypeError = "error"
 JupyterMessageTypeStatus = "status"
 # Supported Jupyter message types (IOPub only)
 JupyterSupportedMessageTypes = [
    JupyterMessageTypeStream,
    JupyterMessageTypeDisplayData,
    JupyterMessageTypeExecuteResult,
    JupyterMessageTypeError,
    JupyterMessageTypeStatus,
 ]
 # Kernel execution states
 JupyterExecutionStateBusy = "busy"
 JupyterExecutionStateIdle = "idle"
 JupyterExecutionStateStarting = "starting"
 # Saturn execution event types
 ExecutionEventTypeStream = "stream"
 ExecutionEventTypeDisplayData = "display_data"
 ExecutionEventTypeError = "error"
 # Saturn execution statuses
 ExecutionStatusOK = "ok"
 ExecutionStatusTimeout = "timeout"
 class ExecutionEventStream(BaseModel):
    stream: str
    text: str
 class ExecutionEventDisplayData(BaseModel):
    variants: dict
 class ExecutionEventError(BaseModel):
    ename: str
    evalue: str
    traceback: list[str]
 class ExecutionEvent(BaseModel):
    type: str
    timestamp: str  # RFC3339
    data: Union[
        ExecutionEventStream,
        ExecutionEventDisplayData,
        ExecutionEventError,
    ]
 class ExecuteRequest(BaseModel):
    code: str
    timeout_secs: Annotated[int, Gt(0)]
 class ExecuteResponse(BaseModel):
    status: str
    events: List[ExecutionEvent]
 class PingResponse(BaseModel):
    last_activity: str  # RFC3339
 class Error(BaseModel):
    error: str
 def datetime_to_rfc3339(dt: datetime) -> str:
    """Convert a datetime to an RFC3339 formatted string."""
    return dt.astimezone(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S.%fZ")
 def rfc3339_to_datetime(date_string: str) -> datetime:
    """Convert an RFC3339 formatted string to a datetime."""
    return datetime.strptime(date_string, "%Y-%m-%dT%H:%M:%S.%fZ").replace(
        tzinfo=timezone.utc
    )
 logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(message)s")
 async def async_create_kernel(kernel_name: str):
    logging.info(f"Starting kernel for spec '{kernel_name}'")
    km = AsyncKernelManager(kernel_name=kernel_name)
    await km.start_kernel()
    client: AsyncKernelClient = km.client()
    client.start_channels()
    await client.wait_for_ready()
    logging.info("Kernel started")
    return km, client
 msg_id_to_queue: dict[str, Queue] = {}
 async def async_msg_producer(km: AsyncKernelManager, kc: AsyncKernelClient):
    try:
        while True:
            logging.info("Waiting for message...")
            msg = await kc.get_iopub_msg()
            log_jupyter_kernel_message(msg)
            parent_msg_id = msg["parent_header"].get("msg_id")
            if parent_msg_id in msg_id_to_queue:
                await msg_id_to_queue[parent_msg_id].put(msg)
    except Exception as e:
        logging.error(f"Error in message producer: {e}")
        await async_shutdown(km)
 async def async_shutdown(km: AsyncKernelManager):
    logging.info("Shutting down kernel...")
    await km.shutdown_kernel()
    logging.info("Kernel shut down")
    sys.exit(0)
 class State:
    def __init__(self, kernel_client: AsyncKernelClient):
        self.last_activity = datetime.now()
        self.kernel_client = kernel_client
    def reset_last_activity(self):
        self.last_activity = datetime.now()
 class MainHandler(tornado.web.RequestHandler):
    def initialize(self, state: State):
        self.state = state
    async def get(self):
        try:
            is_alive = await client.is_alive()
            if not is_alive:
                raise Exception("kernel is not alive")
            self.write(
                PingResponse(
                    last_activity=datetime_to_rfc3339(self.state.last_activity)
                ).model_dump_json()
            )
        except Exception as e:
            self.set_status(500)
            self.write(Error(error=str(e)).model_dump_json())
            return
 def serializer(o):
    if isinstance(o, datetime):
        return o.isoformat()
    raise TypeError("Type not serializable")
 def log_jupyter_kernel_message(msg):
    m = json.dumps(msg, default=serializer)
    logging.info(f"Jupyter: {m}")
 class ExecuteHandler(tornado.web.RequestHandler):
    def initialize(self, state: State):
        self.state = state
    async def post(self):
        parent_msg_id = None
        res: ExecuteResponse = ExecuteResponse(status=ExecutionStatusOK, events=[])
        try:
            logging.info(f"Execute request: {self.request.body}")
            self.state.reset_last_activity()
            req = ExecuteRequest.model_validate_json(self.request.body)
            local_queue = Queue()
            parent_msg_id = self.state.kernel_client.execute(req.code)
            msg_id_to_queue[parent_msg_id] = local_queue
            # Use the timeout logic on message processing
            try:
                await asyncio.wait_for(
                    self.process_messages(parent_msg_id, local_queue, res),
                    timeout=req.timeout_secs,
                )
            except asyncio.TimeoutError:
                logging.info(f"Timeout after {req.timeout_secs}s")
                res.status = ExecutionStatusTimeout
                return self.write(res.model_dump_json())
            self.state.reset_last_activity()
            self.write(res.model_dump_json())
        except Exception as e:
            self.set_status(500)
            self.write(Error(error=str(e)).model_dump_json())
        finally:
            # Cleanup after processing all messages
            if parent_msg_id is not None and parent_msg_id in msg_id_to_queue:
                del msg_id_to_queue[parent_msg_id]
            logging.info(f"Execute response: {res.model_dump_json()}")
    async def process_messages(self, parent_msg_id, queue, res):
        while True:
            msg = await queue.get()
            if msg["msg_type"] not in JupyterSupportedMessageTypes:
                continue
            elif msg["msg_type"] == JupyterMessageTypeStatus:
                if msg["content"]["execution_state"] == JupyterExecutionStateIdle:
                    break
            elif msg["msg_type"] == JupyterMessageTypeStream:
                res.events.append(
                    ExecutionEvent(
                        type=ExecutionEventTypeStream,
                        timestamp=datetime_to_rfc3339(datetime.now()),
                        data=ExecutionEventStream(
                            stream=msg["content"]["name"],
                            text=msg["content"]["text"],
                        ),
                    )
                )
            elif msg["msg_type"] == JupyterMessageTypeDisplayData:
                res.events.append(
                    ExecutionEvent(
                        type=ExecutionEventTypeDisplayData,
                        timestamp=datetime_to_rfc3339(datetime.now()),
                        data=ExecutionEventDisplayData(variants=msg["content"]["data"]),
                    )
                )
            elif msg["msg_type"] == JupyterMessageTypeError:
                res.events.append(
                    ExecutionEvent(
                        type=ExecutionEventTypeError,
                        timestamp=datetime_to_rfc3339(datetime.now()),
                        data=ExecutionEventError(
                            ename=msg["content"]["ename"],
                            evalue=msg["content"]["evalue"],
                            traceback=msg["content"]["traceback"],
                        ),
                    )
                )
            elif msg["msg_type"] == JupyterMessageTypeExecuteResult:
                res.events.append(
                    ExecutionEvent(
                        type=ExecutionEventTypeDisplayData,
                        timestamp=datetime_to_rfc3339(datetime.now()),
                        data=ExecutionEventDisplayData(variants=msg["content"]["data"]),
                    )
                )
@tornado.web.stream_request_body
 class FileUploadHandler(tornado.web.RequestHandler):
    def initialize(self, state: State):
        self.state = state
        self.file_obj = None
    async def prepare(self):
        if self.request.method != "POST":
            self.set_status(404)
            self.finish()
            return
        path = self.path_args[0]
        full_path = os.path.join("/", path)
        os.makedirs(os.path.dirname(full_path), exist_ok=True)
        self.file_obj = open(full_path, "wb")
        content_length = int(self.request.headers.get("Content-Length", 0))
        logging.info(f"File upload: '{path}' (Content-Length: {content_length})")
    def data_received(self, chunk):
        if self.file_obj:
            self.file_obj.write(chunk)
    async def post(self, path):
        self.state.reset_last_activity()
        if self.file_obj:
            self.file_obj.close()
        self.set_status(201)
 class FileDownloadHandler(tornado.web.RequestHandler):
    def initialize(self, state: State):
        self.state = state
    async def get(self, path):
        self.state.reset_last_activity()
        full_path = os.path.join("/", path)
        if not os.path.exists(full_path):
            self.set_status(404)
            self.write(Error(error="file not found").model_dump_json())
            return
        content_length = os.path.getsize(full_path)
        logging.info(f"File download: '{path}' (Content-Length: {content_length})")
        # Set appropriate headers for file download
        self.set_header("Content-Length", content_length)
        self.set_header("Content-Type", "application/octet-stream")
        self.set_header(
            "Content-Disposition",
            f"attachment; filename*=UTF-8''{tornado.escape.url_escape(os.path.basename(full_path))}",
        )
        # Stream the file to the client
        with open(full_path, "rb") as f:
            while True:
                chunk = f.read(64 * 1024)
                if not chunk:
                    break
                try:
                    self.write(chunk)
                    await self.flush()
                except tornado.iostream.StreamClosedError:
                    return
 def shutdown(ioloop: tornado.ioloop.IOLoop, km):
    logging.info("Shutting down server...")
    ioloop.add_callback_from_signal(lambda: async_shutdown(km))
 if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--port", type=int, default=80)
    p.add_argument("--kernel-name", type=str, default="python3")
    args = p.parse_args()
    km, client = asyncio.run(async_create_kernel(args.kernel_name))
    state = State(client)
    application = tornado.web.Application(
        [
            (r"/", MainHandler, {"state": state}),
            (r"/execute", ExecuteHandler, {"state": state}),
            (r"/files/upload/-/(.*)", FileUploadHandler, {"state": state}),
            (r"/files/download/-/(.*)", FileDownloadHandler, {"state": state}),
        ]
    )
    application.listen(args.port)
    logging.info(f"Server started at http://localhost:{args.port}")
    ioloop = tornado.ioloop.IOLoop.current()
    signal.signal(signal.SIGINT, lambda sig, frame: shutdown(ioloop, km))
    signal.signal(signal.SIGTERM, lambda sig, frame: shutdown(ioloop, km))
    ioloop.add_callback(async_msg_producer, km, client)
    tornado.ioloop.IOLoop.current().start()
--- a/interpreter_demo/sandbox_tests.py
+++ b/interpreter_demo/sandbox_tests.py
@ -0,0 +1,160 @@
 import os
 import shutil
 import tempfile
 import unittest
 import requests
 from sandbox import (
    Error,
    ExecuteResponse,
    ExecutionEventTypeDisplayData,
    ExecutionEventTypeError,
    ExecutionEventTypeStream,
    ExecutionStatusOK,
    ExecutionStatusTimeout,
 )
 # We'll create a temporary directory for the tests to avoid any side effects.
 temp_dir = tempfile.mkdtemp()
 BASE_URL = "http://localhost:8888/"
 def url(path: str) -> str:
    return BASE_URL + path
 class TestExecuteHandler(unittest.TestCase):
    def must_bind_with_execute_response(self, r: requests.Response) -> ExecuteResponse:
        self.assertEqual(r.status_code, 200)
        return ExecuteResponse.model_validate_json(r.content)
    def must_bind_with_error(self, r: requests.Response) -> Error:
        return Error.model_validate_json(r.content)
    def test_execute_hello(self):
        r = requests.post(
            url("execute"), json={"code": "print('hello')", "timeout_secs": 10}
        )
        res = self.must_bind_with_execute_response(r)
        self.assertEqual(len(res.events), 1)
        self.assertEqual(res.events[0].type, ExecutionEventTypeStream)
        self.assertEqual(res.events[0].data.stream, "stdout")  # type: ignore
        self.assertEqual(res.events[0].data.text, "hello\n")  # type: ignore
    def test_execute_timeout(self):
        r = requests.post(
            url("execute"),
            json={"code": "import time\ntime.sleep(5)", "timeout_secs": 1},
        )
        res = self.must_bind_with_execute_response(r)
        self.assertEqual(len(res.events), 0)
        self.assertEqual(res.status, ExecutionStatusTimeout)
    def test_execute_syntax_error(self):
        r = requests.post(
            url("execute"), json={"code": "print('hello'", "timeout_secs": 10}
        )
        err = self.must_bind_with_execute_response(r)
        self.assertEqual(err.status, ExecutionStatusOK)
        self.assertEqual(len(err.events), 1)
        self.assertEqual(err.events[0].type, ExecutionEventTypeError)
        self.assertEqual(err.events[0].data.ename, "SyntaxError")  # type: ignore
        self.assertIsNotNone(err.events[0].data.evalue)  # type: ignore
        self.assertGreater(len(err.events[0].data.traceback), 0)  # type: ignore
    def test_execute_invalid_timeout(self):
        r = requests.post(
            url("execute"),
            json={"code": "print('hello')", "timeout_secs": -1},
        )
        self.must_bind_with_error(r)
    def test_execute_display_data(self):
        code = """import matplotlib.pyplot as plt
 plt.plot([1, 2, 3, 4])
 plt.ylabel('some numbers')
 plt.show()"""
        r = requests.post(url("execute"), json={"code": code, "timeout_secs": 10})
        res = self.must_bind_with_execute_response(r)
        self.assertEqual(res.status, ExecutionStatusOK)
        self.assertEqual(len(res.events), 1)
        self.assertEqual(res.events[0].type, ExecutionEventTypeDisplayData)
        self.assertIsNotNone(res.events[0].data.variants["image/png"])  # type: ignore
        self.assertIsNotNone(res.events[0].data.variants["text/plain"])  # type: ignore
    def test_execute_pil_image(self):
        code = """from PIL import Image
 img = Image.new('RGB', (60, 30), color = 'red')
 # Override the show method of the Image class
 def new_show(self, *args, **kwargs):
    display(self)
 Image.Image.show = new_show
 img.show()"""
        r = requests.post(url("execute"), json={"code": code, "timeout_secs": 10})
        res = self.must_bind_with_execute_response(r)
        self.assertEqual(res.status, ExecutionStatusOK)
        self.assertEqual(len(res.events), 1)
        self.assertEqual(res.events[0].type, ExecutionEventTypeDisplayData)
        self.assertIsNotNone(res.events[0].data.variants["image/png"])  # type: ignore
        self.assertIsNotNone(res.events[0].data.variants["text/plain"])  # type: ignore
 class FileUploadHandlerTest(unittest.TestCase):
    @classmethod
    def setUpClass(cls):
        cls.temp_dir = tempfile.mkdtemp()
        cls.BASE_URL = f"http://localhost:8888/files/upload/-{cls.temp_dir}/"
    def test_upload_file(self):
        file_path = os.path.join(self.temp_dir, "test.txt")
        large_binary_file = os.urandom(1024 * 1024 * 10)  # 10 MB
        r = requests.post(self.BASE_URL + "test.txt", data=large_binary_file)
        self.assertEqual(r.status_code, 201)
        self.assertTrue(os.path.exists(file_path))
        with open(file_path, "rb") as f:
            self.assertEqual(f.read(), large_binary_file)
    def test_upload_existing_file(self):
        file_path = os.path.join(self.temp_dir, "existing.txt")
        with open(file_path, "wb") as f:
            f.write(b"exists")
        with open(file_path, "rb") as f:
            r = requests.post(self.BASE_URL + "existing.txt", data=f.read())
        self.assertEqual(r.status_code, 409)
        error = Error.model_validate_json(r.content)
        self.assertEqual(error.error, "file already exists")
    def test_directory_creation(self):
        file_path = os.path.join(self.temp_dir, "newdir", "test.txt")
        os.makedirs(os.path.dirname(file_path), exist_ok=True)
        r = requests.post(self.BASE_URL + "newdir/test.txt", data=b"test content")
        self.assertEqual(r.status_code, 201)
        self.assertTrue(os.path.exists(file_path))
        with open(file_path, "rb") as f:
            self.assertEqual(f.read(), b"test content")
    @classmethod
    def tearDownClass(cls):
        # Clean up the temp_dir after all tests
        if os.path.exists(cls.temp_dir):
            shutil.rmtree(cls.temp_dir)
 if __name__ == "__main__":
    unittest.main()
--- a/langchain_demo/README.md
+++ b/langchain_demo/README.md
@ -0,0 +1,40 @@
 ![](../resources/logo.jpeg)
 [English](README.md) | [中文](README_zh.md)
 ## RAG Functionality
 CodeGeeX4 supports RAG functionality and is compatible with the Langchain framework to achieve project-level retrieval Q&A.
 ## Tutorial
 ### 1. Install Dependencies
 Navigate to the `langchain_demo` directory and install the required packages.
 ```bash
 cd langchain_demo
 pip install -r requirements.txt
 ```
 ### 2. Configure Embedding API Key
 This project uses the Embedding API from the Zhipu Open Platform for vectorization. Please register and obtain an API Key first.
 Then, configure the API Key in `models/embedding.py`.
 For more details, refer to https://open.bigmodel.cn/dev/api#text_embedding.
 ### 3. Generate Vector Data
 ```bash
 python vectorize.py --workspace . --output_path vectors
 >>> File vectorization completed, saved to vectors
 ```
 ### 4. Run the Q&A Script
 ```bash
 python chat.py --vector_path vectors
 >>> Running on local URL:  http://127.0.0.1:8080
 ```
 ## Demo
 ![](resources/demo.png)
--- a/langchain_demo/README_zh.md
+++ b/langchain_demo/README_zh.md
@ -0,0 +1,44 @@
 ![](../resources/logo.jpeg)
 [English](README.md) | [中文](README_zh.md)
 ## RAG功能
 CodeGeeX4支持RAG检索增强，并兼容Langchain框架，实现项目级检索问答。
 ## 使用教程
 ### 1. 安装依赖项
 ```bash
 cd langchain_demo
 pip install -r requirements.txt
 ```
 ### 2. 配置Embedding API Key
 本项目使用智谱开放平台的Embedding API实现向量化功能，请先注册并获取API Key。
 并在`models/embedding.py`中配置API Key。
 详情可参考 https://open.bigmodel.cn/dev/api#text_embedding
 ### 3. 生成向量数据
 ```bash
 python vectorize.py --workspace . --output_path vectors
 >>> 文件向量化完成,已保存至vectors
 ```
 ### 4. 运行问答脚本
 ```bash
 python chat.py --vector_path vectors
 >>> Running on local URL:  http://127.0.0.1:8080
 ```
 ## Demo
 ![](resources/demo_zh.png)
--- a/langchain_demo/chat.py
+++ b/langchain_demo/chat.py
@ -0,0 +1,54 @@
 """
 References: https://python.langchain.com/v0.2/docs/tutorials/rag/
 """
 import argparse
 import gradio as gr
 from langchain_core.output_parsers import StrOutputParser
 from langchain_core.runnables import RunnablePassthrough
 from models.codegeex import CodegeexChatModel
 from utils.prompts import CUSTOM_RAG_PROMPT
 from utils.vector import load_vector_store
 def parse_arguments():
    parser = argparse.ArgumentParser()
    parser.add_argument('--vector_path', type=str, help="path to load the vectors", default='vectors')
    parser.add_argument('--model_name_or_path', type=str, default='THUDM/codegeex4-all-9b')
    parser.add_argument('--device', type=str, help="cpu or cuda", default="cpu")
    parser.add_argument('--temperature', type=float, help="model's temperature", default=0.2)
    return parser.parse_args()
 def format_docs(docs):
    return "\n\n".join(
        [f"[[citation:{i + 1}]]\n```markdown\n{doc.page_content}\n```" for i, doc in enumerate(docs)]
    )
 def chat(query, history):
    retrieve_chain = ({"context": retriever | format_docs, "question": RunnablePassthrough()} | CUSTOM_RAG_PROMPT)
    retrieve_output = retrieve_chain.invoke(query)
    ans = retrieve_output.text
    yield ans
    ans += "模型回复".center(150, '-') + '\n'
    yield ans
    parse_chain = (llm | StrOutputParser())
    ans += parse_chain.invoke(retrieve_output)
    yield ans
 if __name__ == '__main__':
    args = parse_arguments()
    llm = CodegeexChatModel(args)
    try:
        retriever = load_vector_store(args.vector_path).as_retriever()
    except Exception as e:
        print(f"Fail to load vectors，caused by {e}")
        exit()
    demo = gr.ChatInterface(chat).queue()
    demo.launch(server_name="127.0.0.1", server_port=8080)
--- a/langchain_demo/models/codegeex.py
+++ b/langchain_demo/models/codegeex.py
@ -0,0 +1,57 @@
 from typing import Iterator
 import torch
 from langchain_core.language_models.chat_models import BaseChatModel
 from langchain_core.messages import BaseMessage, AIMessageChunk
 from langchain_core.outputs import ChatGenerationChunk, ChatResult, ChatGeneration
 from pydantic import Field
 from transformers import AutoModel, AutoTokenizer
 from utils.prompts import SYS_PROMPT
 class CodegeexChatModel(BaseChatModel):
    device: str = Field(description="device to load the model")
    tokenizer = Field(description="model's tokenizer")
    model = Field(description="Codegeex model")
    temperature: float = Field(description="temperature to use for the model.")
    def __init__(self, args):
        super().__init__()
        self.device = args.device
        self.tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path, trust_remote_code=True)
        self.model = AutoModel.from_pretrained(
            args.model_name_or_path,
            trust_remote_code=True
        ).to(args.device).eval()
        self.temperature = args.temperature
        print("Model has been initialized.")
    def _llm_type(self) -> str:
        return "codegeex"
    @torch.inference_mode()
    def _generate(self, messages, **kwargs):
        try:
            response, _ = self.model.chat(
                self.tokenizer,
                query=messages[0].content,
                history=[{"role": "system", "content": SYS_PROMPT}],
                max_new_tokens=1024,
                temperature=self.temperature
            )
            return ChatResult(generations=[ChatGeneration(message=BaseMessage(content=response, type='ai'))])
        except Exception as e:
            return ChatResult(generations=[ChatGeneration(message=BaseMessage(content=repr(e), type='ai'))])
    def _stream(self, messages: list[BaseMessage], **kwargs) -> Iterator[ChatGenerationChunk]:
        try:
            for response, _ in self.model.stream_chat(
                    self.tokenizer,
                    query=messages[0].content,
                    history=[{"role": "system", "content": SYS_PROMPT}],
                    max_new_tokens=1024,
                    temperature=self.temperature
            ):
                yield ChatGenerationChunk(message=AIMessageChunk(content=response))
        except Exception as e:
            yield ChatGenerationChunk(message=AIMessageChunk(content=f"Fail to generate, cause by {e}"))
--- a/langchain_demo/models/embedding.py
+++ b/langchain_demo/models/embedding.py
@ -0,0 +1,26 @@
 import os
 from langchain.schema.embeddings import Embeddings
 from zhipuai import ZhipuAI
 class GLMEmbeddings(Embeddings):
    def __init__(self):
        self.client = ZhipuAI(api_key=os.getenv("Zhipu_API_KEY"))
        self.embedding_size = 1024
    def embed_query(self, text: str) -> list[float]:
        return self.embed_documents([text])[0]
    def embed_documents(self, texts: list[str]) -> list[list[float]]:
        return self._get_len_safe_embeddings(texts)
    def _get_len_safe_embeddings(self, texts: list[str]) -> list[list[float]]:
        try:
            # 获取embedding响应
            response = self.client.embeddings.create(model="embedding-2", input=texts)
            data = [item.embedding for item in response.data]
            return data
        except Exception as e:
            print(f"Fail to get embeddings, caused by {e}")
            return []
--- a/langchain_demo/requirements.txt
+++ b/langchain_demo/requirements.txt
@ -0,0 +1,12 @@
 accelerate==0.31.0
 faiss-cpu==1.8
 gradio==4.26.0
 langchain==0.2.3
 langchain-community==0.2.4
 regex==2024.5.15
 requests==2.31.0
 tiktoken==0.7.0
 torch==2.3.1
 tqdm==4.66.4
 transformers==4.39.0
 zhipuai~=2.0
--- a/langchain_demo/resources/demo.png
+++ b/langchain_demo/resources/demo.png
--- a/langchain_demo/resources/demo_zh.png
+++ b/langchain_demo/resources/demo_zh.png
--- a/langchain_demo/utils/data.py
+++ b/langchain_demo/utils/data.py
@ -0,0 +1,63 @@
 import os
 from langchain.text_splitter import (
    Language,
    RecursiveCharacterTextSplitter as TextSplitter,
 )
 from langchain_community.document_loaders import TextLoader
 Languages = {
    'c': Language.CPP,
    'cpp': Language.CPP,
    'go': Language.GO,
    'java': Language.JAVA,
    'js': Language.JS,
    'md': Language.MARKDOWN,
    'py': Language.PYTHON,
    'ts': Language.TS,
 }
 def traverse(repo_path: str) -> list[str]:
    """
    Traverse the directory, fetch all files
    - skip hidden directories
    - only keep the supported files
    :param repo_path: path to this repo
    """
    def helper(root):
        for entry in os.scandir(root):
            if entry.name.startswith('.'):
                continue
            if entry.is_file():
                ext = entry.name.split('.')[-1].lower()
                if ext not in Languages.keys():
                    continue
                file_paths.append(entry.path)
            elif entry.is_dir():
                helper(entry.path)
    file_paths = []
    helper(repo_path)
    return sorted(file_paths)
 def split_into_chunks(file_path, chunk_size, overlap_size) -> list[str]:
    """
    Split file into chunks
    :param file_path: path to the file
    :param chunk_size: size for each chunk
    :param overlap_size: overlap size betweeen 2 chunks
    """
    ext = file_path.split('.')[-1].lower()
    lang = Languages.get(ext, None)
    if not lang:
        return []
    try:
        loader = TextLoader(file_path, encoding='utf-8', autodetect_encoding=True)
        splitter = TextSplitter.from_language(lang, chunk_size=chunk_size, chunk_overlap=overlap_size)
        return loader.load_and_split(splitter)
    except Exception as e:
        print(f'`{file_path}`切分失败: {e}')
        return []
--- a/langchain_demo/utils/prompts.py
+++ b/langchain_demo/utils/prompts.py
@ -0,0 +1,22 @@
 from langchain_core.prompts import PromptTemplate
 SYS_PROMPT = """
 你将接收到一个用户提出的问题，并请撰写清晰、简洁且准确的答案。
 # Note
 - 您将获得与问题相关的多个上下文片段，每个上下文都以引用编号开头，例如[[citation:x]]，其中x是一个数字。如果适用，请使用上下文并在每个句子的末尾引用上下文。
 - 您的答案必须是正确的、准确的，并且以专家的身份使用无偏见和专业的语调来撰写。
 - 请你的回答限制在2千字以内，不要提供与问题无关的信息，也不要重复。
 - 请以引用编号的格式[[citation:x]]来引用上下文。如果一个句子来自多个上下文，请列出所有适用的引用，例如[[citation:3]][[citation:5]]。
 - 若所有上下文均不相关，请以自己的理解回答用户提出的问题，此时回答中可以不带引用编号。
 - 除了代码和特定的名称和引用外，您的答案必须使用与问题相同的语言来撰写。
 """.lstrip()
 template = """
 [引用]
 {context}
 问：{question}
 """.lstrip()
 CUSTOM_RAG_PROMPT = PromptTemplate.from_template(template)
--- a/langchain_demo/utils/vector.py
+++ b/langchain_demo/utils/vector.py
@ -0,0 +1,42 @@
 import os
 from langchain_community.docstore import InMemoryDocstore
 from langchain_community.vectorstores.faiss import FAISS, dependable_faiss_import
 from models.embedding import GLMEmbeddings
 from tqdm import tqdm
 from utils.data import split_into_chunks
 embed_model = GLMEmbeddings()
 def vectorize(files: list[str], args):
    # split file into chunks
    chunks = []
    for file in tqdm(files, desc="文件切分"):
        chunks.extend(split_into_chunks(file, args.chunk_size, args.overlap_size))
    # initialize the vector store
    vector_store = FAISS(
        embedding_function=embed_model,
        index=dependable_faiss_import().IndexFlatL2(embed_model.embedding_size),
        docstore=InMemoryDocstore(),
        index_to_docstore_id={},
    )
    # translate to vectors
    batch_size = args.batch_size
    for i in tqdm(range(0, len(chunks), batch_size), desc="向量化"):
        try:
            vector_store.add_documents(chunks[i:i + batch_size])
        except Exception as e:
            print(f"文件向量化失败，{e}")
    # save embedded vectors
    output_path = args.output_path
    os.makedirs(output_path, exist_ok=True)
    vector_store.save_local(output_path)
    print(f"文件向量化完成，已保存至{output_path}")
 def load_vector_store(vector_path: str):
    return FAISS.load_local(vector_path, embed_model, allow_dangerous_deserialization=True)
--- a/langchain_demo/vectorize.py
+++ b/langchain_demo/vectorize.py
@ -0,0 +1,24 @@
 """
 Vectorize your local project
 """
 import argparse
 from utils.data import traverse
 from utils.vector import vectorize
 def parse_arguments():
    parser = argparse.ArgumentParser()
    parser.add_argument('--workspace', type=str, help="directory of the workspace to be vectorized", default='.')
    parser.add_argument('--chunk_size', type=int, help="chunk size when splitting", default=512)
    parser.add_argument('--overlap_size', type=int, help="chunk overlap when splitting", default=32)
    parser.add_argument('--batch_size', type=int, help="embedding batch size", default=16)
    parser.add_argument('--output_path', type=str, help="path to save the vectors", default='vectors')
    return parser.parse_args()
 if __name__ == '__main__':
    args = parse_arguments()
    files = traverse(args.workspace)
    vectorize(files, args)
--- a/llamaindex_demo/README.md
+++ b/llamaindex_demo/README.md
@ -0,0 +1,43 @@
 ![](../resources/logo.jpeg)
 [English](README.md) | [中文](README_zh.md)
 ## RAG Functionality
 CodeGeeX4 supports RAG retrieval enhancement and is compatible with the LlamaIndex framework to achieving project-level retrieval Q&A.
 ## Usage Tutorial
 ### 1. Install Dependencies
 ```bash
 cd llamaindex_demo
 pip install -r requirements.txt
 ```
 Note: This project uses tree-sitter-language, which has compatibility issues with Python 3.10, so please use Python 3.8 or Python 3.9 to run
 this project.
 ### 2. Configure Embedding API Key
 This project uses the Zhipu Open Platform's Embedding API to implement vectorization. Please register and obtain an API Key first.
 Then configure the API Key in `models/embedding.py`.
 For details, refer to https://open.bigmodel.cn/dev/api#text_embedding
 ### 3. Generate Vector Data
 ```bash
 python vectorize.py --workspace . --output_path vectors
 >>> File vectorization completed, saved to vectors
 ```
 ### 4. Run the Q&A Script
 ```bash
 python chat.py --vector_path vectors
 >>> Running on local URL: http://127.0.0.1:8080
 ```
 ## Demo
 ![](resources/demo.png)
--- a/llamaindex_demo/README_zh.md
+++ b/llamaindex_demo/README_zh.md
@ -0,0 +1,46 @@
 ![](../resources/logo.jpeg)
 [English](README.md) | [中文](README_zh.md)
 ## RAG功能
 CodeGeeX4支持RAG检索增强，并兼容LlamaIndex框架，实现项目级检索问答。
 ## 使用教程
 ### 1. 安装依赖项
 ```bash
 cd llamaindex_demo
 pip install -r requirements.txt
 ```
 注：此项目使用到tree-sitter-language，其与python3.10兼容的有问题，因此请使用python3.8或python3.9运行该项目。
 ### 2. 配置Embedding API Key
 本项目使用智谱开放平台的Embedding API实现向量化功能，请先注册并获取API Key。
 并在`models/embedding.py`中配置API Key。
 详情可参考 https://open.bigmodel.cn/dev/api#text_embedding
 ### 3. 生成向量数据
 ```bash
 python vectorize.py --workspace . --output_path vectors
 >>> 文件向量化完成,已保存至vectors
 ```
 ### 4. 运行问答脚本
 ```bash
 python chat.py --vector_path vectors
 >>> Running on local URL:  http://127.0.0.1:8080
 ```
 ## Demo
 ![](resources/demo_zh.png)
--- a/llamaindex_demo/chat.py
+++ b/llamaindex_demo/chat.py
@ -0,0 +1,51 @@
 """
 References: https://docs.llamaindex.ai/en/stable/use_cases/q_and_a/
 """
 import argparse
 import gradio as gr
 from llama_index.core import Settings
 from models.embedding import GLMEmbeddings
 from models.synthesizer import CodegeexSynthesizer
 from utils.vector import load_vectors
 def parse_arguments():
    parser = argparse.ArgumentParser()
    parser.add_argument('--vector_path', type=str, help="path to store the vectors", default='vectors')
    parser.add_argument('--model_name_or_path', type=str, default='THUDM/codegeex4-all-9b')
    parser.add_argument('--device', type=str, help="cpu or cuda", default="cpu")
    parser.add_argument('--temperature', type=float, help="model's temperature", default=0.2)
    return parser.parse_args()
 def chat(query, history):
    resp = query_engine.query(query)
    ans = "相关文档".center(150, '-') + '\n'
    yield ans
    for i, node in enumerate(resp.source_nodes):
        file_name = node.metadata['filename']
        ext = node.metadata['extension']
        text = node.text
        ans += f"File{i + 1}: {file_name}\n```{ext}\n{text}\n```\n"
        yield ans
    ans += "模型回复".center(150, '-') + '\n'
    ans += resp.response
    yield ans
 if __name__ == '__main__':
    args = parse_arguments()
    Settings.embed_model = GLMEmbeddings()
    try:
        query_engine = load_vectors(args.vector_path).as_query_engine(
            response_synthesizer=CodegeexSynthesizer(args)
        )
    except Exception as e:
        print(f"Fail to load vectors, caused by {e}")
        exit()
    demo = gr.ChatInterface(chat).queue()
    demo.launch(server_name="127.0.0.1", server_port=8080)
--- a/llamaindex_demo/models/codegeex.py
+++ b/llamaindex_demo/models/codegeex.py
@ -0,0 +1,108 @@
 from llama_index.core.base.llms.types import (
    ChatMessage,
    ChatResponse,
    ChatResponseGen,
    CompletionResponse,
    CompletionResponseGen,
    LLMMetadata,
 )
 from llama_index.core.llms import LLM
 from pydantic import Field
 from transformers import AutoTokenizer, AutoModel
 from utils.prompts import SYS_PROMPT
 class CodegeexChatModel(LLM):
    device: str = Field(description="device to load the model")
    tokenizer = Field(description="model's tokenizer")
    model = Field(description="Codegeex model")
    temperature: float = Field(description="temperature to use for the model.")
    def __init__(self, args):
        super().__init__()
        self.device = args.device
        self.tokenizer = AutoTokenizer.from_pretrained(args.model_name_or_path, trust_remote_code=True)
        self.model = AutoModel.from_pretrained(args.model_name_or_path, trust_remote_code=True).to(args.device).eval()
        self.temperature = args.temperature
        print("Model has been initialized.")
    @classmethod
    def class_name(cls) -> str:
        return "codegeex"
    @property
    def metadata(self) -> LLMMetadata:
        return LLMMetadata(
            context_window=7168,
            num_output=1024,
            is_chat_model=True,
            model_name="codegeex",
        )
    def chat(self, messages: list[ChatMessage], **kwargs) -> ChatResponse:
        try:
            response, _ = self.model.chat(
                self.tokenizer,
                query=messages[0].content,
                history=[{"role": "system", "content": SYS_PROMPT}],
                max_new_tokens=1024,
                temperature=self.temperature
            )
            return ChatResponse(message=ChatMessage(role="assistant", content=response))
        except Exception as e:
            return ChatResponse(message=ChatMessage(role="assistant", content=e))
    def stream_chat(self, messages: list[ChatMessage], **kwargs) -> ChatResponseGen:
        try:
            for response, _ in self.model.stream_chat(
                    self.tokenizer,
                    query=messages[0].content,
                    history=[{"role": "system", "content": SYS_PROMPT}],
                    max_new_tokens=1024,
                    temperature=self.temperature
            ):
                yield ChatResponse(message=ChatMessage(role="assistant", content=response))
        except Exception as e:
            yield ChatResponse(message=ChatMessage(role="assistant", content=e))
    def complete(self, prompt: str, formatted: bool = False, **kwargs) -> CompletionResponse:
        try:
            response, _ = self.model.chat(
                self.tokenizer,
                query=prompt,
                history=[{"role": "system", "content": "你是一个智能编程助手"}],
                max_new_tokens=1024,
                temperature=self.temperature
            )
            return CompletionResponse(text=response)
        except Exception as e:
            return CompletionResponse(text=e)
    def stream_complete(self, prompt: str, formatted: bool = False, **kwargs) -> CompletionResponseGen:
        try:
            for response, _ in self.model.stream_chat(
                    self.tokenizer,
                    query=prompt,
                    history=[{"role": "system", "content": "你是一个智能编程助手"}],
                    max_new_tokens=1024,
                    temperature=self.temperature
            ):
                yield CompletionResponse(text=response)
        except Exception as e:
            yield CompletionResponse(text=e)
    async def achat(self, messages: list[ChatMessage], **kwargs):
        return await self.chat(messages, **kwargs)
    async def astream_chat(self, messages: list[ChatMessage], **kwargs):
        async for resp in self.stream_chat(messages, **kwargs):
            yield resp
    async def acomplete(self, prompt: str, formatted: bool = False, **kwargs):
        return await self.complete(prompt, formatted, **kwargs)
    async def astream_complete(self, prompt: str, formatted: bool = False, **kwargs):
        async for resp in self.stream_complete(prompt, formatted, **kwargs):
            yield resp
--- a/llamaindex_demo/models/embedding.py
+++ b/llamaindex_demo/models/embedding.py
@ -0,0 +1,37 @@
 import os
 from llama_index.core.base.embeddings.base import BaseEmbedding
 from pydantic import Field
 from zhipuai import ZhipuAI
 class GLMEmbeddings(BaseEmbedding):
    client = Field(description="embedding model client")
    embedding_size: float = Field(description="embedding size")
    def __init__(self):
        super().__init__(model_name='GLM', embed_batch_size=64)
        self.client = ZhipuAI(api_key=os.getenv("Zhipu_API_KEY"))
        self.embedding_size = 1024
    def _get_query_embedding(self, query: str) -> list[float]:
        return self._get_text_embeddings([query])[0]
    def _get_text_embedding(self, text: str) -> list[float]:
        return self._get_text_embeddings([text])[0]
    def _get_text_embeddings(self, texts: list[str]) -> list[list[float]]:
        return self._get_len_safe_embeddings(texts)
    async def _aget_query_embedding(self, query: str) -> list[float]:
        return self._get_query_embedding(query)
    def _get_len_safe_embeddings(self, texts: list[str]) -> list[list[float]]:
        try:
            # 获取embedding响应
            response = self.client.embeddings.create(model="embedding-2", input=texts)
            data = [item.embedding for item in response.data]
            return data
        except Exception as e:
            print(f"Fail to get embeddings, caused by {e}")
            return []
--- a/llamaindex_demo/models/synthesizer.py
+++ b/llamaindex_demo/models/synthesizer.py
@ -0,0 +1,39 @@
 from llama_index.core.response_synthesizers import BaseSynthesizer
 from models.codegeex import CodegeexChatModel
 from utils.prompts import CUSTOM_PROMPT_TEMPLATE
 class CodegeexSynthesizer(BaseSynthesizer):
    """Response builder class."""
    def __init__(self, args) -> None:
        super().__init__(llm=CodegeexChatModel(args))
        self.prompt_template = CUSTOM_PROMPT_TEMPLATE
    def get_response(self, query_str: str, text_chunks: list[str], **kwargs) -> str:
        context = self.build_context(text_chunks)
        return self._llm.predict(self.prompt_template, query=query_str, context=context)
    async def aget_response(self, query_str: str, text_chunks: list[str], **kwargs) -> str:
        context = self.build_context(text_chunks)
        return await self._llm.apredict(self.prompt_template, query=query_str, context=context)
    def _get_prompts(self):
        """Get prompts."""
        return {"text_qa_template": self.prompt_template}
    def _update_prompts(self, prompts) -> None:
        """Update prompts."""
        if "text_qa_template" in prompts:
            self.prompt_template = prompts["text_qa_template"]
    @staticmethod
    def build_context(text_chunks):
        """
        merge contexts
        :param text_chunks: recalled texts
        """
        return "\n\n".join(
            [f"[[citation:{i + 1}]]\n```markdown\n{chunk}\n```" for i, chunk in enumerate(text_chunks)]
        )
--- a/llamaindex_demo/requirements.txt
+++ b/llamaindex_demo/requirements.txt
@ -0,0 +1,12 @@
 accelerate==0.31.0
 faiss-cpu==1.8
 gradio==4.26.0
 llama-index==0.10.43
 regex==2024.5.15
 tiktoken==0.7.0
 torch==2.3.1
 tree-sitter<0.22.0
 tree-sitter-languages==1.10.2
 tqdm==4.66.4
 transformers==4.39.0
 zhipuai~=2.0
--- a/llamaindex_demo/resources/demo.png
+++ b/llamaindex_demo/resources/demo.png
--- a/llamaindex_demo/resources/demo_zh.png
+++ b/llamaindex_demo/resources/demo_zh.png
--- a/llamaindex_demo/utils/data.py
+++ b/llamaindex_demo/utils/data.py
@ -0,0 +1,68 @@
 import os
 from pathlib import Path
 from llama_index.core.node_parser import CodeSplitter
 from llama_index.core.schema import BaseNode
 from llama_index.readers.file import FlatReader
 Languages = {
    'c': "c",
    'cpp': "cpp",
    'go': "go",
    'java': "java",
    'js': "javascript",
    'md': "markdown",
    'py': "python",
    'ts': "typescript",
 }
 def traverse(repo_path: str) -> list[str]:
    """
    Traverse the directory, fetch all files
    - skip hidden directories
    - only keep the supported files
    :param repo_path: path to this repo
    """
    def helper(root):
        for entry in os.scandir(root):
            if entry.name.startswith('.'):
                continue
            if entry.is_file():
                ext = entry.name.split('.')[-1].lower()
                if ext not in Languages.keys():
                    continue
                file_paths.append(entry.path)
            elif entry.is_dir():
                helper(entry.path)
    file_paths = []
    helper(repo_path)
    return sorted(file_paths)
 def split_into_chunks(file_path, lines_per_chunk, lines_overlap, max_chars) -> list[BaseNode]:
    """
    Split file into chunks
    :param file_path: path to the file
    :param lines_per_chunk: lines for each chunk
    :param lines_overlap: overlap lines between 2 chunks
    :param max_chars: max characters for each chunk
    """
    ext = file_path.split('.')[-1].lower()
    lang = Languages.get(ext, None)
    if not lang:
        return []
    try:
        documents = FlatReader().load_data(Path(file_path))
        splitter = CodeSplitter(
            language=lang,
            chunk_lines=lines_per_chunk,
            chunk_lines_overlap=lines_overlap,
            max_chars=max_chars,
        )
        return splitter.get_nodes_from_documents(documents)
    except Exception as e:
        print(f'`{file_path}`切分失败: {e}')
        return []
--- a/llamaindex_demo/utils/prompts.py
+++ b/llamaindex_demo/utils/prompts.py
@ -0,0 +1,22 @@
 from llama_index.core import PromptTemplate
 SYS_PROMPT = """
 你将接收到一个用户提出的问题，并请撰写清晰、简洁且准确的答案。
 # Note
 - 您将获得与问题相关的多个上下文片段，每个上下文都以引用编号开头，例如[[citation:x]]，其中x是一个数字。如果适用，请使用上下文并在每个句子的末尾引用上下文。
 - 您的答案必须是正确的、准确的，并且以专家的身份使用无偏见和专业的语调来撰写。
 - 请你的回答限制在2千字以内，不要提供与问题无关的信息，也不要重复。
 - 请以引用编号的格式[[citation:x]]来引用上下文。如果一个句子来自多个上下文，请列出所有适用的引用，例如[[citation:3]][[citation:5]]。
 - 若所有上下文均不相关，请以自己的理解回答用户提出的问题，此时回答中可以不带引用编号。
 - 除了代码和特定的名称和引用外，您的答案必须使用与问题相同的语言来撰写。
 """.lstrip()
 template = """
 [引用]
 {context}
 问：{query}
 """.lstrip()
 CUSTOM_PROMPT_TEMPLATE = PromptTemplate(template, prompt_type='text_qa')
--- a/llamaindex_demo/utils/vector.py
+++ b/llamaindex_demo/utils/vector.py
@ -0,0 +1,36 @@
 import os
 import faiss
 from llama_index.core import StorageContext, VectorStoreIndex, load_index_from_storage
 from llama_index.legacy.vector_stores import FaissVectorStore
 from models.embedding import GLMEmbeddings
 from tqdm import tqdm
 from utils.data import split_into_chunks
 embed_model = GLMEmbeddings()
 def save_vectors(files: list[str], args):
    # split file into chunks
    nodes = []
    for file in tqdm(files, desc="文件切分"):
        nodes.extend(split_into_chunks(file, args.lines_per_chunk, args.lines_overlap, args.max_chars))
    # initialize vector store
    vector_store = FaissVectorStore(faiss_index=faiss.IndexFlatL2(embed_model.embedding_size))
    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    # translate to vectors
    index = VectorStoreIndex(nodes=nodes, storage_context=storage_context, embed_model=embed_model)
    # save embedded vectors
    output_path = args.output_path
    os.makedirs(output_path, exist_ok=True)
    index.storage_context.persist(persist_dir=output_path)
    print(f"文件向量化完成，已保存至{output_path}")
 def load_vectors(vector_path: str):
    vector_store = FaissVectorStore.from_persist_dir(vector_path)
    storage_context = StorageContext.from_defaults(vector_store=vector_store, persist_dir=vector_path)
    return load_index_from_storage(storage_context=storage_context)
--- a/llamaindex_demo/vectorize.py
+++ b/llamaindex_demo/vectorize.py
@ -0,0 +1,20 @@
 import argparse
 from utils.data import traverse
 from utils.vector import save_vectors
 def parse_arguments():
    parser = argparse.ArgumentParser()
    parser.add_argument('--workspace', type=str, help="directory of the workspace to be vectorized", default='.')
    parser.add_argument('--lines_per_chunk', type=int, help="chunk lines when splitting", default=40)
    parser.add_argument('--lines_overlap', type=int, help="chunk lines overlap when splitting", default=15)
    parser.add_argument("--max_chars", type=int, help="maximum number of characters in a chunk", default=1500)
    parser.add_argument('--output_path', type=str, help="path to save the vectors", default='vectors')
    return parser.parse_args()
 if __name__ == '__main__':
    args = parse_arguments()
    files = traverse(args.workspace)
    save_vectors(files, args)
--- a/metric/README.md
+++ b/metric/README.md
@ -0,0 +1,49 @@
 # The Most Powerful Versatile Code Model Under 10 Billion Parameters
 CodeGeeX4-ALL-9B, the open-source version of the latest generation of the CodeGeeX4 series, iterates on the powerful language capabilities of GLM4, significantly enhancing code generation capabilities. Using a single CodeGeeX4-ALL-9B model, it supports comprehensive functionalities such as code completion and generation, code interpreter, online search, tool invocation, repository-level long code Q&A and generation, covering various programming and development scenarios. CodeGeeX4-ALL-9B has achieved highly competitive performance on multiple authoritative code capability evaluation sets, such as NaturalCodeBench and BigCodeBench. It is the most powerful model under 10 billion parameters, even surpassing general models several times its size, achieving the best balance between inference performance and model effectiveness.
 ## 1. BigCodeBench
 BigCodeBench test results show that CodeGeeX4-ALL-9B performs the best at the same size:
 ![BigCodeBench Test Results](./pics/Bigcodebench.png)
 ## 2. NaturalCodeBench & HumanEval
 NaturalCodeBench test results show that CodeGeeX4-ALL-9B achieves the best results in tasks such as code completion, code interpreter, code Q&A, code translation, and code repair:
 ![NaturalCodeBench Test Results](./pics/NCB&HUMANEVAL.png)
 ## 3. Code Needle In A Haystack
 CodeGeeX4-ALL-9B's context handling capability has reached 128K, an 8-fold increase compared to the previous generation model!
 For code large models under 10B parameters, accurately extracting information from massive amounts of code is a key challenge. CodeGeeX4-ALL-9B's upgraded support for 128K context enables it to process and utilize longer code files, and even information from project code, helping the model to understand complex and detail-rich code more deeply. Based on the longer context, CodeGeeX4-ALL-9B can handle more complex project-level tasks, accurately answering content from different code files and making modifications to the code even when the input length increases significantly.
 In the "Needle In A Haystack" (NIAH) evaluation, the CodeGeeX4-ALL-9B model demonstrated its ability to embed and retrieve code within contexts up to 128K, achieving a 100% retrieval accuracy.
 ![NIAH_PYTHON Evaluation](./pics/NIAH_PYTHON.png)
 ![NIAH_ALL_FILES Evaluation](./pics/NIAH_ALL.png)
 The above figures show the test results in a test set composed entirely of Python code, where an assignment statement such as `zhipu_codemodel = "codegeex"` (Needle) is inserted, and the model is tested on whether it can correctly answer the value of `zhipu_codemodel`. CodeGeeX4-All-9B completed the task 100%.
 ## 4. Function Call Capabilities
 CodeGeeX4-ALL-9B is currently the only code large model that implements Function Call capabilities.
 The Berkeley Function Calling Leaderboard is the first test set that can comprehensively evaluate the function calling capabilities of large models. The AST dataset evaluates the model's calling capabilities for Java, JavaScript, and Python programs; the Executable dataset evaluates the model's function calling capabilities for real-world API scenarios.
 ![Berkeley Function Calling Leaderboard](./pics/FunctionCall.png)
 CodeGeeX4-ALL-9B underwent comprehensive testing on the Berkeley Function Calling Leaderboard, including various forms of function calls, different function call scenarios, and function call executability tests, achieving the following results: a call success rate of over 90% in both AST and Exec test sets.
 ## 5. Cross-File Completion
 Cross-File Evaluation is a multilingual benchmark built on diverse real-world repositories in Python, Java, TypeScript, and C#. It uses a static-analysis-based method to strictly require cross-file context for accurate code completion.
 | Model            | PYTHON EM | PYTHON ES | JAVA EM | JAVA ES | TypeScript EM | TypeScript ES | C# EM  | C# ES  |
 |------------------|------------|------------|----------|----------|----------------|----------------|---------|---------|
 | DeepSeekCoder-7B | 29.9       | 62.9       | 39.8     | 74.8     | 39             | 77             | 52.2    | 78.1    |
 | StarCoder2-7B    | 25.3       | 58         | 31.4     | 67.4     | 33.3           | 73.2           | 43.5    | 69.8    |
 | CodeLlama-7B     | 23.5       | 53.5       | 33.9     | 68.4     | 11.5           | 71.5           | 50.6    | 75.4    |
 | CodeGeeX-9B      | 32.3      | 70.3      | 48.6    | 84.4    | 35.3          | 78.0          | 48.0   | 84.8   |
--- a/metric/README_zh.md
+++ b/metric/README_zh.md
@ -0,0 +1,50 @@
 # CodeGeeX4-ALL-9B
 ## CodeGeeX4-ALL-9B：百亿参数以下性能最强的全能代码模型
 CodeGeeX4-ALL-9B作为最新一代CodeGeeX4系列模型的开源版本，在GLM4强大语言能力的基础上继续迭代，大幅增强代码生成能力。使用CodeGeeX4-ALL-9B单一模型，即可支持代码补全和生成、代码解释器、联网搜索、工具调用、仓库级长代码问答及生成等全面功能，覆盖了编程开发的各种场景。CodeGeeX4-ALL-9B在多个权威代码能力评测集，如NaturalCodeBench、BigCodeBench上都取得了极具竞争力的表现，是百亿参数量级以下性能最强的模型，甚至超过数倍规模的通用模型，在推理性能和模型效果上得到最佳平衡。
 ### 1. 性能表现评测
 BigCodeBench测试结果显示，CodeGeeX4-ALL-9B在同等尺寸下效果最好：
 ![BigCodeBench Test Results](./pics/Bigcodebench.png)
 NaturalCodeBench测试结果显示，CodeGeeX4-ALL-9B在代码补全、代码解释器、代码问答、代码翻译、代码修复等任务上均取得了最佳效果：
 ![NaturalCodeBench测试结果](./pics/NCB&HUMANEVAL.png)
 ### 2. CodeGeeX4-ALL-9B上下文处理能力
 CodeGeeX4-ALL-9B上下文处理能力达到了128K，相较于上一代模型增长8倍！
 对于参数量10B以下的代码大模型，从海量的代码中准确提取信息是一个关键性的挑战。CodeGeeX4-ALL-9B升级支持128K上下文，使其能够处理和利用更长代码文件、甚至是项目代码中的信息，有助于模型更深入理解复杂和细节丰富的代码。基于更长的上下文，CodeGeeX4-ALL-9B可以处理更复杂的项目级任务，在输入显著变长的情况下，依然能准确回答不同代码文件中的内容，并对代码作出修改。
 在“大海捞针”（Needle In A Haystack, NIAH）评估中，CodeGeeX4-ALL-9B模型展示了其在处理长达128K的上下文中进行代码的嵌入和检索能力，实现了100%的检索准确度。
 ![NIAH_PYTHON评估](./pics/NIAH_PYTHON.png)
 ![NIAH_ALL_FILES评估](./pics/NIAH_ALL.png)
 上图展示的是在一个全部由Python代码组成的测试集中，插入一个赋值语句如：`zhipu_codemodel = "codegeex"`（Needle），测试模型是否可以正确回答出`zhipu_codemodel`的值，CodeGeeX4-ALL-9B 100%完成任务。
 ### 3. CodeGeeX4-ALL-9B 支持 Function Call 能力
 CodeGeeX4-ALL-9B是目前唯一一个实现Function Call的代码大模型。
 Berkeley Function Calling Leaderboard是第一个可全面评估大模型函数调用能力的测试集。其中AST数据集是评估模型对Java、JavaScript、Python程序的调用能力；Executable数据集是评估模型对真实场景API的函数调用能力。
 ![Berkeley Function Calling Leaderboard](./pics/FunctionCall.png)
 CodeGeeX4-ALL-9B在Berkeley Function Calling Leaderboard上进行了全面的测试，包括各种形式的函数调用、不同的函数调用场景以及函数调用可执行性的测试，得到了以下结果：在AST和Exec测试集中调用成功率超过90%。
 ### 4. CodeGeeX4-ALL-9B 跨文件补全
 Cross-File Evaluation是一个多语言的基准，建立在Python、Java、TypeScript和C#的多样化真实仓库之上。它使用基于静态分析的方法，严格要求跨文件上下文以实现准确的代码补全。
 | Model            | PYTHON EM | PYTHON ES | JAVA EM | JAVA ES | TypeScript EM | TypeScript ES | C# EM  | C# ES  |
 |------------------|------------|------------|----------|----------|----------------|----------------|---------|---------|
 | DeepSeekCoder-7B | 29.9       | 62.9       | 39.8     | 74.8     | 39             | 77             | 52.2    | 78.1    |
 | StarCoder2-7B    | 25.3       | 58         | 31.4     | 67.4     | 33.3           | 73.2           | 43.5    | 69.8    |
 | CodeLlama-7B     | 23.5       | 53.5       | 33.9     | 68.4     | 11.5           | 71.5           | 50.6    | 75.4    |
 | CodeGeeX-9B      | 32.3      | 70.3      | 48.6    | 84.4    | 35.3          | 78.0          | 48.0   | 84.8   |
--- a/metric/pics/Bigcodebench.PNG
+++ b/metric/pics/Bigcodebench.PNG
--- a/metric/pics/FunctionCall.png
+++ b/metric/pics/FunctionCall.png
--- a/metric/pics/NCB&HUMANEVAL.png
+++ b/metric/pics/NCB&HUMANEVAL.png
--- a/metric/pics/NIAH_ALL.png
+++ b/metric/pics/NIAH_ALL.png
--- a/metric/pics/NIAH_PYTHON.png
+++ b/metric/pics/NIAH_PYTHON.png
--- a/metric/pics/cce.jpg
+++ b/metric/pics/cce.jpg
--- a/repodemo/.chainlit/config.toml
+++ b/repodemo/.chainlit/config.toml
@ -0,0 +1,118 @@
 [project]
 # Whether to enable telemetry (default: true). No personal data is collected.
 enable_telemetry = false
 # List of environment variables to be provided by each user to use the app.
 user_env = []
 # Duration (in seconds) during which the session is saved when the connection is lost
 session_timeout = 3600
 # Enable third parties caching (e.g LangChain cache)
 cache = false
 # Authorized origins
 allow_origins = ["*"]
 # Follow symlink for asset mount (see https://github.com/Chainlit/chainlit/issues/317)
 # follow_symlink = false
 [features]
 # Process and display HTML in messages. This can be a security risk (see https://stackoverflow.com/questions/19603097/why-is-it-dangerous-to-render-user-generated-html-or-javascript)
 unsafe_allow_html = false
 # Process and display mathematical expressions. This can clash with "$" characters in messages.
 latex = false
 # Automatically tag threads with the current chat profile (if a chat profile is used)
 auto_tag_thread = true
 # Authorize users to spontaneously upload files with messages
 [features.spontaneous_file_upload]
    enabled = false
    accept = ["*/*"]
    max_files = 20
    max_size_mb = 500
 [features.audio]
    # Threshold for audio recording
    min_decibels = -45
    # Delay for the user to start speaking in MS
    initial_silence_timeout = 3000
    # Delay for the user to continue speaking in MS. If the user stops speaking for this duration, the recording will stop.
    silence_timeout = 1500
    # Above this duration (MS), the recording will forcefully stop.
    max_duration = 15000
    # Duration of the audio chunks in MS
    chunk_duration = 1000
    # Sample rate of the audio
    sample_rate = 44100
 [UI]
 # Name of the assistant.
 name = "CodeGeeX4 RepoDome"
 # Description of the assistant. This is used for HTML tags.
 description = "CodeGeeX4项目级能力展示"
 # Large size content are by default collapsed for a cleaner ui
 default_collapse_content = true
 # Hide the chain of thought details from the user in the UI.
 hide_cot = false
 # Link to your github repo. This will add a github button in the UI's header.
 github = "https://github.com/CodeGeeX"
 # Specify a CSS file that can be used to customize the user interface.
 # The CSS file can be served from the public directory or via an external link.
 # custom_css = "/public/test.css"
 # Specify a Javascript file that can be used to customize the user interface.
 # The Javascript file can be served from the public directory.
 # custom_js = "/public/test.js"
 # Specify a custom font url.
 # custom_font = "https://fonts.googleapis.com/css2?family=Inter:wght@400;500;700&display=swap"
 # Specify a custom meta image url.
 custom_meta_image_url = "/public/logo_dark.png"
 # Specify a custom build directory for the frontend.
 # This can be used to customize the frontend code.
 # Be careful: If this is a relative path, it should not start with a slash.
 # custom_build = "./public/build"
 [UI.theme]
    default = "dark"
    layout = "wide"
    #font_family = "Inter, sans-serif"
 # Override default MUI light theme. (Check theme.ts)
 [UI.theme.light]
    #background = "#FAFAFA"
    #paper = "#FFFFFF"
    [UI.theme.light.primary]
        #main = "#F80061"
        #dark = "#980039"
        #light = "#FFE7EB"
    [UI.theme.light.text]
        #primary = "#212121"
        #secondary = "#616161"
 # Override default MUI dark theme. (Check theme.ts)
 [UI.theme.dark]
    #background = "#FAFAFA"
    #paper = "#FFFFFF"
    [UI.theme.dark.primary]
        #main = "#F80061"
        #dark = "#980039"
        #light = "#FFE7EB"
    [UI.theme.dark.text]
        #primary = "#EEEEEE"
        #secondary = "#BDBDBD"
 [meta]
 generated_by = "1.1.305"
--- a/repodemo/.chainlit/translations/en-US.json
+++ b/repodemo/.chainlit/translations/en-US.json
@ -0,0 +1,229 @@
 {
    "components": {
        "atoms": {
            "buttons": {
                "userButton": {
                    "menu": {
                        "settings": "Settings",
                        "settingsKey": "S",
                        "APIKeys": "API Keys",
                        "logout": "Logout"
                    }
                }
            }
        },
        "molecules": {
            "newChatButton": {
                "newChat": "New Chat"
            },
            "tasklist": {
                "TaskList": {
                    "title": "\ud83d\uddd2\ufe0f Task List",
                    "loading": "Loading...",
                    "error": "An error occurred"
                }
            },
            "attachments": {
                "cancelUpload": "Cancel upload",
                "removeAttachment": "Remove attachment"
            },
            "newChatDialog": {
                "createNewChat": "Create new chat?",
                "clearChat": "This will clear the current messages and start a new chat.",
                "cancel": "Cancel",
                "confirm": "Confirm"
            },
            "settingsModal": {
                "settings": "Settings",
                "expandMessages": "Expand Messages",
                "hideChainOfThought": "Hide Chain of Thought",
                "darkMode": "Dark Mode"
            },
            "detailsButton": {
                "using": "Using",
                "used": "Used"
            },
            "auth": {
                "authLogin": {
                    "title": "Login to access the app.",
                    "form": {
                        "email": "Email address",
                        "password": "Password",
                        "noAccount": "Don't have an account?",
                        "alreadyHaveAccount": "Already have an account?",
                        "signup": "Sign Up",
                        "signin": "Sign In",
                        "or": "OR",
                        "continue": "Continue",
                        "forgotPassword": "Forgot password?",
                        "passwordMustContain": "Your password must contain:",
                        "emailRequired": "email is a required field",
                        "passwordRequired": "password is a required field"
                    },
                    "error": {
                        "default": "Unable to sign in.",
                        "signin": "Try signing in with a different account.",
                        "oauthsignin": "Try signing in with a different account.",
                        "redirect_uri_mismatch": "The redirect URI is not matching the oauth app configuration.",
                        "oauthcallbackerror": "Try signing in with a different account.",
                        "oauthcreateaccount": "Try signing in with a different account.",
                        "emailcreateaccount": "Try signing in with a different account.",
                        "callback": "Try signing in with a different account.",
                        "oauthaccountnotlinked": "To confirm your identity, sign in with the same account you used originally.",
                        "emailsignin": "The e-mail could not be sent.",
                        "emailverify": "Please verify your email, a new email has been sent.",
                        "credentialssignin": "Sign in failed. Check the details you provided are correct.",
                        "sessionrequired": "Please sign in to access this page."
                    }
                },
                "authVerifyEmail": {
                    "almostThere": "You're almost there! We've sent an email to ",
                    "verifyEmailLink": "Please click on the link in that email to complete your signup.",
                    "didNotReceive": "Can't find the email?",
                    "resendEmail": "Resend email",
                    "goBack": "Go Back",
                    "emailSent": "Email sent successfully.",
                    "verifyEmail": "Verify your email address"
                },
                "providerButton": {
                    "continue": "Continue with {{provider}}",
                    "signup": "Sign up with {{provider}}"
                },
                "authResetPassword": {
                    "newPasswordRequired": "New password is a required field",
                    "passwordsMustMatch": "Passwords must match",
                    "confirmPasswordRequired": "Confirm password is a required field",
                    "newPassword": "New password",
                    "confirmPassword": "Confirm password",
                    "resetPassword": "Reset Password"
                },
                "authForgotPassword": {
                    "email": "Email address",
                    "emailRequired": "email is a required field",
                    "emailSent": "Please check the email address {{email}} for instructions to reset your password.",
                    "enterEmail": "Enter your email address and we will send you instructions to reset your password.",
                    "resendEmail": "Resend email",
                    "continue": "Continue",
                    "goBack": "Go Back"
                }
            }
        },
        "organisms": {
            "chat": {
                "history": {
                    "index": {
                        "showHistory": "Show history",
                        "lastInputs": "Last Inputs",
                        "noInputs": "Such empty...",
                        "loading": "Loading..."
                    }
                },
                "inputBox": {
                    "input": {
                        "placeholder": "Type your message here..."
                    },
                    "speechButton": {
                        "start": "Start recording",
                        "stop": "Stop recording"
                    },
                    "SubmitButton": {
                        "sendMessage": "Send message",
                        "stopTask": "Stop Task"
                    },
                    "UploadButton": {
                        "attachFiles": "Attach files"
                    },
                    "waterMark": {
                        "text": "Built with"
                    }
                },
                "Messages": {
                    "index": {
                        "running": "Running",
                        "executedSuccessfully": "executed successfully",
                        "failed": "failed",
                        "feedbackUpdated": "Feedback updated",
                        "updating": "Updating"
                    }
                },
                "dropScreen": {
                    "dropYourFilesHere": "Drop your files here"
                },
                "index": {
                    "failedToUpload": "Failed to upload",
                    "cancelledUploadOf": "Cancelled upload of",
                    "couldNotReachServer": "Could not reach the server",
                    "continuingChat": "Continuing previous chat"
                },
                "settings": {
                    "settingsPanel": "Settings panel",
                    "reset": "Reset",
                    "cancel": "Cancel",
                    "confirm": "Confirm"
                }
            },
            "threadHistory": {
                "sidebar": {
                    "filters": {
                        "FeedbackSelect": {
                            "feedbackAll": "Feedback: All",
                            "feedbackPositive": "Feedback: Positive",
                            "feedbackNegative": "Feedback: Negative"
                        },
                        "SearchBar": {
                            "search": "Search"
                        }
                    },
                    "DeleteThreadButton": {
                        "confirmMessage": "This will delete the thread as well as it's messages and elements.",
                        "cancel": "Cancel",
                        "confirm": "Confirm",
                        "deletingChat": "Deleting chat",
                        "chatDeleted": "Chat deleted"
                    },
                    "index": {
                        "pastChats": "Past Chats"
                    },
                    "ThreadList": {
                        "empty": "Empty...",
                        "today": "Today",
                        "yesterday": "Yesterday",
                        "previous7days": "Previous 7 days",
                        "previous30days": "Previous 30 days"
                    },
                    "TriggerButton": {
                        "closeSidebar": "Close sidebar",
                        "openSidebar": "Open sidebar"
                    }
                },
                "Thread": {
                    "backToChat": "Go back to chat",
                    "chatCreatedOn": "This chat was created on"
                }
            },
            "header": {
                "chat": "Chat",
                "readme": "Readme"
            }
        }
    },
    "hooks": {
        "useLLMProviders": {
            "failedToFetchProviders": "Failed to fetch providers:"
        }
    },
    "pages": {
        "Design": {},
        "Env": {
            "savedSuccessfully": "Saved successfully",
            "requiredApiKeys": "Required API Keys",
            "requiredApiKeysInfo": "To use this app, the following API keys are required. The keys are stored on your device's local storage."
        },
        "Page": {
            "notPartOfProject": "You are not part of this project."
        },
        "ResumeButton": {
            "resumeChat": "Resume Chat"
        }
    }
 }
--- a/repodemo/.chainlit/translations/zh-CN.json
+++ b/repodemo/.chainlit/translations/zh-CN.json
@ -0,0 +1,229 @@
 {
    "components": {
        "atoms": {
            "buttons": {
                "userButton": {
                    "menu": {
                        "settings": "设置",
                        "settingsKey": "S",
                        "APIKeys": "API 密钥",
                        "logout": "登出"
                    }
                }
            }
        },
        "molecules": {
            "newChatButton": {
                "newChat": "新聊天"
            },
            "tasklist": {
                "TaskList": {
                    "title": "🗒️ 任务列表",
                    "loading": "加载中...",
                    "error": "发生错误"
                }
            },
            "attachments": {
                "cancelUpload": "取消上传",
                "removeAttachment": "移除附件"
            },
            "newChatDialog": {
                "createNewChat": "创建新聊天？",
                "clearChat": "这将清除当前消息并开始新聊天。",
                "cancel": "取消",
                "confirm": "确认"
            },
            "settingsModal": {
                "settings": "设置",
                "expandMessages": "展开消息",
                "hideChainOfThought": "隐藏思路",
                "darkMode": "深色模式"
            },
            "detailsButton": {
                "using": "正在使用",
                "used": "已使用"
            },
            "auth": {
                "authLogin": {
                    "title": "登录以访问应用程序。",
                    "form": {
                        "email": "电子邮件地址",
                        "password": "密码",
                        "noAccount": "没有账户？",
                        "alreadyHaveAccount": "已有账户？",
                        "signup": "注册",
                        "signin": "登录",
                        "or": "或",
                        "continue": "继续",
                        "forgotPassword": "忘记密码？",
                        "passwordMustContain": "您的密码必须包含：",
                        "emailRequired": "电子邮件是必填字段",
                        "passwordRequired": "密码是必填字段"
                    },
                    "error": {
                        "default": "无法登录。",
                        "signin": "尝试使用其他账户登录。",
                        "oauthsignin": "尝试使用其他账户登录。",
                        "redirect_uri_mismatch": "重定向 URI 与 OAuth 应用程序配置不匹配。",
                        "oauthcallbackerror": "尝试使用其他账户登录。",
                        "oauthcreateaccount": "尝试使用其他账户登录。",
                        "emailcreateaccount": "尝试使用其他账户登录。",
                        "callback": "尝试使用其他账户登录。",
                        "oauthaccountnotlinked": "要确认您的身份，请使用您最初使用的同一账户登录。",
                        "emailsignin": "无法发送电子邮件。",
                        "emailverify": "请验证您的电子邮件，新邮件已发送。",
                        "credentialssignin": "登录失败。请检查您提供的详细信息是否正确。",
                        "sessionrequired": "请登录以访问此页面。"
                    }
                },
                "authVerifyEmail": {
                    "almostThere": "您快完成了！我们已发送电子邮件至",
                    "verifyEmailLink": "请点击该电子邮件中的链接以完成注册。",
                    "didNotReceive": "找不到电子邮件？",
                    "resendEmail": "重新发送电子邮件",
                    "goBack": "返回",
                    "emailSent": "电子邮件发送成功。",
                    "verifyEmail": "验证您的电子邮件地址"
                },
                "providerButton": {
                    "continue": "继续使用{{provider}}",
                    "signup": "使用{{provider}}注册"
                },
                "authResetPassword": {
                    "newPasswordRequired": "新密码是必填字段",
                    "passwordsMustMatch": "密码必须匹配",
                    "confirmPasswordRequired": "确认密码是必填字段",
                    "newPassword": "新密码",
                    "confirmPassword": "确认密码",
                    "resetPassword": "重置密码"
                },
                "authForgotPassword": {
                    "email": "电子邮件地址",
                    "emailRequired": "电子邮件是必填字段",
                    "emailSent": "请检查电子邮件地址 {{email}} 以获取重置密码的说明。",
                    "enterEmail": "输入您的电子邮件地址，我们将发送重置密码的说明。",
                    "resendEmail": "重新发送电子邮件",
                    "continue": "继续",
                    "goBack": "返回"
                }
            }
        },
        "organisms": {
            "chat": {
                "history": {
                    "index": {
                        "showHistory": "显示历史记录",
                        "lastInputs": "上次输入",
                        "noInputs": "空空如也...",
                        "loading": "加载中..."
                    }
                },
                "inputBox": {
                    "input": {
                        "placeholder": "在此输入您的消息..."
                    },
                    "speechButton": {
                        "start": "开始录音",
                        "stop": "停止录音"
                    },
                    "SubmitButton": {
                        "sendMessage": "发送消息",
                        "stopTask": "停止任务"
                    },
                    "UploadButton": {
                        "attachFiles": "附件文件"
                    },
                    "waterMark": {
                        "text": "由codegeex团队提供支持"
                    }
                },
                "Messages": {
                    "index": {
                        "running": "运行中",
                        "executedSuccessfully": "执行成功",
                        "failed": "失败",
                        "feedbackUpdated": "反馈已更新",
                        "updating": "更新中"
                    }
                },
                "dropScreen": {
                    "dropYourFilesHere": "将文件拖放到此处"
                },
                "index": {
                    "failedToUpload": "上传失败",
                    "cancelledUploadOf": "取消上传",
                    "couldNotReachServer": "无法连接服务器",
                    "continuingChat": "继续先前的聊天"
                },
                "settings": {
                    "settingsPanel": "设置面板",
                    "reset": "重置",
                    "cancel": "取消",
                    "confirm": "确认"
                }
            },
            "threadHistory": {
                "sidebar": {
                    "filters": {
                        "FeedbackSelect": {
                            "feedbackAll": "反馈：全部",
                            "feedbackPositive": "反馈：正面",
                            "feedbackNegative": "反馈：负面"
                        },
                        "SearchBar": {
                            "search": "搜索"
                        }
                    },
                    "DeleteThreadButton": {
                        "confirmMessage": "这将删除线程及其消息和元素。",
                        "cancel": "取消",
                        "confirm": "确认",
                        "deletingChat": "删除聊天中",
                        "chatDeleted": "聊天已删除"
                    },
                    "index": {
                        "pastChats": "过去的聊天"
                    },
                    "ThreadList": {
                        "empty": "空的...",
                        "today": "今天",
                        "yesterday": "昨天",
                        "previous7days": "前7天",
                        "previous30days": "前30天"
                    },
                    "TriggerButton": {
                        "closeSidebar": "关闭侧边栏",
                        "openSidebar": "打开侧边栏"
                    }
                },
                "Thread": {
                    "backToChat": "返回聊天",
                    "chatCreatedOn": "此聊天创建于"
                }
            },
            "header": {
                "chat": "聊天",
                "readme": "自述文件"
            }
        }
    },
    "hooks": {
        "useLLMProviders": {
            "failedToFetchProviders": "无法获取提供者："
        }
    },
    "pages": {
        "Design": {},
        "Env": {
            "savedSuccessfully": "保存成功",
            "requiredApiKeys": "所需 API 密钥",
            "requiredApiKeysInfo": "要使用此应用程序，需要以下 API 密钥。这些密钥存储在您设备的本地存储中。"
        },
        "Page": {
            "notPartOfProject": "您不属于此项目。"
        },
        "ResumeButton": {
            "resumeChat": "继续聊天"
        }
    }
 }
--- a/repodemo/chainlit.md
+++ b/repodemo/chainlit.md
@ -0,0 +1,35 @@
 # CodeGeeX
 #   Welcome to use my chat dome application
 This is a simple demonstration application.
 ## Usage
 1.  Enter your question
 2.   Wait for the reply
 3.  Enjoy the conversation!
 ## Features
 -  Support multi-round dialogue
 -  Support uploading local zip compressed project package, and can perform project question and answer and modify the project
 ## Installation
 1. Clone the repository to your local machine
 2. Install dependencies: `pip install -r requirements.txt`
 3. Run the application: `python run.py`
 ## Note
 Please ensure that your network environment can access the CodeGeeX API.
 ##   Disclaimer
 This application is for learning and research purposes only and shall not be used for any commercial purposes. The developer is not responsible for any loss or damage caused by the use of this application.
 ##   Thank you
 Thank you for using our application. If you have any questions or suggestions, please feel free to contact us. We look forward to your feedback and are committed to providing you with better services.
--- a/repodemo/chainlit_zh-CN.md
+++ b/repodemo/chainlit_zh-CN.md
@ -0,0 +1,34 @@
 # CodeGeeX
 #   欢迎使用我的chat dome应用
 这是一个简单的演示应用程序。
 ## 使用说明
 1. 输入您的问题
 2. 等待回复
 3. 享受对话！
 ## 功能
 -  支持多轮对话
 -  支持上传本地zip压缩包项目，可以进行项目问答和对项目进行修改
 ## 安装
 1. 克隆仓库到本地
 2. 安装依赖：`pip install -r requirements.txt`
 3.  运行应用：`chain run run.py --port 8888`
 ## 注意
 请确保您的网络环境可以访问CodeGeeX的API。
 ##   免责声明
 本应用仅供学习和研究使用，不得用于任何商业用途。开发者不对因使用本应用而导致的任何损失或损害负责。
 ##     感谢
 感谢您使用我们的应用。如果您有任何问题或建议，请随时联系我们。我们期待您的反馈，并致力于为您提供更好的服务。
--- a/repodemo/llm/api/codegeex4.py
+++ b/repodemo/llm/api/codegeex4.py
@ -0,0 +1,41 @@
 import requests
 import json
 URL = "" #the url you deploy codegeex service
 def codegeex4(prompt, temperature=0.8, top_p=0.8):
    url = URL
    headers = {
        'Content-Type': 'application/json'
    }
    data = {
        'inputs': prompt,
        'parameters': {
            'best_of':1,
            'do_sample': True,
            'max_new_tokens': 4012,
            'temperature': temperature,
            'top_p': top_p,
            'stop': ["<|endoftext|>", "<|user|>", "<|observation|>", "<|assistant|>"],
        }
    }
    response = requests.post(url, json=data, headers=headers, verify=False, stream=True)
    if response.status_code == 200:
        for line in response.iter_lines():
            if line:
                decoded_line = line.decode('utf-8').replace('data:', '').strip()
                if decoded_line:
                    try:
                        content = json.loads(decoded_line)
                        token_text = content.get('token', {}).get('text', '')
                        if '<|endoftext|>' in token_text:
                            break  
                        yield token_text
                    except json.JSONDecodeError:
                        continue
    else:
        print('请求失败:', response.status_code)
--- a/repodemo/prompts/base_prompt.py
+++ b/repodemo/prompts/base_prompt.py
@ -0,0 +1,42 @@
 base_system_prompt = """<|system|>\n你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。"""
 repo_system_prompt = """<|system|>\n你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码。请根据用户给出的项目仓库中的代码，以及用户提出的需求，生成新的代码或者更改已有代码。输出格式：\n\n###PATH:{PATH}\n{CODE}"""
 judge_task_prompt = """<|system|>\n你是一位任务分类专家，请你对用户的输入进行分类（问答/修改/正常），如果用户的输入是对项目进行提问则只需要输出问答两个字，如果用户的输入是对项目进行修改或增加则只需要输出修改两个字，如果用户输入的是一个与项目无关的问题则只需要输出正常两个字。<|user|>\n{user_input}<|assistant|>\n"""
 web_judge_task_prompt ="""<|system|>\n你是一位智能编程助手，你叫CodeGeeX。你会为用户回答关于编程、代码、计算机方面的任何问题，并提供格式规范、可以执行、准确安全的代码，并在必要时提供详细的解释。<|user|>\n{user_input}\n这个问题需要进行联网来回答吗？仅回答“是”或者“否”。<|assistant|>\n"""
 # judge_task_prompt = """<|system|>\n你是一位任务分类专家，请你对用户的输入进行分类（问答/修改），如果用户的输入是对项目进行提问则只需要输出问答两个字，如果用户的输入是对项目进行修改或增加则只需要输出修改两个字。<|user|>\n{user_input}<|assistant|>\n"""
 web_search_prompy = """
 你将接收到一个用户提出的问题，并请撰写清晰、简洁且准确的答案。
 # Note
 - 您将获得与问题相关的多个上下文片段，每个上下文都以引用编号开头，例如[[citation:x]]，其中x是一个数字。如果适用，请使用上下文并在每个句子的末尾引用上下文。
 - 您的答案必须是正确的、准确的，并且以专家的身份使用无偏见和专业的语调来撰写。
 - 请你的回答限制在2千字以内，不要提供与问题无关的信息，也不要重复。
 - 请以引用编号的格式[[citation:x]]来引用上下文。如果一个句子来自多个上下文，请列出所有适用的引用，例如[[citation:3]][[citation:5]]。
 - 若所有上下文均不相关，请以自己的理解回答用户提出的问题，此时回答中可以不带引用编号。
 - 除了代码和特定的名称和引用外，您的答案必须使用与问题相同的语言来撰写。
 """.lstrip()
 def get_cur_base_user_prompt(message_history,index_prompt = None,judge_context = ""):
    user_prompt_tmp = """<|user|>\n{user_input}"""
    assistant_prompt_tmp = """<|assistant|>\n{assistant_input}"""
    history_prompt = ""
    for i,message in enumerate(message_history):
        if message['role'] == 'user':
            if i==0 and index_prompt is not None:
                history_prompt += "<|user|>\n"+index_prompt+message['content']
            else:
                history_prompt += user_prompt_tmp.format(user_input=message['content'])
        elif message['role'] ==  'assistant':
            history_prompt += assistant_prompt_tmp.format(assistant_input=message['content'])
    # print("修改" not in judge_context)
    # print(judge_context)
    if "修改" not in judge_context:
        result = base_system_prompt+history_prompt+"""<|assistant|>\n"""
    else:
        result = repo_system_prompt+history_prompt+"""<|assistant|>\n"""
    print(result)
    return result
--- a/repodemo/public/avatars/user1.png
+++ b/repodemo/public/avatars/user1.png
--- a/repodemo/public/favicon.png
+++ b/repodemo/public/favicon.png
--- a/repodemo/public/logo_dark.png
+++ b/repodemo/public/logo_dark.png
--- a/repodemo/public/logo_light.png
+++ b/repodemo/public/logo_light.png
--- a/repodemo/readme.md
+++ b/repodemo/readme.md
@ -0,0 +1,26 @@
 # CodeGeeX
 #   欢迎使用我的chat dome应用
 这是一个简单的演示应用程序。
 ## 使用说明
 1. 输入您的问题
 2. 等待回复
 3. 享受对话！
 ## 功能
 -  支持多轮对话
 -  支持上传本地zip压缩包项目，可以进行项目问答和对项目进行修改
 ## 安装
 1. 克隆仓库到本地
 2. 安装依赖：`pip install -r requirements.txt`
 3.  运行应用：`chainlit run run.py --port 8888`
 ## 注意
 请确保您的网络环境可以访问CodeGeeX的API。
--- a/repodemo/requirements.txt
+++ b/repodemo/requirements.txt
@ -0,0 +1,2 @@
 chainlit==1.1.305
 beautifulsoup4
--- a/repodemo/run.py
+++ b/repodemo/run.py
@ -0,0 +1,157 @@
 import chainlit as cl
 from chainlit.input_widget import Slider
 from llm.api.codegeex4 import codegeex4
 from prompts.base_prompt import judge_task_prompt,get_cur_base_user_prompt,web_judge_task_prompt
 from utils.tools import unzip_file,get_project_files_with_content
 from utils.bingsearch import bing_search_prompt
@cl.set_chat_profiles
 async def chat_profile():
    return [
        cl.ChatProfile(
            name="chat聊天",
            markdown_description="聊天demo：支持多轮对话。",
            starters = [
                cl.Starter(
                label="请你用python写一个快速排序。",
                message="请你用python写一个快速排序。",
                ),
            cl.Starter(
                label="请你介绍一下自己。",
                message="请你介绍一下自己。",
                ),
            cl.Starter(
                label="用 Python 编写一个脚本来自动发送每日电子邮件报告，并指导我如何进行设置。",
                message="用 Python 编写一个脚本来自动发送每日电子邮件报告，并指导我如何进行设置。",
                ),
            cl.Starter(
                label="我是一个python初学者，请你告诉我怎么才能学好python。",
                message="我是一个python初学者，请你告诉我怎么才能学好python。",
                )
            ]
        ),
        cl.ChatProfile(
            name="联网问答",
            markdown_description="联网能力dome：支持联网回答用户问题。",
        ),
        cl.ChatProfile(
            name="上传本地项目",
            markdown_description="项目级能力dome：支持上传本地zip压缩包项目，可以进行项目问答和对项目进行修改。",
        )
    ]
@cl.on_chat_start
 async def start():
    settings = await cl.ChatSettings(
        [
            Slider(
                id="temperature",
                label="CodeGeeX4 - Temperature",
                initial=0.2,
                min=0,
                max=1,
                step=0.1,
            ),
            Slider(
                id="top_p",
                label="CodeGeeX4 - top_p",
                initial=0.95,
                min=0,
                max=1,
                step=0.1,
            ),
        ]
    ).send()
    temperature = settings["temperature"]
    top_p = settings["top_p"]
    cl.user_session.set('temperature',temperature)
    cl.user_session.set('top_p',top_p)
    cl.user_session.set(
        "message_history",
        []
    )
    chat_profile = cl.user_session.get("chat_profile")
    extract_dir = 'repodata'
    if chat_profile == "chat聊天":
        pass
    elif chat_profile =="上传本地项目":
        files = None
        while files == None:
            files = await cl.AskFileMessage(
                content="请上传项目zip压缩文件!", accept={"application/zip": [".zip"]},max_size_mb=50
            ).send()
        text_file = files[0]
        extracted_path = unzip_file(text_file.path,extract_dir)
        files_list = get_project_files_with_content(extracted_path)
        cl.user_session.set("project_index",files_list)
        if len(files_list)>0:
            await cl.Message(
                content=f"已成功上传，您可以开始对项目进行提问！",
            ).send()
@cl.on_message
 async def main(message: cl.Message):
    chat_profile = cl.user_session.get("chat_profile")
    message_history = cl.user_session.get("message_history")
    message_history.append({"role": "user", "content": message.content})
    if chat_profile == "chat聊天":
        prompt_content = get_cur_base_user_prompt(message_history=message_history)
    elif chat_profile=="联网问答":
        judge_tmp = codegeex4(web_judge_task_prompt.format(user_input=message.content),temperature=0.2,top_p = 0.95)
        judge_context = '\n'.join(judge_tmp)
        print(judge_context)
        message_history.pop()
        if '是' in judge_context:
            prompt_tmp = bing_search_prompt(message.content)
            message_history.append({"role": "user", "content": prompt_tmp})
        else:
            message_history.append({"role": "user", "content": message.content})
        prompt_content = get_cur_base_user_prompt(message_history=message_history)
    elif chat_profile =="上传本地项目" :
        judge_tmp = codegeex4(judge_task_prompt.format(user_input=message.content),temperature=0.2,top_p = 0.95)
        judge_context = ''
        for part in judge_tmp:
            judge_context+=part
        project_index = cl.user_session.get("project_index")
        index_prompt = ""
        index_tmp = """###PATH:{path}\n{code}\n"""
        for index in project_index:
            index_prompt+=index_tmp.format(path=index['path'],code=index['content'])
        print(judge_context)
        prompt_content = get_cur_base_user_prompt(message_history=message_history,index_prompt=index_prompt,judge_context=judge_context) if '正常' not in judge_context else get_cur_base_user_prompt(message_history=message_history)
    msg = cl.Message(content="")
    await msg.send()
    temperature = cl.user_session.get("temperature")
    top_p = cl.user_session.get('top_p')
    if len(prompt_content)/4<120000:
        stream =  codegeex4(prompt_content,temperature=temperature,top_p = top_p)
        for part in stream:
            if token := (part or " "):
                await msg.stream_token(token)
    else:
        await msg.stream_token("项目太大了，请换小一点的项目。")
    message_history.append({"role": "assistant", "content": msg.content})
    await msg.update()
--- a/repodemo/utils/keep.txt
+++ b/repodemo/utils/keep.txt
@ -0,0 +1,19 @@
 package-lock.json
 package.json
 config.json
 LICENSE
 yarn.lock
 requirements.txt
 Dockerfile
 build.gradle
 Makefile
 go.mod
 go.sum
 CHANGES.txt
 Cargo.toml
 pubspec.yaml
 root.json
 snapshot.json
 targets.json
 timestamp.json
 Cargo.lock
--- a/repodemo/utils/programming-languages-to-file-extensions.json
+++ b/repodemo/utils/programming-languages-to-file-extensions.json
--- a/repodemo/utils/tools.py
+++ b/repodemo/utils/tools.py
@ -0,0 +1,154 @@
 import zipfile
 import os
 import json
 def unzip_file(zip_path, extract_dir):
    """
    解压zip文件到指定目录，并在指定目录下创建一个新的目录存放解压后的文件
    参数:
    zip_path (str): zip压缩包的地址
    extract_dir (str): 指定解压的目录
    返回:
    str: 解压后的路径
    """
    if not os.path.exists(extract_dir):
        os.makedirs(extract_dir)
    base_name = os.path.basename(zip_path)
    dir_name = os.path.splitext(base_name)[0]
    new_extract_dir = os.path.join(extract_dir, dir_name)
    if not os.path.exists(new_extract_dir):
        os.makedirs(new_extract_dir)
    with zipfile.ZipFile(zip_path, 'r') as zip_ref:
        zip_ref.extractall(new_extract_dir)
    return new_extract_dir
 def get_project_files_with_content(project_dir):
    """
    获取项目目录下所有文件的相对路径和内容
    参数:
    project_dir (str): 项目目录地址
    返回:
    list: 包含字典的列表，每个字典包含文件的相对路径和内容
    """
    files_list = []
    for root, dirs, files in os.walk(project_dir):
        for file in files:
            if filter_data(file):
                file_path = os.path.join(root, file)
                relative_path = os.path.relpath(file_path, project_dir)
                if "__MACOSX" in relative_path:
                    continue
                with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
                    content = f.read()
                files_list.append({'path': relative_path, 'content': content})
            else:
                continue
    return files_list
 def filter_data(obj):
    LANGUAGE_TAG = {
    "c++"          : "// C++",
    "cpp"          : "// C++",
    "c"            : "// C",
    "c#"           : "// C#",
    "c-sharp"      : "// C#",
    "css"          : "/* CSS */",
    "cuda"         : "// Cuda",
    "fortran"      : "! Fortran",
    "go"           : "// Go",
    "html"         : "<!-- HTML -->",
    "java"         : "// Java",
    "js"           : "// JavaScript",
    "javascript"   : "// JavaScript",
    "kotlin"       : "// Kotlin",
    "lean"         : "-- Lean",
    "lua"          : "-- Lua",
    "objectivec"  : "// Objective-C",
    "objective-c"  : "// Objective-C",
    "objective-c++": "// Objective-C++",
    "pascal"       : "// Pascal",
    "php"          : "// PHP",
    "python"       : "# Python",
    "r"            : "# R",
    "rust"         : "// Rust",
    "ruby"         : "# Ruby",
    "scala"        : "// Scala",
    "shell"        : "# Shell",
    "sql"          : "-- SQL",
    "tex"          : f"% TeX",
    "typescript"   : "// TypeScript",
    "vue"          : "<!-- Vue -->",
    "assembly"     : "; Assembly",
    "dart"         : "// Dart",
    "perl"         : "# Perl",
    "prolog"       : f"% Prolog",
    "swift"        : "// swift",
    "lisp"         : "; Lisp",
    "vb"           : "' Visual Basic",
    "visual basic" : "' Visual Basic",
    "matlab"       : f"% Matlab",
    "delphi"       : "{ Delphi }",
    "scheme"       : "; Scheme",
    "basic"        : "' Basic",
    "assembly"     : "; Assembly",
    "groovy"       : "// Groovy",
    "abap"         : "* Abap",
    "gdscript"     : "# GDScript",
    "haskell"      : "-- Haskell",
    "julia"        : "# Julia",
    "elixir"       : "# Elixir",
    "excel"        : "' Excel",
    "clojure"      : "; Clojure",
    "actionscript" : "// ActionScript",
    "solidity"     : "// Solidity",
    "powershell"   : "# PowerShell",
    "erlang"       : f"% Erlang",
    "cobol"        : "// Cobol",
    "batchfile"  : ":: Batch file",
    "makefile"     : "# Makefile",
    "dockerfile"   : "# Dockerfile",
    "markdown"     : "<!-- Markdown -->",
    "cmake"        : "# CMake",
    "dockerfile"   : "# Dockerfile",
    }
    programming_languages_to_file_extensions = json.load(open('utils/programming-languages-to-file-extensions.json'))
    need2del = []
    for key in programming_languages_to_file_extensions.keys():
        if key.lower() not in LANGUAGE_TAG:
            need2del.append(key)
    for key in need2del:
        del programming_languages_to_file_extensions[key]
    ext_to_programming_languages = {}
    want_languages = []
    for key in programming_languages_to_file_extensions:
        for item in programming_languages_to_file_extensions[key]:
            ext_to_programming_languages[item] = key
            want_languages.append(item)
    ext = '.'+obj.split('.')[-1]
    with open('utils/keep.txt', 'r') as f:
        keep_files = f.readlines()
        keep_files = [l.strip() for l in keep_files]
    #print(ext)
    if ext not in want_languages:
        if obj in keep_files:
            return True
        return False
    else:
        return True
--- a/resources/all_functions.jpg
+++ b/resources/all_functions.jpg
--- a/resources/logo.jpeg
+++ b/resources/logo.jpeg
--- a/web_demo/README.md
+++ b/web_demo/README.md
@ -0,0 +1,34 @@
 ![](../resources/logo.jpeg)
 [English](README.md) | [中文](README_zh.md)
 ## Online Functionality
 CodeGeeX4 supports online search and question answering by calling the Bing API to retrieve search results to access to the latest
 information.
 ## Usage Tutorial
 ### 1. Install Dependencies
 ```bash
 pip install -r requirements.txt
 ```
 ### 2. Configure Bing API Key
 Configure `BING_API_KEY` in `backend/apis/bing.py`.
 For more details, refer
 to [Bing Search API](https://learn.microsoft.com/zh-cn/previous-versions/azure/cognitive-services/Bing-Web-Search/bing-api-comparison)
 ### 3. Run the Project
 ```bash
 python main.py
 >>> Running on local URL:  http://127.0.0.1:8080
 ```
 ## Demo
 ![](resources/demo.png)
--- a/web_demo/README_zh.md
+++ b/web_demo/README_zh.md
@ -0,0 +1,33 @@
 ![](../resources/logo.jpeg)
 [English](README.md) | [中文](README_zh.md)
 ## 联网功能
 CodeGeeX4支持联网搜索问答，通过调用Bing API获取搜索结果，可获取最新资讯。
 ## 使用教程
 ### 1. 安装依赖项
 ```bash
 pip install -r requirements.txt
 ```
 ### 2. 配置Bing API Key
 在`backend/apis/bing.py`中配置`BING_API_KEY`
 详情可参考 [必应搜索API](https://learn.microsoft.com/zh-cn/previous-versions/azure/cognitive-services/Bing-Web-Search/bing-api-comparison)
 ### 3. 运行项目
 ```bash
 python main.py
 >>> Running on local URL:  http://127.0.0.1:8080
 ```
 ## Demo
 ![](resources/demo_zh.png)
--- a/web_demo/backend/apis/api.py
+++ b/web_demo/backend/apis/api.py
@ -0,0 +1,12 @@
 class API:
    def call(self, **kwargs):
        return self.__call__(**kwargs)
    def __call__(self, **kwargs):
        search_kwargs = {'query': kwargs['query'], 'freshness': 'month'}
        search_res = self.search(**search_kwargs)
        return search_res
    @classmethod
    def search(cls, **kwargs) -> list[str]:
        raise NotImplementedError
--- a/web_demo/backend/apis/bing.py
+++ b/web_demo/backend/apis/bing.py
@ -0,0 +1,43 @@
 """
 Bing Search
 """
 import os
 import requests
 from backend.apis.api import API
 BING_API_KEY = os.getenv('BING_API_KEY')
 class BingSearchAPI(API):
    def __init__(self):
        self.url = "https://api.bing.microsoft.com/v7.0/search"
    def search(self, query, freshness=None):
        """
        Search with bing
        References: https://docs.microsoft.com/en-us/bing/search-apis/bing-web-search/overview
        """
        response = requests.get(
            url=self.url,
            headers={"Ocp-Apim-Subscription-Key": BING_API_KEY},
            params={
                "q": query,
                "mkt": 'zh-CN',
                "freshness": freshness,
            },
            timeout=10,
        )
        try:
            json_content = response.json()
            contexts = json_content['webPages']['value'][:4]
            search_res = [{
                "url": item['url'],
                "title": item['name'],
                "snippet": item['snippet']
            } for item in contexts]
            return search_res
        except Exception as e:
            print(f"Searching failed, caused by {e}")
            return []
--- a/web_demo/backend/models/codegeex.py
+++ b/web_demo/backend/models/codegeex.py
@ -0,0 +1,10 @@
 import torch
 from transformers import AutoTokenizer, AutoModel
 device = "cuda" if torch.cuda.is_available() else "cpu"
 model_name_or_path = "THUDM/codegeex4-all-9b"
 tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)
 model = AutoModel.from_pretrained(
    model_name_or_path,
    trust_remote_code=True
 ).to(device).eval()
--- a/web_demo/backend/services/chat.py
+++ b/web_demo/backend/services/chat.py
@ -0,0 +1,35 @@
 import json
 from backend.apis.api import API
 from backend.apis.bing import BingSearchAPI
 from backend.models.codegeex import model, tokenizer
 from backend.utils.chat import build_model_input, SYS_PROMPT
 def chat(query: str, history: list[list[str]] = None):
    if not history:
        history = []
    ans = ""
    # Search with bing
    api: API = BingSearchAPI()
    search_res = api.call(query=query, history=history)
    ans += "搜索结果".center(100, "-") + '\n'
    ans += "```json\n" + json.dumps(search_res, indent=4, ensure_ascii=False) + "\n```\n"
    yield ans
    # Build model's input
    inputs: str = build_model_input(query, search_res)
    # Generate response
    ans += "模型回复".center(100, "-") + '\n'
    yield ans
    response, _ = model.chat(
        tokenizer,
        query=inputs,
        history=[{"role": "system", "content": SYS_PROMPT}],
        max_new_tokens=1024,
        temperature=0.2
    )
    yield ans + response
--- a/web_demo/backend/utils/chat.py
+++ b/web_demo/backend/utils/chat.py
@ -0,0 +1,26 @@
 SYS_PROMPT = """
 你将接收到一个用户提出的问题，并请撰写清晰、简洁且准确的答案。
 # Note
 - 您将获得与问题相关的多个上下文片段，每个上下文都以引用编号开头，例如[[citation:x]]，其中x是一个数字。如果适用，请使用上下文并在每个句子的末尾引用上下文。
 - 您的答案必须是正确的、准确的，并且以专家的身份使用无偏见和专业的语调来撰写。
 - 请你的回答限制在2千字以内，不要提供与问题无关的信息，也不要重复。
 - 请以引用编号的格式[[citation:x]]来引用上下文。如果一个句子来自多个上下文，请列出所有适用的引用，例如[[citation:3]][[citation:5]]。
 - 若所有上下文均不相关，请以自己的理解回答用户提出的问题，此时回答中可以不带引用编号。
 - 除了代码和特定的名称和引用外，您的答案必须使用与问题相同的语言来撰写。
 """.lstrip()
 def build_model_input(query, search_res):
    """
    Build model's input
    :param query: user query
    :param search_res: bing's search results
    :return:
    """
    citations = "\n\n".join(
        [f"[[citation:{i + 1}]]\n```markdown\n{item['snippet']}\n```" for i, item in enumerate(search_res)]
    )
    prompt = f"[引用]\n{citations}\n问：{query}\n"
    return prompt
--- a/web_demo/main.py
+++ b/web_demo/main.py
@ -0,0 +1,10 @@
 """
 References: https://github.com/leptonai/search_with_lepton
 """
 import gradio as gr
 from backend.services.chat import chat
 if __name__ == "__main__":
    demo = gr.ChatInterface(chat).queue()
    demo.launch(server_name="127.0.0.1", server_port=8080)
--- a/web_demo/requirements.txt
+++ b/web_demo/requirements.txt
@ -0,0 +1,7 @@
 accelerate==0.31.0
 gradio==4.26.0
 regex==2024.5.15
 requests==2.32.3
 tiktoken==0.7.0
 torch==2.3.1
 transformers==4.39.0
--- a/web_demo/resources/demo.png
+++ b/web_demo/resources/demo.png
--- a/web_demo/resources/demo_zh.png
+++ b/web_demo/resources/demo_zh.png