Skip to main content

AutoDev Coder

· 2 min read

TL;DR:

The first barely usable version of AutoDev Coder 6.7B, a coding LLM for AutoDev, is now available.

PS: Since AutoDev 1.5.1 is awaiting approval on the JetBrains Marketplace and foreign colleagues are still on vacation after holidays, the model's performance on version 1.5.1 will be slightly better than on 1.5.0.

Additionally, with improved computing power support and better completion testing, we will reintroduce the original Inlay completion mode.

AutoDev Coder 6.7B v1 Experimental Version

Current version is fine-tuned based on DeepSeek Coder 6.7b instruct model under LLaMA architecture.

Note: As an experimental version, its primary purpose is to align the model, data tools, and IDE plugin for better coordination. Generation quality still requires further improvement.

AutoDev Coder 64k Dataset

The instruction composition of AutoDev Coder v1 64k is as follows:

FilenameSelected Instructions
java_oss.jsonl4000
python_oss.jsonl4000
code_bugfix_cleaned_5K.json4000
codeGPT_CN_cleaned_20K.json15000
code_summarization_CN_cleaned_10K.json8000
code_generation_CN_cleaned_5K.json4000
summary.jsonl25000

The summary.jsonl is generated by our open-source code fine-tuning data framework UnitGen (https://github.com/unit-mesh/unit-gen).

We selected dozens of Java and Kotlin open-source projects, generating instructions based on AutoDev plugin requirements, mainly categorized into three types:

  • Completion (inline, interline, interblock)
  • Documentation generation
  • Comment generation

Detailed documentation can be found in the UnitGen project: https://github.com/unit-mesh/unit-gen.

FAQ: AutoDev Coder Model Evaluation

Still under design. Since we need to combine AutoDev instructions with languages like Java, Kotlin, TypeScript rather than the Python-centric systems commonly used in open-source models, we need to rethink our evaluation approach.

Initially, we used instruction sets like OSS Instruct to supplement natural language to code generation, but found ~50,000 instructions (about 50%) were Python-related. After filtering, only ~5,000 Java instructions remained, which showed suboptimal results in AutoDev.

FAQ: AutoDev Instructions

AutoDev employs contextual strategies that differ from other tools in instruction handling. Details: https://github.com/unit-mesh/auto-dev

AutoDev 1.4 Scaling AI-Assisted Development

· 4 min read

Over the past two months, as Thoughtworks rolled out large-scale AI-assisted software delivery (AI4SoftwareDelivery) internally - involving thousands of Thoughtworkers across different roles and regions globally, along with dozens of internal sharing sessions.

We've incorporated more new features into AutoDev to continuously explore how to better assist teams in improving efficiency within IDEs. As the current best open-source AI-assisted programming tool in China, we've introduced several interesting features in AutoDev 1.4.0 to explore scalable AI-driven development efficiency improvements.

AutoDev GitHub: https://github.com/unit-mesh/auto-dev

Team Prompts: Codified Prompts for Team Dissemination

Responding to our colleagues' enthusiasm for TDD (Test-Driven Development), specifically the #49 issue requesting "Support TDD mode to generate implementations based on specified tests", we developed the Team Prompts feature. Now you can directly write prompts in your code repository, and AutoDev will read these prompts to enhance AI-assisted functionality.

Untitled

This means:

  • Share prompts across teams rather than maintaining personalized configurations
  • Different teams within your organization can share their AI experiences
  • No need for custom IDE requirements - just provide interface capabilities

Team Prompts Example

Let's look at a simple example. First create (or configure) a Prompt directory in your repository, then write your prompts. For TDD scenarios:

  • Tasking.vm: Split requirements into test cases
  • TDD-Red.vm: Write the first failing test based on generated test cases
  • TDD-Green.vm: Implement code to pass the test
  • TDD-Refactor.vm: Refactor the implementation

In these prompt files, simply use AutoDev's configuration to introduce context variables (reference: https://ide.unitmesh.cc/variables). Example:

---
priority: 2023
interaction: ChatPanel
---
```user```

You are a senior software engineer skilled in TDD. Improve existing implementations based on new test cases.

Original implementation code: $context.underTestFileCode($methodName)

New test code:

${selection}

Optimize the class under test based on new tests. Return method code using ``` to start your code block:

The YAML FrontMatter at the beginning provides simple configurations:

  • priority determines menu ordering
  • interaction controls output behavior:
    • ChatPanel displays in right-side chat window
    • AppendCursorStream streams output in current document with typewriter effect

Context provides built-in system functions for extended capabilities.

Team Prompts vs Custom Prompt

AutoDev 1.1 introduced Custom Prompt for personal configurations, while Team Prompts offers unified team configurations. This allows creating scenario-specific AI instructions for rapid team sharing.

We will continue evolving Team Prompts for better usability.

Custom Living Documentation: Continuously Supporting Legacy System Refactoring

Compared to conventional documentation generation, we find it more meaningful to generate code annotations that assist system refactoring.

AutoDev Documentation Generation

Inspired by JetBrains AI Assistant's documentation features, we added similar functionality in AutoDev. While initially considered symbolic, it proved valuable when documenting Chocolate Factory - simply select a class/method/variable, right-click or press Alt+Enter to generate documentation. Existing documentation will be updated based on current code (when AI permits).

For SDK development, we recommend adopting the Documentation Engineering approach described in Developer Experience: Exploration and Reshaping, as implemented in Chocolate Factory where tests and comments generate reliable documentation.

Custom Living Documentation Generation

Based on experience with legacy system refactoring tools and large insurance company cases, generating annotation-style documentation directly from code significantly reduces reading costs. Combining existing code with new documentation enables better RAG capabilities for extracting meaningful knowledge from code.

In AutoDev, simply add examples to guide LLM documentation generation:

"documentations": [
{
"title": "Living Documentation",
"prompt": "Generate Living Documentation in the following format:",
"start": "",
"end": "",
"type": "annotated",
"example": {
"question": "...",
"answer": "..."
}
}

Customize annotation formats for different scenarios, including Swagger annotations for API documentation.

Code Review

As discussed in our previous article AIGC Reshaping Software Engineering: Code Review, we combine AutoDev with DevOps platforms for code reviews.

IDE-Side Code Review Best Practices

For IDE-side reviews, we recommend focusing on business context understanding combined with syntax checks. Our design follows common workflows - reviewing multiple commits (e.g., all commits for a requirement) or historical changes of single files.

Requirement System-Integrated Code Review

For teams using AIGC efficiency tools, most already have mature DevOps practices like including requirement IDs in commit messages, e.g., feat(devops): init first review command #8.

AutoDev can retrieve requirement system information using this ID (8 in example) to supplement business context for LLM analysis.

Conclusion

As an open-source project, we still have many areas for improvement. Please submit issues on GitHub: https://github.com/unit-mesh/auto-dev if you encounter any problems.

AutoDev 1.0

· 4 min read

In April, through the article "AutoDev: AI Breaks Through R&D Efficiency, Exploring New Opportunities in Platform Engineering", we outlined the initial impact of AI on software development. We established several fundamental assumptions:

  • Large and medium-sized enterprises will possess at least one proprietary large language model.
  • Only end-to-end tools can achieve quality and efficiency improvements through AI.

Based on these assumptions, we began building AutoDev and open-sourced it. I've also documented all development insights on my blog, hoping to assist domestic enterprises in establishing their own AI-assisted programming capabilities.

As an open-source project, let's start with the GitHub address: https://github.com/unit-mesh/auto-dev.

Designing Three Assistance Modes Around Developer Experience

Initially, I didn't have a clear development blueprint. As a daily code-writing "expert-level" programmer, I built features based on my immediate needs.

Subsequently, I categorized all features into three assistance modes:

  • Chat Mode
  • Copilot Mode
  • Completion Mode

Auto Mode: Standardized Code Generation

Trigger method: All auto modes are under Context Actions, activated using the universal shortcut: ⌥⏎ (macOS) or Alt+Enter (Windows/Linux).

Design philosophy: Similar to the one-click pattern we designed in ClickPrompt. Code shouldn't be like various flashy demos online - it must consider existing team conventions, otherwise generated code remains unusable. Focusing on configurability and implicit knowledge scenarios, we implemented three Auto scenarios:

  1. Auto CRUD: Reads requirements from issue systems, uses a manually coded agent for continuous interaction to find suitable controllers, modify methods, add new methods, etc. Currently supports Kotlin and JavaScript.
  2. Auto Test Generation: Generates and automatically runs tests for selected classes/methods (when RunConfiguration is appropriate). Supports JavaScript, Kotlin, and Java.
  3. Auto Code Completion: Context-aware code filling. Capabilities vary by language due to limited resources:
    • Java: Incorporates code specifications
    • Kotlin/Java: Adds parameter/return type classes as context
    • Other languages: Uses similarity algorithms (no questions about inspiration sources) comparable to GitHub Copilot and JetBrains AI Assistant

Each auto mode includes automated context preparation. The following image shows visible context for code completion:

Untitled

This context combines configured specifications with BlogController-related fields, parameters, return types (e.g., BlogService), etc.

Additionally, hidden contexts exist, such as language declarations in AutoDev configurations:

You MUST Use 中文 to return your answer !

Interestingly, with just two "中文" mentions, there's about 50% chance of non-compliance. Considering adding three repetitions.

Companion Mode: Daily Workflow Integration

When designing companion mode, we referenced existing tools like AI Commit while addressing personal needs.

Since companion modes require waiting for LLM responses, they're grouped under AutoDev Chat.

However, JetBrains AI Assistant has become AutoDev's main competitor (and reference) since its release. Features like "Explain with AI" and "Explain error message with AI" demonstrate excellent UX - areas where we still have room for improvement.

In AutoDev, you can:

  • Generate commit messages
  • Create release notes
  • Explain code
  • Refactor code
  • ...and even generate DDL directly

Chat Mode: A Peripheral Feature

After UI redesign (inspired by JetBrains' approach, given their limited China support), we implemented one-click chat via Context Actions (see Figure 1).

Chat freely here.

Reflections on LLM as Copilot

Currently, LLMs serve as Copilots. They won't replace software engineering specialization but enhance professional capabilities through AI-assisted tools, impacting individual workflows.

They should address "tasks I avoid" and "repetitive tasks" - writing tests, coding, resolving issues, commits, etc. As programmers, we should focus on creative design.

AutoDev focuses on: How can AI better assist human work while keeping engineers within their IDEs?

The LLM as Copilot concept will see increasing tool refinement. As seasoned AI application engineers, we're contemplating how LLM as Co-Integrator can truly boost efficiency.

FAQ

How to Integrate Domestic/Private LLMs?

We provide a Custom LLM Server Python interface example in the source code. Due to limited resources, we've only tested with internally deployed ChatGLM2. For other needs, please discuss via GitHub issues.

Why Only Intellij Version?

As someone experienced in developing new language plugins, contributing to Intellij Community/Android Studio source code, and optimizing Harmony OS IDE architecture, I specialize in JetBrains IDE development.

When Will VS Code Version Arrive?

Short answer: Not soon.

Though I've studied VS Code/X Editor source code:

  1. VS Code lacks critical IDE interfaces
  2. Implementation challenges:
    • TextMate-based tokenization (unreliable Oniguruma regex)
    • Limited LSP implementation references
  3. No quality reference implementations

The ideal approach would be GitHub Copilot-style IDE-agnostic Agent mechanisms with TreeSitter for language processing.

Additional Notes

AutoDev positions LLMs as developer Copilots, providing assistance tools to handle tedious tasks, enabling engineers to focus on creative design and problem-solving.

AutoDev 0.7.0 - Generating Standardized Code, Deep Integration into Developer Daily Work

· 5 min read

Months ago, we embarked on exploring: How to combine AIGC for R&D efficiency improvement? We open-sourced AutoDev, as introduced on GitHub:

AutoDev is a LLM/AI-assisted programming plugin for JetBrains IDEs. AutoDev can directly integrate with your requirement management systems (e.g., Jira, Trello, Github Issues, etc.). Within the IDE, with simple clicks, AutoDev automatically generates code based on your requirements. All you need to do is perform quality checks on the generated code.

Through our exploration of LLM capability boundaries, we discovered some more interesting patterns that have been incorporated into AutoDev.

PS: Search for AutoDev in JetBrains plugins and install it. Configure your LLM (e.g., OpenAI and its proxies, open-source LLMs, etc.) to start using.

WHY AutoDev? Understanding the Integration of GenAI + Software Development

Regarding generative AI, we maintain views similar to our previous sharing:

  1. GenAI can improve efficiency in almost every phase of the R&D process.
  2. More effective for standardized processes, with limited benefits for less standardized small teams.
  3. Efficiency gains need tool implementation due to the time cost of prompt writing.

Therefore, when designing AutoDev, our goals were:

  1. End-to-end integration to reduce interaction costs - from prompt writing to LLM interaction, then copying back into tools.
  2. Automatic collection of prompt context for content/code generation
  3. Final human verification and correction of AI-generated code.

Thus, manual specification organization and automatic context collection to improve generation quality became our focus in tool development.

AutoDev 0.7 New Features

From the big demo in April to the new version today, we continuously studied implementations of GitHub Copilot, JetBrains AI Assistant, Cursor, Bloop, etc. Each tool has unique selling points. Combined with my daily development habits, we added a series of exploratory new features.

Details on GitHub: https://github.com/unit-mesh/auto-dev

Feature 1: Built-in Architectural Specifications & Code Standards

LLM's "parrot mode" (generation mechanism) produces code matching current context programming habits. When using AI code generation features like GitHub Copilot, it generates new API code based on how we handle existing APIs. If our code uses Swagger annotations, it will generate similar code in the same Controller.

This implies a problem: If predecessors wrote non-standard code, generated code will also be non-standard. Therefore, we added CRUD template code specification configuration:

{
"spec": {
"controller": "- Use BeanUtils.copyProperties for DTO to Entity conversion in Controllers",
"service": "- Service layer should use constructor injection or setter injection, avoid @Autowired annotation",
"entity": "- Entity classes should use JPA annotations for database mapping",
"repository": "- Repository interfaces should extend JpaRepository for basic CRUD operations",
"ddl": "- Fields should use NOT NULL constraints to ensure data integrity"
}
}

In special scenarios, specifications alone are insufficient - sample code configuration is needed. With this configuration, when generating Controller/Service code, we can directly apply these specifications.

Feature 2: Deep Integration into Developer Daily Activities

In the April release, AutoDev integrated basic programming activities: AI code completion, comment generation, code refactoring, code explanation, etc.

While developing AutoDev itself, we discovered more interesting needs and integrated them into the IDE:

  • One-click commit message generation. When using IDEA's commit UI, generate suggested commit messages.
  • One-click changelog generation. Select multiple commits in history to generate CHANGELOG based on messages.
  • Error message analysis. During debugging, select error messages to automatically analyze with LLM combining error context.
  • Test code generation.

Combined with AutoDev's core strength of automatic CRUD from requirements, the feature set becomes more comprehensive.

Feature 3: Multi-language AI Support

In April, we found LLMs excel at CRUD, so chose Java for initial implementation. However, languages I frequently use like Kotlin/Rust/TypeScript lacked support.

Referencing Intellij Rust's modular architecture, we reorganized layers/modules using Intellij Plugin extension points (XML + Java) to rebuild the foundation.

New extension points in the architecture:

  • Language data structure extensions. Originally designed for UML representation when tokens are limited. Later referenced (copied) JetBrains AI Assistant's language extensions - language-specific data structures implemented in their own modules.
  • Language prompt extensions. Language-specific prompt differences moved to respective modules.
  • Custom CRUD workflows. Existing CRUD implementation was Java-specific. Now each language implements its own approach.

Currently, Java/Kotlin still have the best support.

Feature 4: Broader LLM Support

AutoDev's original design considered our second hypothesis: Every major company will launch its own LLM. Each LLM has unique characteristics, requiring broader LLM support.

  • OpenAI & proxies. Most tested and complete implementation.
  • Azure OpenAI. As a legal OpenAI channel in China, we implemented preliminary support and gradually improved it.
  • Other LLMs. While suitable domestic LLM APIs haven't been found yet, the interface supports such integration.

Welcome to experiment with your own LLMs.

Feature 5: Smarter Prompt Strategies

In our May article Context Engineering: Real-time Capability Analysis Based on GitHub Copilot, we analyzed GitHub Copilot's prompt strategies. Core promptElements include: BeforeCursor, AfterCursor, SimilarFile, ImportedFile, LanguageMarker, PathMarker, RetrievalSnippet, etc.

Discovering JetBrains AI Assistant uses similar approaches, we refined AutoDev's prompt strategies:

  • Code context strategies:
    • Java + CRUD mode: Build context using related code (BeforeCursor), all called methods, called lines, UML-like class diagrams.
    • Other Java modes: Use DtModel to build UML-like comments as reference.
    • Python: Use import-based similar code snippets as LLM reference.
  • Token allocation strategy: Distribute context based on token limits.

As a "smart context" strategy, current implementation still needs optimization.

Others

Feel free to discuss code on GitHub: https://github.com/unit-mesh/auto-dev.

AutoDev:AI 突破研发效能,探索平台工程新机遇

· 10 min read

围绕于探索 AI 对软件研发的影响,并在有了 LLM 微调工程化能力之后,我们上周末又开源了一个适用于 AI 研发提效的工具:AutoDev。如此一来,我们便构建了接近完整的 AI 在研发效能提升。

在这篇文章中,我们将基于 Unit Mesh、DevTi、AutoDev 等一系列的探索,分享 AI 对于研发效能的影响,以及对于平台工程带来的新机遇。

PS:整个体系站在一个基本假设是:中大型企业将至少拥有一个私有化的大语言模型。

GitHub: https://github.com/unit-mesh/auto-dev

引子 1:DevTi = 软件开发工程化 + LLM 微调

DevTi(Development + Titanium)一款基于大语言模型的研发效能提升的开源项目。旨在基于 LLM 的微调来提供全面智能化解决方案,助力开发人员高效完成开发任务,以实现自动化用户任务拆解、用户故事生成、自动化代码生成、自动化测试生成等等。

简单来说,DevTi 是 AI + 研发效能领域的小模型的工具链 —— 借助于 DevTi,你可以快速训练出适用于软件研发的微调模型。一个简化的流程,如下图所示:

Untitled

为了进行相关的模型微调或者训练,在其中的每一个阶段里,我们都需要准备数据、处理数据、生成 prompt 等,如准备一系列的用户故事、代码生成的数据。所以,作为工程师,需要准备一系列的编程基础设施或者模块。

DevTi 所包含的模块如下所示:

Untitled

它包含了 4.5 个模块:

  • Collector(Python, JavaScript),数据收集。这个模块负责从不同的数据源(如 GitHub、Stack Overflow、CodePen 等)收集代码片段、问题、答案等数据,以便用于微调。
  • Processor(Kotlin),数据处理。这个模块负责对收集到的数据进行清洗、格式化、标注等预处理操作,以提高数据质量和一致性。
  • Prompter(Python),Prompt 设计、调整、优化等。这个模块负责根据用户的需求和场景,设计合适的 Prompt 来引导大语言模型生成期望的输出,例如用户故事、代码片段、测试用例等。
  • Train(Python),训练相关的 Notebook。这个模块包含了一些 Jupyter Notebook 文件,用于展示如何使用不同的大语言模型微调(如 ChatGLM、LLaMA等)来完成不同的研发任务,例如代码生成、代码补全、代码注释等。
  • Chain。待定

随后,便可以围绕于 DevTi 构建工具链,如 IDE 工具、看板工具等等。

引子 2:AutoDev = IDE 插件 + AI API 调用

AutoDev 是一款高度自动化的 AI 辅助编程工具。AutoDev 能够与您的需求管理系统(例如 Jira、Trello、Github Issue 等)直接对接。在 IDE 中,您只需简单点击,AutoDev 会根据您的需求自动为您生成代码。您所需做的,仅仅是对生成的代码进行质量检查。

简单来说,AutoDev 定位是适用于私有化大语言模型 + 高度集成的 AI 编程助手。AutoDev 提供了一种 AutoCRUD 模式,其设计理解的过程是:

  1. 从需求管理系统获取需求,并进行需求分析。
  2. 结合源码与需求系统,选择最适合变更的入口(如 Java 中的 Controller)
  3. 将需求与 Controller 交给 AI 分析,以实现代码的代码。
  4. 根据 Controller 逐步自动完成其它部分代码(实现中…)

另外一种模式则是普通的 Copilot 模式,可以接入现有的大模型工具,实现一系列的 AI 代码辅助相关功能。

Untitled

GitHub: https://github.com/unit-mesh/auto-dev

接入 LLM,我们不仅可以生成代码,还可以生成单元测试代码,从而提高测试效率和覆盖率。

让我们再展开看一看,基于现有的 AI 能力,会有哪些新可能性。

平台工程的变化与新机遇

而除了我们上述的 demo 之外,我们相信它带会其它带来一系列的变化。对于中大型组织的基础设施或者平台团队来说,要接入 AI 能力需要有更多的变化与机遇。

平台工程是一种用来构建和运维支持软件交付和生命周期管理的自助式内部开发者平台的机制和架构。平台工程可以提高开发者的体验和生产力,提供自动化的基础设施操作。平台工程是软件工程组织的新趋势,它可以优化开发者的工作流程,加速产品团队交付客户价值。

平台工程的核心思想是将平台视为一种产品,由专业的平台团队来创建和维护,为内部的客户(如开发者、数据科学家等)提供可复用的服务、组件和工具。

需求:自动化收敛、分析与完善

在现有的场景之下,已经有一系列的关于结合 AI 进行需求管理的尝试:

  • 自动化完善。对用户的反馈和数据的分析,自动识别和补充缺失的需求信息,例如自动识别用户提出的问题并转化为需求描述,自动补全需求的关键词和标签等。
  • 自动化分析。通过训练自带的领域知识,可以更好地评估和优化需求,发现潜在的问题和机会,提高需求的效率和效果。
  • 自动化收敛。结合其它 AI 技术,比如智能推荐、对话系统、多方协作等,可以帮助您更好地沟通和协调需求,收集和整合用户的反馈和痛点,提高需求的满意度和一致性。
  • 自动化迭代。结合人类反馈的 AI 数据,可以更好地更新和改进需求生成,适应不断变化的环境和用户需求,提高需求的持续性和创新性

尽管现有的几个方案:LangChain、llama-index 等暂时只支持 OpenAI,但是随着更多开源大语言模型的加入,未来会更易于落地。

工具链:智能的 IDE

对于现有的场景来说,已经相当的丰富,诸如于:

  • 自动化代码审查
  • 自动化测试
  • 自动化日志分析
  • AI 辅助编程
  • ……

诚然,诸如于 GitHub Copilot 等收费 AI 工具来说,对于大部分公司来说,贵可能是其次,重点是代码的安全性。而虽然国内各类新的模型层出不穷,但是大部分缺少编程相关的集成,又或者是编程能力比较弱。然而,市面上也有只用于编程相关的模型,如 Salesforce 在 Hugging Face 上提供的 16B CodeGen 模型。虽然,还需要经过一些小的微调,但是如 Replit 公司所言,效果还是非常不错的。

随后,便是类似于 AutoDev 针对于大语言模型进行的封装,简化普通开发人员的开发过程。

文档:超越搜索

在有了 LLM 和各种智能问答的基础上,我们还可以加入内部各种工具的文档和代码,以提供更全面、更智能的文档服务。例如,LangChain 构建的问答式文档,可以对企业内部的各种文档进行语义理解和智能问答,进而简化开发人员的学习成本。

其它

AI 正在带来一系列的变化,特别是对于中大型企业的平台工程团队来说,接入 AI 能力需要有更多的变化与机遇。例如,可以自动化收敛、分析与完善需求,构建智能的IDE,提供更全面、更智能的文档服务等。

我们依旧在探索中,欢迎来加入我们。