{"id":54091,"date":"2024-10-16T10:05:00","date_gmt":"2024-10-16T17:05:00","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/dotnet\/?p=54091"},"modified":"2024-10-16T16:29:50","modified_gmt":"2024-10-16T23:29:50","slug":"building-github-copilot-into-visual-studio","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/dotnet\/building-github-copilot-into-visual-studio\/","title":{"rendered":"How we build GitHub Copilot into Visual Studio"},"content":{"rendered":"<p>In 1907, during the heat and humidity of a northern hemisphere summer, Paris newspaper Le Martin and the New York Times announced they would be sponsoring a race so ambitious as to be known simply as, \u201cThe Great Race.\u201d The planned route was from New York to Paris and the astute among you may note there isn\u2019t any actual land passage from one to the other. There was a theory that the Bering strait would ice over in the winter and prove crossable; that \u2018Bering Land Bridge\u2019 had been, of course, gone for ~18k years (just missed it!). This was at a time when there was little infrastructure to support these vehicles and very few paved roads.<\/p>\n<p>This is how it has felt integrating AI into Visual Studio. AI remains so nascent that there aren\u2019t paved paths to success. At times, we were busy building out an experience and realized that there wasn\u2019t a way to accomplish it, so we had to pivot entirely (as did the motorists who ended up having their vehicles shipped across the Pacific). What has our approach been to build out the Copilot support that we have today given that? Well, experimentation, iteration, and leveraging .NET have played large roles, so let\u2019s talk about it.<\/p>\n<p>Visual Studio started our AI journey back when we announced <a href=\"https:\/\/www.microsoft.com\/research\/project\/intellicode-completions\/publications\/\">IntelliCode<\/a> at Build 2018. IntelliCode offers several <a href=\"https:\/\/learn.microsoft.com\/visualstudio\/ide\/intellicode-visual-studio?view=vs-2022\">productivity features<\/a> including starred suggestions, whole line completions (grey text insertions that can be accepted with tab), repeated edits, and API usage examples.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2024\/10\/building-github-copilot-into-vs-1.png\" alt=\"Screen shot showing a grey text completion of LastIndexOf('-') for the code int lastHyphenIndex = this.Code.\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2024\/10\/building-github-copilot-into-vs-2.png\" alt=\"Repeated edits UI shows the Contains method highlighted in Red and Green text on right showing replacement of StartsWith.\" \/><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2024\/10\/building-github-copilot-into-vs-3.png\" alt=\"Shows a tool tip for the method int.TryParse with a hyper link that links to GitHub Examples and Documentation.\" \/><\/p>\n<p>These are all built on top of local models trained on public repositories. These models are <em>not<\/em> large language models and are exceptionally small such that they execute well even on lower powered machines and can load in limited memory space. IntelliCode was built before open standards for leveraging AI models had gained traction so much of the integration is bespoke. However, IntelliCode helped inform UX paradigms that would be generally adopted for future AI interactions such as \u2018ghost text\u2019 for completions and \u2018tab to accept\u2019 for repeated edits. IntelliCode has both an \u2018in process\u2019 component that is loaded into the devenv.exe process as well as a \u2018model host\u2019 which is loaded into a ServiceHub process. The in process and ServiceHub components are built with netfx. The out-of-process service is named ServiceHub.IntellicodeModelService.exe. Communication between the processes is facilitated via <a href=\"https:\/\/github.com\/microsoft\/vs-streamjsonrpc\">vs-streamjsonrpc<\/a> which implements the JSON-RPC protocol for .NET over various transports. Since all of this was built on .NET, it was easy to rapidly iterate and worked seamlessly within the VS ecosystem. This is due to .NET\u2019s rich ecosystem for profiling, debugging, performance monitoring, and its broad usage within Microsoft that allows for reuse of ideas and code. It also allowed reuse of early tokenizers such as <a href=\"https:\/\/github.com\/microsoft\/BlingFire\">BlingFire<\/a>. Understanding this architecture helps set the stage for what came next.<\/p>\n<p>In 2021 GitHub announced a technical preview of completions (ghost text) functionality in VS Code. It was March 2022 when GitHub released a similar extension for Visual Studio. GitHub Copilot was initially built on the OpenAI Codex model which was a GPT-3 language model additionally trained on gigabytes of source code in a variety of programming languages. At this time, GitHub developed the extension to Visual Studio and VS Code, along with other clients. To facilitate rapid development between these clients a single \u2018agent\u2019 was created that every client uses. That agent is a Node process and is responsible for making web requests to GitHub, collating the context to be used to prompt the model, truncating\/massaging responses, and additional functionality and features such as <a href=\"https:\/\/docs.github.com\/en\/copilot\/managing-copilot\/managing-github-copilot-in-your-organization\/setting-policies-for-copilot-in-your-organization\/excluding-content-from-github-copilot\">content exclusion<\/a>. If you are using Copilot completions in VS, you\u2019ll see a separate process named \u2018copilot-language-server.exe\u2019 (its name has changed a few times, but this is the most recent name). If you\u2019ve ever been curious why you may need to set environment variables to enable proxy support for Copilot completions in VS in addition to modifying devenv.exe.config, this is why. The Node process uses environment variables while .NET will use the information in the config file (this is an area where we\u2019d love to improve in the future as it\u2019s non-obvious). The nice thing is that .NET makes it relatively trivial to interoperate via IPC, so having these separate processes is straightforward.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2024\/10\/building-github-copilot-into-vs-4-1.svg\" alt=\"A diagram showing the basic completions architecture.\" width=\"800\" \/><\/p>\n<p>The reason the process is now called copilot-language-server is because it communicates with its host via <a href=\"https:\/\/microsoft.github.io\/language-server-protocol\/\">the LSP protocol<\/a>.<\/p>\n<p>GitHub Copilot Completions has evolved since GitHub introduced the feature in 2021. For example, completions no longer uses the Codex model. In Visual Studio 17.10 the feature is now bundled with the installer and does not require acquiring a separate extension (it can be unchecked\/uninstalled as a recommended component for anyone that does not want it). The high-level architecture is largely unchanged though, so you\u2019ll still see a Node process created whenever you\u2019re signed into GH Copilot and using completions. This architecture allows substantial sharing between the clients which was GitHub\u2019s goal when initially developing the extension; however, running the copilot-language-server as a Node process is somewhat foreign to the overall architecture of Visual Studio, so when the GitHub Copilot X investments started, we reconsidered this approach and decided to align more directly with Visual Studio\u2019s natural extensibility model.<\/p>\n<p>In March 2023 GitHub announced Copilot X which introduced many new features, including chat integration into Visual Studio. This may be clear if you\u2019ve gotten this far, but the separation of \u2018chat\u2019 and \u2018completions\u2019 is arbitrary and, at this point, largely the result of historical evolution rather than any technical need. The Visual Studio team set out to deeply integrate \u2018chat\u2019 functionality into Visual Studio. This started with basic features like a chat tool window and inline chat; however, many other features have subsequently been added which leverage this \u2018chat\u2019 functionality, such as rename suggestions, generating commit and PR messages, helping to diagnose exceptions, etc. The architecture is like IntelliCode and completions; however, instead of a .NET Framework or Node process, there are separate .NET 8 processes:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2024\/10\/building-github-copilot-into-vs-5-1.svg\" alt=\"A diagram showing the basic chat architecture.\" width=\"800\" \/><\/p>\n<p>Unlike the Node process for completions these are all specific to VS and not shared with VS Code, JetBrains products, or other clients. We try to be very intentional about when choosing to build centralized features and a big part of that calculus has to do with user expectations and client specifics. We chose this architecture because although there are many overlapping concerns between the different clients, the prompt crafting, context retrieval, etc. is very client specific. Additionally, the VS team knew that by building the OOP components on .NET, we could both leverage and inform the new framework types in .NET 8\/9 that speed up the creation of \u2018intelligent apps\u2019 (i.e., apps that are powered by AI). It also forced us to separate our logic so that as the AI landscape evolved, we could react quickly. Finally, it also aligned nicely with the new <a href=\"https:\/\/learn.microsoft.com\/visualstudio\/extensibility\/visualstudio.extensibility\/visualstudio-extensibility?view=vs-2022\">modern extensibility model for Visual Studio<\/a>.<\/p>\n<p>This architecture has allowed us to leverage\/influence many .NET improvements, for example: <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/98105\">server-sent events<\/a>, <a href=\"https:\/\/github.com\/dotnet\/runtime\/issues\/100159\">System.Text.Json to JSON schema mapping<\/a>, <a href=\"https:\/\/github.com\/microsoft\/Tokenizer\/blob\/2c9ba5d343de52eb27521afef7c0c2f0f76c9c52\/Tokenizer_C%23\/TokenizerLib\/TikTokenizer.cs#L20\">tokenizers<\/a>, various aspects of <a href=\"https:\/\/learn.microsoft.com\/semantic-kernel\/overview\/\">semantic kernel<\/a>, and others. This includes the recently <a href=\"https:\/\/devblogs.microsoft.com\/dotnet\/introducing-microsoft-extensions-ai-preview\/\">announced Microsoft.Extensions.AI.Preview<\/a>.<\/p>\n<p>The indexing service is another component upon which much of the Copilot architecture depends. Providing good context as part of interacting with LLMs is critical to quality responses. The context needed differs depending on the scenario, but it\u2019s frequently necessary to quickly answer questions about the source location for definitions or references. For example, imagine that VS is providing completions and you have the following:<\/p>\n<p><strong>\/\/ Person.cs<\/strong><\/p>\n<p><code>public record Person(string Name, bool HasEvilLair);<\/code><\/p>\n<p>Then in program.cs you start editing Main. You might see a completion suggestion like this:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2024\/10\/building-github-copilot-into-vs-6.png\" alt=\"Shows IsVillian which takes a Person and the Copilot completion suggesting returning the person.IsVillian property.\" \/><\/p>\n<p>This would happen if the context to create the completion didn\u2019t include information about the Person record. The LLM will do its best to guess and, it\u2019s seen plenty of .NET Person classes and it just assumes that this one probably has an IsVillian property. However, with the semantic index, this would instead look like:<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/devblogs.microsoft.com\/dotnet\/wp-content\/uploads\/sites\/10\/2024\/10\/building-github-copilot-into-vs-7.png\" alt=\"Shows IsVillian which takes a Person and the Copilot completion suggesting returning the person.HasEvilLair property.\" \/><\/p>\n<p>The semantic index is leveraged for completions, chat functionality, etc.<\/p>\n<p>This is a bit of the history and high-level architecture of GitHub Copilot integration in Visual Studio. AI is progressing exceedingly rapidly and building on .NET is one way that the VS team can keep pace. There will be many enhancements, changes, benefits, and surprises in upcoming VS and GitHub Copilot releases, so keep an eye out! If you\u2019re curious, the New York to Paris race was eventually completed by three of the initial six teams. The race led to the call for improved roads to be constructed across the world. This is how I see the partnership between Visual Studio, GitHub Copilot, and all the other AI application development happening with .NET. It is trailblazing for improved libraries, better tooling, and faster development. There may not be a clear destination, but we can certainly help pave the way.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Find out how Visual Studio integrates GitHub Copilot, architectural detail, .NET implementation, and the importance of the indexing service for providing context-aware AI code suggestions.<\/p>\n","protected":false},"author":3216,"featured_media":54092,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[685,7781,646],"tags":[568,7869,147],"class_list":["post-54091","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-dotnet","category-ai","category-visual-studio","tag-ai","tag-github-copilot","tag-visual-studio"],"acf":[],"blog_post_summary":"<p>Find out how Visual Studio integrates GitHub Copilot, architectural detail, .NET implementation, and the importance of the indexing service for providing context-aware AI code suggestions.<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/54091","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/users\/3216"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/comments?post=54091"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/posts\/54091\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media\/54092"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/media?parent=54091"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/categories?post=54091"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/dotnet\/wp-json\/wp\/v2\/tags?post=54091"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}