Skip to main content

Windows 11 will soon harness your GPU for generative AI

Following the introduction of Copilot, its latest smart assistant for Windows 11, Microsoft is yet again advancing the integration of generative AI with Windows. At the ongoing Ignite 2023 developer conference in Seattle, the company announced a partnership with Nvidia on TensorRT-LLM that promises to elevate user experiences on Windows desktops and laptops with RTX GPUs.

The new release is set to introduce support for new large language models, making demanding AI workloads more accessible. Particularly noteworthy is its compatibility with OpenAI’s Chat API, which enables local execution (rather than the cloud) on PCs and workstations with RTX GPUs starting at 8GB of VRAM.

Nvidia’s TensorRT-LLM library was released just last month and is said to help improve the performance of large language models (LLMs) using the Tensor Cores on RTX graphics cards. It provides developers with a Python API to define LLMs and build TensorRT engines faster without deep knowledge of C++ or CUDA.

With the release of TensorRT-LLM v0.6.0, navigating the complexities of custom generative AI projects will be simplified thanks to the introduction of AI Workbench. This is a unified toolkit facilitating the quick creation, testing, and customization of pretrained generative AI models and LLMs. The platform is also expected to enable developers to streamline collaboration and deployment, ensuring efficient and scalable model development.

A graph showing TensorRT-LLM inference performance on Windows 11.
Nvidia

Recognizing the importance of supporting AI developers, Nvidia and Microsoft are also releasing DirectML enhancements. These optimizations accelerate foundational AI models like Llama 2 and Stable Diffusion, providing developers with increased options for cross-vendor deployment and setting new standards for performance.

The new TensorRT-LLM library update also promises a substantial improvement in inference performance, with speeds up to five times faster. This update also expands support for additional popular LLMs, including Mistral 7B and Nemotron-3 8B, and extends the capabilities of fast and accurate local LLMs to a broader range of portable Windows devices.

The integration of TensorRT-LLM for Windows with OpenAI’s Chat API through a new wrapper will allow hundreds of AI-powered projects and applications to run locally on RTX-equipped PCs. This will potentially eliminate the need to rely on cloud services and ensure the security of private and proprietary data on Windows 11 PCs.

The future of AI on Windows 11 PCs still has a long way to go. With AI models becoming increasingly available and developers continuing to innovate, harnessing the power of Nvidia’s RTX GPUs could be a game-changer. However, it is too early to say whether this will be the final piece of the puzzle that Microsoft desperately needs to fully unlock the capabilities of AI on Windows PCs.

Editors' Recommendations

Kunal Khullar
A PC hardware enthusiast and casual gamer, Kunal has been in the tech industry for almost a decade contributing to names like…
AMD is taking the gloves off in the AI arms race
AMD's CEO presenting the MI300X AI GPU.

AMD looks ready to fight. At its Advancing AI event, the company finally launched its Instinct MI300X AI GPU, which we first heard about first a few months ago. The exciting development is the performance AMD is claiming compared to the green AI elephant in the room: Nvidia.

Spec-for-spec, AMD claims the MI300X beats Nvidia's H100 AI GPU in memory capacity and memory bandwidth, and it's capable of 1.3 times the theoretical performance of H100 in FP8 and FP16 operations. AMD showed this off with two Large Language Models (LLMs) using a medium and large kernel. The MI300X showed between a 1.1x and 1.2x improvement compared to the H100.

Read more
Windows 11 may replace a favorite shortcut with more AI
Windows 10 desktop showing task view.

Microsoft is currently testing removing a popular Windows 11 feature and swapping it out for AI.

The brand recently rolled out the Windows 11 preview build for the Dev Channel. In the build, the shortcut to Copilot is a primary feature of the operating system. The shortcut will be located in the bottom-right corner of the screen and will replace the "Show desktop" button, which has been commonplace on Windows since 2009, according to Neowin.

Read more
OpenAI is on fire — here’s what that means for ChatGPT and Windows
Former OpenAI CEO Sam Altman standing on stage at a product event.

OpenAI kicked off a firestorm over the weekend. The creator of ChatGPT and DALL-E 3 ousted CEO Sam Altman on Friday, kicking off a weekend of shenanigans that led to three CEOs in three days, as well as what some are calling an under-the-table acquisition of OpenAI by Microsoft.

A lot happened at the tech world's hottest commodity in just a few days, and depending on how everything plays out, it could have major implications for the future of products like ChatGPT. We're here to explain how OpenAI got here, what the situation is now, and where the company could be going from here.

Read more