Continuing this morning’s run of GTC-related announcements, NVIDIA is offering yet another update on the state of their Data Processing Unit (DPU) project. An initiative inherited from Mellanox as part of that acquisition, NVIDIA and Mellanox have been talking up their BlueField-2 DPUs for the better part of the last year. And now the company is finally nearing a release date, with BlueField-2 DPUs sampling now, and set to ship in 2021.

Originally hatched by Mellanox before the NVIDIA acquisition, the DPU was Mellanox’s idea for the next-generation of SmartNICs, combining their networking gear with a modestly powerful Arm SoC to offload various tasks from the host system, such as software-defined networking and storage, as well as dedicated acceleration engines. Mellanox had been working on the project for some time, and while the original BlueField products saw a relatively low-key release last year, the company has been hard at work on Bluefield-2, which NVIDIA has since elevated to a much greater position.

This second generation of DPU-accelerated hardware will go under the BlueField-2 name, and the two companies have been talking about it for most of the past year. Based on a custom SoC, the BlueField 2 SoC uses 8 Arm Cortex-A72 cores along with a pair of VLIW acceleration engines. All of this is then paired with a ConnectX-6 DX NIC for actual network connectivity. At a high level, the DPU is intended to be the next step in the gradual movement towards domain-specific accelerators within the datacenter, offering a more specialized processor that can offload networking, storage, and security workloads from the host CPU.

Coming off of their success in the datacenter market from broadening applications for GPUs, it’s easy to see NVIDIA’s interest in the DPU project: this is another piece of silicon they can sell to server builders and datacenter operators, and further undermines the importance of the one thing NVIDIA doesn’t have, a server-class CPU. So although not a project stated by NVIDIA, it’s a project they’re full embracing and expanding upon.

As the bulk of today’s DPU announcement is a recap for NVIDIA, the actual product plans for BlueField-2 have not changed. NVIDIA will be releasing two DPU-equipped cards, the BlueField-2, and the BlueField-2X. The former is a more traditional SmartNIC with the DPU and 2 100Gb/second Ethernet/InfiniBand ports. This allows it to be used for networking as well as storage tasks like NVMe-over-Fabrics.

Meanwhile the larger Bluefield-2X incorporates a DPU as well as one of NVIDIA’s Ampere-GPUs for further acceleration via in-network computing, as NVIDIA likes to call it. NVIDIA hasn’t disclosed the GPU used on the BlueField-2X, but if these renders are accurate, then the number of memory chips indicates it’s GA102, the same chip going into NVIDIA’s high-end video cards. Which would make BlueField-2X a very potent card with regards to compute performance.

And NVIDIA’s plans don’t stop with the BlueField-2 products. The company has planned out a series of cards based on BlueField-2, which will be released as the BlueField-3 and BlueField-4 family in successive years. BlueField-3 will be a souped-up version of BlueField-2, with separate DPU and DPU + GPU cards. Meanwhile BlueField-4 will be the first part where NVIDIA’s influence makes it into the core silicon, with the company planning a single high-performance DPU that would be able to significantly outperform the easier discrete DPU + GPU designs. All told, NVIDIA is expecting BlueField-4 to offer 400 TOPS of AI performance.

All of this, in turn, will come with NVIDIA’s traditional embrace of both hardware and software. The company is looking to mirror its CUDA strategy with DPUs, offering the Data Center Infrastructure-on-a-Chip Architecture (DOCA) as the software stack and programming model for BlueField 2 and later DPUs. This means assembling high-grade SDKs for developers to use, and then extending support for those SDKs and libraries over multiple generations. NVIDIA is clearly just getting DOCA off of the ground, but if history is any indication, software will play a huge role in the growth of the SmartNIC market, just like it did GPUs a decade prior.

Wrapping things up, the first BlueField-2 cards are now sampling to NVIDIA’s partners. Meanwhile commercial shipments will kick off in 2021, and BlueField-3 shipments may follow as soon as 2022.

Source: NVIDIA

POST A COMMENT

11 Comments

View All Comments

  • enzotiger - Monday, October 5, 2020 - link

    BF2 puzzles me. Why are they still using A-72? Why did they cut CPU core from 16 in BF1 to 8 in BF2? The embedded ConnectX-6 is a false advertisement. It's can't do 200Gb/s in RoCEv2. Reply
  • michael2k - Monday, October 5, 2020 - link

    The BF1 offered both 8 and 16 CPU designs.

    I imagine that the BF2 was just a mild evolution of the BF1, and you can compare the two here:
    https://www.mellanox.com/related-docs/prod_adapter...
    https://www.mellanox.com/files/doc-2020/pb-bluefie...

    It looks like the upgrade is mainly from PCIe 3 -> PCIe 4, 25GbE -> 100GbE, and 100GbE -> 200GbE, so they simplified the product matrix by removing the 16 core CPU option.

    I imagine this was a stop gap product, with most of the work spent on integrating an NVIDIA GPU with the bluefield2; it looks like from the included renders that the bluefield2x is a bluefield + Ampere, which means the bulk of offloaded work should be on the GPU and not 8 additional ARM CPU cores.
    Reply
  • grrrgrrr - Monday, October 5, 2020 - link

    Looks like rebranded tegra/drive px Reply
  • Yojimbo - Tuesday, October 6, 2020 - link

    No. This is a network, security, and storage offload processor. It was developed by Mellanox before NVIDIA bought Mellanox. The idea is accelerated data packet management and programmable control for networking using Arm cores. It contains special accelerators that deal with the data packets. The point is to free up resources of the CPUs of servers in datacenters where many servers are being lashed together through ethernet. NVIDIA seems to be further adding GPU cores to the board. I don't know enough to know how those would be used. They could use it for their own base functionality such as in security algorithms, but it will also be available for customers to target with their own apps they can write for the device. But I guess they figure there are various uses for the GPU since for Bluefield-4 the GPU seems to be part of the standard package.

    The DrivePX, which is now DriveAGX, I think, is an SoC, meaning that it is designed to run applications like a CPU does. It isn't designed to scale out to systems using many SoCs together, so there is no need for all the networking acceleration in the SoC. But it has image signal processors and other accelerators that help with vision and video processing.

    So they both have both ARM CPU cores and NVIDIA GPU cores, but the purpose, design, and additional components are very different.
    Reply
  • Yojimbo - Tuesday, October 6, 2020 - link

    oh.. and this Bluefield DPU will sit in a data center whereas the DriveAGX SoC will be in an autonomous car or the like. Reply
  • carcakes - Tuesday, October 6, 2020 - link

    AI chipset vs broadcom chipset, Gigabyte skues %smellsa% SERVER GPU MOBO ;p How is.. Reply
  • PeachNCream - Tuesday, October 6, 2020 - link

    I will never understand NV's product branding. They've missed a huge chance to replace the word "Field" with "Waffle." Reply
  • michael2k - Tuesday, October 6, 2020 - link

    Unfortunately it was Mellanox that brought in the name 'BlueField' when NVIDIA bought Mellanox.

    I imagine NVIDIA would have named the product NForce, to revive their old brand, but have the N stand for 'Network'
    Reply
  • Eliadbu - Tuesday, October 6, 2020 - link

    Give it couple of years and Nvidia will rebrand it to something else just like the Tesla and Quadro rebrands. Reply
  • Yojimbo - Tuesday, October 6, 2020 - link

    That Bluefield-4 is a veritable beast. A dual-socket Epyc 7742 is listed as about 770 SpecInt, and the Bluefield-4 is promising 1000. Assuming they are talking about INT8 tensor ops, an RTX 3080 has 238 TOPS AI, or 476 TOPS AI if sparsity is taken into account. The Bluefield-4 is promising 400 TOPS AI. The roadmap promises a 14.2X increase in CPU power and a 6.7X increase in AI power in 2 years going from the Bluefield-2X to the Bluefield-4. NVIDIA really is going all-in on this DPU thing. Reply

Log in

Don't have an account? Sign up now