Tech Portal

Traditional GigEVision vs CXP vs Zero Copy vs RDMA vs GPU Direct vs FPGA cards

GigEVision + GVSP

15+ Years widespread use
Fully ratified and mature standard
Massive adoption
UDP based protocol
True streaming protocol
Multicast support
Has everything you need
Needs properly designed receiver at high-speed

Figure: GVSP frame and packet structure.

We will take a moment to understand the technology behind GigEVision. GVSP is the Ethernet streaming protocol used in the current standard. A stream is made up of multiple frames (or images). Each frame is made up of a leader packet, multiple image (or payload) packets, and a trailer packet. All packets follow the UDP ethernet protocol which is an unconnected protocol. This simply means the camera sends the packets and leaves the receiver to its job of placing the data in the destination buffer. Being an unconnected protocol, this means it has 0 network overhead which leads to maximum network performance. It also means fundamentals like multicasting are supported. We must properly design our receiver to avoid data loss. CXP also follows this same protocol and leaves the receiver to its job of placing the data in the destination buffer. This leads to top performance and the lowest latency and jitter with a quality receiver. We will note that the inability of some companies to design a quality receiver has led them down alternate paths.

This short animation illustrates the process of splitting GigEVision network packets into images. Headers, Leaders, and Trailers get consumed by a control process while the image portions end up in a contiguous memory buffer. When software is used for this process, the whole packet is written to memory and then the image portions need to be read out of memory and written back into another memory location in a non-fragmented (or contiguous) manner. This process can be done in software which costs 3x the memory bandwidth or it can be done by the card’s header splitting features for optimal performance.

Conventional GigEVision + GVSP

Memory copy required(header splitting in software)
Higher CPU %
3x System Memory bandwidth
3x More Powerful PC
3x PC Quantity
1/3 System Density
Needs well designed receiver at high-speed

Conventional GVSP uses header splitting in software to strip the headers off the GVSP packets and place the image data from the payload packets into a contiguous memory buffer. This process raises CPU usage but more importantly eats up 3x system memory bandwidth usage over a 0 copy implementation. This results in a 33% efficiency for the system which factors into system cost in a number of ways. This is an example of a poorly designed receiver and many in the market are still doing this even at 10GigE but we still see cases where some companies have trouble running multiple 1GigE cameras in a single server all related to poor receiver design.

This short animation illustrates the triple memory bandwidth usage of a system that does not utilize a zero-copy (or header splitting) technology. A system like this can result in data loss as memory bandwidth is exhausted. Data loss occurs when the buffer in the network card overflows when the CPU and memory do not permit further transfers. This, incidentally, is what RDMA proponents compare with when discussing pros and cons of traditional GigEVision and RDMA which is very misleading as this is the worst-case example.

traditional gigevision vs cxp vs zero copy vs rdma vs gpu direct vs fpga cards conventional gigevision gvsp 1200x675 1

Figure: Data path in a conventional GigEVision + GVSP implementation.

Optimized GigEVision + GVSP

True zero copy
Uses header splitting(HS) in OTS NICs
Full kernel bypass
HS in use for SMPTE 2110 in M&E market
Supported by industry processing cards
Lowest latency and jitter
No resends or flow control required (nor needed) with quality implementation
Remains GigEVision compliant

ZERO copy with header splitting is indeed possible with modern NICs by Nvidia/Mellanox, Broadcom, Intel, and Marvell. Emergent has implementations deployed with Nvidia/Mellanox and Broadcom which are the primary NICs explored by those experimenting with RDMA / RoCE which eliminates any concerns surrounding interoperability. In fact, Emergent has been using this same method for over 15 years and have the maximum design-in densities of any interface standard with reliability to match. The same approach is also used for ST2110 for the massive media and entertainment market.

traditional gigevision vs cxp vs zero copy vs rdma vs gpu direct vs fpga cards optimizedgigetargetwritten e1669248604257 800x370 2

traditional gigevision vs cxp vs zero copy vs rdma vs gpu direct vs fpga cards optimized gige partners e1669320914421 800x304 1

Figure (top): Data path in an optimized implementation of GigEVsion.
Figure (bottom): Partners of Emergent Vision Technologies.

ZERO copy does not guarantee zero data loss in ANY interface or protocol implementation. Any performance system still needs proper design and margining to achieve desired results. This goes for CXP, RDMA / RoCE, and even optimized GVSP implementations. But we can guarantee that the optimal GVSP implementation will equal or better RDMA / RoCE without turning GigEVision into a point-to-point protocol and eliminating what has made GigEVision the most flexible and popular interface over the years. It is important to remind that when the retransmission feature of RDMA is engaged that this is a sign of a back up in the system which is also a sign of often undesired latency and jitter. It is also important to remind that CXP doesn’t use resends or flow control yet is able to sustain high data transfer rates with optimal receiver performance, low latency and jitter. Much of this can be attributed to zero-copy technology and adequate buffering on the purpose-built frame grabbers required for CXP. Low-cost NICs often lack sufficient buffering capability however modern NICs are readily available at cost-effective price points with ample physical buffering.

It is worth noting at 25Gbps and higher that the familiar PoE (power over ethernet) is dead. Thus, new deployments should be focussed on SFP technologies and distributed power systems. It is also noteworthy even at 10GigE speeds that the big NIC providers do not support PoE which forces camera vendors to sell their proprietary card solutions.

This short animation illustrates the zero-copy memory bandwidth usage of an optimized GVSP based system using zero-copy. We see in the first part of this animation the system is not optimized and thus even the appropriately sized buffer in the network card over-flows. In the second part of the animation, data flows freely and reliably thanks to zero-copy and system optimization.

GPU Direct

0 CPU and 0 System Memory bandwidth
NVidia product requires Rivermax for Windows
NVidia requires partnership – select few
Linux is open for GPU direct on std GPUs
80% MV application on Windows
Some apps include AOI, drone, VR, sports
Lowers PC requirements
Peer to peer support
Available NOW!

ZERO copy minimizes the CPU and memory bandwidth utilization by writing to memory only once, but we can avoid that transfer altogether by writing directly to the GPU – this is called GPU Direct. And it makes sense in many performance applications to send data directly to the GPU for processing and then taking the lower bandwidth results to the CPU and memory for user or system interaction.

Emergent has been supporting GPU Direct with Nvidia GPUs on Windows and Linux for over 4 years in a variety of applications. Nvidia RTXA6000/5000/4000, Orin, and Xavier are used in many applications using Emergent cameras.

Unfortunately for RDMA users, Nvidia / Mellanox only allow GPU Direct on Windows to select partners such as Emergent and this OS is where 80% of machine vision applications continue to be deployed. Linux, however, does remain an option for RDMA with GPU Direct for all.

This short animation illustrates the zero-TRANSFER process using GPU Direct which completely bypasses the memory and utilizes only the PCIe endpoints of the CPU for 0% memory and 0% CPU utilization.

FPGA

0 CPU and 0 System Memory bandwidth
CPU not involved at all
OTS FPGA Cards with native Emergent provided GVSP core support or with OTS GVSP cores from Xilinx, etc.
MV algorithms in abundance
Windows and Linux support
Lowers PC requirements
Peer to peer support
Available NOW!

ZERO copy is great. GPU Direct improves on this a lot. But it would be the ultimate achievement if we received and processed the data from the cameras all on one card. In this case, CPU, memory and all server resources are not used at all. Emergent is supporting AMD/Xilinx Alveo cards for this very purpose and have multiple performance applications leveraging this technology. Emergent is also working closely with Nvidia to bring Bluefield NIC support. Think of Bluefield as the merging of Nvidia NICs with Nvidia GPUs. In both cases, the computer can be a very low-end PC which primarily supplies power to the chosen card.

This short animation illustrates the FPGA card process which completely bypasses the memory and CPU for 0% memory and 0% CPU utilization.

About Emergent Vision Technologies

Here is a recap of what Emergent is all about…

10+ Awards for innovation and pioneering the high speed GigEVision imaging movement
10+ years shipping 10GigE cameras with more than 140 models
5+ years shipping 25GigE cameras with more than 55 models
2+ years shipping 100GigE cameras with more than 16 models
Camera technology performance leader
Focused on high-speed Ethernet/GigEVision
Focused on enabling the processing of high-speed image data
Area scan and Line scan models
UV, NIR, Polarized, Color, Mono models for multispectral applications
Emergent eSDK for full application flexibility
Emergent eCapture Pro for a highly comprehensive software solution
Most comprehensive range of product and support for high-speed imaging applications
Any speed, any resolution, any cable length
Available NOW!

We are a multi-award winning company with a focus on high speed GigEVision product.

We have many years shipping product ranging in speeds from 10GigE up to 100GigE.

We have a strong focus on providing end-to-end technologies and support for our customers applications.

We can fullfil most application needs.

Lastly, products presented are available now.

Adoption of 10GigEVision and Higher

Here is a quick snapshot of the adoption of GigEVision products ranging in speeds from 10GigE up to 100GigE. Emergent has shown how top performance can be achieved and opened up many markets including machine vision to the use of such technologies. Some companies are just now leveraging our efforts toward releasing 25G and higher speed products but still a ways to go to release ratified and performance products.

traditional gigevision vs cxp vs zero copy vs rdma vs gpu direct vs fpga cards adoption e1669319825848 1200x446 1.png

Figure: Emergent Vision Technologies is the first provider of cameras based on 10GigE, 25GigE, 50GigE, and 100GigE interfaces.

Tech Portal

Tech Portal

Traditional GigEVision vs CXP vs Zero Copy vs RDMA vs GPU Direct vs FPGA cards

GigEVision + GVSP

Conventional GigEVision + GVSP

Optimized GigEVision + GVSP

GPU Direct

FPGA

About Emergent Vision Technologies

Adoption of 10GigEVision and Higher

Overview

Contact Us

Overview

Contact Us