Deep Learning Needs an Assist
One startup thought it had a strategy for overcoming those issues. Condense Reality, a volumetric video company, had a plan for capturing images, reconstructing scenes, and streaming MR content at multiple resolutions to end-user devices. From start to finish, each frame in a live stream would require only milliseconds to complete.
“Our software calculates the size and shape of objects in the scene,” said Condense Reality CEO Nick Fellingham (Figure 2). “If there are any objects the cameras cannot see, the software uses deep learning to fill in the blanks and throw out what isn’t needed, and then streams 3D motion images to phones, tablets, computers, game consoles, headsets, and smart TVs and glasses.”
But there was a hitch. For the software to work in real-world applications, Fellingham needed a high-resolution, high-frame-rate camera that content creators could set up easily in a sports stadium, concert venue, or remote location. The company tested cameras, but the models used severely limited data throughput and the cable distance between the cameras and the system’s data processing unit. To move forward, Condense Reality needed a broadcast-quality camera that could handle volumetric data at high speeds.
Figure 2: Condense Reality CEO Nick Fellingham stands inside one of the company’s volumetric capture setups.
High-Speed Cameras Deliver
In 2020, Fellingham learned that Emergent Vision Technologies was releasing several new cameras with high-resolution image sensors. These cameras included models with SFP28-25GigE, QSFP28-50GigE, and QSFP28-100GigE interface options, all of which offer cabling options to cover any length.
“Our cameras deliver quality images at high speeds and high data rates,” said Ilett. “They capitalize on advances in sensor technology and incorporate firmware we developed so the cameras can achieve the sensor’s full frame rate.”
The images in an MR experience should be captured at an extremely high frame rate and resolution. With the new cameras, Fellingham was able to assemble a commercially viable system. “High-speed GigE cameras are what we need to get the data off the cameras quickly and stream it,” he noted.
High-speed capture is particularly important for sports, where exciting action often occurs in the literal blink-of-an-eye. When capturing a golf swing, for example, a camera with a frame rate of 30 fps is likely to only “see” the beginning and end of the swing, which significantly reduces the quality of the volumetric content.
“We are not using these cameras for inspecting parts in a factory; we are using them to create entertainment experiences,” said Fellingham. “When the speed [fps] increases, the quality increases for fast-moving action, the output we generate is better, and the experience improves overall.”
Bigger Capture Areas Are on the Horizon
Condense Reality serves as a system integrator for customer projects. A standard system uses 32 cameras, one high-speed network switch from Mellanox, and a single graphics processing unit (GPU) from NVIDIA to cover a 7-meter by 7-meter capture. The company worked with Emergent Vision Technologies to put together the optimal system for volumetric capture.
“We don’t necessarily want to be committed to very specific hardware configurations, but by working with the Emergent team and testing different components, we’ve found that NVIDIA and Mellanox work best for us,” said Fellingham.
Along with implementing its technology, the company is working to increase the capture area for MR while maintaining throughput and quality.
“When you start to get bigger than a 10 by 10 meter area, 4K cameras don’t cut it,” Fellingham said. “When our algorithms improve, we will go bigger.”
The new Emergent Vision Technologies cameras are integral to this work. With models supporting up to 600 fps at 5120 x 4096 resolution and interface options ranging up to 100 GigE, Fellingham has not had to worry about caps on camera resolution, data rates, or frame rates. Those advantages mean that Condense Reality is well positioned to deliver even better content and user experiences.