Designed for automotive computing SoCs used in autonomous vehicles of the future, Renesas has announced the development of a video processing circuit block.The circuit block handles processing of vehicle camera video with 70ms latency,processing video in real time on large volumes of data with 197mW power consumption and without imposing any additional load on the CPU and GPU, which are responsible for autonomous vehicle control.
Video codec processing basically consists of parsing processing, where performance depends on the volume of encoded stream data, and image processing, where performance depends on the image resolution. The newly developed video processing circuit block implements video encoding and decoding by using a stream processor for parsing processing and a codec processor for image processing. Since the data size of the typical video streams handled by in-vehicle infotainment systems varies greatly from frame to frame, the processing time required by the stream processor, whose performance depends on the volume of encoded stream data, varies substantially from frame to frame. On the other hand, the processing time required by the codec processor, whose performance depends on the image resolution, is the same for every frame. Consequently, the stream processor and codec processor must operate asynchronously and this can cause large delays to become an issue.
The newly developed video processing circuit block has a synchronous operation mode that utilises a FIFO placed between the stream processor and codec processor and can handle video streams that are roughly constant in volume from frame to frame, as is expected to be the case in driving safety support systems. It also has a mechanism whereby the codec processor outputs an interrupt to the CPU each time processing of a multiple of 16 lines has completed during frame processing, thereby allowing distortion correction to start in a later stage without waiting for frame processing to finish completely. This combination of synchronous operation and incomplete-frame pipeline operation achieves low latency of only 70ms (a reduction of 40% compared with existing Renesas devices using the 28nm process) from the reception of video streams to the completion of video decoding and distortion correction.
The newly developed video processing circuit block integrates 17 video processors of six different types in order to achieve real-time and power-efficient video processing without imposing any additional load on the CPU and GPU. Stream processors and codec processors handle video encoding and decoding, rendering processors perform distortion correction, video processors perform general image processing, blending processors handle image composition, and display processors perform processing for displaying images on screens. The video processors are connected to each other via hierarchical buses.
Evaluation of prototypes of the video processing circuit block comprising these video processors, fabricated with a cutting-edge 16nm FinFET process, confirms truly industry-leading Full-HD 12-channel performance (approximately three times improvement compared to the existing Renesas devices using the 28nm process).
When performing the massive video processing required by Full-HD 12-channel video, data accesses to the memory are a major source of performance bottlenecks and power consumption. In addition, in automotive computing systems it is necessary to minimise the memory bandwidth consumed by video processing to avoid interfering with the cognitive processing performed by the CPU and GPU. It is essential not to inhibit the operation of driving safety support systems, which must maintain a high level of safety.
For this reason, image data stored in memory is compressed to reduce usage of memory bandwidth. By using both lossless compression, which does not alter the pixel values and results in larger silicon area, and lossy compression, which alters the pixel values and results in smaller silicon area, in a manner appropriate to the image processing characteristics, it is possible to reduce memory bandwidth by 50% in a typical video processing flow. In particular, to avoid an issue specific to DDR memory where the memory access efficiency drops when accessing smaller blocks of data, meaning that there is no effective reduction in memory bandwidth, caching is used for video decoding, which involves large numbers of accesses to small blocks of data. This makes it possible to increase the DDR memory access size and reduces the effective memory bandwidth by 70%. Evaluation of prototypes fabricated with a cutting-edge 16nm FinFET process confirms that this reduction in memory bandwidth results in a 20% drop in power consumption, proportional to the reduction in the volume of data transaction on the bus, resulting in truly industry-leading Full-HD 12-channel power consumption of 197mW (60% less than that of current Renesas devices using the 28nm process).
The newly developed video processing circuit block will realise automotive computing systems integrating vehicle information systems and driving safety support systems by enabling massive video processing without imposing any additional load on the CPU and GPU, with real-time performance, low power consumption and low delay. Renesas intends to incorporate the new video processing circuit block into its future automotive computing SoCs to contribute to a safer and more convenient driving experience.