A Brief Discussion on JPEG-XS Light Compression Remote Production

A Brief Discussion on JPEG-XS Light Compression Remote Production

IABM Journal

MediaTech Intelligence

A Brief Discussion on JPEG-XS Light Compression Remote Production

Fri 24, 01 2025

A Brief Discussion on JPEG-XS Light Compression Remote Production

China Media Group
Chenghu Guo
Xin Jin
Beijing Cinctech Technology Co., LTD
Bo Wang、Xiaokun Dong

 

Chenghu Guo

Abstract:

Remote production has become an important technological development direction in the broadcasting industry in recent years. JPEG-XS technology can be regarded as a very representative light compression technology. In the recent years of remote production trials, JPEG-XS light compression technology has been tried in intranet, leased line and SD-WAN public network environments. In this article, the feasibility of combining JPEG-XS light compression technology with SD-WAN public network technology is demonstrated using the example of a live broadcast report on the 2024 General News Channel’s “New Archaeological Discoveries at Wuwangdun”.

Keywords:

Remote Production, JPEG-XS, SD-WAN

In recent years, the broadcasting industry has increasingly adopted IT technology, IP-based systems have become more mature, and the advantages of IP have gradually become apparent. IP-based technical solutions not only solve the problem of signal loss during long-distance transmission, but also greatly simplify the tedious steps of embedding and de-embedding video and audio. IP-based technology also provides a more convenient platform for system customisation and expansion. By using IP-related technologies, we can quickly adapt and expand the system to meet today’s diverse content production needs.

In the development of IP-based technology, uncompressed technology has been CMG News Channel’s direction of technology development in recent years. Uncompressed technology means high-bandwidth signal transmission and signal processing that can ensure the best picture quality, which is particularly important when broadcasting to large screens in 4K or even 8K. However, ensuring picture quality also means that a huge amount of bandwidth is needed to transmit the signal, which greatly increases the pressure on the transmission network.

To strike a balance between picture quality and transmission bandwidth, shallow compression technology has begun to attract widespread attention. The essence of shallow compression technology is that it can significantly reduce the transmission bandwidth with a slight loss of picture quality, while keeping the encoding and decoding delay within an acceptable range. This technical feature is very suitable for some program production scenarios. JPEG-XS is one of the best shallow compression technologies.

In May 2024, CMG news channel produced the program “New Archaeological Discoveries at Wuwangdun”, which was the first time the JPEG-XS shallow compression technology was used in live broadcast production. The following is a brief introduction of how JPEG-XS Light compression is combined with SD-WAN technology to achieve remote production, based on the production of this program.

Introduction to the technical characteristics of remote production

The features of JPEG-XS light compression technology that appeal to CMG News Channel are high image quality and low latency. JPEG-XS technology can achieve a latency of milliseconds, which is even less than the time it takes to display an image. At the same time, the JPEG-XS light compression technology can maintain image quality to the greatest extent possible without loss by optimising the algorithm, and has the characteristic of multiple iterations. In addition, JPEG-XS light compression technology also supports the encoding and decoding of high-dynamic, wide-colour-gamut images such as HDR and BT.2020, making it perfectly suited to the current 4K/8K technology development trend.

Another highlight of the JPEG-XS light compression technology is the flexible definition of the compression ratio. Depending on network bandwidth, codec device and content characteristics, users can flexibly set the compression ratio on the encoding device. For example, the latest generation of GrassValley’s LDX150 camera used in this archaeological activity supports direct output of JPEG-XS IP streaming signals, and the compression ratio can be adjusted between 5:1 and 20:1. This program was produced in high definition, and the bit rate of each high definition (1080/50i) JPEG-XS signal ranged from 60M to 240M.

After using the JPEG-XS light compression technology, a new challenge became how to efficiently and economically transmit the lightly compressed signal over the network. In early tests, the lightly-compressed signal was mainly transmitted over bare fibre or a leased line. This transmission method essentially connects the front and back systems directly via fibre to form a relatively closed intranet environment. While this method is secure and reliable, it has significant drawbacks. It requires a large investment and is inflexible, as the fibre or leased line must be rented each time production starts, and not all sites have the conditions for a leased line. This transmission solution does not really deliver cost reductions and efficiency gains due to its high demand on network resources and high cost. Only by solving the transmission problem can the convenience of remote production be truly realised.

In contrast, SD-WAN networks offer a more economical and flexible solution. SD-WAN networks use existing public network resources to achieve efficient and cost-effective transmission of lightly compressed signals through intelligent routing and traffic optimisation technologies. This approach not only reduces transmission costs, but also improves network reliability and flexibility to better support remote production.

Introduction to the JPEG-XS Remote Production System [u1]

u2] The JPEG-XS Remote Production System can be divided into two parts: the front and the back.

Front system

The front system [u3] is housed in a compact 14U flight case and can meet basic production needs. It uses a Grass Valley LDX150 camera with an HPE300 power supply that can power three cameras in 2U of space. The LDX150 camera also has built-in XS codec functionality, which can output JPEG-XS IP stream signals directly from the camera head. At the same time, audio from the camera head’s microphone, intercom, tally and OCP control signals are all connected to the front core switch via the service optical port on the camera head and then to the rear system via the SD-WAN public network link, providing full channel functionality. With high-density IP-based cameras, the system can be quickly expanded to meet multi-channel production needs.

The flight case is equipped with a core switch and a management switch. Due to the small number of front-end devices and the high stability of the core switch itself, the core switch has a dual power supply. Taking into account the cost of system construction and system security, we decided to split different VLANs on a core switch to achieve a logical primary/secondary relationship for service data. The Grass Valley LDX150 camera has two 25G optical ports and supports the 2022-7 primary/secondary redundancy mechanism. Therefore, we have divided main service port of the camera into VLAN20 and the backup service port into VLAN30 to achieve the isolation of service networks in a core switch and achieve the effect of 2022-7 redundancy.

The front system is also equipped with a synchroniser to ensure that all equipment in the front is under a unified synchronisation benchmark. In addition, the flight case is equipped with a talkback panel and a four-wire converter to allow internal communication in the front and two-way communication between the rear system and the front system.

Rear system

The core of the rear system is the HD baseband system in Studio 14 in the Fuxinglu office area. To adapt to the interface with the JPEG-XS shallow compressed IP stream, a core switch, network gateway, XS codec card and other equipment were added to the rear production system to convert the JPEG-XS shallow compressed IP stream into the baseband signal. The JPEG-XS signal captured at the front is decoded into an SDI signal by the XS decoder card and then fed into the baseband system for post-production. Meanwhile, the PGM signal from the baseband system is encoded into a JPEG-XS signal via the network card and the XS encoding card, and then sent to the front as a head-end return signal.

The OCP control panel and CCS ONE camera control unit of the rear system are connected to the management switch, which in turn is connected to the core switch at the rear to allow remote control of the camera and tally functions.

Some points to consider when using remote production

Synchronisation issues

The program ‘New Archaeological Discoveries at Wuwangdun’, produced by CMG news channel, used the following network architecture during the live broadcast: front-end corporate network – front-end SD-WAN device – public network – back-end SD-WAN device – back-end corporate network. In an IP-based system, the first task is to solve the PTP (Precision Time Protocol) synchronisation problem. Technical tests have shown that under current technical conditions, PTP signals cannot be transmitted over SD-WAN, which means that the front and back-end systems may be under different synchronisation benchmarks. To solve the synchronisation problem, the simplest and most effective solution is for the front and back systems to lock their time via GPS. If GPS locking is not available, the front and rear systems will be in asynchronous states with each other. This requires the rear decoder to be able to decode asynchronous signals and to embed the rear system’s synchronisation benchmark into the signal during the decoding process[u4].

Service signal problems

Once synchronisation is resolved, the next step is to debug the service signal (i.e. the audio and video signal). In an IP system,[u5] service signals are usually sent in multicast form. In order for the rear system to receive the signal from the front system, the rear IP device must send an IGMP (Internet Group Management Protocol) request to the front. This process also involves unicast transmission, so the unicast and multicast addresses of all IP devices in the front and back systems must be planned in advance and correctly configured in the SD-WAN device. The software processing power of the SD-WAN is used to enable effective unicast and multicast communication between the front and back systems. During configuration, the communication direction (front-to-back or back-to-front) must be specified. An incorrect direction will result in communication failure.

It is also necessary to ensure that the management signals between the front and rear are unobstructed. Management signals mainly include the camera’s iris control and the tally signal. In the Grass Valley camera system, the CCS ONE camera control unit is the core of the management signal routing. Due to the public network environment, a separate public network licence must be purchased for the CCS ONE camera control unit to ensure stable transmission of the iris control and tally signals over the public network.

Communication issues

Intercom is a very important system in studio production. In a remote production, where the director’s team and camera crew are in different locations, smooth communication is particularly important. If we rely solely on traditional methods such as mobile phone calls, it may be limited by various factors such as network stability and power exhaustion, which may prevent us from ensuring the security and reliability of the call. So, for this live broadcast, we used advanced IP technology to ensure that the remote production team had the same call experience as a traditional studio production.

The director’s team in the back can seamlessly connect to the talkback panel in the front, the cameraman and the presenter’s wireless earpiece via a professional talkback panel, providing full communication between the front and back. Our studio system uses Telos Infinity Series IP communication panels. The communication panels in the front system are connected via IP to create an internal communication system with the front cameramen’s and hosts’ ear-return devices. The front communication panel is connected to the director’s communication panel in the back via an SD-WAN network, ensuring that the director in the back can communicate clearly with everyone in the front in real time, greatly improving the efficiency of remote production collaboration and the quality of communication.

Camera position matching

For some programs, it may not be appropriate to use only channel machines for shooting. Therefore, in remote production, channel machines are often used in conjunction with guerrilla machines. While the channel machine uses JPEG-XS light compression encoding, the guerrilla machine typically uses a 5G backpack and SRT encoding to return the signal. The difference between the two encoding technologies will inevitably cause different levels of signal delay and different picture quality effects. Therefore, how to balance the division of labour between the various machines is also a problem that needs careful consideration.

After repeated testing and exploration, we have found that for large spatial scenes within a program, the guerrilla camera with 5G backpack is more flexible, enabling efficient shooting in complex environments and greatly enhancing the breadth of scene coverage and the immediacy of dynamic capture. For the need to present fine image detail in the program, the channel camera has the advantage of using JPEG-XS shallow compression technology. With its low latency and high fidelity image processing capabilities, JPEG-XS shallow technology effectively balances image quality and data bandwidth requirements, which is critical for capturing and preserving the subtle visual elements in the scene. The high resolution and colour reproduction capabilities of the channel camera, combined with the benefits of JPEG-XS technology, significantly improve the clarity and expressiveness of the program’s image detail. At the same time, the low latency provided by the JPEG-XS light compression technology allows the director to switch between camera positions with ease and confidence.

JPEG-XS Technical Index Test

After the broadcast of “New Archaeological Discoveries at Wuwangdun”, we tested some key indicators of JPEG-XS technology.

Image quality

We tested image quality at different compression ratios according to network bandwidth conditions. At a compression ratio of 20:1, there is a clear loss of image quality on the monitor, as evidenced by an increase in noise.

We asked the Radio and Television Planning Institute of the State Administration of Radio and Television to conduct a comprehensive test of JPEG-XS compression encoding. The test covered scenes in three common video formats: 1920×1080/50i, 1920×1080/50p and 3840×2160/50p. By selecting distinctive scenes of different types, we conducted a detailed comparative test of the loss of picture quality at different compression ratios.

For the 1920×1080/50p and 1920×1080/50i tests, we focused on the technical indicator PQR (Perceptual Quality Rating). PQR converts the perceptual difference between the tested video and the reference video into a score that represents the viewer’s ability to ‘notice’ the difference in the video. The test results more accurately reflect the picture quality as perceived by the viewer. The relationship between the PQR and picture quality is roughly as follows

PQR=0: the reference and test images are identical.

PQR<=1: the viewer cannot tell the difference between the reference and test images.

2<PQR<4: the viewer can tell the difference between the reference and test videos. This range is typical of high bandwidth, high quality broadcast MPEG encoders and is generally considered to be excellent to high picture quality.

5<PQR<9: The viewer can easily tell the difference between the test video and the reference video. This is often the result of a low bitrate MPEG encoder in consumer video equipment and is often considered good to fair picture quality.

PQR>10: The difference between the tested video and the reference video is large and is considered poor picture quality.

In the 1080/50i scene, we compared the material recorded by a commonly used studio video recorder with the material encoded with the JPEG-XS codec. The results are as follows:

 

Item The name of the image Recorder

DNxHD 120Mbps

JPEG-XS

10:1 120Mbps

PQR_Y PQR_Y
1 Flowerbed 1.9 3.1
2 Turntable 0.8 0.9
3 Basketball 1.3 1.9
4 Leaves 1.5 1.8
5 Birdcage 1.3 1.7
6 Studio 1.7 2.2
7 Beijing opera 1.4 1.9
8 Volleyball 1.0 1.4

 

As can be seen from the table above, in the 1920×1080/50i scene, the JPEG-XS codec with a compression ratio of 10:1 is slightly inferior to the VTR, but still in the range of high quality codecs (PQR<4).

In the 1920×1080/50p scene, we tested two different JPEG-XS compression ratios and obtained the following results:

Item The name of the image JPEG-XS

20:1 115Mbps

JPEG-XS

10:1 230Mbps

PQR_Y PQR_Y
1 Beijing opera 2.3 0.7
2 Complexion 0.7 0.3
3 Night view 1.9 0.5
4 Bamboo leaves 3.2 1.3
5 Folk dance 1.5 0.5
6 Sports 1.4 0.5
7 Play in the park 4.2 1.8
8 Blooming flowers 3.7 1.4

 

As you can see from the table above, the effect of the 10:1 compression ratio is much better than that of the 20:1. The PQR value is close to 1 in most scenes, which is in the visually lossless category, and the effect of 20:1 also seems to be in the range of high quality codecs overall.

For the 3840×2160/50p test, we used two indicators for evaluation, SSIM (Structural Similarity) and VMAF (Video Multimethod Assessment Fusion):

SSIM is an indicator that measures the structural similarity between images and is often used to assess the similarity between before and after image distortion. The value of the SSIM indicator ranges from 0 to 1, and the closer the value is to 1, the closer the compressed or processed image is to the original image.

VMAF combines three indicators: Visual Quality Fidelity (VIF), Detail Loss Measure (DLM) and Temporal Information (TI). It generates a score for each frame and uses an averaging algorithm to calculate the final video score. A score of 95 or above means that the difference is extremely difficult to see with the naked eye; 93-95 means that subtle differences can be seen but are perfectly acceptable; and below 91, the difference is usually more obvious.

The test results show that image quality is excellent at all compression ratios.

 

Item The name of the image JPEG-XS

6:1 1480Mbps

JPEG-XS

10:1 890Mbps

JPEG-XS

16:1 556Mbps

JPEG-XS

20:1 445Mbps

SSIM VMAF SSIM VMAF SSIM VMAF SSIM VMAF
1 Beijing opera 0.988045 100 0.978799 100 0.968025 100 0.96114 99.99895
2 Complexion 0.994263 100 0.991585 100 0.98716 100 0.984024 100
3 Night view 0.989941 100 0.979466 100 0.9677 100 0.961396 100
4 Bamboo leaves 0.991173 100 0.981627 100 0.967106 100 0.958404 100
5 Folk dance 0.968585 100 0.933169 100 0.908078 100 0.897538 100
6 Sports 0.989136 100 0.973964 100 0.956193 100 0.947777 100
7 Play in the park 0.982455 100 0.96611 100 0.942407 100 0.925748 99.99934
8 Blooming flowers 0.976105 100 0.959082 100 0.933238 100 0.917215 100

 

Delay

The Wuwangdun archaeological site in Huainan, Anhui Province, is almost 1,000 kilometres away from the CCTV Fuxinglu office, yet the transmission delay is only about 60-80 milliseconds. This means that after the camera footage is decoded into a baseband signal by the back-end system, the delay is less than two frames. In addition, we also tested the relative delay between the two cameras and found that there was almost no delay difference between them, which ensured that the director could smoothly switch between multiple camera positions using JPEG-XS technology, avoiding alignment problems in the program flow.

The above test results confirm that the JPEG-XS shallow compression technology has the characteristics of ultra-low latency, and the delay between each camera position is basically the same. In contrast, the delay using 5G backpack transmission usually reaches about 3 seconds, and the delay between multiple backpacks is not fixed, which requires the production team to plan and design in advance during commissioning.

The use of JPEG-XS shallow compression technology allows the director in the background to see the shooting screen in near real time, so he can accurately capture every key screen switching point, providing strong technical support for producing more exciting and compact TV programs.

Transmission Bandwidth

When the compression ratio is set to 10:1, the actual bandwidth requirement of an HD signal is about 120M. After testing, we have found that if the bandwidth provided by the operator is less than 150M, the signal will occasionally appear black. The public network environment sometimes has unstable factors. In order to ensure stable signal transmission, we still need to reserve more than 40% redundant bandwidth[u6].

Conclusion

This archaeological activity at Wuwangdun was an important practical application of remote production technology. This live broadcast used a hybrid mode of the channel machine with JPEG-XS technology and the portable shooting equipment with 5G backpack[u7], making full use of the advantages of the two types of equipment and technology. The characteristics of flexibility, high image quality and low latency were perfectly presented to the audience through the director’s precise scheduling, making the audience feel as if they were there in person and deeply feel the unique charm of the Wuwangdun relics.

Of course, JPEG-XS light compression technology still has many aspects that require further in-depth research, such as how to more effectively reduce the impact of network jitter on signal transmission, and how much redundant bandwidth should be reserved in the public SD-WAN network to ensure transmission stability. These issues require our continued exploration and research in the field.

This live broadcast not only proved the feasibility of this technical solution, but also met the current development needs of broadcasters to reduce costs and increase efficiency. The combination of JPEG-XS shallow compression technology and SD-WAN network technology is a useful exploration in the field of remote production. It is believed that in the near future, remote production technology will be more widely used, and the advantages of remote production technology, such as low cost and high quality, will gradually become apparent. At the same time, it will promote the light transformation of traditional broadcast television systems. We will, as always, pay attention to the technological development of remote production, boldly carry out the practice of remote production in program production, fully release the production potential of existing production facilities, and contribute to the transformation of the content production methods of headquarters.

 

Search For More Content


X