Signiant – How AI is becoming a vital part of the intelligent transport workflow
Ian Hamilton, Chief Technology Officer, Signiant
The use of AI at all levels in the broadcast chain increased with dizzying speed throughout 2023 and into 2024. But while the likes of ChatGPT and its generative AI cousins have stolen much of the limelight when it comes to assessing the technology’s impact on the industry, in truth there is likely more work being done in other areas. AI has been part of the fabric of the industry for several years now, and at Signiant, this has become a vital component of what can be referred to as an intelligent transport workflow.
Machine learning systems learn and adapt using statistical models to make inferences from data. These systems take a set of inputs and make predictions for corresponding outputs; trained on examples, they can predict outputs for inputs they haven’t seen before. More data drives better predictions and, as a multi-tenant SaaS vendor, Signiant aggregates a lot of data that can be anonymized and analyzed for the benefit of all our customers.
With file transfer systems, many parameters can be adjusted to increase (or decrease) the resulting transfer rate, which ultimately depends on a series of interactions with the operating environment. The operating environment encompasses factors like dataset composition, storage capabilities, computational resources, and network conditions. Given an operating environment and a resulting transfer rate as inputs, an ML system can be trained to predict transfer parameters used to achieve the rate, but this on its own isn’t useful for maximizing the transfer rate.
By adding another input representing transfer rate quality, a form of contrastive learning (using positive and negative examples) can be applied to predict transfer parameters to create a model to drive high-quality transfer rates. This approach relies on labelling the transfer rate quality of the examples used to train the model, which can be driven algorithmically.
Changing parameters
Operating environment variables impacting the speed of file transfer include: networking factors (available bandwidth, latency, loss and jitter); compute factors (number and speed of CPUs and GPUs or other hardware offload capabilities); storage factors (read rate, write rate, optimal block sizes, type of storage and protocol support); and sizes of files in the dataset.
Signiant implements a transport-layer protocol deployed on top of UDP that outperforms TCP on long-distance high-bandwidth networks. Whether to use Signiant UDP or TCP is one key parameter impacting performance in different environments. On top of transport-layer optimizations, Signiant uses an HTTP-based parallel stream application-layer transfer protocol. Multiple streams facilitate scaling out the transfer across multiple servers, among other benefits. Large files can be split across multiple streams or multiple small files can be sent on a single stream. At the application-layer, the version of the HTTP protocol, the number of streams used, and the amount of data sent over each stream are adjustable parameters.
As mentioned previously, defining a “good” transfer is a key part of the system. At the simplest level, a good transfer rate is a rate that is as close as possible to the available end-to-end bandwidth of the environment. The real available bandwidth isn’t necessarily known, but it can be estimated with reasonable accuracy leveraging operating environment instrumentation.
The advantages of AI
The main benefit in using AI tools lies in the elimination of time-consuming manual testing and tuning. The possible combinations and permutations of transfer configurations are practically unlimited. Running enough test transfers to cover appropriate combinations and eliminate run-to-run variability is time consuming and wasteful when contrasted with looking at real world transfer data obtained under similar conditions. ML just provides an effective and well understood mechanism for analyzing this data.
To quantify success, we look at the percentage of transfers achieving good transfer rates before and after applying the model. Given the objective of the system is to maximize this, a significant improvement in this metric (based on a given classification method) shouldn’t be surprising. It’s also somewhat self-referential.
A perhaps better measure of success that can be translated into a business benefit is the portion of transfers that still benefit from manual performance tuning. This has dropped to effectively zero. Prior to introducing our intelligent transport capability, manual adjustment of transfer parameters was frequently required to achieve transfer rates above 5 Gbps. With our ML-based tuning approach, we regularly see rates over 15 Gbps with no manual tuning.
The current and future state of play
Signiant has one issued patent (USPTO No. 16,909,382) and one pending patent on our “Cloud-Based Authority to Enhance Point-to-Point Data Transfer with Machine Learning”. Signiant has fourteen total issued patents, with two pending patents, covering a broad range of the underlying technologies we use to provide efficient global access to media. At the risk of oversimplifying, the claims of patent 16,909,382 cover aspects of how we collect and process the information necessary to train a model and how we apply the results of that model to new transfers.
We are in active use with our Jet product. Jet is used for automated system-to-system transfers, which tends to benefit most from this type of optimization.
As for where we are heading, a challenge for continued refinement is ongoing collection of training data. We need to make sure we aren’t just implementing a positive feedback loop that reinforces already established patterns. Additionally, while the initial configuration determined by this system plays a critical role, there is also the opportunity to adjust transfer parameters over the duration of the transfers. For example, like our transport-layer protocol adjusts to the operating environment in real time through its flow and congestion control mechanisms, there is an opportunity to also adjust application-layer parameters during the transfer to optimize for variability in operating conditions.