Q&A | Webinar Sessions: How to Leverage AI for Closed Captioning?

Q&A | Webinar Sessions: How to Leverage AI for Closed Captioning?

Blogs

IABM Member Blogs
Blogs written by the IABM Team
To submit your blog email marketing@theiabm.org

Q&A | Webinar Sessions: How to Leverage AI for Closed Captioning?

Blog from Digital Nirvana

Wed 02, 09 2020

Trance:  AI-enabled world of Transcription, Captioning, and Translation; anytime & anywhere

Trance is an enterprise-level, cloud-based web application from Digital Nirvana integrated with cutting edge STT technology and cloud-based architecture, leading to significant efficiency gains. In this webinar we discussed how AI is transforming subtitling, translation, and transcription and covered key concepts around;

  • Importing media into Trance
  • Creating, viewing and editing transcripts in a single window
  • Configuring presets and single-click generation of closed captions in Pro Window and Caption Editor Window
  • Automatically translating to a secondary language
  • Exporting caption sidecars in all industry-standard formats

Below are few key important conversations in the Q&A section of the webinar

  1. How do customers really use Trance? What are some of your most common use cases?

A- There’s an application again alluded to from a third-party provider perspective, and Netflix was mentioned and I’m going to reference Quibi in this particular case. So we have a client that uses Trance and they struck up a relationship with Quibi. And all of a sudden very quickly they needed to turn around content to Quibi, but the requirements for them to turn this content around are extremely narrow and very exacting, meaning that specifically, the timeframe to do this is roughly within the 60-minute window. And this is going on on a rotational daily basis.

So high pressure, short turnaround time, but also with an exacting standards of accuracy based on the Quibi style guideline. And so that’s one particular use case, where the overarching requirement for this customer was speed, accuracy and the enterprise workflow of being able to get this work done and get it done quickly to turn content around, the short form assets to a carrier like Quibi or an outlet like Quibi.

And this would apply for Netflix as well and there are other outlets that just happens to be one that is top of mind and very prominent right now. Another application would be independent caption providers. These would be small companies that have relationships in the broadcast space or sometimes in other areas, other OTT markets or other VOD type of markets.

And they’re looking to leverage Trance to be competitive in these alternate markets, maybe not so much in the live broadcast segment for reasons we’ve already touched on. And so they come to us looking for a solution saying, hey, I need a better tool than what’s out there in the web and the app based marketplace for translation and transcription.

And this is again where Trance fills a really niche or gap in the market from these providers that are sort of a one size fits all, they’re not agile. What you get is, it’s pre-configured, whereas with Trance it can be wide open and we can export in any number of formats and we have all this enterprise flexibility, but yet it’s still approachable and usable for a small shop that, say has three, four, five employees.

So we’re seeing applications and adoption of Trance in scenarios like that. There’s another application. This came up fairly recently where another client has said, we’d really love to take long format assets, typically raw field content or long form interviews. And we see a need to be able to take these assets, automatically transcribe them and take the transcription as metadata and index it back into the asset.

So that it gives the operators and the editors and people producing this content much greater visibility into, say, an hour-long recording. And this is another way in which the product is being used, not just directly captioned something to go to air or some OTT delivery, but they’re using it as a metadata source to index back to the raw content for greater post-production, accuracy and efficiency.

And then I think the last one I would draw your attention to would be CBC or the Canadian Broadcast Company, where they are using us to do conformance. One piece of the product is the ability to conform existing content to again a specific style guideline, like in Netflix, in this case that’s exactly what it is. It’s one of the outlets that Netflix parameter is very rigid and very demanding and exacting.

And so they use Trance to auto conform any inconsistencies in their existing captioning process to make sure that what goes out and gets delivered to Netflix is fully compliant based on those again very exacting style guidelines. So those would be three or four examples of mainline broadcast media sort of in the trenches applications where the product is used today.

2. Would the charge be based on the amount of content or it’s just a license?

A- It’’s another hallmark of the way we operate and do business at Digital Nirvana. So here’s the great news. Trance is user based. It’s based on usage and usage alone. It depends on how much content you push through it and how much content comes out of it and it’s based purely on that factor alone.

So there is no site licensing process. There’s no particular user seat license restriction. It’s purely a consumption model. And you’re charged a rate based on that consumption figure or that consumption number. And that’s it. We want as many people using it and adopting it. And we hope that that policy makes it as easy with a lowest barrier of entry to embrace the product and use it, and we also think it’s the most fair

3. Is there a product installed? Is it browser based? Can you give us some background on the SaaS of your product?

A- It is a SaaS product. There is no licensing or deployment of software. This is entirely browser-based HTML access. So, it’s universally accessible via any browser to any asset, any video audio asset, and the tools are also universally accessible through that browser connectivity to any user wherever they may be anywhere in the world and at any time.

So again there’s no geographic or physical limitation to its access. If you have a broadband Internet connection, it doesn’t even have to be in a corporate land or wide area network. You can — simply based on your credentials to use the system, you can log into the platform and be using the tool just like we demonstrated a few minutes ago in the demo.

And that means there’s no installs, there’s no issues with software versions or version management. There’s no plugins. There’s no software updates that have to be managed across a single site or multiple sites, which is even more difficult. There’s no, again, user licensing or site licensing restrictions, et cetera. It’s purely a remote HTML login process only.

4. Why is Trance so fast? How does it produce these fast, incredibly fast turnaround times for these operators? You’d mentioned the Quibi example before we talked about SaaS, but can you give us some highlights without giving away the secret sauce on how you do it? How do you do it so quickly?

A- Some of the inherent advantages of SaaS and cloud-based tool sets enhance that process. I mean, this is a great example in the current climate that we’re in, where many people are working remotely and they’re not in their physical offices they once were. This type of tool just makes getting access to doing your work that much easier in a sort of social cultural climate that we find ourselves. So that’s the first piece. The other piece that’s huge is the speed and the accuracy of the speech to text technology. And this goes back to what I referenced in I think one of the earlier questions, how we philosophically deal with speech to text and we use multiple engines and we optimize those engines based on their performance and we’re constantly evaluating those engines’ performance.

So not only do you not have to do it manually, which is a colossal savings in time in terms of manual transcribing of content, that’s being done automatically, and the work that’s being done is highly accurate. So again, with good quality content, we typically and very uniformly see results coming from the speech to text engines that are 90%, 95%, sometimes even higher in terms of their accuracy.

So that’s another giant efficiency boosting part of this whole equation. The other piece is of course as we talked about this is an M&E application from the outset. So the tool set and the user interface are all geared towards this specific outcome and the specific people that will use it and embrace it. 

So it was designed from the ground up with that type of community in mind. The API that I referenced is another way for large enterprise customers that do not have to create secondary processes even to work within Trance and get all the native stuff in Trance is really cool, right? But if you’re in some giant enterprise and you’ve got all these interconnected media systems, the last thing you want to do is have to create a secondary process in a situation like that.

And so, for those customers, we’ve solved that problem as well. Through these — this open API process, Trance just integrates directly into that existing media workflow environment, which is really great. And then the other piece, I would say, is that again these custom presets being able to define and confirm automatically at the creation of captions or transcriptions, content that’s automatically destined to conform to wherever its final destination is intended.

So if I’m publishing to Netflix, I can set those parameters up ahead of time and be reassured and comfortable knowing that the work that I do is automatically formatted for that final destination without having to go back later and tweak it or make changes or fuss around with it. It’s just completely done from the outset. So I would say those are four or five really good reasons to sort of clarify that speed question.

5. When you’re using these SaaS or PaaS, something that’s not based on-site, data platform security is critical in transfer mechanism. And you also list FTP as a transfer protocol, right. So curious what the security posture is around it?

A- One such customer that uses this product every single day of the week had that same question, taking top-tier content with a high degree of sensitivity and vulnerability quite frankly from a risk perspective evaluated us on that very topic. And we went through a very thorough process of reassuring that customer that those assets in the way that they’re — the way that they’re transferred and the way that they’re stored in the cloud. And the privacy that surrounds and the security that surrounds those assets remain untouched and completely secure and pristine. This was an extremely high mark for us to have to meet. It was something that the product natively does.

I can only tell you that we met that test with flying colors and this major mainline well-known broadcaster again has accepted us based on that and uses it every single day. If there are — I would say that if there’s more detailed technical questions, certainly forward those to us, it may be better to field those more detailed questions around security or anything else offline outside of this session, because I don’t know the specifics of what the questioner had in mind. But we’ve addressed them and our customers have responded in kind believing and trusting us knowing that our security protocols are sufficient for their requirements.

Search For More Content


X