608 / 708 Closed captions out Decklink SDI

asperi · December 18, 2021, 6:02am

I am hoping to find a solution for passing 608/708 closed captions out CasparCG to a Blackmagic design Decklink board. Could be playing mpeg-2, mp4 or MXF files with embedded captions, 1080i or 720p. The server would just have all the same type of files, same resolution, frame rate and stay with the same files once configured. Do not need inputs, graphics or anything fancy, just play 24/7 and pass captions out the SDI. I am in the States where 608 or 708 captions are required by law for anything being broadcasted so getting captions to pass is a must for us. I know a lot of other people are interested in this so hoping to see about some development on the issue. Thanks!

didikunz · December 18, 2021, 9:49am

There is another tread about this already. Usually the forum is not a good place to post feature requests. Please open a issue on GitHub if it is not already done.

Maybee you or your organization is able to fund the development of such a feature, or you have a programmer caoable of foing it. Understand that for us, based in Europe, CC is nothing we need, so this would better be done by somebody, that actually needs this feature.

asperi · December 18, 2021, 4:05pm

Posted on the old thread, posted on the github, posted everywhere I can with no success or replies, hoping in adding a new tread it would show how much this is needed. Delete if you must. Just sad that this lack of support for captions cuts off CasparCG from an entire continent.

didikunz · December 18, 2021, 4:15pm

As I said, if you know a guy that can program in c++ he could add it…

Julusian · December 18, 2021, 9:52pm

While I can understand the need for it, I am struggling to think about how it could be done in casparcg properly.

Caspar is all about a channel having multiple layers playing. So while it is unlikely that anyone wanting captions will try to play multiple things on a channel, someone will. What should happen when it does. Should it take the captions from the lowest layer? or uppermost? can they be combined? Does it matter if one of those clips has been muted?

From what I have read from github, it might be that the decklink needs to be fed yuv frames to be able to add captions. There isnt a flow in place to get yuv frames, only rgba. So itll need a new flow to convert frames, which may become a performance drain (when enabled). Also consider that the only way to feed a decklink frames with alpha is in rgba. So if you want an alpha channel out and to be in yuv mode then we need to feed two decklink outputs perfectly synchronised frames ourselves, which is also something new.

So it sounds to me like there are quite a lot of areas inside of casparcg that need changes to handle captions. But it is doable, and if someone wants to do it I am available to discuss plans and to help them get it merged.

asperi · December 19, 2021, 12:37am

Julian, thank you for the thorough explanation of the issues of this, this helps in understanding the magnitude of this task. I think one of the issues may be in difference between what I will call a “Play Out Server” and a “Production Media Server”

So with a Play Out Server, they are typically remotely managed, schedules loaded for a week, sending video into a cable system or TV station switcher, they switch to the server when they need to. The Play Out Server will just play file after file for a 24 hour schedule, no layers, no graphics- just one file ends, and the next file plays. Some have GPI triggers that will start playback when someone in the control room triggers it, they switch to it, put its content on air and switch back to other sources when done but 90% of these server play for days/ weeks without anyone ever touching the box. Sometimes this is called - channel in a box, not glamorous or very fun but these work great for unmanned channels, overnights and paid show placement.

With a Production Media Server, you have all the fun stuff, layers, graphics, keying, ext. Basically CasparCG, there are some high end Production servers that do all of these things while maintaining captions and also able to just play 24/7. I am not totally sure on how they maintain captions with layers / graphics and all but pretty sure there is just a base layer with the main video/ captions that will pass, and stated in the manual that if you want captions to pass, they need to be on this layer where they are sent directly to the output and combined with the final output video. Never heard of captions being combined.

From what I have read, I believe you are right about the Decklink needing 10 bit YUV frames for captions and I see the need for an alpha channel in a Production server but not in the Play Out model.

So with your info, if this was to be done it seems we would need a Play Out server version that simply sends “local files” directly to the “consumer”/ Decklink as viewed in the CasparCG_Server.jpg This would be a stripped down version with no Producers, Mixers, layers or Keying. But then there is still the problem of getting YUV frames. It may be that just making a player in FFMPEG would be better for a simple Play Out server but have had problems even getting FFMPEG to pass captions. Wish it would have been easier to do in CasparCG but at least now the complications are known. Please let me know if there are any other solutions for getting captions to pass out Decklink SDI and thanks again!

hreinnbeck · December 19, 2021, 2:20am

I wonder if we should look at this from a different angle. Perhaps it would be simpler to parse EIA/CEA 608/708 from the ffmpeg producer to OSC (or other protocol) and that could be picked up by a separate service for insertion. E.g. CCG could output to SHM/NDI/IP and the separate service would handle inserting 608/708 (perhaps even DVB/TXT for us Europeans) and output to SDI. It would likely require less modifications to CCG and the caption/subtitle service could be useful for other projects beside CCG. The developer effort on the service might also simpler, e.g. wouldn’t require domain knowledge of CCG.

asperi · December 19, 2021, 5:22am

hreinnbeck, this is exactly what I was thinking but I did not know how to word it, thank you! So I have successfully extracted captions from a file with embedded 608/708 captions in FFMPEG, here is that line:

ffmpeg -f lavfi -i "movie=RD2838_720p.ts[out0+subcc]" -map s output.srt

The thing with that is it makes .SRT… In all my experience with 608/708 captions we use .SCC to embed them, so this would need to be looked at, I have not seen anything that uses SRT for 608/708 but I know they can be converted from one another. libcaption could possibly be one tool for this or maybe srt2scc another good free tool for captions is ccextractor the newest version can “listen on UDP” so maybe there is something that could be used with that but it is mostly for extracting to file I believe.

It would be cool if this could work for not just 608/708 but for DVB/TXT too. I have read of success in FFMPEG with a line like this using -teletext_lines all, but I not sure on it…

ffmpeg -i C:\video\RD2838_720p.mpg -codec:v mpeg2video -f decklink -teletext_lines all -map 0 -vcodec v210 "DeckLink 4K Extreme"

So maybe we could just extract to SRT, bypass Producer/ Mixer, convert to 608/708 or to DVB/TXT and insert on final output…? This way the desired caption output could be set in the config file. This would be awesome if possible!

didikunz · December 19, 2021, 11:30am

A few thoughts and ideas to this topic:

That is seldom true, as most of the stations have at least a bug layer. Sometimes they also need a second video layer for stuff like a push-back effect, where underneath it a advertisement graphic (or video) is revealed, that could also contain text/sound, that would need to go to the caption.

We could add something like a MIXER command, that enables CC output on a layer, and disables it on all others. In the scenario where we have a second layer (push-back etc.) the play-out controller would issue another command to switch to the secondary layer.

That could be a way to go. That MIXER command I mentioned above would then enable the output to OSC (or other protocoll).

Then a completely independent software could do the insertion of the 608/708 CC or Teletext or maybe even open captions. It would need a separate channel (Decklink cards channel), because AFAIK only one software at a time is able to get access to a Decklink channel. But that is not a big issue, as one could use a Duo 2 and use a spare channel for this.

For Caspar’s side we only would need to implement that special MIXER command and the extraction and sending out of the CC data, but could leave most of the frames pipeline untouched.

The implementation of the inserter could be left to a separate (GitHub) community.

What do you guys think?

Julusian · December 19, 2021, 11:44am

If we are happy to impose a bunch of limitations on the feature, then I think we can really scope it down and keep it all inside a normal version of caspar.

We could add something like a MIXER command, that enables CC output on a layer, and disables it on all others.

I think this is a good approach to solve the problem of which layer it gets taken from. Making the choice manual negates my questions about the source.

For feeding data to a decklink, what if we give it a limitation of the channel output cant use an alpha output? That way, we can convert it to yuv (on the cpu to begin with, gpu once someone wants to spend the time to make it optimal), and drive a single decklink.
If you need both captions and the alpha channel you can probably make something work by adding a second decklink consumer to the channel, one being the k&f, the other containing captions.

hreinnbeck · December 19, 2021, 2:11pm

I think the best design would be to have the ffmpeg producer parse 608/708 to events (that could also be expanded to external subtitle files supported by ffmpeg). These events could be exposed so that the HTML producer could be used for open captions/subtitles (burn-in). If the events are in OSC they could also be used by anything external that can listen to OSC. A command would choose which layer(s) is emitting events. Another command could send custom events to allow for live captioning. Following that we could work on final composition to embed as closed captions in SDI or other output modules (if an external service isn’t developed for the insertion).

At least I think it’s important not to confine this to parsing/inserting 608/708 which is of no use to most of us.

didikunz · December 19, 2021, 3:42pm

Why would that be necessary? One could send the OSC commands directly from the live subtitler, or not?

hreinnbeck · December 19, 2021, 4:03pm

It simplifies development and building, once you have the event bus working with the custom events you have a minimal and proven working solution, then you add the events from ffmpeg. Instead of having to listen for OSC events from two or more sources you only listen from one source. You might also want to have a burn-in output (rendered in the HTML producer) and an embedded SDI output. And you also have to keep in mind that live captions could be mingled in with captions from ffmpeg, e.g. a news cast with pre-captioned segments and live captions for the presenter.

asperi · December 19, 2021, 6:40pm

This all sounds awesome, please let me know if there is anything I can do to help. I can test captions on my broadcast monitor, can be tough to test without one or if anyone needs access to a decklink, could VPN into my test server. I had been in contact with Gijs Peskens on github a while back but have not heard from him recently, he did mention he may of had 608/708 captions “somewhat” working with NDI output here is that issues thread and his version
He said: “coded support for reading the CC stream via the ffmpeg module and outputting in a custom format to NDI. The only issue we saw was with framerate conversion done by the CasparCG mixer.” This may be a good place to look for the FFMPEG module reading the CC and as for the framerate conversion in the mixer, if the mixer is bypassed, this could maybe avoid this issue…?

asperi · December 19, 2021, 6:54pm

Here are some short files with embedded 608/708 is anyone needs…

Also getting the Decklink to output captions can be a task…
With the Blackmagic DeckLink SDK 12.2 The Win\Samples\bin\ClosedCaptions.exe sample does not have a input file choice or anything, just puts a blue screen and the SDI Monitor shows:
“CEA-708 Closed Captions
Second line of text!” - as captions on the monitor.

Another place to maybe look at getting captions out is with the Medialooks MFormats SDK This is a paid SDK but can demo to get ideas maybe… This works for playing a file with embedded captions and it will output to the decklink SDI, playing audio/ video and 708… Medialooks\MFormats SDK\Samples\C#\Sample File Playback\bin\x64\Debug> Sample File Playback.exe …there are also some C ++ samples too.

Julusian · December 19, 2021, 7:04pm

Instead of bypassing the mixer (I think doing that will be more work than you expect. It certainly wont help when the framerate mismatches) it could be a limitation in the first implementation
that the clip framerate has to match that of the output for captions to be used

asperi · December 20, 2021, 4:38pm

Sure, maybe not bypass but parsed out and sent to the proper place. I am not up to speed on the inner workings of CasparCG… What are the next steps here? Thanks all and happy holidays!

didikunz · December 20, 2021, 5:07pm

As this tread is filed under “Paid Work Available” somebody should post, that he/she is able to pay for the development, so that c++ programmers can make an offer. To fund the development of a feature is normally the fastest way to get it done. Posting a feature request on GitHub can be cheaper, but it can take a (long) while until it is done.

asperi · December 20, 2021, 5:14pm

Yes I can try and raise funds if there is a number we need to reach, thanks.

gpeskens · December 22, 2021, 9:23am

I’m the guy who asperi talked about, having offered my services to him.

As he said: I wrote CEA708 passthrough code into a 2.1 fork with a single usecase of outputting it to NDI, we had that working fine besides framerate conversion (this can be solved, though the simplest solution is far from elegant), since our business requirement for it disappeared the code hasn’t been tested further than our lab environment.

I’d disagree with outputting the data via OSC, this creates a lot of issues in the chain and isn’t industry standard. Especially for caption data like CEA708 it makes a lot of sense for the caption data to travel with the frames, this way everything remains perfectly in sync. The simplest place to insert ancillary data produced by CasparCG into SDI is at the source: CasparCG.

Instead I’d opt for a ancillary container that travels along on the frames. That way the same framework for CEA708 can simply be extended to support other formats (OP47 etc.) or even other types of ancillary data all together (for example HDR metadata).
Besides that it can be easily consumed by other outputs besides Decklink this way (i.e.: v210 encoded ancillary lines send as NDI metadata packets).

As for Decklink specifically: newer hardware dropped the requirements for ancillary data to be in the same pixel format as the video data, this can be checked via the BMDDeckLinkVANCRequires10BitYUVVideoFrames attribute ID, if supported the conversion from RGBA to v210 can be entirely skipped, since only the ancillary data needs to then be encoded into v210. Otherwise, when the user configured it CasparCG could fallback to converting to v210 via FFmpeg, or drop an error if that is deemed undesirable. On modern hardware the overhead of conversion to v210 shouldn’t be to high, and we can make sure to only enable it when necessary (BMDDeckLinkVANCRequires10BitYUVVideoFrames evaluates to true AND user has requested via configuration to output vanc data).

That leaves the mixer, I think to start as a proof of concept an ON/OFF mixer command that enables pass through of caption data for a certain layer would be enough, ideally also checking other layers do not have caption pass through enabled, and returning an error if so.

After gleaning at the code I believe framerate conversion has moved out of the mixer mostly (wrt ffmpeg producer), I’m not sure how ffmpeg handles caption data in frames it repeats/drops, so this could be an issue. So anything other than matching framerates would indeed be unsupported territory.