A few thoughts and ideas to this topic:
That is seldom true, as most of the stations have at least a bug layer. Sometimes they also need a second video layer for stuff like a push-back effect, where underneath it a advertisement graphic (or video) is revealed, that could also contain text/sound, that would need to go to the caption.
We could add something like a MIXER command, that enables CC output on a layer, and disables it on all others. In the scenario where we have a second layer (push-back etc.) the play-out controller would issue another command to switch to the secondary layer.
That could be a way to go. That MIXER command I mentioned above would then enable the output to OSC (or other protocoll).
Then a completely independent software could do the insertion of the 608/708 CC or Teletext or maybe even open captions. It would need a separate channel (Decklink cards channel), because AFAIK only one software at a time is able to get access to a Decklink channel. But that is not a big issue, as one could use a Duo 2 and use a spare channel for this.
For Caspar’s side we only would need to implement that special MIXER command and the extraction and sending out of the CC data, but could leave most of the frames pipeline untouched.
The implementation of the inserter could be left to a separate (GitHub) community.
What do you guys think?