I was working in the past a lot with CasparCG. Recently I had was working on a ‘real-time’ people-motion tracking project (AR App nothing to do with TV/CG/Graphics) and thought this maybe could be fun to port to use within CasparCG. I can track hands, faces - return points for every finger/arm/leg/etc. (where you then could ‘attach’ graphics to)
Do you think something like this could be usefull? I do not have any demo yet but maybe I could do something in a couple of days.
It’s fully web-based (so it would be a template or even usable within other templates) - without any changes to the server component itself (as long as the chromium version works with my code).
Will make a video today/tomorrow to demonstrate the functionality/tracking.
Sorry for the late response - was very bussy a last couple of days. Had a few minutes to make a video to show the points we could attach things to (demo video shows a wrist-watch which is the original use case).
In theory it should be possible to simply use the values within a template or build the whole library into a html template itself.
Next steps will be to figure out whats the best way to implement this (in a way that is easily usable for others). It’s a bit since I last used Caspar (back when HTML got a thing) so any suggestions/tipps/wishes are appreciated here.
No, the graph is trained to recognize people (uses Google MP)
But there are other ways to do something like that. It depends on the size/speed of the drone but I think that would be quite hard to do (especially if you have multiple drones on the same shot who change position all the time) I think in that case you would be better of to track the drones (via Wifi Beacon/LoraWAN or something like that) and camera zoom/movement.
On the other hand (depends on the version of chromium caspar is using) it shouldn’t be to hard to do something like a virtual set (or at least position virtual objects/graphics in 3D space and track them).
Do you have any info on hand which chromium version caspar is using? As far as I can tell it makes the most sense (at least from my perspective) to build everything outside caspar, clone the videofeed and send back the position-values to the template. Or is there a better way to do so (from a developer point of view?)
On the other hand (depends on the version of chromium caspar is using) it shouldn’t be to hard to do something like a virtual set (or at least position virtual objects/graphics in 3D space and track them).
This would be super useful in many broadcast applications. If you can place demo HTML <div> in fixed 3D position related to the persons position in the surroundings and camera movement - that’s a whole another level.
For instance, is it possible to fix this “SPECIAL COVERAGE” text in HTML with your code, related to camera movement?
Yes, didi got that right it’s ‘just’ for people (Hands, Face, Arms,…). But as long as your focal length+real world camera position is constant it should be possible without any AI/Tracking (edit: ‘just’ with a few sensors on the camera). Got a request from a tv channel to do something like this - can keep you updated on that one.
I quickly tested today to ‘pack everything in a template’ and it seems like there are some issues (not sure what went wrong but it seem like there is an issue with the initialisation process). I think next thing to try would be to have the recognition running outside the template (which would be a better soulution anyways but I had hopes to just try it out the easy way) and just pass the values to caspar via javascript…