Low performance with Decklink Quad HDMI Recorder input

My goal is to make a simple 2x2 grid with 4 video inputs from Decklink Quad HDMI Recorder card.
I created 4 Decklink producers with 1080p3000 format in separate layers and then I created the grid with MIXER 1 GRID 2 command. The output is Decklink Mini Monitor 4K with 2160p3000 format.

The issue is that the maximum number of Decklink inputs I can use is only 2. After I add another Decklink input the mix-time rises significantly, video gets laggy and out-sync warnings appear in the console. Diagnostics screenshot is attached. However when I use local videos with the same parameters (1080p3000) as inputs it runs smoothly without any problems.

When playing 2 decklink inputs the GPU utilisation is around 90%. After adding more decklink producers the GPU utilisation tops at 100% and PCIe bandwith is around 60%. But when playing 4 local videos the GPU utilisation sits under 60%. All CPU cores run at 80% max.

What could cause this issue? How can it be fixed?
I have used CasparCG for a few other projects and I was under the impression that making a 2160p 2x2 grid from 1080p sources shouldn’t be that hard or am I wrong?

My system specifications:
AMD Ryzen 5 3600XT
Nvidia GeForce RTX 2060 6GB (connected over PCIe 3.0 8x)
Blackmagic DeckLink Quad HDMI Recorder (connected over PCIe 3.0 8x)
Blackmagic DeckLink Mini Monitor 4K (connected over PCIe 2.0 4x)

CasparCG 2.3.0 N/A Dev
Ubuntu 18.04, kernel 5.4
Nvidia driver 450.80
Blackmagic driver 11.7

I have done some further testing using ffmpeg. When doing the same task as described in previous post - making 2160p grid from 1080p sources - I found out that if I run single ffmpeg process with four 1080p25 inputs from Decklink and single 2160p25 or 2160p30 output to Decklink it runs well, but if I set 1080p30 inputs it does not run at real time speed (28fps instead of 30fps) and eventually results in buffer errors. However when I tried reading the four inputs in separate ffmpeg processes it ran at 1080p30 flawlessly. Also I noticed that the ffmpeg process making the video grid consumed less CPU resources than doing the same thing in CasparCG even though ffmpeg is doing all the work in CPU and does not use GPU at all.
Could the issues be caused by the Decklink Quad HDMI Recorder card? It is supposed to be able to record at 2160p30 from four inputs simultenaously.