Dmabuf screencasting is crazy good. Here's a histogram of the screencasting overhead on my 2560×1600@165 screen—the median is 300 microseconds, and the worst across 12,669 frames was just below 1 ms. Most of that time is spent rendering the frame, perhaps something could even be further optimized in Smithay.
And yeah, if you look at the profiling timeline, I zoomed it in such a way that almost the entire width is taken by one frame, that is 6.05 ms long. Most of it is completely empty!