Here's work in progress with with strategies to find out what's wrong with a given videostream. I focus on the streaming part, I don't know that much about the video itself beyond what is needed to get ffmpeg to decode it.
We could identify roughly three levels:
- Analyse on video level: Low Latency Viewer.
This the fastest method because it is accelerated by your built-in vision computer, i.e. eyes and brain
- Analyse on frame level: stream_peeker.
This a bit harder to do and requires more knowledge about the bits and pieces
- Analyse on packet level, i.e. Wireshark.
This gives the ultimate detail but requires a lot of domain knowledge. For simple problems this is by far the slowest method. Having a lot of detail isn't necessarily a good means to get the big picture. Use it when needed but not earlier
Stream analysis on video level
Most quality decoders do a good job hiding artifacts. Jitter is masked by adding latency for example. Decoding errors due to lost packets are typically not shown in the pixels but rendering temporarily halts instead until a new I-frame is received. Even Low Latency Viewer throws away incomplete frames. But it does overlay a bunch of state indicators when it happens. See that page for more information.
These blinking LED's can tell a lot on a high level about the network situation
Stream analysis on frame level
A lot can be concluded from observing the time information of frames being delivered. This is what the stream_peeker tool can do using the -M option. Please read that page for download- and basic usage information. The tool outputs a number of columns which have the following meaning:
Frame # #th frame since monitoring started
Duration Duration of stream in milliseconds, based on capture time
Receive dt Time (ms) since start of receiving previous frame
Capture dt Time (ms) since previous capture (RTP timestamp)
ADT Accumulated delta-t (ms) between receiver time and interpolated camera time
Size Size in bytes of the frame
Example: normal delivery
The following short run shows a handful of normal irregularities. Let's try to analyse. First of all, from the capture delta times from frame 4 up to 14 you can conclude this camera runs at 30 frames per second because there is 33.333 milliseconds between captures.
At the start of the stream timing is considerably off, including the capture times. This is normal and should be ignored.
But you can see that delivery of those frames is not every 33 milliseconds but slower. A frame is received every other 40+ milliseconds, catching up every few other frames. This is typical for TCP streaming and generally not a problem. But a decoder needs to dejitter a bit to render it properly. UDP streaming typically shows better figures.
At the end the capture times suddenly increase to nearly 1 second between captures. This is due to the 'smart codec' deciding to lower the framerate as it found no content changes from frame to frame.
Frame # Duration Receive dt Capture dt ADT Size
======= ============ ========== ========== ========== ======
0000000 246 0.000 0.000 0.000 170619 IDR SPS PPS
0000001 256 3.586 10.001 -6.415 1158 1
0000002 261 5.178 5.179 -6.416 763 2
0000003 292 31.050 31.052 -6.418 705 3
0000004 325 44.192 33.344 4.430 723 4
0000005 359 43.157 33.344 14.243 663 5
0000006 392 42.374 33.312 23.305 758 6
0000007 425 43.232 33.344 33.193 793 7
0000008 459 2.245 33.333 2.105 586 8
0000009 492 40.085 33.345 8.845 746 9
0000010 525 43.183 33.322 18.705 757 10
0000011 559 43.842 33.333 29.215 586 11
0000012 592 2.176 33.345 -1.955 717 12
0000013 625 46.053 33.311 10.787 533 13
0000014 659 43.253 33.344 20.696 404 14
0000015 1625 939.397 966.505 -6.412 254 15
0000016 2624 998.657 998.657 -6.412 580 16
0000017 3623 999.223 999.227 -6.416 526 17
0000018 4623 1000.329 1000.022 -6.109 282 18
0000019 5623 1001.649 1000.011 -4.472 714 19
Example: delay around I-frame
Here at frame 0000032 (the 33th frame) a delivery delay occurs of approx. 80 milliseconds. The frame is an I-frame which carries a lot more (270.1 Kb ) data compared to the P-frames (4..10Kb). In some networks such delays around the larger I-frames are normal. It's obvious a delay of around 80 milliseconds must be applied to hide this jitter.
On a sidenote, you can also conclude the framerate is 25 fps (from the 40 milliseconds interval) and the GOP is 32. Also it is likely there is no smart codec active (transmitting at full fps with moderately sized P-frames)
0000024 1075 39.041 40.000 -138.291 4017 24
0000025 1115 57.750 40.000 -120.541 9766 25
0000026 1155 25.097 40.000 -135.444 3165 26
0000027 1195 44.228 40.000 -131.216 3662 27
0000028 1235 36.471 40.000 -134.745 4130 28
0000029 1275 41.057 40.000 -133.687 4614 29
0000030 1315 40.122 40.000 -133.565 5051 30
0000031 1355 39.411 40.000 -134.154 4634 31
0000032 1395 102.189 40.000 -71.965 270102 IDR
0000033 1435 2.797 40.000 -109.168 4792 1
0000034 1475 20.476 40.000 -128.692 3789 2
0000035 1515 39.372 40.011 -129.332 4737 3
0000036 1555 45.012 40.000 -124.319 3955 4
0000037 1595 49.517 40.000 -114.802 9410 5
0000038 1635 29.712 40.000 -125.090 3783 6
0000039 1675 41.281 40.000 -123.809 5282 7
0000040 1715 36.629 40.000 -127.181 4193 8
0000041 1755 36.623 40.000 -130.558 4377 9
0000042 1795 44.513 40.000 -126.045 26862 10
0000043 1835 38.624 40.000 -127.421 2835 11
Example: camera frame skip
Here is another camera, also TCP streaming. At frame 0000073 and 0000083 you can see the capture skips a frame. There can be two reasons for this:
- A framerate is requested that can only be approximated given the fixed capture rate of 25 fps. The camera will drop frames here an there in order to generate more or less the requested rate. As you likely know which framerate you are requesting it is often easy to rule this out.
- The camera is generating video at near maximum capacity and for this stream occasionally a frame is being skipped.
So a sequence like this with capture irregularities your next step should be to inspect the camera for number of connected clients, system load and so on.
0000070 3381 42.293 39.989 13.295 181 8
0000071 3421 43.954 40.000 17.249 210 9
0000072 3461 44.701 40.000 21.951 220 10
0000073 3541 43.063 80.011 -14.998 219 11
0000074 3581 44.250 39.989 -10.736 305 12
0000075 3621 47.170 40.000 -3.566 228 13
0000076 3661 44.577 40.000 1.011 207 14
0000077 3701 43.532 40.011 4.532 247 15
0000078 3741 42.469 40.000 7.002 276 16
0000079 3781 43.087 39.989 10.100 283 17
0000080 3821 48.992 40.000 19.092 269 18
0000081 3861 43.759 40.011 22.840 266 19
0000082 3901 44.491 39.989 27.342 551 20
0000083 3981 43.088 80.000 -9.570 375 21
0000084 4021 44.426 40.011 -5.155 326 22
0000085 4061 44.044 40.000 -1.111 314 23
0000086 4101 44.029 39.989 2.929 390 24
0000087 4141 43.315 40.000 6.244 294 25
0000088 4181 50.739 40.000 16.984 336 26
Example: packet loss
Here is a short stream received over UDP, so no error correction. The real problem in this transmission was out-of-order packets, not packet loss. Packet reordering is not (yet) handled by stream_peeker and the packet loss statistics are off as a consequence. Otherwise the output resembles packet loss so close that it is fine as an example.
The striking thing is that the sequence starts with P-frame number 1 instead of an IDR frame. The total sequence shows 18 frames and the summary states 19 frames received, one lost. The packet statistics are unfortunately unuseable in this particular case.
Sometimes the problem is so persistent that every I-frame transmission is incomplete. You can recognize it by the GOP position number endlessly increasing.
As you can see the tool doesn't point you at the problem, but knowing enough about the video stream you can see it for yourself.
Frame # Duration Receive dt Capture dt ADT Size
======= ============ ========== ========== ========== ======
0000000 242 0.000 0.000 0.000 758 1
0000001 254 11.647 11.648 -0.001 687 2
0000002 287 57.778 33.333 24.444 708 3
0000003 320 10.995 33.344 2.095 502 4
0000004 354 31.630 33.311 0.414 915 5
0000005 387 37.947 33.345 5.016 761 6
0000006 420 31.834 33.344 3.506 561 7
0000007 454 32.983 33.345 3.144 700 8
0000008 487 30.155 33.299 -0.000 797 9
0000009 520 32.966 32.967 -0.001 603 10
0000010 553 33.900 33.344 0.556 856 11
0000011 587 34.661 33.311 1.906 584 12
0000012 620 32.240 33.345 0.801 372 13
0000013 1620 1002.599 1000.011 3.389 228 14
0000014 2587 963.606 966.677 0.318 648 15
0000015 3587 999.738 1000.012 0.044 351 16
0000016 4587 1000.414 1000.022 0.436 520 17
0000017 5587 1000.284 999.978 0.742 380 18
Commencing shutdown of all sessions...
Packets Dropped Pct Frames Dropped Pct FPS Url
======== ======== ======= ======== ======== ======= ===== ===============
65677 65536 99.79% 19 1 5.26% 3.12 rtsp://192.168.0.90/axis-media/media.amp?audio=0&videocodec=h264
BTW, this is the same camera and network as the 'normal delivery' example. It is interesting to compare the delivery delta times with that example, the difference being TCP vs UDP. It's still not really regular as the example is made over wifi. Wired typically shows more stable timing.
Stream analysis with Wireshark
This section would a book by itself, and a lot of ready made material is available on internet. I limit myself to my personal tips and tricks
Packet list display
Display filters
I/O graph
Extract decodeable video from a trace
I never got this to work unfortunately. It would be nice when this was more easy. Wireshark can play most RTP audio out of the box, and as long as the RTSP negotiation is inside the trace it automatically decodes the payload packets correctly. But you can't get an .mp4 out of it.
There is this plugin:
https://github.com/volvet/h264extractor
and:
Maybe you will more succesfull than I am