Video Streaming Over 4G
|Example of a matrix view of thumbnails on a VMS|
There is a huge difference between transporting 16 streams as full frame rate and full resolution and sending thumbnail views of camera streams. Let me explain.
A typical H.264 IP camera with a 1.3 megapixel resolution will compress and stream video in 25.6KB frame size. At 30fps, that is 768KBps or 6144Kbps or 6.1Mbps. This assumes some motion, etc, but is a general ballpark. By turning up compression, you can get this down to around 2.5Mbps at the cost of image quality (artifacts).
In a typical security configuration the cameras send a primary stream over a high speed local area network to the VMS server without concern for bandwidth, recording at best quality and frame rate. Sending a 2.5 Mbps, or even a 6Mbps stream, between the camera and the server is not usually an issue. However, sending from the server out to a remote client such as a smartphone over a narrow cellular link can be an issue.
|Example of how HauteSpot MVE Connects|
If you tried to stream at full resolution, full frame rate and full image quality even one H.264 1.3 megapixel stream the video will have little chance of getting through in real time. You will need to buffer a long time (aka YouTube, Hulu, etc). Remember that most broadcast streaming is non-realtime and they use multipass encoding techniques to improve the quality and reduce the size of the streams they send.
Further, a 1080p monitor can display 1920x1080 pixels or approximately 2 megapixels at any given time. So sending any more data than this from the server to your client is really useless.
So what most VMS systems do is set a second stream from the camera to a lower resolution for live viewing. They are still recording the primary live stream at full resolution. When someone goes to the VMS from their cell phone, they first get a thumbnail of each of the cameras using the small substream and then, when you click on a camera to look at in detail, the VMS will shift from the small secondary stream to the primary full resolution stream. But even this may be scaled and transcoded on the server to fit the size of the screen of your device.
So, really what you are getting from the VMS to the smartphone client is a 2 megapixel or smaller transcoded rendering of your cameras. When you are looking at 16 cameras, you are not receiving all 16 streams at full resolution, you are receiving either a bunch of small substreams or a transcoded single stream representing all of the substreams in 2 megapixels. This may then additionally be compressed to fit into a 1 Mbps pipe. Think of it as a "image of your streams".
There are many VMS who do this and the technique has existed for years. It is a commonly known method that is well documented in the public domain.
What is missing from most vendors implementations are several critical elements which are addressed by the HauteSpot MVE system:
- Inbound Mobile Ingest of Cameras - It is all well and good to be able to view your cameras remotely over 4G, but connecting your cameras to the VMS over cellular is a different story. This requires uplink speed, which rarely exceeds 500kbps and can be extremely variable. MVE dynamically adjusts the uplink speed to fit the available bandwidth. It is not preset to a low "live view" rate and size. If you have good bandwidth, you will get a better image.
- Persistence - Most VMS have no ability to deal with the constant disconnects of cellular. Roaming between cell towers, changing IP addresses, and lost signal all contribute to loss of connection. MVE makes and keeps persistent connections that have been proven to work on cellular networks in the worst conditions (hurricane Irene, tropical storm Sandy, floods and hot weather).
- Remote Record with Transfer - The HauteSpot microNVR allows remote recording at full frame rate and resolution at the camera source. It then provides a persistent connection to the MVE server and the dynamic rate adaptation which adjusts to changing available bandwidth. It also allows remote, chain of evidence transfer of the high resolution video from the remote site in batch.
VMS IP Video Decode Capacity Vs Record CapacityAs we walked the show we asked a simple question of a number of VMS vendors "How many 1080p 30fps live streams can you simultaneously display (assuming that they had a multiple monitor video card on their system)?" The answers ranged from 4 to 64. And the answers were coming from field application engineers, not sales people.
A VMS provides a number functions including recording video streams to disk, displaying live and recorded streams, providing search capabilities, etc. The answers that we got indicated that the vendors really did not understand the difference between storing video and decoding video for display.
Again, IP VMS systems receive incoming video streams from cameras. These are typically either H.264 or MJPEG streams that have a fixed frame rate, resolution and compression profile as set on the camera. Most VMS systems will receive the video stream and write it to disk. This function does not require any significant CPU or GPU functions, since it is really just network IO. The system is limited only by the bandwidth of the network as to how many streams it can support. Disk IO is generally faster than network IO.
|Tom's Hardware Benchmark for H.264 encoding|
The chart on the left shows just how much time it takes to encode 1080i video using hardware and software. This is an example of when you are reading a raw source and sending it to a client to view. Performance is similar for decoding. As you can see, using hardware such as the Intel Quick Sync technology built into Ivy Bridge based systems can significantly improve performance. Without Quick Sync, software encoding/decoding consumes approximately 25-50% of a quad core CPU for just one 1080p 30fps stream. Using high end video cards helps some, but again, they are targeted at single or maybe dual display systems, so they really don't try to decode more than what they are capable of displaying.
If you were to use the hardware acceleration of Quick Sync, then you could get 120 fps of 1080p
So the vendors that said they are able to simultaneously decode 12 or 64 streams are clearly not correct.
If you are writing a file to disk or reading a file from disk to and from the network, then yes, you can support many cameras. If you are trying to actually display the video, then the capacity is much much lower.
It is interesting that only a couple of products like Network Optix HD Witness or HauteSpot's MVE system take advantage of Intel Quick Sync hardware acceleration in order to increase performance and are designed for high definition video processing.
This is actually a good thing for companies like RGB Spectrum who make display wall processors that allow you to aggregate multiple server video outputs into a single consolidated display. Solutions like their QuadView HDx allow you to consolidate multiple video sources into one screen, so even if you can only get a single 1080p stream out of your VMS, you can combine it with other servers to create faster, full frame rate, full resolution systems. Watch for more on high resolution display, particularly 4K, when I review NAB 2013 in my next blog.