Multimedia technologies and CCTV.
Preface
Currently, more and more CCTV systems are becoming digital. It is not surprising that existing video processing technologies are widely used in them. How justified is this?
Specialists from Geutebrueck, a company that has been producing digital video recording devices for over 15 years, do not share the generally accepted opinion about the applicability of multimedia technologies for solving the problems facing security television systems. Dr. Matthias Döring explains Geutebrueck's position
« We are frequently asked questions
-What drives us to come up with our own solutions? Why have we been doing what we have been doing for over 10 years?
-Why can't we just use media and IT technologies to compress, store and transmit images like most of our competitors do?
-Why do we need our own MPEG4CCTV algorithm, and why is it not enough to just embed existing MPEG4 and H.264 chips into our products?
-Why don't we use standard databases like Oracle and MS SQL, why do we develop our own database server to store video archives?
-Why don't we use Multicast and WEB technologies for image transmission, but implement our own transmission protocol?
-Why did we «invent» our own GBF export format? Are AVI, FLV and MPEG formats not good enough?
-What makes CCTV systems so different from other applications, such as multimedia, to justify the expensive R&D we have been doing for the last 10-12 years?
-Will there one day be a «digital» standard for CCTV, like PAL/NTSC in the «ancient» analog days?
If you have ever been to England, you will be familiar with the problem of the UK socket not accepting plugs from European-made appliances. Sometimes I have even managed to plug one into the other, albeit slightly damaging them . And it works!
The main problem is the compatibility of CCTV requirements with the capabilities of multimedia technologies. The introduction of multimedia technologies into CCTV products leads to the emergence of unacceptable compromises. CCTV requires special approaches and solutions to meet its requirements.»
Dr. Döring gives nine arguments to prove his position.
Argument 1
CCTV and multimedia have little in common
Multimedia usually means television (broadcast, mobile, Internet), DVD, video conferencing, webcams and other applications that are well known to everyone.
CCTV is primarily surveillance, including security television, control and analysis of technological processes, verification of personnel access to protected areas.
In both Multimedia and CCTV, the source of information is a moving image obtained by a television camera. It can be accompanied by sound and text information. Specialists working in both fields solve the same problem — how to save and transmit this information to users.
But despite the same sources of data, both technologies seriously differ in the purposes of their use, which leads to the fact that real systems are completely different.
Argument 2
The purposes of image processing are different
In multimedia, images themselves are the purpose of video processing, the final product.
The general and basic requirement of multimedia is that the image must be of the highest quality that can be obtained from the chosen platform. Of course, the concept of «highest quality» differs for different multimedia products: a multimedia image on a mobile phone screen cannot be compared with a home theater image. But in all cases, there remains a desire to improve it as much as technology allows.
Since images are the main product of multimedia, it is also understandable that there is a desire to preserve all images. Losing or removing some images degrades the final multimedia product. Would a film benefit from removing some frames?
In CCTV, the goal is completely different, and this is often forgotten. By arguing about issues of technology, we lose the true value that lies behind all technologies: our goal is security.Our goal is not the images themselves, nor is it to store all the images, but to extract the necessary information from them. We should be proud if we can get rid of unnecessary images, because they take up our resources, causing us to waste money.
The quality of images used for security purposes should be as high as necessary.
For example, we have a video motion detector that triggers an alarm as soon as a moving object appears in the surveillance zone. There is no point in installing a high-quality TV camera that allows you to see details — this information will not be used to obtain a result
Thus, in CCTV, images are secondary, and the main thing is to extract security information from them.
Moreover, we must avoid unnecessary images. Let's imagine a TV camera standing at the entrance to a parking lot. There is no point in leaving images of an empty road in the archive — they are of no value.
Argument 3
Multimedia and CCTV images differ in their characteristics
Let's consider two examples.
The first is frames from a movie.
The second is an image from a camera located in a parking lot. Nine out of ten cameras installed in CCTV systems show approximately the same image.
Let's look at the main differences:
Multimedia
high dynamics, many moving objects in the frame
frequent change of scenes and angles
rapid changes in illumination
CCTV
no movement or insignificant movement
fixed scene
slow changes (e.g. day-night, seasons)
In CCTV, usually nothing happens. On the contrary, in multimedia, movement is a normal state.
In CCTV, movement, changes are an exception. The analysis of frame changes is the task of CCTV.
Argument 4
How large is the share of useful information?
The answer to this question in the case of multimedia is very simple – 100%. You cannot throw out a single frame from a movie without degrading its quality. You do not want to lose a single frame of the video conference you are conducting.
In CCTV, the situation is completely different. On average, more than 90% of images are garbage, frames that are not needed to perform security tasks. The remaining 10% are “suspicious images” that require analysis or operator attention. Less than 1% of the total number of images are real “disturbing” images.
Thus, the total share of useful information in CCTV video material is very small. The presence of «extra» images reduces its value in terms of security tasks.
Storing and transmitting unnecessary images also leads to unjustified material costs. For example, with the cost of a 10 TB archive of about 5,000 euros, storing absolutely all video data received from a CCTV camera means that 90% (4,500 euros) was wasted.
Argument 5
Multimedia compression is close to the theoretical limit. Content filtering methods used in CCTV allow to overcome it.
What are the possibilities for video data compression in multimedia?
A simple example: an image of a car park.
One hour of uncompressed video takes up 73 GB (at a resolution of 704*576, 8-bit resolution per sample, 4:2:2 color encoding, 25 frames per second). If you compress this video using the M-JPEG algorithm, you will have about 5-6 GB left. Using an algorithm with interframe compression — MPEG4 or H.264 — will result in about 1 GB. In the first case, compression is achieved by removing spatial redundancy contained in the image stream, and in the second case — also temporal.
Algorithms that remove such redundancy are standardized.
The compression ratio we got in the second case, equal to 70, is good for multimedia, but not enough for CCTV, especially if there are 200-300 cameras in the system. Even the best standard today, H.264, will give, perhaps, not 1 GB, but 500 MB of compressed video per hour. But this is still too much.
Let's go back to our parking lot example. For CCTV purposes, it is enough to see only which cars entered it. Analysis of the video segment selected for the example showed that if we select only the frames containing movement, 20 MB of useful information will remain. The resulting compression ratio seems exotic — 3500. But if we are only interested in people who entered the parking lot (i.e. suspicious situations), then only 2 MB will remain.
The applied content filtering method allows us to achieve enormous compression, which is impossible in the case of multimedia. We do not simply remove spatial and temporal redundancy in video information, we also remove unimportant frames, and the effect of this is much higher.
Content filtering methods are not standardized, and each manufacturer develops its own.
Argument 6
Can multimedia formats be used in CCTV?
Widely used video formats (MPEG, etc.) have the following characteristics:
They record one channel
They are designed for navigation with accuracy up to a scene (fragment).
They have fixed compression parameters (pixel resolution, quality, frame rate, encoder type). For example, if you need to record video at a speed of 2 frames per second, many multimedia codecs do not support this speed.
Have limitations on the size of the video file (e.g. 1 GB for DVD)
Have poor capabilities for embedding additional information (metadata). Such data could, for example, be information about alarms.
What is required for CCTV?
Multi-channel recording with the ability to synchronously play back several video channels
Variable speed, resolution, quality, controlled according to the needs of the system
Precise positioning on each frame
Metadata indexing (i.e. the ability to search images by embedded metadata – alarm parameters, bank account number, barcode, etc.)
Authentication and protection of video data from unauthorized interference Terabyte-range file sizes
Using standard multimedia format files for CCTV purposes is, at the very least, inconvenient.
Argument 7
Are Multimedia Players Suitable for CCTV?
Multimedia players are widely available. Let's imagine what would happen if we used one of these players for CCTV purposes.
A multimedia player typically has the following properties:
Plays only one channel at a time.
When fast-forwarding or rewinding a video, the movement is uneven. Rewinding typically works much worse.
No precise frame-by-frame positioning.
Specialized players for CCTV:
Play many channels simultaneously and synchronously
Have advanced navigation tools within the video archive, with uniform movement forward and backward at any speed and positioning on any frame.
Allow automatic changes to screen parameters, for example, in the event of an alarm
It should be noted that the wide capabilities of specialized players for CCTV are due precisely to the refusal to use standard multimedia formats.
Argument 8
Multimedia is usually late
I would like to draw attention to the time aspect.
Digital image processing takes time. Optimization of compression algorithms leads to increased delays in encoding and decoding information. These delays cannot be eliminated by simply increasing computing power; they occur due to the specific structure of the transmitted frames.
How critical are the delays associated with the use of a particular video processing technology?
Multimedia applications are usually characterized by the following time parameters:
Acceptable delay in transmitting a video stream is several seconds
Acceptable delay in changing images (switching video channels) is several seconds
Rare and slow changes in compression parameters over time.
For CCTV applications, the situation is completely opposite. Even fractions of a second of delay significantly impairs the system's capabilities:
When transmitting «live video», the delay prevents normal control of high-speed PTZ cameras
When alarm frames are delayed, their relevance is lost
Delays in changing scenes and switching channels slow down the operator's work
Compression parameters must be quickly changed depending on the situation (for example, when alarms occur).
It can be said that video information delays in the CCTV system negate its useful properties and deprive it of the right to exist!
Argument 9
Information distribution models in multimedia and CCTV are different
Information distribution models in the multimedia environment and CCTV have an obvious difference.
In the first case, information from one or several sources of information is transmitted to a huge number of its consumers. The term «replication» is appropriate here. Consumers of information cannot, as a rule, influence the source of information in any way.
In the second case, the number of sources of video information is large and far exceeds the number of its consumers. This can be called «information collection». Consumers in the CCTV system not only receive information, but also actively influence the system — change the display configuration, confirm alarms, control PTZ cameras.
The differences between both models of information distribution are presented in the table:
System parameter Multimedia CCTV Number of video channels Small Significant Number of information consumers Very large Several Number of video channels transmitted to one consumer at any given time One Many Interactivity requirements Low High Acceptable delays in information transmission Large Very small Requirement for uninterrupted operation Absent Very important As we can see, these characteristics have nothing in common!
The existing standards were created for multimedia purposes. It is impossible to use them in CCTV without radical reworking. Cosmetic interventions will not help here.
The desire to reduce development costs is not a strong enough argument to agree to use inadequate technology. However, many leading manufacturers do so. Is it really possible to imagine a TV, remote control, camcorder and mobile phone cameras as a full-fledged replacement for a professional CCTV system?
The paradox is that most multimedia technologies (compression, transmission and storage) are not only not optimal solutions for CCTV, but rather serve as a source of internal problems, since they are based on principles that are not applicable to CCTV. Everything is turned upside down. We have standardized multimedia technology capabilities and are trying to adapt them to our needs. Isn't it better to develop a standard that meets CCTV requirements?
To the question at the beginning of the article: will a “digital” standard for CCTV ever be invented, the answer may be as follows:
As long as the provisions of the standards ignore the basic requirements of CCTV, there is no visible prospect for creating a comprehensive standard, or the result will be determined by the own developments of individual companies.