Video Analytics: Myths and Real Possibilities.
Video Analytics: Myths and Real Possibilities
The rapid development of the digital video surveillance market is promoting the spread of intelligent video analysis systems. At the moment, everything related to the real capabilities of video analytics is shrouded in darkness, and there are significantly more questions than answers. In many ways, thanks to marketing information, customer expectations exceed the real capabilities of intelligent video analysis systems. In this article, I will try to clarify the situation a little.
At present, it has become clear to most specialists that video analytics is a very promising direction, it is the future. There are plenty of reasons for the development of such systems. There are studies, the results of which show that after 12 minutes of continuous monitoring, the operator begins to miss up to 45% of events. And up to 95% of potentially alarming events will be missed after 22 minutes of continuous monitoring. There is nothing surprising in such results, because images from security cameras are extremely boring for a person. Thus, there is practically no doubt about the validity of using video analytics. On the other hand, the question arises: what is the real functionality of such systems? What are they really capable of, and what are they not?
There is a list of classic tasks that video analytics successfully copes with, which is confirmed by practical results. I will list the most popular of them:
— recognition of car license plates;
— face recognition;
— event detection (crossing lines, entering a field, leaving a field, an object left or taken away, etc.).
I would also like to list several tasks that video analytics cannot handle today:
— alarm detection in extremely poor visibility;
— detection of hidden weapons;
— detection of «suspicious behavior»;
— etc.
The effectiveness of solving each of the above tasks depends significantly on many factors.
As for the recognition of license plates, quite a lot of work has been done in this area, and there is also plenty of information. I will not dwell on this issue in detail in this article, I will only say that there are detailed recommendations for installing a camera, choosing a lens, the minimum number of pixels that a license plate should occupy in the frame is known, etc. Such systems are quite widespread in practice, many Russian and foreign developers successfully offer their solutions for recognizing license plates.
Face recognition is, of course, a topic for a separate conversation. In my opinion, of all the systems mentioned, these are the most capricious. The requirements for illumination, size and position of the face in the frame, the quality of the face database are very critical, so it is very difficult to actually use such systems. However, this does not mean that they are useless and unpromising. Video analytics systems are developing very quickly, and the demand for face recognition systems is difficult to overestimate.
Fig. 1. Myths about the possibilities of video analytics: |
Fig. 2. Typical video analytics tasks: Top row left – Following a route right – Crossing a line in a certain direction Bottom row left – tracking the direction of movement in the crowd right – intermediate result of processing image on the left |
There are currently quite a few discussions about whether video analytics should be centralized or work directly in the video camera itself. The advantages and disadvantages of each option are obvious. On the one hand, server processors are significantly more productive than processors installed inside IP*video cameras. Developing video analytics software that works centrally on a server is simpler, and the capabilities are much wider.
Implementing analytics at the camera level has a number of advantages:
— The ability to work with a “live”, uncompressed image just read from the matrix.
— No centralized point of failure (if one camera fails, the others continue to work, but if the server processing video from a group of cameras stops, the video analysis process stops for the entire group).
— Possibility to reduce network traffic and build distributed systems.
Often, server software companies advocate for centralized video analytics, but the advantages of video analytics that operate at the camera level are undeniable. With the technological development of IP video cameras, their analytical capabilities are constantly expanding. Today, it is already possible to build an intelligent video surveillance system in which there is no computer at all, and all the analytical work is done by «smart» cameras.
Fig. 3. Configuration of the color of the detected object |
In the future, when speaking about video analytics, we will mean video analysis that works directly in the camera. One example of such an implementation of a video analysis system is IVA (Intelligent Video Analytics), developed by Bosch Security Systems.
Now let's dwell on event detection in more detail. I would like to emphasize that we are talking specifically about intelligent video analysis, and not about a banal motion detector.
Fig. 4. IVA Lens Calculator |
The spectrum of events – tasks of the intelligent detector – is constantly expanding. If comparatively recently the possibilities were limited to motion detection in certain areas of the image, filtering was possible by the size of the object, now the list of tasks that can be solved by means of video analysis is significantly wider.
This is far from a complete list of events that can be detected using video analytics systems. As for the range of practical applications, it is incredibly wide. This includes perimeter security, traffic control, transport security, safe city systems, etc. Thanks to the ability to create complex alarm events based on combinations of simple ones, the capabilities of designers and installers become virtually limitless.
It should be noted that the event filtering capabilities have also expanded significantly. In addition to the usual filtering capabilities by geometric features (size, area, aspect ratio), it is possible to filter by direction, speed of movement, presence of a head, and also the color of the detected object. For example, the IVA intelligent video analysis system is capable of detecting an object by color, which can be represented by a combination of up to 5 different colors!
It is important to understand that both the probability of adequate detection of alarm events and the percentage of false alarms significantly depend on a number of factors, such as: the choice of the optimal location of the camera, the correct selection of optics, the quality of the video signal (if it is impossible to see anything visually, detection is also impossible), the relative size of the object in the frame, etc. Thus, the correct selection of equipment (video camera, lens), the choice of the optimal location for installing the camera, the use of illumination (IR or visible light) if surveillance is carried out in difficult conditions in terms of lighting, allow achieving the maximum probability of correct detection of alarm events and minimizing the frequency of false alarms. I would like to emphasize that it is impossible to achieve 100% probability of detection and 0% false alarms.
As mentioned earlier, the probability of detection depends on a number of factors.
Bosch Security Systems has developed the so-called IVA Lens Calculator – a tool that allows designers and installers to select the optimal camera installation location, select a lens and estimate the so-called probability of detection (POD Probability Of Detection).
The final POD value depends on the DCRI parameters – Detection, Classification, Recognition and Identification (detection, classification, recognition and identification) – classification in the terminology of the US Army (US Army Night Vision lab, John Johnson). In turn, the DCRI value depends on the VSH (Vertical Screen Height) value – a percentage value showing the relative height of the object (in relation to the frame height). That is, depending on what height the detection object will be in the frame, it is possible to classify it in DCRI terminology. Determine whether it will be detected, classified, recognized or identified. Depending on this, POD – the probability of detection – is already assessed. The background noise level is also important. When choosing a location for installing a camera, it is advisable to minimize, if possible, the area of space that falls into the frame behind the detection object. One of the most important conditions for the correct operation of video analytics is accurate calibration of the IP camera.
The video analytics system needs to be informed as accurately as possible about the height at which the camera is installed, the focal length of the lens, the angle of the camera, etc. Since video analytics only operates with pixels, the system must clearly understand whether 100 horizontal pixels are one meter or five.
Using a tool such as the IVA Lens Calculator allows you not only to optimally select equipment and determine the location of the video camera, but also to assess the probability of detection in a given part of the space.
Another important function of video analytics is the ability to intelligently search the archive. When the video analytics system is operating, metadata (special service data that continuously describes potentially alarming events in the camera's field of view, movement contours, trajectories, etc.) is continuously transferred to the storage along with the video stream. Metadata is synchronized with the video archive and allows you to save a lot of time when searching the archive. The search can be performed by an event, the criteria for which are set similarly to setting up the detector in real time.
Let's look at an example:
Let's say a video camera monitors the traffic situation and is configured to detect alarms in real time. Alarming events are all intersections by cars of an imaginary line drawn over a real line of markings dividing the flows of oncoming directions. Information has been received that a stolen car drove along this section of the road. The approximate time of travel and, say, the color of the car are known. Since it did not violate traffic rules and did not cross a solid line of markings, its appearance in the frame was not an alarming event. However, since the video analysis function is activated, metadata was recorded in the archive, it will be possible to quickly find all appearances of cars of a given color on this section of the road, despite the fact that the color was not specified a priori.
Viewing archived video is no less boring than monitoring security video images in real time. The probability that the operator will miss this or that event while viewing is quite high. In Russia today, there is no such profession as a qualified video surveillance operator. This is not taught anywhere. In most cases, this functionality is assigned to security guards. Of course, the ability of different people to effectively monitor the situation on monitor screens varies significantly. In my opinion, building systems using video analytics significantly increases the efficiency of the entire video surveillance system as a whole.
In conclusion, I would like to say that, in my opinion, video analytics is a powerful tool in the hands of a video surveillance system operator. Despite the fact that the level of automation in security systems can be very high, video analytics today is not capable and is not intended to completely replace the operator. It is difficult for a person to continuously monitor hundreds of cameras, but he is able to quickly make the only correct decision!
Source: Security Algorithm magazine, No. 5, 2010