The future is becoming reality.
The origins of analytics
Video image analysis was first discussed in 1990, long before the tragic events of September 11, 2001. «It was the domain of various government-funded research institutes,» says Dieter Kondek, president and CEO of Agent Video Intelligence, or Agent Vi. «Computer vision,» as analytics was originally called, gradually evolved into research activities to create artificial intelligence.»
The first examples of analytics on the market were motion detectors developed by order of the Ministry of Defense, which required significant hardware resources, which made them ineffective. Consumers and manufacturers, in turn, considered such video analytics complex and unreliable due to a large number of false alarms. «Even the most innocent movement, such as the flight of a bird, could lead to an alarm in the system,» notes Kondek, «These outdated CCTV systems could not work with more advanced algorithms that exist today, for example, to detect abandoned objects, the fact of intrusion into a protected area, a stopped car, etc.» The turning point was the events of September 11, which took everyone by surprise. It was from that moment that it became clear how necessary it was for society to have intelligent video systems to monitor the situation in crowded places, at airports, to restrict access to protected areas, etc. At that time, there was no solution on the market that could handle a task of this kind, which required simultaneous processing of information from thousands of cameras. Also, a large number of threats and terrorist attacks in schools in the US and around the world forced manufacturers to think and start working in this direction. Fortunately, by that time the vector of technology development in the security industry began to shift towards digital systems, which gave manufacturers the opportunity to focus their efforts on creating intelligent systems.
Intelligent capabilities of systems today
Today, the ability to analyze video images has become an integral part of absolutely any systems and solutions. The task of the analytical system is to radically reduce false alarms caused, for example, by natural and climatic phenomena, changes in illumination, etc., with the maximum level of intelligent recognition of human movement, image or event, determination of a human face, car number, etc. For example, Bosch Security System uses background analysis that can adapt to its changes, which helps to avoid false alarms, for example, from swaying trees or moving clouds. In this case, the alarm occurs only in a really dangerous situation, the parameters of which are set by the user. Solutions are also used that can measure objects using a three-dimensional coordinate system or by calibrating the image from the camera. This is achieved by using a graphical user interface that allows you to superimpose a 3D grid on the image and distinguish objects, taking into account the ratio of the sizes of objects in perspective and their movement vectors. Many companies are developing the analysis of human behavior, highlighting non-standard and dangerous actions.However, experts believe that manufacturers and suppliers should clearly explain to consumers what to expect from the intelligent capabilities of systems and what user desires cannot be satisfied. «There is a huge gap between what people believe and expect from intelligent systems and their real capabilities,» says Andras Sudy, Marketing Director at Intello. «After reading and hearing a lot of different information about the intelligence of systems, consumers make unreasonably high demands on them, the fulfillment of which, if possible, is only possible through complex settings and configuration of the systems.»
This deep understanding of the customer’s needs was part of Intello’s strategy to reduce false alarms. The company has a traffic monitoring solution that includes several analytical algorithms, such as license plate recognition, direction and speed detection. The Hungarian Highway Authority tested the solution and then asked about the system’s ability to detect fog on the road, which was not originally envisaged.
«We didn't have a ready-made solution for this issue, so we had to create a system for detecting and warning about fog on the road from what exists today,» notes Sadi. «But if this task cannot be solved due to the physical limitations of the camera, then we must be honest about it. This is the best solution to the problem that is possible.»
What to expect?
The emergence of artificial intelligence in the systems of tomorrow is still quite problematic, but nevertheless, video analysis algorithms are becoming more complex every day. «More accurate identification or object selection is obvious in the future,» says Gadi Talmon, co-founder of Agent Vi. «No manufacturer has this technology yet. We are currently working on object identification algorithms that could distinguish, for example, people by their clothing or distinctive signs on it. We are constantly creating new, more robust and reliable algorithms.»
Some experts predict that we will see a large number of new intelligent algorithms developed by small, specialized companies, rather than integrated solutions created by joint efforts of companies such as VistaScape and Siemens Building Technologies or ActivEye and Honeywell. “The big companies have the means to change course depending on the needs of the market, while the small companies are more specialized, but they develop exactly what the customer wants,” says Sadi. “The security market is going to be very interesting in the next couple of years.”
The growth of the security industry is driving demand for ever smarter analytics solutions. And the more likely it is that the technologies will be used outside the security market, the more interesting it will be to watch the intelligent video surveillance sector.
Securing the Future
But today the consumer is already asking the question: what to choose? DVR, hybrid DVR, NVR? Without forgetting about the main functions, such as compression and storage of video information, it is worth paying more attention to video platforms and systems with an open, scalable architecture based on IP technologies, which are presented on the market by many manufacturers.
The world of security is steadily moving towards IP technologies. DVR functions have become more complex and intelligent, thanks to the integration with various cameras and central servers. Today's DVRs are capable of capturing video in better quality, storing more information on less disk space and have flexible recording settings. Moreover, both network video recorders (NVR) and hybrid DVRs are available on the market. These devices are designed to work over a network, including the Internet. NVRs work exclusively with network IP cameras, hybrid DVRs can work with both IP and analog cameras, have more options for remote control and surveillance. NVRs allow you to control the entire system, down to the settings and control of each camera, they are specially designed for recording and storing large amounts of data, which is only possible on an open architecture platform. Which systems: NVR or hybrid DVRs will lead the market in the next 5 years? What to expect from the market? What trends and technologies?
Hybrid systems
The British company IMS Research conducted a market research study which showed that the transition from analog technology to network technologies has become more dynamic. The market for network video surveillance technologies, including cameras, network video servers and recording devices, increased by 42 percent in 2006. The market is expected to continue growing and will amount to 2.6 billion US dollars by 2010. This indicates that network products such as cameras will make up only one third of the total number of cameras shipped in 2010. But according to the same research, the security industry stubbornly resists change, and the implementation of new technologies will not be so fast. The main task of suppliers and manufacturers is to explain to the market the advantages of network technologies over traditional analog CCTV. What is the problem with the confrontation? Many IP managers (specialists, system administrators) are not satisfied with the large flow of data generated by video surveillance devices in the network. «That's why even manufacturers like Bosch will always have a good old DVR in their product line,» notes Jan-Bart Mul, Marketing Director of Digital Systems at Bosch Security Systems. «Since cameras make up the largest part of any system's costs, hybrid systems that work with both analog and network cameras, especially hybrid DVRs, will be in demand. Hybridity is a necessary functionality.»
Vision Systems, a leading Australian DVR company, agrees. Given the prevalence and number of analogue cameras, the ability to work with both technologies will be needed for several years to come. According to Tim Farrow, Vision Systems' chief designer, «bandwidth will be the biggest bottleneck in the development of scalable IP solutions.»
Along with adaptation for network operation, compression, system capacity, market players are trying to find a way to more conveniently and quickly search for recorded information and analyze it. In addition, the issue of information security, access restrictions to video recordings and methods of protecting and encrypting video data is acute.
Data storage
Higher capacity, reliability and read/write speed of hard drives will be very important in the future. Due to the increase in capacity and decrease in the cost of hard drives, DVRs occupy an important niche in the security market. Dieter Dallmeier, founder of Dallmeier Electronic, one of the pioneers of DVR development, explains that despite the significant increase in memory capacity over the last 15 years, the capacity of hard drives increased from 100 megabytes in 1992 to 500 gigabytes in 2006, and the constant improvement of compression methods, the problem of data backup and storage still exists today. Saving disk space is a key task. DVRs with large amounts of disk space are being released. At present, conventional disks can easily reach a size of 600 gigabytes to 2 terabytes. In large IP systems, DVRs must be connected to multiple data storage systems. «Support for external network storage arrays, RAID and NAS (network attached storage) drives and additional hot-swappable drives is what will matter in the future,» says Jean-Bart Moul.
Some companies, such as 3VR, offer solutions that have self-testing and self-monitoring to prevent, for example, cases of camera disconnection, data storage failure or lack of video data. This significantly reduces the time and cost of maintenance, the risk of data loss.
Data compression technologies
Image compression technologies are also developing. H.264 is the most advanced codec, allowing for high-quality images while transmitting video in real time with minimal network load. In systems using H.264, the frame size varies from 1 to 5 kilobytes. Recorders with dual compression methods, such as MPEG-4 and H264 or MJPEG and MPEG-4, are also becoming popular due to their flexibility in connecting to cameras and central servers. “As communication channels and storage sizes evolve,” says Tim Farrow, chief designer at Vision Systems, “innovative video compression technologies will no longer be as important in video surveillance and security systems. IP devices will have built-in storage and an analytical core that will allow for better information management. These devices will be able to index video and transmit only the important fragments to the central server, instead of transmitting and storing all the information. This will reduce communication channel requirements and the cost of creating the system as a whole.” Recently, GEUTEBRÜCK released a new version of the GscView software, which has the ability to control video signal encoders. Such a system not only significantly reduces the amount of data and channel load, but also allows a regular Core2Duo processor to simultaneously display up to 100 cameras of live video at 25 fps in MPEG-4 format. This requires a special mechanism, the so-called speed regulator, which constantly monitors the operation of the monitors to ensure that video data is not transmitted faster than it can be processed. The second task is the dynamic live stream (DLS), which ensures that the hardware codecs always compress and transmit a video image of the required size and quality for the current display window.
Data Search
Despite all the advantages of today's DVRs or NVRs, they have one common problem. With the ability to store recorded information from a large number of cameras, security services are faced with large volumes of information that are very difficult to use. In the event of an incident, such as a theft or an attempted terrorist attack, there is no doubt that all the information will be guaranteed to be recorded. But the problem is the time it takes to sift through hundreds and thousands of hours of live video recordings from a simple camera in search of the necessary data. Many companies are working on creating non-traditional DVRs with deep intelligent data search systems. More functionality will also include complex systems for analyzing recorded data built into the DVR to quickly find suspicious events. For example, files with metadata (text strings containing keywords describing individual scenes), which are sent along with a video fragment to the recording device, are much smaller in size and easier to quickly search, as opposed to searching for a frame with motion directly in the video file. This allows you to reduce the time it takes to search and access scenes and events of interest to a few seconds, as a special algorithm is used, similar to that used on Internet search sites.
But in addition to analyzing recorded events, real-time monitoring of the situation also plays an important role. The traditional method of video monitoring can no longer match the growing number of cameras and the architecture of modern systems. A video operator will not be physically able to control such a number of cameras and adequately perceive the situation and react. Therefore, the market trend is the transition to intelligent analysis systems. As mentioned above, the main idea for building large-capacity distributed systems is the use of end devices that process video images from a camera and send only suspicious events that require attention to the operator. But it turns out that there are pitfalls here too, requiring modern solutions and organizational measures.
IPoIP (Image Processing over IP Networks)
Video processing over IP networks
Much effort has been put into developing algorithms that extract information of value for analysis from both streaming and static images. As a result, methods have been created that allow working with both live and recorded images, and these algorithms can work on both software and hardware platforms. However, such platforms can only process data from a small number of cameras at a time (in most cases, according to the original source, no more than one or two channels). There are two main types of system design.
In the first case, the video processing device is located in close proximity to the video camera, performs primary data processing, and then the result is sent over the network to the central server or monitoring station. Usually, a PC acts as an image processing device for solving complex problems, but recently the share of autonomous servers based on DSP (digital signal processor) or even ASIC (Application-Specific Integrated Circuit) processors has been increasing. These devices perform image analysis tasks and send information to the network when a significant event occurs (for example, motion detection). At the same time, video digitization devices can be installed near the camera to organize remote monitoring, which can transmit video of different quality and resolution, adapting to the network bandwidth. The video stream is usually transmitted in MJPEG, MPEG-4 or similar formats. In fact, this is a solution based on IP video servers with built-in motion detectors, sensor channels and keys. On the one hand, such a solution is effective, as it has a low network load and provides sufficient distribution and flexibility. But on the other hand, it has a number of disadvantages:
- as a rule, each video channel requires its own device, which, in the case of a large system, leads to a significant increase in the cost of the project;
- it is impossible to upgrade the processor, only to change or use the PC for more complex analytical tasks;
- in the case of outdoor installation of cameras that require complex analytical tasks, for example, on highways for traffic control, it is impossible to use the PC due to its unsuitability for work «outside» in severe climatic conditions, or it is necessary to use expensive solutions for transmitting video images over long distances;
- DSP-based solutions require more development effort and costs due to limited capabilities and low-level development tools.
In another case, the central server acts as the video processing core. All video analytics tasks from all cameras in the system are assigned to one powerful computing resource. From a hardware point of view, such a solution is more economically justified and suitable for large projects. But such a solution is only possible if there are relatively few significant events in the cameras that require insignificant resources and allow one server to work with a large number of cameras. But in such a solution, the network is a bottleneck, since its bandwidth is subject to increased requirements. Since all video analysis occurs on the server, it must receive high-quality digitized video images, that is, uncompressed. When such a solution operates in a network with a small number of cameras, everything will be fine, but if the number of cameras increases, it becomes impractical due to the high costs of organizing network communications. This solution is more applicable in cases where there is no need to process every frame of live video, but only some rare frames.
Having realized all the limitations of both solutions, Agent Vi set itself the task of creating a solution that:
- could simultaneously work with both several and thousands of cameras;
- ensure scalability at low cost;
- cameras could be installed in any geographical point without restrictions, provided that there is an IP network in this place;
- the ability to view any camera from the monitoring site;
- the image from each camera can be processed by several intelligent algorithms. The analysis data should be entered into a central database and available for viewing from the monitoring site;
- the ability to easily add new algorithms or adapt existing ones to specific tasks without significant costs for updating the system;
- the ability to work with events from one or more cameras; multi-camera events combine information from several sensors, which allows you to identify events of the highest importance;
- in the wilderness (overpasses, borders, highways), where there is no developed infrastructure, the requirements for power supply and communication channel, especially wireless, are most important. For such tasks, where the issue of power consumption is critical, it is impossible to use a PC.
IPoIP architecture
The IPoIP architecture was created to meet the above requirements and solve the following problems:
- providing a cost-effective solution for the application of video processing from a large number of cameras without limiting the probability of detection and increasing the number of false alarms;
- the ability to apply any algorithm to any camera, even in the case of their geographical remoteness and limited operating conditions;
- the ability to apply a large number of algorithms simultaneously for any camera without limiting the user to working with only one application.
The uniqueness of IPoIP lies in the distributed architecture of image processing. Instead of performing video analysis only near the camera (IP video server, PC) or only on the central server, the new architecture assumes the simultaneous use of both options. The process itself is divided into two parts and is divided between hardware digitization devices and the central server. In this case, IPoIP encapsulates the advantages of both solutions and does not depend on their shortcomings.
The idea behind this division is that each camera has a microprocessor inside the video encoder for video compression. This inexpensive fixed processor is very suitable for performing several of the tasks described above, which allows sending a small amount of important information to the central server for more detailed analysis. For example, in the case of identifying a car number, a video analyzer located near the camera finds an image of the number in the frame and transmits over the network not the entire frame (which is about 1 MB in original quality), but only the image of the number to the central server for analysis. In case the initial data is not enough, the central server can request additional information for analysis, for example, the color of the car or the contour for classification.
In this case, the system combines both the high quality of the original video and the computing power and flexibility of the central processor without the cost of an expensive network.
Based on materials from:
http://asmag/asm/common/article_detail.aspx?c=1&module=1&id=4128
A&S International magazine, June 2007, issue 102
articles “Intelligent Video. Solutions Tackle Real-Life Hurdles», «Image Processing over IP Network (IPoIP)», «Powering the Next Generation of Actionable Intelligence Solutions»
http://boschsecurity
http://honeywell
http://camerasecuritynow
http://agentvi
http://video-surveillance-guide/