Jokingly or seriously about professional solutions to the problems of creating very large data archives.
Jokingly or seriously about professional solutions to the problems of creating very large data archives.
A real-life situation. My old friend has a hard drive on his home computer of only 500 GB, but four years ago it was the biggest! He loves photography, and his camera is 8 MP, not 16 or 32! True, as an additional load there are two more users — sons, lovers of films and videos. But on the other hand, there are no games at all, and the eldest son now has his own laptop!
And the hard drive is still full, it's gone! Why?
Professionals in home PCs confirmed my assumption and the fact that from 60 to 90% of the space on the disk of a home computer is usually now occupied by digital video data, photos, and games with naturally developed video sequences. But it is one thing to make sandboxes, and another to fill a protective dam in the Gulf of Finland. What should specialists do who, due to their job, require huge data archives?
How big are the archives? Where are they needed? And what data? — you ask.
Let's start from the end — What data?
There will be no surprises. After the digital revolution in all areas of life, in video in general and in video surveillance in particular, most often large volumes of data consist of video surveillance system clips, photo archives of scientific, technological documentation, libraries of scanned copies of the historical heritage of human thought, fine art, archives of television programs, cinema. Perhaps you, the reader, can add to this list yourself.
Where are they needed? The question is almost rhetorical, but we will try to answer it.
Central storages of video surveillance systems of large objects from a separate enterprise to a metropolis as a whole. Libraries of scientific institutes and general cultural heritage. Archives and storages of film studios, Gosfilmofond and television channels.
How big are the archives? Let's calculate it together using a number of examples of video surveillance systems of well-known objects.
EXAMPLE 1. Video surveillance and monitoring system of the St. Petersburg metro
Why the metro? Because it is a very important social facility that requires video surveillance. Why the Petersburg metro? Because it is quite large-scale and the closest.
Let's use some open information and make a rough estimate of the volume of video data for this facility. Back in 2006, a major expansion of the video surveillance system was carried out. Even according to the old requirements, at least 17 television cameras were provided for at a station with one vestibule. The local video surveillance system (VSS) of the station provided for one main video server and one hot standby server. Main server: up to 32 cameras. Of these, 4 cameras provide video with a resolution of 1600×1200 and a rate of 6 * 12 fps to the archive, and the remaining 28 cameras — 704 576, 6 fps. Archive depth — at least 7 days. Looking ahead, for future calculations, I suggest assuming that the recording is not constant and the presence of activity in the frame is approximately 20 hours per day. The hot standby server is essentially the same, but the archive depth is up to 3 days. In addition to the servers of local SVN stations, all video is transmitted via optics to the Metro Video Monitoring Center at the Technological Institute station. Let's skip the real state of affairs, but assume that some kind of video data storage with a depth of 10 days has been created there.
Now, for the calculation, there is one more important decision to make — to determine the average frame size. Of course, this value depends on many parameters: resolution, color, the number of small details in the frame, the compression algorithm, the compression ratio, and for streaming algorithms, also on the number of moving objects in the frame. On a real object, it is easier and more accurate to determine this value by empirically dividing the volume of the accumulated archive by the total recording time in seconds and by the recording rate. In this article, we will not complicate things and will stop at the value of 100 kb. A number of experts will say that it can be 10 times less. But these same specialists know very well that it can easily be 10 times larger, especially if we do not want to lose data from megapixel cameras.
Total data volume at the Metro Monitoring Center, taking into account 60 stations (in reality, already 63), 32 TV cameras per station, with a recording rate of 10 fps and a guaranteed storage period of 10 days: 60 stations x 32 cameras x 20 hours x 3600 sec x 10 fps x 100 kb x 10 days = 1,318 TB = 1.3 Petabytes.
Moreover, we tried not to exaggerate everywhere. But there is still potential for development. So, last week I came across an electric train equipped with television cameras. So far, 2 per car. Muscovites, of course, will say that they have had this on their ring line for a long time, and I agree. So, there are 520 cars in the St. Petersburg metro and, therefore, in the future, another 1,040 television cameras in addition to the already counted 1,920 cameras at the stations.
By the way, the requirements for video surveillance systems can be higher, they are constantly growing both in quality and in the depth of the archive.
EXAMPLE 2. Requirements for video surveillance systems of a large bank
Until recently, the requirements in the country's major banks were limited to 30 and 60 days of archive storage (for various purposes). However, this year the situation has changed. Recommendations for ensuring bank security in terms of video surveillance include mandatory color images, and an archive depth of up to 90 days.
It is easy to calculate how the volumes of stored data will change. A similar evolution of increasing requirements for video surveillance is now happening in all areas of life.
EXAMPLE 3. Video surveillance system of a megalopolis
In some systems, at first glance, the requirements are not so high. For example, for most TV cameras of the Safe City Moscow project, 4 fps are archived, and in standard resolution 352×288, b/w, and the archive depth is 14 days. However, the system currently has 124,000 TV cameras connected to 10,000 servers. Most of the video data is not stored in a single center, but distributed. Is this good or bad? It is impossible to say for sure, since both a centralized archive and a distributed storage method have their advantages and disadvantages. But is it possible in principle to create a single archive of such capacity? First, let's calculate the approximate volume of this hypothetical storage:
124,000 cameras x 20 hours x 3600 sec x 4 fps x 14 days x 20 kb = ± 9,313 TB – that’s 9 Petabytes, rounded up.
It’s hard to imagine how much that is, even if you store it in a distributed, piecemeal fashion.
Now let's remember that it is necessary not only to competently and reliably store gigantic volumes of video data, but also to have fast and convenient, sometimes multiple access to this data, as well as tools for processing it. And here it becomes clear that a mechanical set of hardware data storage tools will not solve the problem. A certain software system for monitoring, managing, protecting, organizing access and processing all this data is also needed. Otherwise, there is no point in storing it.
The final question: are there technologies that can handle gigantic volumes, ensure storage reliability, organize multiple access to data, and provide a tool for processing this data? How accessible are these technologies? Have they been tested in practice?
These issues are the subject of the second part of the article, which will be published in the next issue of the magazine. In it, specialists from the VIT Center from St. Petersburg will answer these and other questions. For a number of years, the Center's specialists have been successfully solving similar problems in building scalable systems for storing large-volume video archives. The article will provide information on a number of promising technologies, methods, and experience in solving non-trivial problems of organizing storage and quick access to archival records. Interested readers, both installers and customers, can take part in this dialogue by sending their questions, both specific and general, to the editorial board. We guarantee that the Center's employees will answer all questions either in the article or individually.
____________________________________________
D. Sadekov
General Director of the Technical Center «ViAiTi»