Not Just Encryption, or an Overview of Cryptotechnologies..
Not Just Encryption, or an Overview of Cryptotechnologies
Gleb Semenov
Introduction
What usually comes to mind when you hear the word «cryptography»? Intelligence agencies, diplomatic correspondence, «mathematical shamanism»?
And when you say the phrase «cryptography in information technology»? Data encryption will immediately come to mind. Electronic signatures will be remembered. Someone knows something about «Public Key Certificates». Or about «cracking» other people's codes. Let's try to understand all these, and at the same time other, concepts and approaches that modern cryptography uses.
Cryptography in the past was used only for military purposes. However, now, as the information society is emerging, cryptography is becoming one of the main tools for ensuring confidentiality, trust, authentication, electronic payments, corporate security, and countless other important things.
To be fair, it should be noted that cryptography is not a panacea for all ills. Cryptographic methods can help ensure security, but you should not rely on these methods alone. Cryptographic methods can be used to solve the following security problems:
- confidentiality of transmitted/stored data;
- authentication;
- integrity of stored and transmitted data;
- ensuring the authenticity of documents.
There are few basic methods of information transformation available in cryptography, including:
- encryption (symmetric and asymmetric);
- calculating hash functions;
- generating an electronic digital signature;
- generating a sequence of pseudo-random numbers.
In this review, we will be interested in cryptosystems and cryptographic algorithms implemented using a computer. The review will not cover encryption methods using cipher pads or special encryption equipment, both due to the lack of public information about these methods and due to the unavailability of such equipment to the mass consumer.
About encryption itself
Encryption is a reversible transformation of data in order to hide it from outsiders. Many encryption methods have been invented — from simple substitution ciphers (the most famous example is Conan Doyle's «The Dancing Men») to the fundamentally unbreakable Vernam cipher (binary addition of the original text with a random sequence used once). Almost all encryption methods use an encryption key — a secret code sequence used in the process of information transformation. Somewhere I even happened to read the following definition of encryption: «Encryption is the process of replacing your big secret (document) with a small one (key).»In the case of the Vernam cipher, the encryption key is equal in length to the encrypted message, and must also be used only once. And although the Vernam cipher, if used correctly, provides «absolute» secrecy, it is not convenient for most applications. Modern cryptosystems use an encryption key of length from 64 to 1024-2048 bits. The tradition of measuring the key length in bits will probably remain with us forever. The reader may have a question? — where did these numbers come from, and why is TripleDES considered no less reliable than 1024-bit RSA? And in general, how does the reliability of encryption (or, as they say, the strength of the cipher) depend on the length of the key used?
To answer these questions, we need to understand what encryption algorithms are currently used in practice. In general, symmetric block ciphers are usually called «classical ciphers». That is, those that use the same key to encrypt and decrypt information and encrypt information in blocks. The block length is usually 8 or 16 bytes. There are algorithms that allow variable block length. The first block cipher widely used in practice was DES (Data Encryption Standard), developed by IBM specialists in the early 70s of the last century and for a long time served as the standard for data encryption in the USA. Then many block algorithms appeared — IDEA, Blowfish, Soviet GOST 28147-89 (and now a domestic standard).
The original DES, for example, used a 112-bit key and a 64-bit encryption block. But after its «analysis» by NSA1 specialists, the key length was reduced to 64 bits. At the same time, the key had only 56 unique bits, and 8 bits were control bits, used to control the integrity of the key. It was with a 56-bit key that DES was approved as a National Standard. At that level of computing technology development, the task of trying 2**56 keys in an acceptable time was either technically impossible or unreasonably expensive2. Currently, DES with a key length of 56 bits does not seem to be a strong algorithm.
Most modern strong symmetric algorithms use a key length of 64-256 bits (8-32 bytes). The table below lists the main algorithms currently in use, their block lengths and key lengths.
Algorithm | Key length (in bits) | Block length (in bits) |
DES | 64 | 64 |
Blowfish | Variable, up to 448 bit | 64 |
IDEA | 128 | 64 |
GOST 28147–89 | 256 | 64 |
RC5 | variable | variable |
It should be noted that in addition to block ciphers, stream ciphers exist and are actively used. They, like block ciphers, use a symmetric key, but encrypt the input stream byte by byte or, sometimes, bit by bit. The idea of a stream cipher is that a key sequence, or gamma, is generated based on a symmetric key, a sequence that is added modulo two (xor operation) to the input stream. Stream ciphers are usually more productive than block ciphers and are used to encrypt speech, network traffic and other data with a previously unknown length. If the key is changed frequently enough to generate a gamma, stream ciphers provide sufficient security.
Figure 1 shows the general scheme of the block encryption algorithm (block cipher), and Figure 2 shows the stream encryption algorithm. Figure 1 shows that in addition to the encryption key, the input parameter of the encryption algorithm is the initialization vector3. The initialization vector, as its name suggests, is an initializing binary message of length equal to the algorithm block. The initialization vector is not a secret element.
Figure 1. Block cipher diagram
Figure 2. Stream cipher diagram
In particular, in mobile communications, the GSM standard provides the ability to encrypt the transmitted voice stream on the section from the telephone set to the base station using the A5 cipher, which is a stream cipher. There is an instructive story associated with the A5 algorithm. Initially, the description of the A5 algorithm was closed. But due to a legal error by the company that owned the algorithm, its description ended up on the Internet and the algorithm was analyzed. Its strength turned out to be even lower than that of DES. Having realized the importance of openness of algorithms to ensure their security4, the developers of the third generation GSM network promised to make the proposed voice encryption algorithms available to the general cryptographic community. The example shows the importance of having an open description of the algorithm even for its developers.
Today, algorithms with a key length of 64 bits or more provide acceptable security. However, why then use keys 512 bits or more long? The answer is that these are keys to a different lock! In the seventies of the last century (Get used to it, gentlemen! The twentieth century, alas, is over) Diffie and Hellman invented a completely «new» cryptography — public key cryptography [1]. It is also called «open cryptography» or «asymmetric cryptography».
The idea of Diffie and Hellman was as follows. Let's take some number and call it a «secret key». Let's perform some actions on it, as a result of which we will get another number — a «public key». Moreover, let's choose such actions that obtaining a «public key» from a known «secret» will not require large computing resources and time, and the opposite action — calculating a «secret» from a «public key» will be impossible or very long5. Without going into mathematical details, we note that on a modern personal computer, generating a public key from a known public key takes about a second, while the reverse transformation, depending on the length of the key, can take up to hundreds (and even thousands) of years.
Considering the original text as some large number, we will transform it using the public key. As a result, we will obtain a new number, which we will consider as a cipher. Having a secret key at our disposal and using the fact that the public key was obtained from the secret one in a special way, we can decrypt the text relatively quickly (in computational terms). At the same time, it should be noted that, in general, the encryption-decryption process using a key pair is two to three orders of magnitude slower than encryption-decryption of the same text using a symmetric algorithm.
It would seem that with the availability of fast and time-tested symmetric algorithms, why use relatively slow asymmetric cryptography? This is necessary because using asymmetric cryptography radically simplifies the procedure for distributing keys between exchange participants, and it also becomes possible to use an electronic digital signature (but more on this below). The most famous asymmetric encryption algorithm today is the algorithm proposed by Rivest, Shamir and Adelman and bearing their names — the RSA algorithm.
In fact, if there are few exchange participants and they are all close to each other, then the problem of distributing symmetric keys between them is solved relatively simply. For N exchange participants for peer-to-peer communication, each participant must have N-1 pairwise communication keys. At the same time, each exchange participant must have a secure channel for exchanging keys with the other participants6. As the number of exchange participants increases, the problem of distributing and replacing compromised symmetric keys may turn out to be completely unsolvable, especially when the exchange participants do not trust each other.
The use of asymmetric cryptography radically simplifies the process of key distribution. The public key is called «open» because it does not represent a secret. It is possible to create a public «public key directory» where the public keys of all participants in the exchange can be placed. In this case, each key owner is free to revoke his key from the directory or replace it — this procedure will not affect the other participants in the exchange. In this case, the problem of the authenticity of the key in the directory arises, but this can be resolved.
But, convenience comes at a price. In the case of using asymmetric cryptography, the price is time and key lengths. The typical key length when using asymmetric cryptography is 512-1024 bits. Now, when high-performance computing systems have become available, the use of 2048-bit keys is gaining popularity.
Is it possible to reduce encryption time while maintaining the convenience of asymmetric cryptography and adding the speed of block ciphers? It turns out that it is. This is how it is usually done: generate a random (or pseudo-random) sequence and use it as a one-time (so-called session) key to encrypt a document with a fast symmetric algorithm. Then, using an asymmetric algorithm, encrypt the session key and transmit it in encrypted form along with the document. When decrypting a document, first decrypt the session key, and then the document itself. Due to the fact that the session key is short, its encryption takes little time. Symmetric cryptographic algorithms currently in use have a performance of about a megabyte per second (for software implementations) and tens of megabytes in the case of specialized cryptographic processors. Asymmetric algorithms show performance from units to tens of kilobytes per second, depending on the key length. With a session key length of 8-32 bytes, such a hybrid cryptographic scheme turns out to be quite effective.
But let us return briefly to the estimates of the security of asymmetric algorithms and to the problems that arise when using them. In this review, we will try to do without mathematical calculations and formulas7, however, we note that the complexity of restoring a secret key from an open one is determined by the complexity of factoring a large number into prime factors (the so-called «factorization problem») or the complexity of the «discrete logarithm» problem. The problem of «discrete logarithm» can be formulated as follows: take some known number, raise it to an unknown power, and report the remainder of dividing this exponent by some known large prime number (the so-called modulus) as the result. All numbers are natural. Find the exponent. The characteristic lengths of the numbers involved in the calculations are several hundred decimal places (these are the same 512-1024-2048 bits).
Today's results in complexity theory state that the computational complexity of the discrete logarithm problem depends exponentially on the magnitude of the modulus. But no one has yet been able to rigorously prove that this dependence cannot be polynomial. If suddenly, tomorrow, some brilliant mathematician finds a fast way to factorize large numbers8, then it may happen that we will find ourselves without asymmetric cryptography. Although, mathematicians estimate the probability of this event as close to zero. When using asymmetric cryptography, it is useful to remember about a kind of «sword of Damocles».
Hash functions and a little bit about electronic signature
As noted in the introduction, cryptographic methods can ensure not only confidentiality, but also control the integrity of transmitted or stored data. Integrity control is mainly performed by calculating some «checksum» of the data. Mathematicians and engineers working in the field of data transmission and coding theory have developed many algorithms that calculate checksums of transmitted data. For many applications, a simple checksum (for example, the well-known crc32 algorithm or sequential byte-by-byte or word-by-word addition of the source text with a known constant) is sufficient, especially when the data processing speed is important and the amount of data is not known in advance (a typical case is data transmission over communication channels).
The problem with simple checksum calculation algorithms is that it is fairly easy to find several data arrays that have the same checksum. Cryptographically strong checksums are calculated as the result of applying a so-called hash function to the original text.
One of the results of complexity theory and function theory is the hypothesis of the existence of one-way functions. A one-way function is a function defined (for example) on the set of natural numbers and not requiring large computational resources to calculate its value. However, calculating the inverse function (that is, from a known value of the function to restore the value of the argument) turns out to be theoretically impossible or (in extreme cases) computationally impossible.
The strict existence of one-way functions has not yet been proven. Therefore, all currently used hash functions are only «candidates» for one-way functions, although they have fairly good properties. The main properties of a cryptographically «good» hash function are the diffusion property, the collision resistance property, and the irreversibility property.
We have already discussed irreversibility. A collision of a hash function H is a situation in which there are two different texts T1 and T2, but H(T1) = H(T2). The value of a hash function always has a fixed length, and no restrictions are imposed on the length of the original text. It follows from this that collisions exist. The requirement of resistance to collisions means that for a cryptographically «good» hash function for a given text T1, it is computationally impossible to find a text T2 that causes a collision.
The diffusion property requires that minimal changes to the text to be hashed cause maximal changes to the hash value.
The main algorithms used today that implement hash functions are MD2, MD4, MD5, SHA and its variant SHA1, a Russian algorithm described by the standard GOST R 34.11–94. The most commonly used are MD5, SHA1 and, in Russia, 34.11. The length of the hash function value varies. A typical length is 16–32 bytes. In light of recent cryptanalytic results, MD5 will probably have to be abandoned in the near future, since, as has been stated, «its resistance to collisions has dropped and has probably approached the point where it is no longer possible to talk about resistance at all.»
The section heading contains the words «electronic signature». In this same issue, an article was published entirely devoted to the practical and legal issues of using an electronic signature, so there is no need to dwell on this in detail here. But it would be wrong not to mention the electronic signature at all.
The thing is that without asymmetric cryptography, there would be no electronic signature at all! The idea of an electronic signature is simple. When the encryption process using an asymmetric algorithm was described in Section 1, it was noted that a public key was used to encrypt a message, and a secret key was used to decrypt it. But, when applied to encryption, keys are interchangeable. You can encrypt a message using your secret key, and then anyone can decrypt it using the public key. This property of asymmetric algorithms is used when forming and verifying an electronic digital signature. In general, the digital signature of a document is its hash sum encrypted with a secret key. Verification of the digital signature of a document comes down to calculating the hash sum of the document, decrypting the hash sum contained in the signature, and comparing the two values. If the values of the calculated and stored in the signature hash sums match, then the signature under the document is considered correct.
Electronic watermarks
Section 3 addressed the issue of integrity control and authenticity assurance for an electronic document. Is it possible to ensure the authenticity of a paper document (will, power of attorney, or other legal document) using cryptographic technologies? The traditional approach is to draw up a legal document on paper with watermarks or other security features. This approach requires the availability of a special form at the time of drawing up the document or the availability of a printing house with special equipment. But what to do when you already have a paper document and want to protect it from counterfeiting9?
It is known that each sheet of paper is unique in the structure of its fibers. With the advent of inexpensive scanners with high resolution and reliable image recognition technologies, it has become possible to analyze the microstructure of paper and use the information obtained to ensure the uniqueness of the document.
Today, there is already a well-developed technology developed into a hardware and software solution that ensures the uniqueness of paper documents using the above idea.
The selected document is scanned at high resolution, and several features (micro-inclusions, characteristic bends of the forming fibers, etc.) are highlighted in the scanned image. In general, some analogy with fingerprint analysis technology suggests itself here. And, by the way, it is no coincidence. The obtained data is transformed into a binary array, for which a hash function is calculated.
The resulting hash function value is an analogue of a «watermark» and ensures the uniqueness of the document. It is easy to see that the proposed technology can be easily expanded and the hash function value can be printed directly on the document form along with the notary's seal when the document is notarized10. But such an approach requires appropriate legislation.
Attempts to use «electronic watermarks» for non-paper media have, unfortunately, not yet been successful. The most famous example is the attempt to protect DVD discs from illegal distribution. The idea of the developers was to, in addition to encrypting information on the disc11, place on it some information that was lost or ceased to be relevant on the copy. Practice has shown that attempts to implement such technology have been unsuccessful.
The given example, by the way, reflects a profound and, unfortunately, often unnoticed difference between traditional documents and electronic ones. The essence of this difference is clearly visible in the example of the use of an electronic signature. A signature verification program, generally speaking, can only establish that the document being verified was signed using a key with the specified identifier and that the signature is correct (or incorrect). But it is impossible to determine from the signature who exactly used this key.
Let's say, for example, that the checksum of a legal DVD disc was calculated using its characteristics, such as the coating material, the barcode data, the manufacturer's code, and the serial number of the disc. Having the algorithm for calculating such a checksum would allow potential pirates to make an unlimited number of copies simply by recalculating the checksum during the manufacturing process for the blanks they have at their disposal. Any DVD player will perceive the disc produced in this way as legal!
Phil Zimmerman, PGP and Public Key Certification Systems
Section 2 noted the problem of preserving the authenticity of public encryption and signature keys in the case of their open storage and distribution. With the mass distribution and use of open cryptography tools, the issue of ensuring the authenticity of keys becomes central and requires special attention.
The traditional method of certifying the authenticity of a document is its notarization. To ensure the authenticity of a public key, it was also proposed to use an electronic analogue of notarization — an electronic signature of a trusted person or a group of persons. So, in this matter, humanity again did not come up with anything fundamentally new. The authenticity of a public key and its attributes is ensured by an electronic signature under it, belonging to a certain person or a group of persons trusted by the exchange participants. In practice, this is implemented through the so-called «key certification centers» or «trust centers» (the English term is Certification Authority, which, unfortunately, also has no Russian literal analogue).
Before talking about public key certification authorities and public key infrastructure in general, a few words should be said about the pioneer of «open» cryptography, Phil Zimmerman. It is worth mentioning him not only because he made cryptographic methods available to a wide range of users, but also because he came up with an original idea for ensuring the authenticity of public keys.
Let us recall that Phil Zimmerman, a former NSA employee, wrote and made widely available the PGP (Pretty Good Privacy) program in 1992. The name can be translated into Russian as “Pretty Good Secrecy”12), designed for encryption and electronic signing of messages. In the life of PGP and Phil Zimmerman himself, there was both “the wrath of the intelligence agencies” and recognition of the “wide masses of users”. There was a time when Zimmerman was under investigation, allegedly for “disclosing state secrets”. The fascinating history of the emergence and development of PGP is not the topic of this article. We will be interested in the original, if you like, social-technological ideas implemented in PGP.
PGP is a program that allows you to encrypt and sign electronic documents using several cryptographic algorithms. In this case, a «hybrid» scheme is used for encryption: information is encrypted with a symmetric algorithm on the session key, and the session key is encrypted with an asymmetric algorithm on the public key of the message recipient. Since Phil Zimmerman considered one of the tasks of PGP to be the uncontrolled use of cryptographic technologies by the state, he proposed the idea of mutual certification of public keys.
The idea of mutual certification of public keys is implemented as follows. You, as a PGP user, create your key pair. Next, if you want to transfer encrypted information to other people, you should take care of a secure way of transferring your public key to your correspondents. For example, such a method could be a personal meeting. You and your correspondent transfer public keys to each other. You and your correspondent can sign each other's public keys and place the resulting «certificates» on a freely accessible server. Next. There can be several signatures under a public key and each of them can be assigned a certain «trust level». In this case, the end user of the key determines the trust level. That is, using the principle «a friend of my friend is my friend», you can organize communication between your friends and your correspondent. If your friends trust you completely, then your signature under your correspondent's key will be enough for them to use it.
Implicit in the PGP technology was the principle of «mutual trust»13. If I, as a PGP user, am interested in increasing the number of correspondents with whom I can maintain encrypted communication, then I will try to find more people who trust me for the purpose of certifying my key. And they, in turn, will look for their trustees. As a result, complete strangers wishing to organize the exchange of encrypted messages can obtain public keys with a large number of signatures of people who do not know each other from the PGP key server14. With a high probability, among the signatures in the certificate there will be either a signature of a person whom these people trust completely or several signatures of people whose trust is not so unlimited, but still exists. At the same time, each of the correspondents is free to prioritize the signatures themselves.
And yet, Phil Zimmerman did not manage to fully realize his idea. The system of «mutual certification based on trust» did not stand the test of reality. PGP, of course, continues to be used and has its supporters, but at the corporate level (and in some countries — at the state level), the PGP-style key certification system has been completely replaced by trusted centers or public key certification centers. The Public Institute of Justices of the Peace and Notaries is not new, so the creators of certification centers did not have to invent anything new. Instead of mutual key certification, a mechanism for certifying public keys in an «electronic notary» was proposed.
To convert a public key into a public key certificate, the key owner's details, the «trust center» details, as well as the key's expiration dates and other necessary information are added to it. All information is encoded in a special way to ensure unambiguity and is signed on the secret key of the trust center. Jet Info Magazine has already published an article on the organization of trust centers (see [9]), containing, in particular, definitions of basic terms, so in this review we will not dwell on the terminology, general principles and standards for organizing certification centers.
The legal status of a trust center can be anything. From officially state to corporate or private. Many large Western corporations already have their own corporate certification centers for organizing the intra-company exchange of confidential information. Independent certification centers have been created and are successfully operating, for example, “VerySign, Inc.” Microsoft15 and American Express have created their own public certification centers. The creation and deployment of Russian certification centers is a matter of the near future.
Currently, the State Duma of the Russian Federation is considering several bills that regulate the use of electronic digital signatures in one way or another. One of the bills is called «On the Use of Electronic Digital Signatures». The others are «On Electronic Commerce» and «On Electronic Trade».
All the bills envisage the deployment of public key certification centers at the federal level. At the same time, none of the bills touches upon the issue of departmental affiliation of certification centers, nor does it describe the circle of persons and/or organizations that will be temporarily required to use the services of the certification centers being created. It is too early to talk about what awaits us, as Russian consumers and developers of information technologies, if one of the projects under consideration is approved as a law.
Identification and authentication systems
In many applications, the task of identifying and authenticating a person or program's access to a resource is even more important than the task of ensuring confidentiality. Almost all multi-user and network operating systems require user authentication. As well as ATMs and point-of-sale terminals. With the development of the Internet and paperless technologies, the number of applications that require user authentication will only increase.
So, first, definitions.
In what follows, by a subject we shall mean a user or a user agent (program) accessing a certain resource. By an information system we shall mean a separate computer or a computer network or another electronic device, access to which is regulated by a certain system of powers and/or rights. The task of identification and authentication systems is to determine and verify the set of powers of a subject when accessing an information system.
Identification of a subject when accessing an information system is the process of comparing it with a certain characteristic of the subject stored by the system — an identifier. Subsequently, the subject identifier is used to grant the subject a certain level of rights and powers when using the information system.
Subject authentication is the procedure of verifying that an identifier belongs to a subject. Authentication is performed on the basis of one or another secret element (authenticator) that is possessed by both the subject and the information system. Usually, the information system does not possess the secret element itself, but some information about it, on the basis of which a decision is made on the adequacy of the subject to the identifier.
To make this dry theory more understandable, let's look at a concrete example. Before starting an interactive session, most operating systems ask the user for his name and password. The entered name is the user identifier, and the password is the authenticator16. The operating system usually stores not the password itself, but its hash sum, thereby ensuring that it is practically impossible to recover the password (see Section 2).
Using a username-password pair to authenticate subjects is the most common, but not the only one. There are actually few fundamentally different authentication methods. One class of authentication methods is based on the fact that the authenticated subject must have some secret element (a password, a secret key, or a special authentication token). Another class of authentication methods is applicable mainly to authenticating people. It is based on the presence of unique physical properties of the person (fingerprints, hand shape, voice, iris). Each class of methods has both advantages and disadvantages. We will compare both classes of methods a little later, but for now let look at the different authentication methods in more detail.
Algorithmically, the authentication procedure is represented as a sequential transmission of one or more information messages between the subject and the information system and their intermediate processing by both parties. As a result of these actions, both parties to the exchange must verify that they are who they claim to be.
We have already discussed authentication with a secret element. Another common authentication method is authentication using public key certificates. Several such algorithms have been developed and are used. Typically, authentication using keys is combined with the procedure of generating a paired symmetric key for the purpose of its further use for exchanging messages. The most well-known procedure for mutual authentication of a pair of subscribers is the Diffie-Hellman method. It is widely described. Both in the articles of the authors themselves, and in independent works ([4], [8]). The essence of the method is that each of the exchange participants, through mathematical transformations of their secret key and the public key of their correspondent and the exchange of non-secret parcels, independently of each other receives a secret number. Since the secret and public keys of the subscribers are related by some relation, it is possible to select key transformations so that the numbers received by both subscribers coincide. The resulting secret number can be used as a shared secret.
Another interesting authentication method is the use of an authentication token. An authentication token is a physical device, usually small in size for ease of carrying. This can be a smart card or recently appeared devices that are connected to a USB port and made in the form of a key fob. Typically, an authentication token contains non-volatile memory and a specialized processor «on board». Some devices additionally have a built-in hardware random number generator or a timer (real-time clock). The token processor, depending on its power, is capable of performing a wide variety of operations. There are processors capable of encrypting data using the DES algorithm or calculating hash sums using a key (HMAC-MD5). A specialized token allows cryptographic transformations to be performed without extracting the key from the token memory and transmitting only non-secret or encrypted data between the token and the computer and information system, which additionally protects the authentication protocol from key interception. Usually, programmatic access to the token is possible only after entering a PIN code known only to the owner of the authentication token. Additional token capabilities allow implementing more reliable authentication protocols.
An interesting authentication technology based on «one-time passwords» was proposed by Security Dynamics (now a division of RSA Security). The technology is called SecureID. One-time passwords are pseudo-random numbers. The generator of the sequence of pseudo-random numbers is an authentication token. RSA Security offers several token options — a smart card, a calculator with the ability to enter a PIN code, key fobs. Each token has a unique serial number. The token generates a new pseudo-random number at a rate of one per minute. The period17 of the pseudo-random number generator is such that the usage time of one token is two years.
For authentication using SecureID technology, the information system must contain a SecureID authentication server and a database that matches the names of authenticated users and token serial numbers. The authentication request from the user consists of his name and a random number read by the user from the token. The server, based on the number received from the user and the token serial number, decides whether this number belongs to the sequence generated by this token or not.
These and many other authentication methods suffer from one drawback — they, in fact, authenticate not a specific subject, but record the fact that the subject's authenticator corresponds to his identifier. That is, all of the listed methods are not protected from compromise of the authenticator. Biometric identification or authentication methods are free from this drawback. As already noted, biometric methods are based on the analysis of the unique characteristics of the person himself.
A biometric characteristic can be both an identifier (as fingerprinting actually considers a fingerprint as a personal identifier) and an authenticator (the user enters his name and confirms it by looking into the eyepiece of the iris analyzer). For some applications (for example, for access control to premises), identification is sufficient. For some («paranoid») cases, it is necessary to enter the user's name, his fingerprint, and also pronounce a code phrase.
The most common (and cheap) method of biometric identification or authentication is fingerprint analysis. When a user registers, a digest — a certain hash sum of the scanned fingerprint — is placed in the authentication server database. Depending on implementation, the digest length is 200-400 bytes.
However, biometric methods of personal authentication have one serious drawback (apart from the relatively high cost). In case of compromise of the authentication token, key or password, the subject can refuse to use it and get a new authenticator. In case of compromise of the electronic representation of the biometric authenticator, the person can simply «drop out» of the authentication process. In case of using a biometric characteristic as a personal identifier, there is no threat of compromise.
A little about cryptanalysis
Cryptanalysis is a field of activity aimed at deciphering other people's encrypted messages with an unknown encryption key and/or algorithm. It is difficult to define cryptanalysis as just a science. It also includes elements of art, luck, and even «shamanism».
The «unconventional» nature of the work of cryptanalysts is illustrated by, for example, a «real-life story» told to the author of this article by a retired employee of the Soviet Cryptanalytic Service. It happened a long time ago, when computers didn't really exist yet. An encrypted note, transferred from places of detention to «freedom,» came into the service. The note landed on the desk of one of the employees, known for his extreme neatness. The man carefully smoothed the note with his hands and immersed himself in the analysis. The cipher turned out to be not very difficult for him. After some thought over the note, the man first got up from the table, went to the sink, washed his hands thoroughly with soap, then wiped his hands and the table with alcohol. The note stated that its author had syphilis.
Before talking about cryptanalysis, it should be said that modern cryptography is based on the principle that the strength of a cryptographic transformation should be ensured only by keeping the key secret. When designing a cryptographic system, it is initially assumed that the cryptographic transformation algorithm is open (or, at a minimum, known to a potential adversary), and the key, on the contrary, is reliably protected. The strength of an encryption algorithm is usually estimated in the number of typical operations required to decrypt an encrypted message and/or select an encryption key using the best algorithm.
A typical and most common example of the use of cryptanalysis by people far removed from cryptology is «password guessing». It is difficult to find a person (at least in Russia) related to information technology who has never in his life encountered the problem of «cracking» a password to access a protected archive or a document in Microsoft Word format. Well, and guessing a superuser password in almost all currently used operating systems is generally «the talk of the town»!
The simplest and longest method of password selection is a complete enumeration of all possible options. The method is ineffective, and often impossible due to time constraints. Cryptanalysis aims to find methods for decrypting a document and/or finding an encryption key in less time than a complete enumeration.
In addition to simple enumeration, or, as it is also called, the «brute force method», cryptanalysis has other fairly well-developed methods (see, for example, [3]). There is almost no systematic presentation of cryptanalytic techniques in the open press, because it is no secret that the main cryptanalytic research in almost all countries is carried out by intelligence services, and intelligence services of all countries are similar in their desire to conceal their methods of work.
As a rule, all known methods of cryptanalysis assume that the cryptanalyst has, in addition to the ciphertext, the corresponding plaintext or its fragments. In laboratory conditions, this requirement is satisfied automatically, and in the case of a real attack on the cipher, plausible conclusions are made regarding the original text (for example, that encrypted letters begin with the words “Dear Sir!”) or the structure or content of the plaintext is learned by other methods.
Recently, due to the development of the Internet, it has become possible to effectively use the «brute force method» by parallelizing operations. This approach is usually implemented as follows. Somewhere on the Internet, a server is installed, from which anyone can download a program that decrypts a test message by trying keys. The program is usually supplied both as source code and compiled for the most common operating systems. After launching, the program establishes a connection to the server, receives a set of keys from it for trying, and, after finishing the work, reports the result to the server.
The program can work in the background or be activated at night. There are programs for enumerating keys, implemented as screen savers. The programs can not only «break» ciphers, but also, for example, select two texts that have the same value of the hash function calculated by the specified algorithm.
With this approach to «breaking» the cipher, the question of the unambiguousness of decryption arises. In fact, it may turn out that for a given cipher text there are two or more keys, the use of which allows one to obtain a «meaningful» text upon decryption. It turns out that starting from a certain length of the original message, this statement becomes false. For the cryptosystems currently in use, a message 60-100 bytes long is already decrypted unambiguously.
Usually, the «distributed hacking» method is now used either by cryptographic software manufacturers to test their products or by various non-profit organizations to determine the cryptographic strength of the algorithms used (See, for example, http://distributed.net). The most interesting thing is that the described «distributed hacking» method, which relies solely on the enthusiasm of the participants, has already yielded amazing results.
Conclusion
Naturally, this review does not pretend to be complete. Such areas of cryptographic research as quantum cryptography, cryptographic protocols, keyed hash sums, distributed authentication systems, or pseudorandom sequence generation methods are beyond the scope of the review. Some issues have already been covered in our journal, and it is possible that coverage of other topics will be a matter of the near future.
Literature
[1] W. Diffie, M.E Hellman — New directions in cryptography IEEE Transactions of Information Theory, v. IT-22, pp. 644–654, Nov 1976
[2] Gilles Brassard — Modern Cryptology — M., «POLYMED», 1999
[3] Bruce Schneier — Self-Study Course in Block Cipher Cryptanalysis Cryptologia, v .24, n.1, pp. 18–34, Jan 2000
[4] Bruce Schneier — Applied Cryptography, Second edition: Protocols, Algorithms, and Source Code in C — John Wiley & Sons. Inc., 1996
[5] Donald R. Richards — Biometric Identification Handbook of information security management — Auerbach, pp. 5–26, 1999
[6] Beresford N. Parlett — Some basic information of information-based complexity theory Bulletin of the AMS, v. 26, no. 1, pp. 3–27, January 1992
[7] H. Krawczyk, M. Bellare, R. Canetti — Keyed-Hashing for Message Authentication, RFC 2104, February 1997
[8] E. Rescorla — Diffie-Hellman Key Agreement Method, RFC 2631, June 1999
[9] Victor Gorbatov, Olga Polyanskaya — Trusted centers as a link in the system of ensuring the security of corporate information resources — Jet Info, N 11(78), pp 12–206, 1999
[10] Nadezhda Vyukova, Vladimir Galatenko — Kerberos authentication server — Jet Info, N 12–13(19–20), 1996
1 National Security Agency. It is also the National Security Agency. The US State Cryptographic and Cryptanalytic Service.
2 Bruce Schneier in his book «Applied Cryptography» [4] provides the following data. In 1976 and 1977, Diffie and Hellman stated that the creation of a specialized parallel computer capable of finding a key in one day would cost $20 million. In 1981, Diffie changed his own estimate to the downside — a computer capable of finding a key in two days would cost $50 million.
3 In Russia, the term «initial filling» has been adopted
4 In terms of the possibility of timely cryptanalysis by independent experts and the identification of potential weaknesses in the algorithm
5 Analysis and attempts to solve computationally complex problems are currently being carried out by mathematicians developing the “Information-based Complexity Theory”, the success of which is largely due to modern cryptography
6 The simplest option for distributing keys is a personal meeting. The most expensive and reliable, at least in Russia, is the State Courier Service.
7 Perhaps our newsletter will feature other, more specialized articles on this topic
8 or the «prime number formula» that Leonard Euler and other Great Mathematicians unsuccessfully sought
9 With today's level of full-color printing technology, forging notarized documents or making copies of documents is not difficult.
10 It is even possible to use a hash function with a key (see, for example, [7]) or an electronic digital signature.
11 The topic of DVD encryption itself deserves special attention, as an example of the unsuccessful application of cryptography and a demonstration of the viciousness of the principle of “security through obscurity”
12 Unfortunately, in Russian there is no special term equivalent to the English term Privacy. The word “privacy” is translated into Russian, depending on the context, as “secrecy” and as “the right to privacy” or “protection of personal information”.
13 How can we not recall the numerous “Mutual Credit Societies” from Ilf and Petrov’s “The Golden Calf”?
14 Statistics claim that, on average, every inhabitant of the Earth is acquainted with every other inhabitant of the Earth through a chain of six people
15 perhaps due to the ability to protect the software code of drivers with an electronic signature that appeared in Windows 2000
16 Strictly In other words, in operating systems such as UNIX or VMS, the user identifier is a number specified when the user account is created, and the user name is just a unique attribute of the account.
17 The period of a PRNG generator is the maximum length of the PRNG sequence, after which the generator will begin to produce repeating numbers.
Jet Info, N 3(94), 2001