Expert study of the parameters of the distribution of the NFC.
Expert study of the parameters of the distribution of the NFC
Ivanov I.L.
Orel
(Manual for experts of the system of expert institutions of the Ministry of Justice of the Russian Federation (SEI MJ RF), authors: Kaganov A.Sh. et al. signed for publication 2004)
Chapter 3. Technical and software tools used in forensic phonographic examination
§ 1. Theoretical foundations for conducting the instrumental part
identification of the speaker.
Identification features associated with the assessment of the parameters of the excitation source of the vocal tract.
Kasim – asymmetry coefficient, %;
dFо – average value of the first derivative of the FOT as it increases, Hz/s;
D(+dFо) – dispersion of the average value of the first derivative of the FOT as it increases;
–dFо – average value of the first derivative (FOT) as it decreases, Hz/s;
D(–dFо) – dispersion of the average value of the first derivative of the FOT as it decreases;
It is important that, according to a number of works [1; 5; Appendix 1 of this manual], the parameters of the FOT selected for analysis are stable characteristics over time. For example, the data obtained by E.V. Belovol [1] indicate the homogeneity of 12 of the listed parameters for each of the subjects over the course of the 5 months of the experiment. The difference in the emotional state of the subjects, although it affected the analyzed acoustic parameters, did not have a significant impact on their individual stability over time. If, at the final stage of the study of the FOT parameters, we analyze the average weighted relative deviation of the analyzed parameters of the fundamental tone of the original speech material from the corresponding parameters of the sample phonogram, we can see that it is within the limits of the average statistical intra-speaker variability.
At the same time, the results of the instrumental part of the study of the characteristics of the excitation source of the speech signal are reliable, since they are based on a representative sample (the total number of measured periods of the fundamental tone N can be easily calculated both in the material of the original and comparative records).
1. As expert practice shows, and repeated attempts to introduce these parameters of the study of the statistical distribution of the fundamental tone into the software product, did not lead to a significant positive effect. What is this possibly due to:
2. The emotional state of the person being identified is brought to the same state before the production of voice and speech samples.
3. The same situational environment is created.
4. The same acoustic environment is created for all cases of sample collection spaced in time (over 5 months, as noted in the book).
5. The same recording channel is used.
6. The same recording equipment is used.
Thus, real expert practice shows that these conditions are usually not met.
1. The expert does not have the opportunity to use previously accumulated samples of the defendant's voice and speech for the study.
2. The situational environment is sharply different.
3. The emotional state is sharply different.
4. The recording channel is different (a conversation on a pay phone and the selection of free speech indoors in an elevated emotional state, when the defendant understands that he is testifying against himself), etc.
5. The frequency response of comparative phonograms, as a rule, does not match.
And if the expert can also request additional samples of voice and speech, then the second take of the original recording will not be presented.
Noise cleaning of both the samples and the research material, tempo correction, etc. may be required.
Thus, repeated attempts to enter these parameters into the software product have not led to success.
As a result of the study of dFо – the average value of the first derivative of the FOT during its increase, Hz/s and dFо – the average value of the first derivative (FOT) during its decrease, Hz/s (rate of increase, rate of decrease) there was an attempt to combine these two parameters into one: dFо – the average value of the first derivative of the FOT (average rate of change of the FOT).
The study of this parameter turned out to be a stable feature. The following method was used to check it:
1. Voice and speech samples are examined, selected using different recording channels and with different emotional states, taken from one subject. These sections begin to be compared with each other. As an example, the voice of a 13-year-old teenager will now be examined:
Parameters of the fundamental tone and tempo of speech | reading on a pay phone on the street | reading indoors (dictaphone) |
free speech indoors (dictaphone) |
Number of periods FROM | 2775 | 1586 | 325 |
Average Hz | 181.0 3.3% | 193.4 3.4% | 187.1 |
Maximum Hz value | 220.5 2.0% | 239.7 6.5% | 225.0 |
Minimum Hz value | 149.0 8.4% | 157.5 3.1% | 162.6 |
RMS Hz | 13.3 7.4% | 13.1 5.5% | 12.4 |
Range value FROM(D) | 1 ,16 1.5% | 1.14 0.3% | 1.14 |
Average rate of change of OT Hz/sec | 295.1 1.5% | 282.2 3.0% | 290.8 |
For example, a very short duration material is used as the initial one = 325 OT periods.
As we observe the parameters of the new coefficient, it fits perfectly into the table of the main statistical parameters of voice and speech, and in some indicators — even better than the main ones.
In real expert practice over two years, this parameter has proven itself in small quantities by volume of research material (anonymous calls, terrorist threats, etc.). For example: a real examination of a school being mined.
Original phonogram No. 1
«They are singing in the second school, it is mined, if you don’t arrive in five seconds, it will explode.»
The total duration of the speech material selected for analysis was 3.5 sec.
Parameters of the main tone and tempo of speech | Sample B__v I.V. soundtrack #1 |
Sample B__v I.V. soundtrack #2 strong> |
Sample I__v V.S. phonogram No. 1 |
Sample I__v V.S. phonogram No. 2 |
Original phonogram M1 |
Number of OT periods | 10165 | 4553 | 4274 | 4109 | 527 |
Average Hz | 162.5 | 203.4 | 123.9 | 165.2 | 197.3 |
Maximum Hz value | 250.0 | 296.3 | 166.7 | 228.6 | 229.6 |
Minimum Hz value | 80.0 | 60.6 | 60.6 | 62.5 | 127.0 |
RMS Hz | 29.08 44.8% | 38.46 91.6% | 12.52 37.6% | 25.50 27.0% | 20.08 |
FROM range value (D) | 1.44 17.1% | 1.47 19.6% | 1.22 0.1% | 1.37 11.3% | 1.23 |
Average rate of change FROM Hz/sec | 348.9 44.6% | 387.6 60.6% | 289.8 20.1% | 280.5 16.2% | 241.3 |
When analyzing two suspects for participation in the school bomb threat, the table shows that the closest in terms of coefficients is the defendant I__v V.S. (The auditory part gives the same result). The emotional state on the original phonogram and on the sample phonograms is very different, different recording channels are used, etc.
Thus, there is a proposal to continue the study of this parameter.
And taking into account expert practice over 2 years, it can, in principle, already be used by experts.
List of literature
1. Belovol E.V. Manifestation of temperament properties in acoustic characteristics of speech: Abstract of Cand. Dis. — M .: Soyuz, 1999.
5. Kaganov A.Sh., Mikhailov V.G. Peculiarities of preparing voice and speech samples for conducting identification phonographic examination: Forensic Science XXI century/Proceedings of the All-Russian scientific and practical conference. – Rostov n/D: URTSSE MJ RF, 2001. – P. 113 – 120.