Chromium Code Reviews
chromiumcodereview-hr@appspot.gserviceaccount.com (chromiumcodereview-hr) | Please choose your nickname with Settings | Help | Chromium Project | Gerrit Changes | Sign out
(385)

Side by Side Diff: webrtc/modules/audio_processing/test/py_conversational_speech/README.md

Issue 2733863002: Conversational Speech generator, main script with shell arguments (Closed)
Patch Set: typo fixed Created 3 years, 9 months ago
Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.
Jump to:
View unified diff | Download patch
« no previous file with comments | « no previous file | webrtc/modules/audio_processing/test/py_conversational_speech/generate_conversational_tracks.py » ('j') | no next file with comments »
Toggle Intra-line Diffs ('i') | Expand Comments ('e') | Collapse Comments ('c') | Show Comments Hide Comments ('s')
OLDNEW
1 #Conversational Speech generator tool 1 #Conversational Speech generator tool
2 2
3 Python tool to generate multiple-end audio tracks to simulate conversational 3 Python tool to generate multiple-end audio tracks to simulate conversational
4 speech with two or more participants. 4 speech with two or more participants.
5 5
6 The input to the tool is a directory containing a number of audio tracks and 6 The input to the tool is a directory containing a number of audio tracks and
7 a text file indicating how to time the sequence of speech turns (see the Example 7 a text file indicating how to time the sequence of speech turns (see the Example
8 section). 8 section).
9 9
10 Since the timing of the speaking turns is specified by the user, the generated 10 Since the timing of the speaking turns is specified by the user, the generated
11 tracks may not be suitable for testing scenarios in which there is unpredictable 11 tracks may not be suitable for testing scenarios in which there is unpredictable
12 network delay (e.g., end-to-end RTC assessment). 12 network delay (e.g., end-to-end RTC assessment).
13 13
14 Instead, the generated pairs can be used when the delay is constant (obviously 14 Instead, the generated pairs can be used when the delay is constant (obviously
15 including the case in which there is no delay). 15 including the case in which there is no delay).
16 For instance, echo cancellation in the APM module can be evaluated using two-end 16 For instance, echo cancellation in the APM module can be evaluated using two-end
17 audio tracks as input and reverse input. 17 audio tracks as input and reverse input.
18 18
19 By indicating negative and positive time offsets, one can reproduce cross-talk 19 By indicating negative and positive time offsets, one can reproduce cross-talk
20 and silence in the conversation. 20 and silence in the conversation.
21 21
22 IMPORTANT: **the whole code has not been landed yet.** 22 IMPORTANT: **the whole code has not been landed yet.**
23 23
24 ###Example 24 ###Example
25 25
26 For each end, there is a set of audio tracks, e.g., a1, a2 and a3 (speaker A) 26 For each end, there is a set of audio tracks, e.g., a1, a2 and a3 (speaker A)
27 and b1, b2 (speaker B). 27 and b1, b2 (speaker B).
28 The text file with the timing information may look like this: 28 The text file with the timing information may look like this:
29 ``` A a1 0 29
30 B b1 0 30 ```
31 A a2 100 31 A a1 0
32 B b2 -200 32 B b1 0
33 A a3 0 33 A a2 100
34 A a4 0``` 34 B b2 -200
35 A a3 0
36 A a4 0
37 ```
38
35 The first column indicates the speaker name, the second contains the audio track 39 The first column indicates the speaker name, the second contains the audio track
36 file names, and the third the offsets (in milliseconds) used to concatenate the 40 file names, and the third the offsets (in milliseconds) used to concatenate the
37 chunks. 41 chunks.
38 42
39 Assume that all the audio tracks in the example above are 1000 ms long. 43 Assume that all the audio tracks in the example above are 1000 ms long.
40 The tool will then generate two tracks (A and B) that look like this: 44 The tool will then generate two tracks (A and B) that look like this:
41 45
42 ```Track A: 46 **Track A**
47 ```
43 a1 (1000 ms) 48 a1 (1000 ms)
44 silence (1100 ms) 49 silence (1100 ms)
45 a2 (1000 ms) 50 a2 (1000 ms)
46 silence (800 ms) 51 silence (800 ms)
47 a3 (1000 ms) 52 a3 (1000 ms)
48 a4 (1000 ms)``` 53 a4 (1000 ms)
54 ```
49 55
50 ```Track B: 56 **Track B**
57 ```
51 silence (1000 ms) 58 silence (1000 ms)
52 b1 (1000 ms) 59 b1 (1000 ms)
53 silence (900 ms) 60 silence (900 ms)
54 b2 (1000 ms) 61 b2 (1000 ms)
55 silence (2000 ms)``` 62 silence (2000 ms)
63 ```
56 64
57 The two tracks can be also visualized as follows (one characheter represents 65 The two tracks can be also visualized as follows (one characheter represents
58 100 ms, "." is silence and "*" is speech). 66 100 ms, "." is silence and "*" is speech).
59 67
60 ```t: 0 1 2 3 4 5 6 (s) 68 ```
69 t: 0 1 2 3 4 5 6 (s)
61 A: **********...........**********........******************** 70 A: **********...........**********........********************
62 B: ..........**********.........**********.................... 71 B: ..........**********.........**********....................
63 ^ 200 ms cross-talk 72 ^ 200 ms cross-talk
64 100 ms silence ^``` 73 100 ms silence ^
74 ```
OLDNEW
« no previous file with comments | « no previous file | webrtc/modules/audio_processing/test/py_conversational_speech/generate_conversational_tracks.py » ('j') | no next file with comments »

Powered by Google App Engine
This is Rietveld 408576698