| OLD | NEW |
| 1 #Conversational Speech generator tool | 1 # Conversational Speech generator tool |
| 2 | 2 |
| 3 Python tool to generate multiple-end audio tracks to simulate conversational | 3 Tool to generate multiple-end audio tracks to simulate conversational speech |
| 4 speech with two or more participants. | 4 with two or more participants. |
| 5 | 5 |
| 6 The input to the tool is a directory containing a number of audio tracks and | 6 The input to the tool is a directory containing a number of audio tracks and |
| 7 a text file indicating how to time the sequence of speech turns (see the Example | 7 a text file indicating how to time the sequence of speech turns (see the Example |
| 8 section). | 8 section). |
| 9 | 9 |
| 10 Since the timing of the speaking turns is specified by the user, the generated | 10 Since the timing of the speaking turns is specified by the user, the generated |
| 11 tracks may not be suitable for testing scenarios in which there is unpredictable | 11 tracks may not be suitable for testing scenarios in which there is unpredictable |
| 12 network delay (e.g., end-to-end RTC assessment). | 12 network delay (e.g., end-to-end RTC assessment). |
| 13 | 13 |
| 14 Instead, the generated pairs can be used when the delay is constant (obviously | 14 Instead, the generated pairs can be used when the delay is constant (obviously |
| 15 including the case in which there is no delay). | 15 including the case in which there is no delay). |
| 16 For instance, echo cancellation in the APM module can be evaluated using two-end | 16 For instance, echo cancellation in the APM module can be evaluated using two-end |
| 17 audio tracks as input and reverse input. | 17 audio tracks as input and reverse input. |
| 18 | 18 |
| 19 By indicating negative and positive time offsets, one can reproduce cross-talk | 19 By indicating negative and positive time offsets, one can reproduce cross-talk |
| 20 and silence in the conversation. | 20 and silence in the conversation. |
| 21 | 21 |
| 22 IMPORTANT: **the whole code has not been landed yet.** | 22 IMPORTANT: **the whole code has not been landed yet.** |
| 23 | 23 |
| 24 ###Example | 24 ### Example |
| 25 | 25 |
| 26 For each end, there is a set of audio tracks, e.g., a1, a2 and a3 (speaker A) | 26 For each end, there is a set of audio tracks, e.g., a1, a2 and a3 (speaker A) |
| 27 and b1, b2 (speaker B). | 27 and b1, b2 (speaker B). |
| 28 The text file with the timing information may look like this: | 28 The text file with the timing information may look like this: |
| 29 | 29 |
| 30 ``` | 30 ``` |
| 31 A a1 0 | 31 A a1 0 |
| 32 B b1 0 | 32 B b1 0 |
| 33 A a2 100 | 33 A a2 100 |
| 34 B b2 -200 | 34 B b2 -200 |
| (...skipping 30 matching lines...) Expand all Loading... |
| 65 The two tracks can be also visualized as follows (one characheter represents | 65 The two tracks can be also visualized as follows (one characheter represents |
| 66 100 ms, "." is silence and "*" is speech). | 66 100 ms, "." is silence and "*" is speech). |
| 67 | 67 |
| 68 ``` | 68 ``` |
| 69 t: 0 1 2 3 4 5 6 (s) | 69 t: 0 1 2 3 4 5 6 (s) |
| 70 A: **********...........**********........******************** | 70 A: **********...........**********........******************** |
| 71 B: ..........**********.........**********.................... | 71 B: ..........**********.........**********.................... |
| 72 ^ 200 ms cross-talk | 72 ^ 200 ms cross-talk |
| 73 100 ms silence ^ | 73 100 ms silence ^ |
| 74 ``` | 74 ``` |
| OLD | NEW |