使用 Balabolka 將文字轉換成英文語音

參考資源

英文單字朗讀 MP3 DIY
Balabolka (cross-plus-a.com) 免安裝版、語音引擎

新增語音

設定 → 時間與語言 → 語音，將「英文 (美國)」安裝上去，就有黑暗大大喜歡的 Devid 選擇了

安裝、準備

下載 balcon 解壓縮到 Balabolka 目錄
下載 lame 解壓縮到 Balabolka 目錄

Balabolka 目錄最後大概長這樣

lame

Guide to command line options (in SVN)

設定 bit rate

-b 128

-b 128

設定採樣頻率，在 Balabolka 必須設定 16，因為來源只有 16 Khz

-s 16

-s 16

指令應用

列出語音清單

balcon.exe -l

1	balcon.exe -l

SAPI 5:
  Microsoft David Desktop
  Microsoft Hanhan Desktop
  Microsoft Hazel Desktop
  Microsoft Zira Desktop
Microsoft Speech Platform:
  Microsoft Server Speech Text to Speech Voice (en-US, ZiraPro)

SAPI 5:

Microsoft David Desktop

Microsoft Hanhan Desktop

Microsoft Hazel Desktop

Microsoft Zira Desktop

Microsoft Speech Platform:

Microsoft Server Speech Text to Speech Voice (en-US, ZiraPro)

套用語音產出 mp3

balcon.exe -n "Microsoft David Desktop" -t apple -o --raw | lame -r -b 32 -s 16 -m m -h - d:\apple.mp3

1	balcon.exe -n "Microsoft David Desktop" -t apple -o --raw \| lame -r -b 32 -s 16 -m m -h - d:\apple.mp3

使用文字檔產出

balcon.exe -f a.txt -o --raw | lame -r -b 32 -s 16 -m m -h - d:\hoyo.mp3

1	balcon.exe -f a.txt -o --raw \| lame -r -b 32 -s 16 -m m -h - d:\hoyo.mp3

a.txt，檔案編碼為 ANSI 不支援 UTF-8 ，檔案內可以使用 <voice> 標籤

<voice required="Name=Microsoft Zira Desktop">This is a book.</voice> <voice required="Name=Microsoft Hanhan Desktop">這是一本書</voice>

1	<voice required="Name=Microsoft Zira Desktop">This is a book.</voice> <voice required="Name=Microsoft Hanhan Desktop">這是一本書</voice>

如果想要在圖形界面中英文混用，也是相同輸入

不同的語音產出

原始語音 - Microsoft David Desktop

Microsoft David Desktop

Microsoft Hanhan Desktop

Microsoft Hazel Desktop

Microsoft Zira Desktop

使用 Python 輸出

import os

for i in data:
    command = 'c:\\balabolka_portable\\Balabolka\\balcon.exe -n "Microsoft David Desktop" -t ' + i[
        'word'] + ' -o --raw | c:\\balabolka_portable\\Balabolka\\lame -r -b 32 -s 16 -m m -h - d:\\2000\\' + i['word'] + '.mp3'
    os.system(command)

import os

for i in data:

command = 'c:\\balabolka_portable\\Balabolka\\balcon.exe -n "Microsoft David Desktop" -t ' + i[

'word'] + ' -o --raw | c:\\balabolka_portable\\Balabolka\\lame -r -b 32 -s 16 -m m -h - d:\\2000\\' + i['word'] + '.mp3'

os.system(command)

balcon 參數

  -l              : print list of voices
  -g              : print list of audio output devices
  -f <file_name>  : set input text file
  -fl <file_name> : set file with list of input file names
  -w <file_name>  : set output file in WAV format
  -n <voice_name> : set voice for speech
  -id <integer>   : set voice by language code (Locale ID)
  -m              : print voice parameters
  -b <integer>    : set audio output device by index
  -r <text>       : set audio output device by name
  -c              : use text from clipboard
  -t <text>       : use text from command line
  -i              : use text from stdin
  -o              : write sound data to stdout
  -s <integer>    : set rate of speech (from -10 to 10)
  -p <integer>    : set pitch of speech (from -10 to 10)
  -v <integer>    : set volume of speech (from 0 to 100)
  -e <integer>    : pause between sentences (in milliseconds)
  -a <integer>    : pause between paragraphs (in milliseconds)
  -d <file_name>  : apply dictionary for pronunciation correction
  -k              : kill other copies of application
  -ka             : kill active copy of application
  -pr             : pause or resume reading by active copy of application
  -q              : add application to queue
  -lrc            : create LRC file to display synchronized text in audio players
  -srt            : create SRT file to display synchronized text in video players
  -vs <file_name> : create text file with synchronized visemes
  -sub            : process input text as subtitles
  -tray           : show icon in system tray
  -ln <integer>   : select line by number (or range, e.g. 12-79)
  -fr <integer>   : set output audio sampling frequency in kHz (from 8 to 48)
  -bt <integer>   : set output audio bit depth (8 or 16)
  -ch <integer>   : set output audio channel mode (1 or 2)
  -enc <encoding> : set input text encoding (ansi, utf8 or unicode)
  -sb <integer>   : silence at the beginning (in milliseconds)
  -se <integer>   : silence at the end (in milliseconds)
  -df             : delete text file when job is done
  -dp             : display progress information
  -isb            : ignore text in square brackets
  -icb            : ignore text in curly brackets
  -iab            : ignore text in angle brackets
  -irb            : ignore text in round brackets
  -iu             : ignore URLs
  -ic             : ignore /*comments*/ in text
  -h              : print usage information

  --lrc-length <integer>  : set max length of text lines for output LRC file
  --lrc-fname <file_name> : set filename for output LRC file
  --lrc-enc <encoding>    : set encoding for output LRC file
  --lrc-offset <integer>  : set time offset for output LRC file (in milliseconds)
  --lrc-artist <text>     : artist (ID tag)
  --lrc-album <text>      : album (ID tag)
  --lrc-title <text>      : title (ID tag)
  --lrc-author <text>     : author (ID tag)
  --lrc-creator <text>    : creator of LRC file (ID tag)
  --lrc-sent              : insert blank lines after sentences in LRC file
  --lrc-para              : insert blank lines after paragraphs in LRC file
  --srt-length <integer>  : set max length of text lines for output SRT file
  --srt-fname <file_name> : set filename for output SRT file
  --srt-enc <encoding>    : set encoding for output SRT file
  --raw                   : output is raw PCM data (headerless)
  --ignore-length         : omit length of audio data in WAV header
  --sub-format <text>     : set format of subtitles (for input text)
  --sub-fit               : increase speech rate to fit time intervals in subtitles
  --sub-max <integer>     : set max rate of speech for subtitles

  --voice1-name <voice_name>    : set voice to read foreign words in text
  --voice1-langid <language_id> : set language ID for foreign text (e.g. en)
  --voice1-rate <integer>       : set rate of speech for foreign text (from -10 to 10)
  --voice1-pitch <integer>      : set pitch of speech for foreign text (from -10 to 10)
  --voice1-volume <integer>     : set volume of speech for foreign text (from 0 to 100)
  --voice1-roman                : use default voice to read Roman numerals
  --voice1-digit                : use default voice to read numbers in foreign text
  --voice1-length <integer>     : set min length of foreign text to change voice

-l : print list of voices

-g : print list of audio output devices

-f <file_name> : set input text file

-fl <file_name> : set file with list of input file names

-w <file_name> : set output file in WAV format

-n <voice_name> : set voice for speech

-id <integer> : set voice by language code (Locale ID)

-m : print voice parameters

-b <integer> : set audio output device by index

-r <text> : set audio output device by name

-c : use text from clipboard

-t <text> : use text from command line

-i : use text from stdin

-o : write sound data to stdout

-s <integer> : set rate of speech (from -10 to 10)

-p <integer> : set pitch of speech (from -10 to 10)

-v <integer> : set volume of speech (from 0 to 100)

-e <integer> : pause between sentences (in milliseconds)

-a <integer> : pause between paragraphs (in milliseconds)

-d <file_name> : apply dictionary for pronunciation correction

-k : kill other copies of application

-ka : kill active copy of application

-pr : pause or resume reading by active copy of application

-q : add application to queue

-lrc : create LRC file to display synchronized text in audio players

-srt : create SRT file to display synchronized text in video players

-vs <file_name> : create text file with synchronized visemes

-sub : process input text as subtitles

-tray : show icon in system tray

-ln <integer> : select line by number (or range, e.g. 12-79)

-fr <integer> : set output audio sampling frequency in kHz (from 8 to 48)

-bt <integer> : set output audio bit depth (8 or 16)

-ch <integer> : set output audio channel mode (1 or 2)

-enc <encoding> : set input text encoding (ansi, utf8 or unicode)

-sb <integer> : silence at the beginning (in milliseconds)

-se <integer> : silence at the end (in milliseconds)

-df : delete text file when job is done

-dp : display progress information

-isb : ignore text in square brackets

-icb : ignore text in curly brackets

-iab : ignore text in angle brackets

-irb : ignore text in round brackets

-iu : ignore URLs

-ic : ignore /*comments*/ in text

-h : print usage information

--lrc-length <integer> : set max length of text lines for output LRC file

--lrc-fname <file_name> : set filename for output LRC file

--lrc-enc <encoding> : set encoding for output LRC file

--lrc-offset <integer> : set time offset for output LRC file (in milliseconds)

--lrc-artist <text> : artist (ID tag)

--lrc-album <text> : album (ID tag)

--lrc-title <text> : title (ID tag)

--lrc-author <text> : author (ID tag)

--lrc-creator <text> : creator of LRC file (ID tag)

--lrc-sent : insert blank lines after sentences in LRC file

--lrc-para : insert blank lines after paragraphs in LRC file

--srt-length <integer> : set max length of text lines for output SRT file

--srt-fname <file_name> : set filename for output SRT file

--srt-enc <encoding> : set encoding for output SRT file

--raw : output is raw PCM data (headerless)

--ignore-length : omit length of audio data in WAV header

--sub-format <text> : set format of subtitles (for input text)

--sub-fit : increase speech rate to fit time intervals in subtitles

--sub-max <integer> : set max rate of speech for subtitles

--voice1-name <voice_name> : set voice to read foreign words in text

--voice1-langid <language_id> : set language ID for foreign text (e.g. en)

--voice1-rate <integer> : set rate of speech for foreign text (from -10 to 10)

--voice1-pitch <integer> : set pitch of speech for foreign text (from -10 to 10)

--voice1-volume <integer> : set volume of speech for foreign text (from 0 to 100)

--voice1-roman : use default voice to read Roman numerals

--voice1-digit : use default voice to read numbers in foreign text

--voice1-length <integer> : set min length of foreign text to change voice

1,724 total views, 1 views today

使用 Balabolka 將文字轉換成英文語音

參考資源

新增語音

安裝、準備

指令應用

不同的語音產出

使用 Python 輸出

balcon 參數

Related Post

發佈留言

使用 Balabolka 將文字轉換成英文語音

參考資源

新增語音

安裝、準備

指令應用

不同的語音產出

使用 Python 輸出

balcon 參數

Related Post

AI 問掛 - Windows 10 修復 .iso 檔案右鍵沒有燒錄光碟錯誤

網頁的前、後端如何區分

PWA 漸進式網路應用程式 - 3. 桌面版瀏覽網頁通知

發佈留言