MusicRevU`s guide to UTAU

Transcription

MusicRevU`s guide to UTAU
MusicRevU’s guide to
UTAU
Introduction ......................................................................................................... 2 Creating your UTAU ........................................................................................... 3 Installing UTAU ............................................................................................................................ 4 CV (Consonant – Vowel) Voice banks ............................................................. 6 Sounds to record ......................................................................................................................... 6 VCV (Vowel-Consonant-Vowel) Voice banks ............................................... 13 Sounds to record ...................................................................................................................... 14 The next steps ................................................................................................... 21 OTO.ini ............................................................................................................... 23 Last steps .......................................................................................................... 27 UTAU Flags ........................................................................................................ 30 Using USTs ........................................................................................................ 35 Mixing ................................................................................................................. 40 Wiki creation ...................................................................................................... 58 Character Profile Template ............................................................................. 62 Changing System Locale ..................................................................................................... 63 How to type in Japanese Hiragana ................................................................................. 63 How to create a UTAU .......................................................................................................... 63 Tutorials about Mixing ........................................................................................................... 64 Software ........................................................................................................................................ 64 UTAU Voicebank for best reference .............................................................................. 64 UTAU User Guide .................................................................................................................... 65 UTAU wiki .................................................................................................................................... 65 How to make an UTAU sound better ............................................................................. 65 1 Introduction
Hello there! I’m Hoshi, I go under the alias of MusicRevU for all my UTAU
work and I am creating this little tutorial to help any future UTAU users. When
I was creating Minuet, I found it quite difficult to understand a lot of the
tutorials out there and as such I decided to share my experiences and provide
some details on how to create an UTAU! These are just little things I found on
my own and learned how to use UTAU mostly through two of my friends who
use the software. So please read carefully and let me know if this helps you!
2 Creating your UTAU
Alrighty so we begin with the planning of your UTAU. It is usually good to
begin with a concept of your character so you have a rough idea
how they look and ideally how you want them to sound.
An
example would be when I created Minuet; I gave her a slightly
gothic look to her, as I wanted her voice to be soft and
mature.
Once you have your concept out of the way, starting making
some little notes for your UTAU such as their name, their
name in Japanese and so on.
Don’t worry though I will
provide a template for you on what info is required for you
UTAU later on in this tutorial!
So you’ve now got a rough idea of what you want your UTAU
to look like, now the next step is to start recording this bad boy!! Yup, this is
where the most work has to be done!
So to start off I will list what you need to download:
1. UTAU - http://utau2008.xrea.jp/index.html to download the latest
version.
2. UTAU
English
patch
-
http://utau.wikia.com/wiki/UTAU_wiki:UTAU_GUI_Translation
3. Audacity - http://audacity.sourceforge.net/download/
4. Lame for Audacity (to export mp3) - http://lame.buanzo.org/
You can use any kind of audio software to record and mix, I personally prefer
Audacity because I learned to use it at University and I can use its interface
very well. So you have gotten everything you need!! For now of course and
now we begin the next step; what sounds you need to record your Voice
3 Banks and the two common Voice Banks you will find in the UTAU
community.
Installing UTAU
Now I know for some first time users, you might encounter some problems
installing UTAU. This is because it is a Japanese Unicode Program. What
does this mean?
The programming is different to display Japanese
characters, which is where Unicode comes in! It’s all about programming and
to be honest it can be hard to understand if you’re not very computer literate.
I knew what it meant due to the fact my mum was a technician and taught me
how to use computers; everything else was self-taught.
So first of all what you will need to do is change your locale if you’re using
Windows. I don’t know how to do this for Mac due to the fact I can barely
work UTAU synth. This is a guide to changing the locale on Windows 7 but
I’m sure it’s the same kind of method for any Windows computer!
The system locale determines the default character set (letters, symbols, and
numbers) and font used to enter and display information in programs that
don't use Unicode. This allows non-Unicode programs to run on your
computer using the specified language. You might need to change the default
system locale when you install additional display languages on your
computer. Selecting a different language for the system locale doesn't affect
the language in menus and dialog boxes for Windows or other programs that
do use Unicode.
1. Open Region and Language by clicking the Start button
, clicking
Control Panel, clicking Clock, Language, and Region, and then clicking
Region and Language.
2. Click the Administrative tab, and then, under Language for nonUnicode programs, click Change system locale.
If you're prompted
for an administrator password or confirmation, type the password or
provide confirmation.
4 3. Select the language, and then click OK.
To restart your computer, click Restart now.
(Taken from the Microsoft website; Linked in the Useful Links section)
So that’s how you do that and by hitting shift+alt you can alternate between
Japanese and English.
Be sure to do this
when
you’re using Unicode programs cause you
want everything to run smoothly. This will mean Voice Banks like Teto or
Defoko (default with UTAU) should run, as they are only Hiragana based.
You can find lots of useful tutorials on how to type in Japanese Hiragana with
the Locale changed! Find the link in the Useful Links section!
If you click on the A when you’re in
Japanese Locale, you will get these
options. The one with the H for it lets you
input Hiragana so this is handy if you
wanna
type
Japanese
but
your
this
UTAU’s
can
be
name
a
in
little
complicated to understand so only do this
if you are confident enough to input romaji to translate to Hira.
5 CV (Consonant – Vowel) Voice banks
CV stands for "Consonant-Vowel". It is the traditional recording system of
UTAU, being designed for the Japanese language. Sounds consist of either a
single V, "vowel," sound (a, i, e, o, u and n included) or "consonant-vowel"
(ka, ji, no, etc). CV (Consonant - Vowel) is the most common and first voice
bank type, created by Ayame/Ameya.
It's a pretty simple voice bank type, known for being just Japanese syllables
recorded, each in an own wave file. You would use a CV voice bank like this:
[a][ri][ga][to]
With these sounds you can create the word "Arigato".
You will find most beginners start out with this sort of Voice bank because it is
the simplest to make. Remember, the sounds don't make the UTAU; it’ll be
the OTO.ini which I will explain later on. I am now going to list the sounds you
must record to make a CV voice bank. If you want an idea of how long to
record the sounds, take a look at some UTAUs Voice banks like Namine Ritsu
or Utaune Nami and open the sounds to hear them as well as use them as a
reference.
Sounds to record
•
Breath (Romaji alias of “Br”) – Breath ↑
•
a–あ
•
i–い
•
ye – いぇ
•
u–う
•
wi – うぃ
•
we – うぇ
•
wo – うぉ
6 •
e–え
•
o–お
•
ka – か
•
ga – が
•
ki – き
•
kye – きぇ
•
kya – きゃ
•
kyu – きゅ
•
kyo – きょ
•
gi – ぎ
•
gye – ぎぇ
•
gya – ぎゃ
•
gyu – ぎゅ
•
gyo – ぎょ
•
ku – く
•
kui – くぃ
•
kue – くぇ
•
kuo – くぉ
•
kua – くぁ
•
gu – ぐ
•
gui – ぐぃ
•
gue – ぐぇ
•
guo – ぐぉ
•
gua – ぐぁ
•
ke – け
•
ge – げ
•
ko – こ
•
go – ご
•
sa – さ
•
za – ざ
•
shi – し
7 •
she – しぇ
•
sha – しゃ
•
shu – しゅ
•
sho – しょ
•
ji – じ
•
je – じぇ
•
ja – じゃ
•
ju – じゅ
•
jo – じょ
•
su – す
•
sui – すぃ
•
sue – すぇ
•
suo – すぉ
•
sua – すぁ
•
zu – ず
•
zui – ずぃ
•
zue – ずぇ
•
zuo – ずぉ
•
zua – ずぁ
•
se – せ
•
ze – ぜ
•
so – そ
•
zo – ぞ
•
ta – た
•
da – だ
•
chi – ち
•
che – ちぇ
•
cha – ちゃ
•
chu – ちゅ
•
cho – ちょ
8 •
ji – ぢ
•
je – ぢぇ
•
ja – ぢゃ
•
ju – ぢゅ
•
jo – ぢょ
•
tsu – つ
•
tsi – つぃ
•
tse – つぇ
•
tso – つぉ
•
tsa – つぁ
•
zu – づ
•
te – て
•
ti – てぃ
•
tyu – てゅ
•
de – で
•
di – でぃ
•
dyu – でゅ
•
to – と
•
tu – とぅ
•
do – ど
•
du – どぅ
•
na – な
•
ni – に
•
nye – にぇ
•
nya – にゃ
•
nyu – にゅ
•
nyo – にょ
•
nu – ぬ
•
nui – ぬぃ
•
nue – ぬぇ
9 •
nuo – ぬぉ
•
nua – ぬぁ
•
ne – ね
•
no – の
•
ha – は
•
ba – ば
•
pa – ぱ
•
hi – ひ
•
hye – ひぇ
•
hya – ひゃ
•
hyu – ひゅ
•
hyo – ひょ
•
bi – び
•
bye – びぇ
•
bya – びゃ
•
byu – びゅ
•
byo – びょ
•
pi – ぴ
•
pye – ぴぇ
•
pya – ぴゃ
•
pyu – ぴゅ
•
pyo – ぴょ
•
fu – ふ
•
fi – ふぃ
•
fe – ふぇ
•
fo – ふぉ
•
fa – ふぁ
•
bu – ぶ
•
bui – ぶぃ
•
bue – ぶぇ
10 •
buo – ぶぉ
•
bua – ぶぁ
•
pu – ぷ
•
pui – ぷぃ
•
pue – ぷぇ
•
puo – ぷぉ
•
pua – ぷぁ
•
he – へ
•
be – べ
•
pe – ぺ
•
ho – ほ
•
bo – ぼ
•
po – ぽ
•
ma – ま
•
mi – み
•
mye – みぇ
•
mya – みゃ
•
myu – みゅ
•
myo – みょ
•
mu – む
•
mui – むぃ
•
mue – むぇ
•
muo – むぉ
•
mua – むぁ
•
me – め
•
mo – も
•
ya – や
•
yu – ゆ
•
yo – よ
•
ra – ら
11 •
ri – り
•
rye – りぇ
•
rya – りゃ
•
ryu – りゅ
•
ryo – りょ
•
ru – る
•
rui – るぃ
•
rue – るぇ
•
ruo – るぉ
•
rua – るぁ
•
re – れ
•
ro – ろ
•
wa – わ
•
o–を
•
n–ん
This includes all the extra sounds you can record and a little tip is to record in
romaji, that way it makes it easier for you to know what sounds you have
definitely recorded. When you record in romaji, then be sure to put your
aliases in UTAU to hiragana and vice versa. The best reference you can use
as I said is to study other UTAU voice banks and that will give you an idea of
how the sounds are meant to sound. (I had never spoken Japanese before
this and had to study the sounds, as Japanese syllables sound different to
English syllables.) I used Utaune Nami’s, Namine Ritsu’s and Kyohakushi
Alice’s Voicebanks to get my sounds right.
12 VCV (Vowel-Consonant-Vowel) Voice banks
VCV stands for "Vowel-Consonant-Vowel". It is a phoneme technique used to
record UTAU voice banks, also called "triphones" or "triphonics."
By
recording strings of syllables and using otos to split them up, one can
crossfade vowels together before consonants for sound that flows more
naturally. Example: "aRiGaTo" becomes "a ari iga ato", rather than "a ri ga
to". VCV (Vowel - Consonant - Vowel) is also a quite common voice bank
type, created by Ayame/Ameya. This voice bank type is a bit more confusing
and not recommended to people who just started using and making
UTAUloids. A VCV voice bank, it's usually recorded in syllable strings, mostly
has 5 or 7 syllables. Most the time the configurations in the oto.ini look like
this:
[i あ],[a た],[i そ]
With those VCV strings you can create a chain of sounds to blend with each
other and sound smoother. For example:
[- a][a ri][i ga][a to]
With these strings you can create the word "Arigato", and it will sound
smoother than a CV voice bank. And thanks to the natural pronunciation
available in a VCV voice bank, the utau voice banks sound more "human". (It
doesn't reduce the noise of resamplers)
VCV’s are the next step to do for a voice bank AFTER you are familiar using
an UTAU for a while. This is NOT recommended for beginners as it requires
a lot of work to the oto.ini. There are also several ways to record this type of
Voice bank starting with 2 Mora up to 7 Mora. Because in the future I aim to
make my UTAU sound as realistic as possible, I’ll note down the 7 Mora list
provide links later to the necessary Voicebanks to research.
13 Sounds to record
•
a_a_i_a_u_e_a
•
e_e_u_o_e_o_o
•
o_u_n_a_n_u
•
ke_ke_ku_ko_ke_ko_ko
•
za_za_ji_za_zu_ze_za
•
zi_zi_ju_ja_je_zi_je
•
zu_zu_ji_zo_za_zo_ji
•
chi_chi_tsu_ta_te_chi_te
•
che_che_chu_cho_che_cho_cho
•
cha_cha_ti_cha_chu_che_cha
•
chu_chu_ti_cho_cha_cho_ti
•
cho_chu_n_cha_n_chu
•
ku_ku_ki_ko_ka_ko_ki
•
i_i_u_a_e_i_e
•
i_i_yu_ya_ye_i_ye
•
ye_ye_yu_yo_ye_yo_yo
•
wi_wi_u_wa_we_wi_we
•
u_u_i_o_a_o_i
•
u_wi_wo_wa_wo_wi
•
we_we_u_wo_we_wo_wo
•
wo_u_n_wa_n_u
•
ko_ku_n_ka_n_ku
•
kye_kye_kyu_kyo_kye_kyo_kyo
•
ki_ki_ku_ka_ke_ki_ke
•
ki_ki_kyu_kya_kye_ki_kye
•
kya_kya_ki_kya_kyu_kye_kya
•
kyu_kyu_ki_kyo_kya_kyo_ki
•
kyo_kyu_n_kya_n_kyu
•
so_su_n_sa_n_su
•
be_be_bu_bo_be_bo_bo
•
ka_ka_ki_ka_ku_ke_ka
14 •
gye_gye_gyu_gyo_gye_gyo_gyo
•
gi_gi_gyu_gya_gye_gi_gye
•
gi_gi_gu_ga_ge_gi_ge
•
gya_gya_gi_gya_gyu_gye_gya
•
gyu_gyu_gi_gyo_gya_gyo_gi
•
gyo_gyu_n_gya_n_gyu
•
gu_gu_gi_go_ga_go_gi
•
ge_ge_gu_go_ge_go_go
•
she_she_shu_sho_she_sho_sho
•
shi_shi_su_sa_se_shi_se
•
sha_sha_si_sha_shu_she_sha
•
shu_shu_si_sho_sha_sho_si
•
sho_shu_n_sha_n_shu
•
je_je_ju_jo_je_jo_jo
•
ji_ji_zu_za_ze_ji_ze
•
ja_ja_zi_ja_ju_je_ja
•
ju_ju_zi_jo_ja_jo_zi
•
jo_ju_n_ja_n_ju
•
si_si_shu_sha_she_si_she
•
su_su_shi_so_sa_so_shi
•
zo_zu_n_za_n_zu
•
ta_ta_chi_ta_tsu_te_ta
•
da_da_ji_da_zu_de_da
•
se_se_su_so_se_so_so
•
go_gu_n_ga_n_gu
•
sa_sa_shi_sa_su_se_sa
•
dye_dye_dyu_dyo_dye_dyo_dyo
•
ti_ti_chu_cha_che_ti_che
•
di_di_dyu_dya_dye_di_dye
•
tsi_tsi_tu_tsa_tse_tsi_tse
•
zi_zi_du_za_ze_zi_ze
•
tse_tse_tu_tso_tse_tso_tso
15 •
ze_ze_du_zo_ze_zo_zo
•
tso_tu_n_tsa_n_tu
•
zo_du_n_za_n_du
•
tsu_tsu_chi_to_ta_to_chi
•
ya_ya_i_ya_yu_ye_ya
•
yu_yu_i_yo_ya_yo_i
•
ga_ga_gi_ga_gu_ge_ga
•
ji_ji_zu_da_de_ji_de
•
dya_dya_di_dya_dyu_dye_dya
•
dyu_dyu_di_dyo_dya_dyo_di
•
dyo_dyu_n_dya_n_dyu
•
te_te_tsu_to_te_to_to
•
zu_zu_ji_do_da_do_ji
•
de_de_zu_do_de_do_do
•
mo_mu_n_ma_n_mu
•
tsa_tsa_tsi_tsa_tu_tse_tsa
•
za_za_zi_za_du_ze_za
•
tu_tu_tsi_tso_tsa_tso_tsi
•
to_tsu_n_ta_n_tsu
•
yo_yu_n_ya_n_yu
•
du_du_zi_zo_za_zo_zi
•
do_zu_n_da_n_zu
•
ra_ra_ri_ra_ru_re_ra
•
hye_hye_hyu_hyo_hye_hyo_hyo
•
hya_hya_hi_hya_hyu_hye_hya
•
hyu_hyu_hi_hyo_hya_hyo_hi
•
hyo_hyu_n_hya_n_hyu
•
hi_hi_hyu_hya_hye_hi_hye
•
hi_hi_fu_ha_he_hi_he
•
nye_nye_nyu_nyo_nye_nyo_nyo
•
rye_rye_ryu_ryo_rye_ryo_ryo
•
nya_nya_ni_nya_nyu_nye_nya
16 •
rya_rya_ri_rya_ryu_rye_rya
•
ryu_ryu_ri_ryo_rya_ryo_ri
•
nyu_nyu_ni_nyo_nya_nyo_ni
•
nyo_nyu_n_nya_n_nyu
•
ryo_ryu_n_rya_n_ryu
•
ni_ni_nyu_nya_nye_ni_nye
•
ri_ri_ryu_rya_rye_ri_rye
•
ri_ri_ru_ra_re_ri_re
•
ru_ru_ri_ro_ra_ro_ri
•
ni_ni_nu_na_ne_ni_ne
•
na_na_ni_na_nu_ne_na
•
re_re_ru_ro_re_ro_ro
•
nu_nu_ni_no_na_no_ni
•
ne_ne_nu_no_ne_no_no
•
ro_ru_n_ra_n_ru
•
no_nu_n_na_n_nu
•
wa_wa_wi_wa_u_we_wa
•
ha_ha_hi_ha_fu_he_ha
•
ba_ba_bi_ba_bu_be_ba
•
pa_pa_pi_pa_pu_pe_pa
•
n_zi_n_je_n_jo
•
n_chi_n_te_n_to
•
n_n_i_n_e_n_o_n
•
n_i_n_ye_n_yo
•
n_wi_n_we_n_wo
•
bye_bye_byu_byo_bye_byo_byo
•
n_ki_n_ke_n_ko
•
n_ki_n_kye_n_kyo
•
n_gi_n_gye_n_gyo
•
n_gi_n_ge_n_go
•
n_shi_n_se_n_so
•
n_ji_n_ze_n_zo
17 •
n_si_n_she_n_sho
•
n_ti_n_che_n_cho
•
n_di_n_dye_n_dyo
•
n_tsi_n_tse_n_tso
•
n_ji_n_de_n_do
•
bya_bya_bi_bya_byu_bye_bya
•
byu_byu_bi_byo_bya_byo_bi
•
byo_byu_n_bya_n_byu
•
n_hi_n_hye_n_hyo
•
n_hi_n_he_n_ho
•
n_ni_n_nye_n_nyo
•
n_ri_n_rye_n_ryo
•
n_ri_n_re_n_ro
•
n_ni_n_ne_n_no
•
n_bi_n_be_n_bo
•
bi_bi_byu_bya_bye_bi_bye
•
n_bi_n_bye_n_byo
•
bi_bi_bu_ba_be_bi_be
•
n_fi_n_fe_n_fo
•
n_pi_n_pye_n_pyo
•
n_pi_n_pe_n_po
•
n_mi_n_me_n_mo
•
n_mi_n_mye_n_myo
•
n_vi_n_ve_n_vo
•
fye_fye_fyu_fyo_fye_fyo_fyo
•
fi_fi_fyu_fya_fye_fi_fye
•
fi_fi_fu_fa_fe_fi_fe
•
pye_pye_pyu_pyo_pye_pyo_pyo
•
fe_fe_fu_fo_fe_fo_fo
•
fo_fu_n_fa_n_fu
•
fyo_fyu_n_fya_n_fyu
•
pyo_pyu_n_pya_n_pyu
18 •
fya_fya_fi_fya_fyu_fye_fya
•
fyu_fyu_fi_fyo_fya_fyo_fi
•
pya_pya_pi_pya_pyu_pye_pya
•
pyu_pyu_pi_pyo_pya_pyo_pi
•
ze_ze_zu_zo_ze_zo_zo
•
fu_fu_hi_ho_ha_ho_hi
•
bu_bu_bi_bo_ba_bo_bi
•
fu_fu_fi_fo_fa_fo_fi
•
pi_pi_pyu_pya_pye_pi_pye
•
pi_pi_pu_pa_pe_pi_pe
•
pu_pu_pi_po_pa_po_pi
•
he_he_fu_ho_he_ho_ho
•
pe_pe_pu_po_pe_po_po
•
fa_fa_fi_fa_fu_fe_fa
•
me_me_mu_mo_me_mo_mo
•
mu_mu_mi_mo_ma_mo_mi
•
bo_bu_n_ba_n_bu
•
ho_fu_n_ha_n_fu
•
po_pu_n_pa_n_pu
•
ma_ma_mi_ma_mu_me_ma
•
mye_mye_myu_myo_mye_myo_myo
•
myo_myu_n_mya_n_myu
•
mya_mya_mi_mya_myu_mye_mya
•
myu_myu_mi_myo_mya_myo_mi
•
mi_mi_mu_ma_me_mi_me
•
mi_mi_myu_mya_mye_mi_mye
•
vi_vi_vu_va_ve_vi_ve
•
ve_ve_vu_vo_ve_vo_vo
•
vo_vu_n_va_n_vu
•
va_va_vi_va_vu_ve_va
•
vu_vu_vi_vo_va_vo_vi
•
wo_wo_a_wo_i_wo
19 •
u_wo_e_wo_n_wo
In terms of how to alias these, I would suggest looking to Namine Ritsu’s
Voice bank for reference or Utaune Nami’s.
20 The next steps
All right so you have your reclist for you voice bank and are ready to record!!
This is where I suggest having a very
good microphone to record; condenser
microphones are recommended for
more professional work
as it comes with noise
reduction but a good
USB microphone will
also suffice. I used a simple plug in microphone to record my
UTAU but I have plans to update her when I get a new
microphone.
Be sure to record in a quiet
environment to avoid any kind of background
noise leaking into your sounds.
This will
avoid any problems with your UTAU singing
as background noise makes it sound very robotic and it is not
pleasant to hear.
Now as I mentioned before it’s always good to reference
other voice banks so you know you are along the right lines.
When you record in Audacity be sure to trim any silence off the start and end
if needed and to make your UTAU sound a tad more realistic, try fading in and
out the beginning and end of the notes because it makes it smoother. Always
export the audio as wav as this is what’s read for your UTAU in the UTAU
software! Now when you save, you can save in either romaji (ka, ki, ko etc.)
or in hiragana (か, き, こ etc.). I recommend saving in romaji if you do not
know Japanese as it makes it easier to track what sounds you have recorded.
I personally prefer it in romaji because I can barely read Hiragana at the best
of times. I have however reversed this as I work on Minuet’s ACT 2. Make
sure you spend time on the recordings and be sure to check out the tutorials
I’ll link to later on for some really good tips to use UTAU!
21 So wonderful!! You now have all the sounds to your UTAU completed! I bet
you’re thinking you’re done now huh? WRONG! Now comes the longest and
probably most tedious part if you have no idea what you’re doing. This part is
called the OTO.ini.
22 OTO.ini
The oto.ini can be VERY confusing at first – especially if you're totally new to
UTAU. The oto.ini is there to melt the notes together and add life to the
UTAU, as well as some final editing for the wavs. There are lots of other
great oto.ini tutorials out there, and I'll add links to those I found that I think
could
help
after
the
tutorial.
To open the UTAU oto.ini interface, do this:
In the menu window, click "tools" and "Voice Bank Settings...". This will open
up this window:
As you can see, I've selected the sample "a" to work with. The selected
sounds are marked with blue, the numbers you see on the screen all describe
how the oto is built. To open the "working interface", click "Launch Editor".
Done!
23 This is the UTAU oto.ini interface – you have the first window and the second
one, the second window opens when you push the large button placed near
the four smaller ones (it's the big one to the left, see?). The second window is
where we edit the oto.ini – the first window is for information on the oto,
duplicating files and adding hiragana/romaji aliases.
As you can see, there's a lot of stuff going on here that I'll try to explain.
BLUE - the blue part indicates the part we want to delete from the file, for
example if you recorded "zu" but forgot to erase the empty space in the
beginning/end of the recording. That's where the blue goes. It's very handy.
PINK - this part makes sure that your samples won't go "nnnnaaa" instead of
"naaa", the pink goes over to consonant and partly over the vowel.
24 RED LINE - overlap, how much of the note/sound shall overlap the previous
one.
GREEN LINE - This part is "preutter", IE. when the sound will actually start.
For example, I want my UTAU to say "sama". I add the "sa" and "na", and the
pre-utter turns it into "sa..ma" with a short pause for me. It's very convenient
for
realistic
sounds!
NOTE!! NEVER drag the red and green lines TOO FAR or they'll sound OFF
and BAD, experiment before releasing a voice bank with a newly configured
oto.ini!!
STEP BY STEP
Here's the opened sound we want to Oto. As you can see, there's nothing
more than the sound file at the moment.
We drag the pink over the consonant + some of the vowel.
25 Then we add the overlap and preutter like this, a bit after the consonant but
not too far into the vowel. Experimenting is the best way to work an oto.ini.
The green should NEVER be too far into the sound or it will sound horrible, it's
better if you leave the green somewhere in the beginning on the consonant.
Perfect! Now we add the blue line over the parts we don't want - like stuff we
should've deleted in Audacity but didn't for some reason, or just sounds we
don't want.
Last but not least we drag the blue stuff over the very end of the sound.
DONE!
26 Oto'd Voice bank vs. No Oto Voice bank:
The first pic is of an oto.ini voicebank singing "arigato", the second of a
voicebank with no oto.ini. See the differences? The Oto.ini makes it go
"a..riGaa..too", and the no oto goes "arigato". The first one sound a lot more
realistic and nice. Every voicebank should have a working oto before it's
released.
(Oto
information
and
pics
taken
from
–
http://purutau.blogspot.co.uk/2010/12/utau-tutorial-otoini-configuring-cv.html)
If you have any UTAU friends, ask them to test the voice bank for you to see if
it sounds ok. Make sure you do lots of testing before you release your UTAU!
Last steps
So now that your voice bank is complete it’s time to do some of the misc work
before you start using usts. First of all we need to make a separate txt file
that will hold the character information! So go to notepad (if you’re using
Windows) or text edit (if you’re using a Mac) and type the following:
27 •
name = (enter the name of your UTAU; can be romaji or hiragana)
•
image = (the image must be 100x100, saved as a bmp and you put the
file name here. Eg “Profile_image.bmp”)
•
author = (your name here!)
•
web = (your website here so this can be deviantArt, tumblr etc.)
•
sample = (if you have a sample of your UTAU then you put that here.
Same way as the profile image, you just put the file name here. Eg
“sample.wav”)
Once you’ve done that be sure to save it as character.txt. If you use any kind
of hiragana in this be sure to switch your settings to Unicode so that it reads in
UTAU ok.
The next txt file needs to be created now and this will contain information
about your UTAU. It can be plain and simple or it can be detailed like I did
with mine, so I will provide the layout that I put in mine. Create your file and
input this:
•
Terms and conditions – This is just some simple knowledge such as is
it CV or VCV? Who created it? Is there more information available
and if so link to the page? Do you allow edits? Copying? Be sure to
list everything here if you want to be specific about your rights.
•
Make a solid line to separate the info since underneath this you want to
put character traits.
•
Name: (Your UTAU’s name)
•
Age: (Your UTAU’s age)
•
Height: x’x”ft (xxxcm) (this is to cover the metric system as well)
•
Weight: xxxlbs (x’x”stone)
•
Gender: (This can be anything obviously, it’s not restricted to male or
female)
•
Voice Range: (You don’t have to enter
this but it can help users if your UTAU
28 works better at a certain range. You can find this information out in
UTAU when you first run it.)
•
Flags: (Do you use flags? Be sure to list them here. Don’t worry; a list
of available flags will be listed later.)
•
Resampler: (Does your UTAU sound better with different samplers?
List here)
•
Character Item: (Every UTAU and Vocaloid has an item! Be sure it
matches your UTAU’s personality and/or design)
•
Personality: (List your UTAU’s personality here.)
•
Likes: (What does your UTAU like?)
•
Dislikes: (What does your UTAU hate?)
•
Related characters: (Only list those you have either created or have
gained permission for.)
Be sure to then save it as readme.txt so that UTAU will read it and the results
are as followed! With that out
of the way, your UTAU’s voice
bank
is
now
completed!
Celebrate and rejoice as there
is no more work required here!
Just
remember
these
steps
whenever you make a new
voice bank and if you plan on
making appends, ALWAYS DO
RESEARCH!
It helps in the
long run.
29 UTAU Flags
Flags are probably amongst the most difficult part of UTAU to understand as
they can either help a Voice bank or destroy it. Now the Flag I use for Minuet
I found somewhere online to help her sound a bit clearer due to the fact I
didn’t have a good microphone but I suggest experimenting a bit with them.
The most common flag use is the Gender Factor. I usually use g-2 or g+2 for
Minuet depending on the pitch of the UST. I will explain them with a useful
chart I found in the User Guide for UTAU. I will link it in the useful links
section.
In UTAU, you can perform a variety of tonal marks by entering various flags
like "g-5H30Y0" in the "Flags" text box of "Sound Properties" 「音のプロパテ
ィ」.
Note: You must enter half-width uppercase or lowercase letters in Flags.
Also please note that the case is significant (uppercase and lowercase
are different types of Flags).
In addition, there are Flags that are valid with all the resamplers (singing
synthesis engines), and flags that are valid only for the latest generic
Resampler version and for the development versions (resampler7,
resampler8).
30 All the valid resampler Flags:
Flag
Base
Setting
value
range
Feature - How to it set up effectively
A flag to control the effect offered by the formant (the
voice quality determined by the structure of the throat
or mouth). (Strictly speaking, this is different from
VOCALOID2's gender factor, but it has the same
effect.)
g
0
-100 ..
+100
Make sure to set this flag value with the + or symbols appended, like e.g. g+10, g-10.
When setting positive values, the voice becomes
deeper, more mature and masculine (with +20 or
more, a female voice can become male.)
When setting negative values, the voice becomes
thinner, more childlike and feminine (with -20 or less,
a male voice can become female.)
Flag to adjust the pitch in 10 cents (1/10th of
t
0
-9 .. +9
semitone) units. Make sure to set values with the + or
- symbols appended, like e.g. t+5, t-5.
The part outside of the fixed range in consonants is
called breathiness. (Details are omitted.)
By specifying a small value like e.g. Y0, the
breathiness part of consonants becomes relatively
stronger, and the articulation is considered to be
Y
100
0 .. 100
better. (As a side effect, noise appears that makes
high notes sounding metallic, thus increase the flag
value, or adjust simultaneously the H low-pass filter
described below.)
Note: If you use a continuous sound source,
specifying a small value like e.g. Y0 causes
noise, so please leave it to the default value Y100.
31 A low-pass filter to emphasize the bass and cut the
treble. (When using together the C, D, E low-pass
filters described below, they produce the same
H
0
0 .. 99
effect.)
It has the effect of mitigating the metallic noise on
high notes, but as a side effect the sound becomes
muffled.
A low-pass filter operating outside of the breath
component of consonant (breathiness). As it
emphasizes the high frequencies of consonant
components, it is unsuitable to sound sources where
h
0
0 .. 99
the consonant component is unstable.
Note: If set too strong, voices becomes hoarse even
with sound sources in which the consonant
component is stable, and you need to reduce the
value of the Y flag.
This adjusts the strength of the formant filter. The
formant filter depends on the frequency defined by
"source frequency * specified value".
F
3
0 ..
unspecified
It is generally better not to touch it, but when noise
appears in low tones, you can suppress it by
specifying values around F4 .. F7.
This flag is valid for Resampler development versions
too, but changes are not as big as in the default
generic Resampler version.
A fixed frequency for the "F" flag above. The formant
L
None
0 ..
unspecified
filter depends then on the frequency defined by
"170Hz * specified value".
When used simultaneously with F, this value takes
precedence.
Flags that are valid only for the latest generic and the development
Resampler versions:
32 Flag Base value
Setting
range
Feature - How to it set up effectively
BRE adjustment after the formant filter.
BRE changes are loosened when its pitch is
very different from the pitch of the primary
b
0
0 .. 100
sound, and the voice becomes unpleasantly
rough.
In addition, because it is not influenced by the
low-pass type filters (C, D, E, H and h), the
sound coming from BRE is bad and muffled.
A low-pass filter especially reducing the high-
C
0
0 .. 100
frequencies.
When set to 100, the volume is 100% at 0kHz,
50% at 11kHz, and 0% at 22kHz.
A low-pass filter cutting the midrange.
D
0
0 .. 100
When set to 100, the volume is 100% at 0kHz,
0% at 11kHz, and 100% at 22kHz.
A low-pass filter cutting the bass and treble.
E
0
0 .. 100
When set to 100, the volume is 100% at 0kHz,
0% at 7.1kHz, 100% at 11kHz, and 0% at
22kHz.
c
50
0 .. 100
The value of the C flag before the formant filter
adaptation.
Peak compressor. Align the peak volume of the
primary sound. (The volume setting and the
envelope changes are applied separately.)
There is zero variation with a value of 100.
P
86
0 .. 100
When set to 99 or less, the variation produced
is proportional to the volume of the primary
sound and to the parameter value.
Because only the peak volume of the primary
sound is aligned, a sense of instability in the
33 volume will remain in sound sources with
unstable volume changes, even if it is set to
100.
W unspecified unspecified
Produces a robotic voice. This flag is highly
experimental, and is generally not used.
(Source - http://utau.wikia.com/wiki/UTAU_User_Manual_-_7 )
As I stated before, experiment with flags but just be careful of the values.
Plus flags don’t make a UTAU, they are there to help but a good voice bank
and good OTO makes a good Voice bank.
34 Using USTs
With your brand new voice bank we can finally make them sing! Yep this is
the fun part but it can also be frustrating. Remember, if you use someone’s
UST, ALWAYS CREDIT THEM!! They worked hard on this and need to be
recognized. Never plagiarize, you will be found out immediately. So use
Google, YouTube or nico douga to find USTs. For this I will use the UST for
Palette as done by HaruVampire on YouTube.
When you open it, you want to
make sure you load the UST up
with your preferred Voice bank as
UTAU will automatically set Defoko
as the voice or try to use the Voice
bank that originally created the
UST.
As
you
can
see,
HaruVampire’s UTAU comes up so you want to be sure to change it. If you
have any preferred flags for your UTAU be sure to set them here. Don’t be
afraid to experiment. Click ok and load it up.
Your window should look like this!
Wonderful, now onto the next
steps.
You now want to make
sure the UST fits your UTAU as
every UTAU has different settings
that might not mix well with the
current
settings
on
a
UST.
ALWAYS check a UST in case it
35 has settings already or it’ll cause problems later on. If you find some of the
fields are filled in then do the following:
1. Select ALL of the UST by hitting
ctrl+a. This will allow you to edit
the track. Then right click on it.
2. This
drop
down
menu
should
appear. You want to go to the very
bottom
where
it
says
Region
Property and click on that. It should bring a new pop up window.
3. So you don’t want to touch Intensity or Modulation.
these
NEVER touch
unless
you
are
making a UST because
you can really mess up
the balance for the song.
If
Preutterance
and/or
Overlap is greyed out, this
means it won’t fit your
UTAU.
You want to hit
the clear button for that.
4. There’s a little link that says “Details”
on it. Click that to drop down the rest
of this window.
5. You’ll see that it has some more to it;
BRE, Flags and STP.
BRE adds
breathiness and can make your UTAU
sound robotic unless that is what you
are aiming for. I suggest clicking on the
box and hitting space to clear it.
6. Always clear flags since you don’t know
what flags the creator has used and
might make your UTAU sound distorted.
You should have set your own flags if you read my previous steps.
36 7. ALWAYS CLEAR STP! This feature subtracts milliseconds to the head
of the preutterance.
In other words it’ll make it sound like it’s not
pronouncing correctly if there’s too much STP.
When you’re done
setting everything up, be sure to hit ok to save changes.
8. Then you’ve finished up you want to then hit these three buttons at the
top of UTAU in this particular order:
a. ACPT
(Apply
automatic
parameter adjustment)
b. P2P3
(Set
crossfade
envelopes by p2 and p3)
c. P1P4 (Set crossfade envelopes by p1 and p4)
d. ACPT (Apply automatic parameter adjustment)
9. That will make the UST fit nicely to your UTAU!
10. Now for the final steps that I
personally do but you don’t
have to if you don’t want to.
Make
sure
everything
once
you
again
select
with
ctrl+a.
11. Click on the Tools menu and navigate down to Built-in Tools and click
on A LA CARTE. Now this bit
can
be
confusing
to
a
beginner but trust me it will
help in the long run.
I was
kindly told about this by my
friend felipone.
12. You want to make sure
the box next to “Connect
vowels smoothly to previous
note!” is checked. Click all the hiragana boxes as this will cover the
vowels, then add into the “Others” box the hiragana for “wo” – を and
then in English “a e i o u n wo”. Now the settings can be anything you
want but try to make it work for your UST. Leave Rising Note and
37 Falling Note unchecked, for me I set the timing to Medium,
Rising+Falling Note Change to Medium. Now I encountered a massive
problem with HaruVampire’s UST for Palette which was there was too
much vibrato. I fixed this with A LA CARTE by checking the box next
to “Add Vibrato!” and set it to Little. Set Frequency to Medium and the
duration to Medium. Once you have all your settings, be sure to click
all so it’ll apply to the whole UST. You can also use it on just selected
areas!
13. And for the final step, navigate to the Tools menu once more, go to
Built-in
Tools
and
click
on
Crossfade. Make sure only the
Crossfade box is checked and
put into the target the hiragana
for the vowels (あ, え, い, お,
う), n (ん) and for wo (を).
Then after that type in English the
vowels, n and wo.
This will make
everything flow into each other nicely
and will help the UTAU pronounce
better.
Set the Width to 100msec
and set Start to -70msec. These are
my settings all the time and are the last step I do for my UST editing.
14. Now sometimes you might get an exclamation box over some notes
even after all the editing. If this happens, right click on the note with
this box and right
click on it. Navigate
t
o
e
n
v
elope on the drop down menu to bring up a new pop up box. Now you
38 want to make sure the four red boxes aren’t overlapping each other or
doing anything funky, so just click on the Normal button and hit OK and
that should fix it. Be sure to go through the whole UST to fix problems
like these or your UTAU will sing way off key.
15. There you go! All finished! Now all you
have to do is render this out! Go to the
Project and click on the Render wav File
option. Pick your save location, name it
as necessary and save.
Sorted!
Now you can
close UTAU and we can
finally go onto Mixing.
39 Mixing
All right this is probably the most fun and also must FRUSTRATING part of
making a song!! You want to have some mixing software and there’s a wide
range available! There’s Audacity, Reaper, MAGIX Music Maker and so on.
Choose one you
think
you
can
become familiar
with. For me it’s
Audacity and as
previously
stated,
this
is
due to using it A
LOT
at
university. I can
navigate it fairly well and I find I can make a lot of nice effects with it due to
the amount of wonderful tutorials there are on YouTube.
So first things first, make sure you have an off vocal for the song you are
making. These can be found around YouTube and are normally supplied with
the UST. I had to find mine for Palette as it wasn’t supplied in HaruVampire’s
UST. A handy thing to have is the actual song and you’ll see why as we
progress.
So how do we mix? Ok here are the steps:
1. Import all your audio into your mixing software. For
this tutorial I’ll be using Audacity. I normally just click
and drag what I need into it. Click ok to any pop up
windows.
Don’t change anything; you want it to
remain intact.
2. Mute your audio (including any harmonies) and then
import your Off Vocal.
40 3. Be sure to mute the Off Vocal when it’s in and if you have the actual
song, import that. Why? This is going to help you with your timing!
4. Now that everything is in let’s make sure we get
everything on time.
You want to click this little
button and start moving the Off Vocal to match up to
the timing of the actual song.
Try your best to
match the wavelengths and you should have no
problems.
5. Next is to line up your audio. Now in the case of this UST, I had to find
where it started singing in the actual song and then moved the audio
until it was in time to the song. A note to make is that not all USTs
need this done, as some creators are nice enough to time it for you.
Neemiso is one of the best at doing this.
Do the same with your
harmonies so they are in alignment to your audio!
6. Ok with that done and us all time perfectly, you can remove the actual
song from Audacity by clicking on the “x” and be sure to
unmute everything so you can hear it. Now to get it in
to that nice song you have dreamed of making. Now
you will need to select the Off Vocal and be sure to go
to the Effect menu and pick Amplify. Set Amplification
to -1.5 and click OK if
highlighted. If not, click
“Allow Clipping” then OK.
This will lower
the volume on the Off Vocal so it’s not too
over powering.
Do this how many times
you need to but always be sure to balance
the Off Vocal with the Vocals.
7. Now
select
you
Vocals!
This
includes
harmonies! This is where we make sure your singer
is optimized.
Click on Effect and select Reverse.
This will reverse your Vocals but trust me, this is an
essential step.
41 8. Next click on Effect and click on
Compressor. Don’t change any
of the settings, just hit OK.
9. Go to Effect again and this time
select
Equalization.
Again don’t change
any of the settings,
just hit OK.
10. Once more, go
to Effect, select Normalize. Don’t
change
the
settings
and hit
OK.
11. Now we just Reverse it back so
click Effect
and go to Reverse. There we go, your Vocals are now nice and clear!
12. A lot of songs sometimes require you to use
Effects on them such as to make a radio
sounding voice and so on. I can’t help here, I
will advise you look up tutorials on the kind of
effect you are after but I will tell you the effect I always use on my
Vocals.
13. With the Vocals still selected, go to Effect and select Echo. Play with
the settings you ideally would
like but I tend to set my Delay
Time to around 0.1 – 0.2 and
the Decay Factor between 0.2 –
0.3. If I use more it’s usually to
make the echo more present in
the song if it calls for it.
14. This is the most time consuming part of making a song as we need to
make sure the Vocals and Off Vocal are in harmony. You can use
42 these little bars to change the sound a bit
to make them sound good to you. If I for
example set my Vocals to -2dB, I will set
my Harmonies to around -4 – -7dB. This
means it’s not over powered. I have a cut
off for my Vocals as well. If they exceed
this line, then I use Amplify set to -1.5 to start lowering the volume and
I’ll do this however many times I need to until I am personally satisfied.
Same goes to the Off Vocals. I usually set my Off Vocals to around 2dB but some songs might need less.
15. Once you are done with all the mixing it is
finally time to export it. Now the reason I said
before to download Lame is so you export in
MP3. This has better sound quality and is
more commonly used for songs nowadays.
So go to File, select Export Audio.
16. Save it to your preferred location and be sure to name it the finished
song.
So
an
example
would
be
I
would
save
mine
as
“Palette_feat_Minuet_Aoi” so that I knew it was done.
17. Once you hit save, make sure to hit ok for the next window and then
you
have
the
Edit
Metadata screen. Here is
where you add information
about the song if you want.
I always do since I’m a bit
OCD that way so this is
how
mine
would
look.
Once you’re happy hit OK
and it’ll begin to export to
your save location.
43 18. And that’s it for Mixing. Always save your projects on the off chance
something happens or if you want to edit it. Trust me, it can help if you
decide to change anything to the vocals. As you can see it becomes
easier the more you do this because you learn with each new UST you
work on. ALWAYS CREDIT THE CREATOR OF THE UST!!! I can’t
stress this enough but please, credit them and link to the original
download or to actual creator’s page. So you are now an UTAU user, I
bet you think you’re done now? Nope! There’s one more thing to do
and that is to create a wiki page for your UTAU on the UTAU wikidot.
This community is better than the previous UTAU wiki and you can
create a page easily enough!
A useful guide on mixing that I found on a UTAU forum! It explains everything
a beginner needs to know so you guys can follow this one too!
Audacity and LAME:
Audacity is open-source, has an excellent wiki, and solid functionality.
We'll be using Audacity 1.3.12 Beta for this guide.
You'll also need LAME, in order to save mp3 on Windows. Follow
these instructions.
Audacity Recording and Editing Basics:
The easy stuff:
•
Recording via the recording button (toolbar)
•
Importing your BGM (just drag and drop is fine)
•
Adjusting microphone sensitivity with the slider with the
microphone icon (toolbar)
•
Zooming in and out in time (a.k.a the x axis) with the +/magnifying glasses (toolbar)
•
Zooming in and out in the y axis by clicking and shift+clicking on
a track's y axis labels
44 •
Selecting a piece of audio with your mouse, cutting if necessary
(ctrl+x)
•
Selecting the whole track by clicking on the space where it says
"Stereo, 44100 Hz" etc.
•
Changing the graphical display of a track to Waveform,
Waveform (dB), etc. (that triangle button beside the name of the
track)
•
Making the track display area bigger by dragging the bottom
edge of the track. This will make the y axis labels display more
information, which will be useful for mixing.
•
Adjusting the volume (or gain) of a track (slider on the left side of
the track with -/+ signs)
Before Mixing - Timing:
Nothing says "I didn't practice enough" more than starting to sing half a
second after when you were supposed to, or finishing a phrase with a
syllable or two still unsung. Sure, you can blame it on the music
starting too suddenly or something, but you don't see better singers
doing it =P. It's possible to edit timing, but it's much easier to fix it by
practicing your singing.
•
Recognize: Play your vocals and the original song at the same
time (import them both into audacity). Make sure they start at
the exact same time by zooming in in time and cutting out bits of
silence at the beginning. It'll be obvious during playback, which
parts you messed up the timing on.
•
Recognize: Audacity sometimes has a infuriating habit of
adding a bit of delay in front of your recording as it prepares
itself to record. This can range from 40 to 400 milliseconds.
Even 40 milliseconds of delay is noticeable, so it's up to you to
make this right! Zooming in in time and looking at the waveform
helps.
45 •
Prevent: Nothing you can do about the Audacity lag, other than
getting a better computer =P. As for your own singing, suck it up
and practice! =D
Before Mixing - Pitch:
You're off-tune? And you hope mixing and editing can save you?
You're right, but it's hard. Very hard. Harder than five-year-old cheese.
Audio engineers do it for pop idols that can't sing any better than your
favourite nico singer (actually most nico singers are probably much
better than the likes of Hannah Montana). But you're neither an audio
engineer nor a pop idol (yet), so you'll have to do with good ol'
fashioned practice...
•
Recognize: If you've got a good ear, you'll hear it. If you don't
have a good ear, someone else will hear it, so ask. How do you
know if you have a good ear? This test can tell you.
•
Prevent: Practice practice practice... it's hard, I know. Pay
attention to the pitch you're producing, try singing a bit more
slowly, watch out when you go high or low, whatever you notice
you're weak in, practice it.
Before Mixing - Clipping:
All microphones have a certain level of maximum sound energy they
can convert to electrical energy to send to your computer. Any
difference between the energy received and the energy sent on is
simply lost. As a result, the recording of such strong vocals sound like
they're missing bits of signals, as if they've been "clipped" out. Clipping
is best fixed by properly setting the sensitivity on your mic, and not by
mixing or editing.
•
Recognize Clipping: Change the graphical display of your track
to "Waveform". Do any of those waves touch the ceiling or floor?
If so, you might have a little bit of clipping. If you find that the
46 waves are hugging the ceiling or floor for seconds at a time,
you've clipped, man, and you've clipped BADLY.
•
Prevent Clipping: Turn down the sensitivity of your mic (that
slider in the toolbar, with the microphone icon next to it) and rerecord until your vocals no longer have waveforms that touch
the ceiling/floor, especially during the loud parts. If you think this
makes the soft parts too soft, I know already you're going to like
the compression section of this guide =D.
•
Prevent Clipping: Also, make sure you're not so close to the
mic you're about to devour it. If you have a regular mic, put it to
the side of your mouth, instead of directly in front, to avoid
"boom" sounds caused by breathing into the mic. If you have a
condenser mic, consider a pop filter. You know, one of these.
•
Okay, so maybe you can fix clipping a little bit: Krystal
doesn't like it, but *whispers* Effect -> Clip Fix... Try it out if you
only have a little clipping. But be warned, it takes a REALLY
long time.
Mixing - Noise Removal:
There's always a bit of background noise. No, I'm not talking about
your brother's yelling downstairs. I'm talking about the hum of your
computer and the ambient buzz in the air. Your brain might tune it out
for you, but the microphone will not. Unfortunately it's hard for the
computer to distinguish noise from voice, so with any noise removal
process there comes a little distortion in vocals. The skill in mixing here
comes from the right balance between noise removal and voice
preservation.
•
Select a few seconds of the noise you want to remove, and go
to Effect -> Noise Removal. Click "Get Noise Profile". This tells
Audacity what noise to remove.
•
Now select a portion of your vocals and go to Effect -> Noise
Removal again.
47 o
Noise reduction (dB): How much to reduce the noise by.
More reduction means less noise, but also more voice
distortion.
o
Other settings: Don't worry about them until you're pro
enough. (Actually, I only know what they do, but not how
to use them effectively. The default settings work fine
though.)
•
Use the preview button. You'll notice it only gives you the first
few seconds of whatever piece of audio you selected. This'll
help you adjust the settings until you're satisfied with the effect.
•
Once you're satisfied, remember the settings and click cancel.
Now select the whole track and Effect -> Noise Removal again.
Enter the settings you decided on and click OK.
Mixing - Compression:
Dynamic range refers to the difference between the volume of the
loudest sound and the softest sound. Raw vocals have a HUGE
dynamic range, much larger than your BGM, in most cases. That's why
oftentimes if your verse is just right your chorus gets too loud, or if your
chorus is just right you can't hear the verses anymore. Compression
"compresses" the dynamic range so the two are closer together in
terms of volume, thus blending in with the BGM which has a similar
dynamic range. The mixing skill here is to reduce differences in
volume, but not so much that you can't hear the differences between
powerful vocals and "sweet" vocals.
•
Change the graphical display of your vocal track to "Waveform
(dB)" (remember that little triangle thing next to the track name?)
and drag the bottom edge of the track until you have lots of
informative labels displayed on the y axis. Don't be afraid to
make the track so large as to fill the screen.
•
You'll notice general differences in volume between various
parts of your vocals. Record approximately how loud (Audacity
48 records the loudest as 0 dB and the softest as -60 dB) your soft
parts and loud parts are. Say you found that they were -25 and 10 dB respectively.
•
Now, select a portion of your recording you'd like to preview,
preferably containing a second of soft vocals followed by a
second of loud. For example, the transition to a chorus.
•
Effect -> Compressor
o
Threshold: How loud the vocal has to be before
compression is applied to it. We want to compress the
loud vocals while leaving the soft vocals as they are. For
our values of -25 dB soft and -10 dB loud, we'll set the
threshold to -20 dB. Thus anything louder than -20 dB
(such as our loud -10 dB vocals) will be compressed.
o
Ratio: How much the vocals to be compressed will be
compressed. 2:1 means that the dynamic range of
whatever that passes the threshold will be cut in half.
o
•
Other settings: They're fine as they are.
Preview, and play with the ratio until the volumes are more
equivalent between the soft and loud vocals, but not so much so
that you can't hear the difference in power anymore.
•
Remember the settings, cancel, select whole track, and apply
the compressor with the settings you decided on.
•
There's a curious trend in the music industry to heavily
compress the dynamic range so as to get the loudness of every
part of the song as high as possible. This makes sense, since
it'll be easier to hear high and low frequencies when it's louder.
And if two identical songs, one slightly louder than the other,
were to be played, the louder one generally is regarded as
better. Many people think heavy compression isn't good (I'm one
of those people, since I like classical, and play piano. Dynamics
is very important... Compressing until only 3dB remain, like TV
commercials are, is unthinkable to me.), but that's the way it is
right now. Wiki up "loudness war" if you're interested.
49 Mixing - Equalization:
Every pitch is identified via a frequency of the sound waves carrying
the pitch. The higher the frequency, the higher the pitch you hear. The
human voice typically ranges from 80 to 1100 Hz, with low vocals
obviously at the lower end and high vocals at the higher end. Men
typically have vocals centered between 80-500 Hz (not counting
falsetto), and 170-1100 Hz for women (though the women's range
covers more Hz, the relationship between frequency and pitch is not
one-to-one. For each octave you go up in pitch, you'd have to DOUBLE
the frequency; thus more frequency change is needed to go from high
C to high D than going from low C to low D). The purpose of the EQ is
to boost or diminish the volume of sound based on their frequency. For
example, boosting 80 - 200 Hz might make your bass guitar sound
more prominent.
If your vocals are drowned out by the BGM, you can make the BGM
quieter. But as Ciel pointed out, you don't need to make the entire
BGM quieter - just the frequencies where they interfere with your
vocals' frequencies. In effect creating "space" in the BGM for your
vocals. But which frequencies? Well, Audacity has a neat tool...
•
Select a representative part of your vocals. Say, the first verse,
bridge, and chorus together. Now go to Analyze -> Plot
Spectrum.
•
Whoa, it's a graph. Frequencies on the X axis, volume on the Y
axis. So if you see a peak at 400 Hz that means there are a lot
of notes at around 400 Hz. From this graph you can get a feel
for what range of frequencies you're singing in.
•
Don't fuss about being exact. Once you start getting comfortable
with the equalizer, you'll know that ranges, not numbers, are
what you'll be working with.
•
Now, think about what parts of the song is interfering with your
vocals. Let's use Campanella as an example. It starts out
50 simple, with hardly any BGM. But in the chorus it builds up and
by the third chorus there's drums and cymbals and piano and
even rocketships. My vocals rang true in the beginning but were
drowned out near the end.
•
My strategy was to find out my vocals' frequency range (which
we just did with Plot Spectrum), then make the BGM quieter in
that range whenever I feel like I'm being drowned out, usually
the chorus. I ended up doing the EQ with every chorus, and with
harsher settings in the final chorus. But how do you work the
EQ?
•
Select the portion of BGM you want to apply the EQ to, and go
to Effect -> Equalization
o
Whoa, it's another graph. Y axis is volume, X axis is
frequency. The default line is at 0 (no adjustment) for all
frequencies. You can manipulate the line with the mouse.
Try it out. For me, I made a small valley of about -5 dB
from 150-600 Hz. Your spectrum might be different.
o
Alternatively, you can select the "Graphic EQ" radial
button and have sliders instead of messing with the line
yourself.
o
Other settings: If you're itchy, try them out. Just don't do
anything permanent. If not, leave them alone lol.
•
Preview doesn't do much here, since you need to hear your
vocals at the same time. So to experiment here you'll need to
apply the EQ, play it back, and if you're dissatisfied, you'll have
to undo and do it again. Yeah, it's one of Audacity's weaknesses
- but hey, it's free.
•
Once you're satisfied, click "save as" and name your curve. Now
you can apply the same EQ to other parts of the BGM that
drown out your vocals.
Mixing - Amplification:
51 Every effect affects volume. Noise removal reduces the volume of
whatever it recognizes as noise. Compression reduces the differences
in volume. Equalization adjusts volume based on frequency. Amplify
affects volume much more simply. It's a pure addition/subtraction in
volume. The skill in mixing here comes from knowing where your
vocals need boosting. Is a specific part too quiet? Is the whole track too
loud?
•
Select something that needs boosting (the low notes that you
lacked power in, and can thus barely hear, maybe?) and go to
Effect -> Amplify.
o
Amplification (dB): How much louder/quieter you want it
to be.
o
New Peak Amplitude (dB): Audacity calculates how many
dB it will have to amplify your selection to make the peak
(loudest part of the selection) whatever dB you entered
here, then changes the "Amplification" field to reflect this
calculation. End result is normalization (see next section)
to whatever dB you entered. By default this is set to 0.0
dB.
o
Allow clipping: If you check this, you can boost volume
above 0.0 dB, but clipping will result. I recommend not
checking it.
•
Preview, adjust, and apply. Play back to make sure you haven't
made it too much louder/softer as to make it stand out too much
from the rest of the vocals.
Mixing - Normalization:
Ever had an mp3 that was quieter than most others in your collection?
If you were to make the song louder so that the max volume in this
song is the same as the max volume in another song (typically 0 dB),
you'd be normalizing it.
52 •
Normalization is useful, but there's already a way to do it with
Amplify, and it's also included in the Compressor, if you
remember.
Mixing - Gain:
Every track has a -/+ slider on the left side its display. If you can't see
it, drag the bottom of the track to make the display area bigger. It has
the same effect as Amplify, but limited to exactly one track at a time.
So why bother?
•
It can be adjusted on the fly. Meaning you can adjust it while the
music is playing. It also displays how much gain you're applying
in dB. This is excellent for finding out exactly how many dB of
amplification that quiet part of your vocals should get.
•
Once you know how much dB to amplify, you can put the gain
slider back to 0 dB, and use Effect -> Amplify instead to make
the actual changes.
•
Why not use the gain slider to make volume changes? It applies
to the whole track whether you like it or not, so if you only want
to make one part of the track louder, you're out of luck. Also,
unlike Amplify, there is no "allow clipping" checkbox to leave
unchecked, so it won't warn you if you're clipping.
Mixing - Reverberation:
Singing in the bathroom obviously sounds different from singing in your
bedroom, which in turn is different from singing in a concert hall. The
reason is echo and reverberation. If you compare your freshly noiseremoved, compressed, and equalized vocals with the vocals from
some songs, you'll notice that despite all your mixing, your vocals still
sound very... naked. Very raw. But that doesn't mean you should make
yourself sound like you're in the Globe Theatre. You just have to match
the reverb of your BGM, or at least the reverb of the original Miku
vocals or whatever. The skill in mixing here comes from being able to
53 add reverb that is pleasing, but not readily noticeable (it'll distract
listeners from your beautiful singing, you know?). As in, you'll notice if
you compare, but not if you simply listen.
•
Select a suitable preview section. Preferably containing a
second of a few words followed by a long drawn-out vowel.
(Kaku yuugoro ni saaaaaaaaa~)
•
Effect -> GVerb (some people use Echo, but GVerb is more
flexible and takes less processing time).
o
The settings are complicated. You should start out with
the presets here. I like "The Quick Fix" for most songs.
•
Preview. Try to aim for something that sounds pleasantly full,
but natural at the same time. Adjust the amount of reverb by
changing these settings:
o
Early reflection level: Loudness of the first echoes. It's
once again from -60 dB (softest) to 0 dB (loudest).
o
Tail level value: Loudness of the echoes of the echoes,
as they "die away". This is what makes reverb vocals
sound so full and pleasant. Also from -60 dB to 0 dB.
•
Remember your settings, cancel, select the whole track, and
apply the settings you chose.
•
You might want to hear your vocals again with the BGM, since
the preview only plays the track you selected by itself. You might
need to undo the GVerb and do it again with different settings.
•
Sometimes a good pair of headphones can be a liability. What
you hear as just the right amount of reverb someone else with
just the good ol' iPod headphones might hear as, well, nothing.
Sound quality also differs by sound card. You'll just have to
learn about these the hard way, though! So once you export
your mp3 later, test it out on another computer, or ask your
friend to compare two versions, one with reverb and one
without.
Mixing - Panning:
54 Stereo means different sounds signals can be sent to the left or the
right speaker. Biologically speaking, your brain interprets a sound as
coming from the left if it receives the sound from the left ear a few
milliseconds faster than the right ear. Most mixing programs can create
the illusion of your vocals coming from the left or right by panning.
•
Underneath the Gain slider on each track is an L/R slider that
controls how close the sound from this track will sound to the left
or right side.
•
Harmonies and background singing are good targets for panning
to the left or right while your main vocals remain centered.
•
If you're mixing a duet or chorus, there are even more
possibilities for panning. Be creative!
Part Three
Effect - Radio:
A.K.A. the "tinny" effect, "walkie-talkie" effect, "buzzy" effect, etc. This
one is actually produced with *gasp* the EQ! Theory is to cut off the
high and low frequency components of your voice, but there's a
convenient preset in Audacity you can make use of.
•
Effect -> Equalization. In Select curve:, there's "amradio". This
simulates the sound from AM Radio stations, which are mostly
talk, news, etc. Which makes sense, because if you look at the
curve, everything other than the typical speaking range is getting
cut.
•
You can modify this curve if you want, make the slopes sharper,
move the peak to a higher frequency if your voice is higher,
make the peak cover a smaller range, etc.
•
Use the preview button to experiment! When you're done, you
can save your curve for future use.
Effect - Echo:
55 You can also use this for reverb, but GVerb is better for that, and Echo
is more intuitive for ... well, echoing. Maybe you have a song that has
the dramatic soft to powerful shifts like Starduster and Last Night, Good
Night, where the choruses have a bit of echo to accentuate their
contrast from the soft parts of the song. Or maybe you wanted to make
the last note of the song echo just to sound cool. Just remember to use
it moderation!
•
Select where you want to preview, and Effect -> Echo...
o
Delay time (s): time between each echo
o
Decay factor: how much the sound decays with each
echo. 0 means complete decay (no echo), 0.5 means
each echo is half as loud as the last one, 1 means the
echo will never die out.
•
Fun: to hear what going crazy sounds like, apply a delay time of
1 and a decay factor of 1 to, say, 20 seconds of your singing. =D
Effect - Autotune:
This basically picks a scale, and "rounds your pitch to the nearest
note", to use a mathematical analogy. Human singing, though, is not
that simple, and that's why autotuned voices generally sound very
unrealistic. But maybe that's what you're going for, like in Campanella.
•
Audacity doesn't come with autotune, but there's a plugin we
can download to do the same thing. ChoAkkar introduced it in
the second page of the topic. It's called GSnap. Download it
here. And extract the contents of the archive into the plugin
folder (located where you installed Audacity)
•
Restart Audacity and you should now see GVST: GSnap... in
the Effects menu.
•
How do you use GSnap? I don't actually know... I installed it
because I thought I might use it for the Campanella audition, but
56 I ended up leaving it pretty prestine. Maybe someone else will
contribute? If not, I'll update this after I try it out.
Exporting to mp3:
Make sure to save your work often! And BTW, Audacity projects can
take up a gigabyte of space if you did heavy editing and mixing, so
make sure you move some old anime or something.
The LAME plugin we installed at the beginning was to allow for mp3
exporting. Go to File -> Export.
•
You can export any format from the list, but most people will
choose mp3
•
When you do, click on "options" and change the quality. Unless
you're stingy about 10 mb of space, use the highest quality
setting (320 kbps). This simply tells you how much sound data
will be put into the file to represent each second of music (kbps
= kilobits per second).
•
After you press save, you can enter some tag information. I
used to put my name in, but got embarrassed after and now I
just leave mine blank.
(Taken from http://ytchorus.forumotion.com/t1687-the-beginner-s-guide-tomixing-audacity )
57 Wiki creation
We want to make our UTAU official now so we do need to create a profile on
this webpage: http://utau.wikidot.com/ Now you don’t need an account to do
this but I find if you do, it’ll show you are the one who made and edited it. But
do whatever makes you comfortable. The page should look this this:
You want to type in your UTAU’s name the western way as instructed and
click
on
Create.
This
will then take
you to a page
where you can
enter
your
UTAU’s
information. I’ll
help you guys
in filling it out if the layout confuses you but it’s pretty self-explanatory.
•
Title – The Western Name of your UTAU goes here.
•
Western Name – Same as before.
•
Eastern Name – This would be where Kanji would go if you have it.
58 •
Kana Name – Now this is where the Hiragana for your UTAU’s name
goes. Always study how it should look and if you are not sure don’t
ever be afraid to ask for help.
•
Icon – This is the Profile Image you use for your UTAU in the software
so just use that.
•
Image URL for Official Art (PNG, GIF, JPG only) – You can upload
your picture to an external site like photobucket, sta.sh and so on. Just
be sure to take the Direct Image Link and paste it in that box.
•
Artist Credit for Official Art – If you did not draw the official art, please
give credit to the actual artist.
You will be found out if you claim
something as yours and it’s not.
•
Gender – Obviously. Again not reserved to just Male or Female.
•
Age – Just enter the age here but be sure the age matches the voice.
•
Release Date – When you released it officially for download.
•
Official Site – This is the site you would be posting your UTAU related
stuff. I personally use Tumblr so you can use anything or even your
own website if you can design one.
•
UTAU Group or Production Team – This really only applies if your
UTAU is already part of a group like Vipperloids or something.
•
UTAU Voicer – This is where you put the alias of whoever voiced your
UTAU. If it wasn’t you again be sure to give credit and ask the Voice
Provider what alias they wish to use if they don’t want to use their real
name.
•
UTAU Manager – I’m going to assume this means who manages the
UTAU. This would be you.
•
File Encoding (especially in the case of Japanese voice banks) – This
is what you named the WAV files when recording your sounds.
I
recorded Minuet in Romaji so she has Romanized filenames. Just pick
whatever applies to you.
•
OTO.ini Aliasing (especially in the case of Japanese voice banks) –
When you make your OTO you would’ve have to add aliases to the
sounds like I stated before. So because Minuet was romaji, I had to
put her aliases to Hiragana. So again, click what applies to you.
59 •
Voicebank configured on – This is what kind of system you used for
making your UTAU. I alternated between Mac and PC so I put both.
Click what applies to you.
•
Voicebank Details – So you provide the download link here.
Remember to use something you maintain regularly. I use Google a lot
so that’s why I chose Google Drive. Be sure to add extra info here
such as if it has extra sounds, is it CV or VCV or both?, what resampler
you use, the gender factor if it applies and of course the flag you use
for this UTAU.
•
Soundcloud, YouTube or Nico Embed – The name pretty much states
it all but most people use Soundcloud for the songs and YouTube for
videos. You just need to find your iframe and just Google how to find it
if you aren’t sure.
•
R-18 Content – You need to decide if you allow mature content for your
UTAU.
•
Commercial Use of Voice banks Allowed? – Again your choice but be
very careful what you pick because it means people can gain profits
from using your UTAU.
•
Commercial Use of Character Allowed? – Same as the previous option
but again be careful cause this makes it very easy for people to steal
your character.
•
Do these Terms apply to derivative voices/characters? – This is like
what Akaito is to Kaito, only enable this if you want to give this option
but if you set it for permission required then remember you have a right
to say no.
•
Link to additional Terms of Use – This is for if you have further terms
so I can’t really explain this one.
•
Height – In x’x”ft (xxxcm)
•
Weight – In xxxlbs (xxxkg)
•
Character Details – This is just some basic information about your
UTAU. You can put anything put I used a template for info and then
just typed that info into the box. I’ll provide that in the next section
60 •
Image URL for Reference Sheet/Artist Credit for Reference Sheet x 4 –
The name explains it all. Same as the Official Image info.
•
Click save when you add everything you want and there you go, you
have made your UTAU official!!
So that’s all you need to know! Be sure to maintain your UTAU and respect
others. This is a community where music is supposed to bring us together!
Just follow the rules and do this to be happy and everything will work out. The
next couple of pages will have the character template and some helpful links
for UTAU. I hope this tutorial has helped and here’s to a Music Revolution!
61 Character Profile Template
Western Name
(Japanese: (Hiragana
Name)- Eastern Name in English)
NAME INTERPRETATION:
Hiragana First Name (English) – Meaning of the name
Hiragana Second Name (English) – Meaning of the name
TYPE: What kind of Loid is it?
MODEL: Their Model number is applicable.
GENDER
VOICE
RELATED
RANGE
CHARACTERS
AGE
GENRE
HOMEPAGE
WEIGHT
CHARACTER
CREATOR
ITEM
HEIGHT
VOICE
PICTURE LINK LIST
SOURCE
BIRTHDAY
LIKES
MEDIA LIST
RELEASE DATE
DISLIKES
SIGNATURE
SONGS
Personality:
Supplemental Information
Hair color:
Headgear:
Eye color:
Headphones:
Dress:
Nationality/Race:
Favorite phrase:
62 Useful Links
Changing System Locale
http://windows.microsoft.com/en-us/windows/change-systemlocale#1TC=windows-7
How to type in Japanese Hiragana
http://www.yesjapan.com/video/pages/install-japanese-windows-7-vista.html
How to create a UTAU
https://www.youtube.com/watch?v=b4R73mmrlRs - Part 1
https://www.youtube.com/watch?v=qUUN-gpofbM - Part 2
https://www.youtube.com/watch?v=2pzAhP2tiLA - Part 3
https://www.youtube.com/watch?v=_1jerBrl91g - Part 4
https://www.youtube.com/watch?v=Rxiv7P2JY_Q - Part 5
https://www.youtube.com/watch?v=s_9zKrROTz4 - Part 6
http://www.vocaloidotaku.net/index.php?/topic/46143-reclist-source-andexplanations/ - Fantastic source of different Voicebanks and reclists!
http://purutau.blogspot.co.uk/2011/01/how-to-addedit-flagsbre-etc-in-utau.html
- More information on flags
http://fav.me/d850zwc - CV Basic English Reclist (It’s recommended if you
plan to make a full English voicebank then to record it as CVVC)
http://fav.me/d7wlw40 - Reclist to record a Japanese CV voicebank with some
English sounds
http://visa-to-america.deviantart.com/art/How-to-create-an-UTAUloid203047455 - Explanation of Voicebanks as well as Oremo
https://sites.google.com/site/cvvcenglishusts/reclists - CVVC reclists
https://utaututorials.wordpress.com/utafaq/ - FAQS on UTAU
63 http://auraautumnus.deviantart.com/art/UTAU-MULTILINGUAL-CV-VCGUIDE-AND-RECLIST-435249586 - Multilingual CV VC reclists (experienced
users recommended)
http://fav.me/d7ehw7i - VCV reclist
http://fav.me/d7ehuix - Blank VCV oto
Tutorials about Mixing
https://www.youtube.com/watch?v=HjHh9AE0Sn0
https://www.youtube.com/watch?v=yRFje8SgR_4
Both of these tutorials are excellent when it comes to mixing! These are the
ones I used when learning about Mixing.
http://utau.wiki/tutorials:equalization:a-mixing-tutorial - Very good explanation
of Equalizers
http://ytchorus.forumotion.com/t1687-the-beginner-s-guide-to-mixing-audacity
- Wonderful and detailed tutorial on how to use Audacity for mixing
Software
http://utau2008.xrea.jp/index.html - UTAU software
http://utau.wikia.com/wiki/UTAU_wiki:UTAU_GUI_Translation - English Patch
http://audacity.sourceforge.net/ - Audacity
http://lame.buanzo.org/ - Lame for Audacity
UTAU Voicebank for best reference
http://www.canon-voice.com/index.html - Namine Ritsu
http://utau.wikia.com/wiki/Nami_Utaune - Utaune Nami’s CV Voicebank
64 https://www.youtube.com/watch?v=3D1rlp-zpgs - Utaune Nami’s VCV
Voicebank
http://ladyogien.wix.com/ogien-utau - Axis, Atlas and Kasai voicebanks;
excellent references of thorough voicebanks
UTAU User Guide
http://utau.wikia.com/wiki/UTAU_User_Manual
UTAU wiki
http://utau.wikia.com/wiki/UTAU_wiki - Version 1
http://utau.wikidot.com/ - Version 2
How to make an UTAU sound better
http://fav.me/d4u2xgr - How to make male UTAUs clearer
65