[]

жүктеу 5,01 Kb.

Pdf просмотр

бет	73/164
Дата	28.11.2017
өлшемі	5,01 Kb.
	#2112

1 ... 69 70 71 72 73 74 75 76 ... 164

145

УДК ӘОЖ 81'374
D.A. Karagoishiyeva, M. Zhanabekova, Sh.A. Ospankulova
Al-Farabi Kazakh National University,
Almaty, Kazakhstan
danel.karagoish@mail.ru

EXPERIENCE AND ISSUES OF SPEECH SYNTHESATOR CREATION

The main aim the article is to characterize and analyze the difficulties we faced in the process
of creating speech synthesizer  which produces a speech  in the Kazakh language. Synthesis of the
speech is a technology which allows to sound (to read) the text by a natural human voice. It is clear
that speech synthesis is novelty for the Kazakh language. It is obvious that there is hardly a speech
synthesizer in the Kazakh language. One cannot find even an artificial voice in our Mother tongue
in the Internet.

Key words: speech syntheziser, reduction, sound, diphthongoid, utterance, synchronize

Speech synthesizers are software libraries (text-to-speech engine) which give other programs
an  opportunity  to  produce  a  speech  from  the  text.  Speech  synthesizers  differ  by  the  quality  of  a
produced  speech,  opportunities  to  change  speech  settings  and  can  consist  of  several  voices
including  female  and  male  voices.  Speechsynthesizers  are  created  on  the  basis  of  Speech
Application  Programming  Interface  (Speech  API,  SAPI)  –  software  library  for  Windows  which
allows  to  identify  and  synthesize  a  voice  in  applications  for  the  operating  system.    SAPI  4.0
wascreatedin 1998, nowadays in modern operating systems SAPI 5is installed [1].
The main aim the article is to characterize and analyze the difficulties we faced in the process
of creating speech synthesizer which produces a speech  in the Kazakh language.
Synthesis of the speech is a technology which allows to sound (to read) the text by a natural
human voice.
Where is it possible to apply synthesis of the speech? Telephony: in the contact center and for
creation of the IVR menu
The most frequent and effective application of speech synthesis is a scoring of information in
IVR systems of the contact centers. First of all, this  is  a scoring of dynamic  information which  is
individual  for each client. Scoring of sound rollers by the synthesized  voice will  be also useful  in
the  case  of  quickly  changing  information,  that  is,  balance  of  warehouse  positions,  repertoire  of
movie theaters, emergencies of Internet service provider, etc. Scoring of information on the site
You  can  connect  our  "cursor" of  synthesis  of  the  speech  to  any  the  Internet  site/web-portal
and sound necessary information aloud.
It can be useful when a visitor of the site has no opportunity to be in front of the monitor or
he/she can listen to the text and at the same time work in other program. Sometimes there is a need
to  read  the  text  to  a  person  who  isn't  near  the  computer.  Also  it  will  be  useful  for  people  with
problems on eyesight. Scoring of documents and office information in corporate systems
Our  speech  synthesis  can  sound  any  information  put  in  a  system.  For  example,  it  can  be
educational exercises, medical complexes, warehouse systems of the account. Scoring of books and
their listening on mobile devices
We developed the Reader mobile application which allows to read aloud any book loaded into
the device.  Scoring of videos and educational courses
You can sound subtitlings to videos and you won't need the announcer any more [2].
It is clear that speech synthesis is novelty for the Kazakh language. It is obvious that there is
hardly a speech synthesizer in the Kazakh language. One cannot find even an artificial voice in our
Mother  tongue  in  the  Internet.  And  on  the  13
th
  of  March  of  the  current  year  workers  of  the
organization of the Rehabilitation Center for children and teenagers with restrictions on mental and
physical  development  collaborative  with  Russian  scientists  produced  a  device  for  disabled  people

146

with eyesight problems which can voice texts in the Kazakh language too. Specialists of the Russian
company  “Elita  Group”  in  the  field  IT  headed  by  a  group  of  scientists  with  Olga  Yakovleva  and
Kazakhstani initiators including R.A. Suleimenova – Head of the Rehabilitation Center for children
and  teenagers  with  restrictions  on  mental  and  physical  development,  E.D.  Suleimenova,
D.A. Karagoishiyeva, S.B. Bektemirova, M.A. Zhanabekova  - teachers of the Faculty of Philology,
Literary  Studies  and  World  Languages  of  al-Farabi  Kazakh  National  University  worked  on
producing speech synthesizer in the Kazakh language. Russian specialists were responsible for the
technical  side  of  the  Computer  Speech  synthesis  apparatus,  and  Kazakhtani  linguists  –  for
preparation of the necessary  material for the device [3], [4].
Speech synthesizers started to be used even more often in everyday life. Speech synthesizers
as it is already clear according to the name only, carry out synthesis of the speech, that is format the
written text into the oral.
Thanks  to  speech  synthesizers  it  is  possible  to  learn  new  foreign  words  with  the  correct
pronunciation,  to  read  books  without  distracting  from  the  affairs  or,  for  example,  in  a  transport.
Initially  the  organizations  specializing  on  equipment  for  people  with  eyesight  problems  were
engaged in development of such programs. Now, any user can download one of  programs, install it
on the computer or a telephone and to synthesize the speech. The set of various programs, enclosed
and even the whole systems were developed for this purpose.
Here is the list of speech synthesizers:
Acapela
–  one  of  the  most  widespread  speech  synthesizers  around  the  world.  The  program
distinguishes and sounds texts more than in thirty languages. Russian is supported by two voices: a
man's voice
– Nikolay, female – Alyona. The female voice appeared considerably after man's and is
more  advanced.  It  is  possible  to  listen  how  voices  soundon  the  official  site  of  the  program.  It  is
enough  to  choose  a  language  and  a  voice,  and  to  type  the  small  text.  By  the  way,  the  separate
dictionary of accents that allows to reach clearness of  a pronunciation was developed  for a  man's
voice.
Vokalizer  –  the  second  in  our  list,  but  not  by  popularity.  It  has  a  cursor  Milena  from  the
developer of the Vokalizer program of the Nuance company. The voice sounds very naturally, the
speech  is  pure.  There  is  an  opportunity  to  establish  various  dictionaries,  and  also  to  correct  the
loudness, speed and an accent that is very important. As well as in a case with Akapela, the program
has  various  versions  for  mobile,  automobile  and  computer  applications.  It  is  perfectlysuitable  for
reading books.
RHVoice – a speech synthesizer was developed by Olga Yakovleva. The program sounds the
Russian texts in three voices: To Elena \and, Irina and Alexander.
ESpeak – the first version of a free speech synthesizer was launched in 2006. Since then the
company  developer  constantly  issues  more  and  more  advanced  versions.  The  latest  version  was
presented at the end of spring of 2013.
Festival  –  this  is  a  complete  system  of  recognition  and  synthesis  of  the  speech  which  was
developed  in  University  of  Edinburgh.  Programs  and  all  modules  absolutely  free  of  charge  also
extend on source system [5].
Thus,  the  above  given  list  of  speech  synthesizers  are types  that  are  found  in  the  Internet. If
you analyze these speech synthesizers the number of languages are limited, and you can hardly find
the Kazakh language in them.
Our work included development of a device which can read the text in the Kazakh language,
but not a speech synthesizer in the Internet. The device itself is produced in the USA. The synthesis
for  a  certain  language  is  made  by  a  local  dealer.  This  is  the  result  of  collaborative  work  of  the
Rehabilitation  Center  for  children  and  teenagers  with  restrictions  on  mental  and  physical
development  and  Russian  specialists.  The  aim  of  the  speech  synthesizer  is  to  produce  a  Kazakh
speech on the device which is helpful for handicapped people. We have achieved our goal. As the
developers  of  the  project  we  took  part  in  a  special  demonstration  in  our  turn,  and    had  the
opportunity to test the device.
As the philologists we passed such stages in creating the speech synthesizer:

жүктеу 5,01 Kb.

Достарыңызбен бөлісу:

1 ... 69 70 71 72 73 74 75 76 ... 164