Cantonese Dictionary
eGuideDog Logo

Cantonese Dictionary

(updated on Oct 26, 2009)

This is a sub-project to edit a Cantonese dictionary which supports both eSpeak and Ekho.

You can download the latest Cantonese dictionary of eSpeak from http://espeak.sourceforge.net/data/zhy_list.zip.

If you use Perl language, you can take a look at the module eGuideDog::Dict::Cantonese

Our Cantonese dictionary base on Jyutping system. The copyright of Jyutping system belongs to the Linguistic Society of Hong Kong. However, with needs of people in Canton (Guangzhou), we extent this system to 7 tones. We separate contour 53 from 55 and mark it as the 7th tone. For example, "衫、煙(cigarette)" are contour 55 while "三,煙(smoke)" are 53. People in Hong Kong may not feel the difference of them. This 7 tone system is not recognised in linguistics but just my personal choice in order to improve Ekho TTS.

Reference:

  1. http://en.wikipedia.org/wiki/Jyutping
  2. http://www.lshk.org/cantonese.php
  3. http://www.ogcio.gov.hk/ccli/eng/ structure/jyutping.html
  4. http://www.iso10646hk.net/jp/document/download.jsp
  5. http://arts.cuhk.edu.hk/Lexis/lexi-can/
  6. http://arts.cuhk.edu.hk/Lexis/Canton2/
  7. 詹伯慧 等. 广州话正音字典. 广东人民出版社. 2004
  8. 饶秉才 等. 广州音字典. 广东人民出版社. 1983

Missions

No.ContentContributorsStatus
1Contribute to http://sourceforge.net/apps/mediawiki/e-guidedog/index.php?title=粤语Jade Lau, TT Fivecompleted
2Make amendment to our Cantonese dictionaryMichael Tang, Cameron Wongabout 500 items finished
3Contribute to http://e-guidedog.wiki.sourceforge.net/粤语第七声调Cameron, TT, Jadecanceled

Detail of mission 2: Make amendment to Cantonese dictionary

This amendment will apply to dictionary of eSpeak and Ekho. Let's read an example first:

B
棉被: min4 pei5

C
重重一拍: cung2 cung2 jat1 paak3
长绳: coeng4 sing2
长了眼睛: zoeng2 liu5 ngaan5 zeng1
松了口气: sung1 liu5 hau2 hei3
宝藏: bou2 zong6
一切: jat1 cai3

Below is the format of the list:

  1. The colon is English colon
  2. After the colon, there is an English space
  3. Between each jyutping symbol, there is an English space
  4. And line don't obey rules above will be ignored. So you can add comments as you like.
  5. The list should be sorted by alphabet although this is not a must. We would like to put word with a same character together. Take 长 for example, we use C in "coeng4" to sort it instead of Z in "zoeng2".

Anyone can help this mission. Just send Cameron the list. Thank you!

Valid XHTML 1.0 Strict Valid CSS! Level A conformance icon, 
          W3C-WAI Web Content Accessibility Guidelines 1.0 Get eGuideDog software for the blind at SourceForge.net. Fast, secure and Free Open Source software downloads