arrowHome arrow Articles arrow Heritage arrow Learn frequently used words Wednesday, 08 February 2012  

Hua-Lian.Net

collaborative Chinese online community
sharing everyday issues in the Tri-State area

Login Form
Username

Password

Remember me
Forgotten your password?

Main Menu
 Home
 Welcome
 Articles
 Blog
 Downloads
 Contact Us
 SiteMap
 Administrator

 
FYI
Click on: "Using this site" to see an introduction.

Learn frequently used words   PDF 
Written by Wei-Jing Zhu  
Whether we want our parents to learn English, or our children to learn Chinese, we can have maximum impact by having them learn the most frequently used words first (such as top N=100 or N=1000). Here is a resource that I have prepared just for that purpose.

With the existence of the publically available (open source) CE/EC dictionary, (and other useful resources), I can use the dictionary itself in two ways:
  1. as a corpus to extract the frequency of the words, and sort the list of words by frequency to get any Top N list I want.
  2. as a dictionary to translate the top N words into the target language
Depending on the EC or CE direction, I can make two lists,
  1. the most frequent and hence useful English words (and their translation to Chinese) that our parents should learn first
  2. the most frequent Chinese characters and words (with English translation) that our children should learn first
These lists are now available as downloads on this site, just choose Sorted_EC or Sorted_CE, both are around 1Mbytes text files.

First Chinese characters to teach

To offer my son the feeling of maximum impact of learning Chinese characters, I would teach him first a combination of "easy characters" which have the fewest strokes, but at the same time being the most common as well, so that he would be able to recognize a lot when watching Chinese MTV Karaoke subtitles.

I have made such a list manually, looking through the list of frequent characters and selecting the simple ones. However, one can imagine automatically generating such a list with the following approach:
  • the ordering of characters by stroke numbers is implicitly in the Big5 character encoding
  • the frequency listing has been obtain above
  • so a combination of these two, to favor both simple strokes and frequency, can easily be coded.
I will supply such a list in due time.  Here are a few online resources:

Comments

Only registered users can write comments.
Please login or register.

Powered by AkoComment 1.0 beta 2!


Links
Teen Ventures
Tech Blog
80-20 Blog

IBMWCC related:
readme
articles
email archive
Most Read
80-20 election endorsement
How Not to Talk to Your Kids (NYMag)
Hershey Park trip for Memorial Weekend
Free subdomains for communities
Vision and Mission