![]() You put a check mark in the box to enable the “Break” rule. For the segmentation rules, “Break” means you will break up the paragraph after the “Pattern Before” and before the “Pattern After”.Change the name “New Language and Country” to something appropriate like “Chinese-TW” or “Chinese-CN” but make sure that you change “LN-CO” to “ZH.*”.Scroll to the bottom of this window to see this new addition and use “Move Up” to move it all the way to the top so that it will have the highest priority in terms of sentence segmentation. ![]() To the right of the “Language Name” and “Language Pattern”, click the “Add” button and it will add a “New Language and Country” and “LN-CO” at the bottom of the “Language Name” and “Language Pattern” window.Next, click on the “Segmentation” button, and check the “Make the segmentation rules project specific”.Note: If ”Remove Tags” is not checked, it will affect the segmentation as the tags will be considered as some kind of punctuation marks. In options, check the “Enable Sentence-level Segmenting” and “Remove Tags”. After opening the project and loading the source file, go to Project > Properties, then make sure that your source file is set to ZH-TW, ZH-CN, or ZH-HK.I will write down below the steps I took to make my Chinese sentence segmentation rules: There are other exceptions as well and I will not go into more details explaining each and every other possibilities. However, if there are other punctuation marks following right after the above mentioned punctuation marks, then it should be segmented after the last punctuation mark. These Chinese punctuation marks (。?!) will indicate that it is the end of a sentence and you can segment the paragraph into sentences for translation purposes. ![]() In Chinese text, there are no blank spaces after punctuation marks like in the English language. I searched in the Yahoo group support of OmegaT and asked a few questions and eventually learned a few tricks that enabled me to customize my own Chinese sentence segmentation rules. 24, 2014, the latest version of OmegaT 3.08.02 already has its own official sentence segmentation rule for Chinese source text and it is working very well.) Maybe it’s because the translation direction was generally from other languages to Chinese and it’s only in the past few decades that Chinese to other language translation started to become more important. I started learning and using OmegaT, the free Computer-Assisted Translation tool (or CAT tool, for short), for professional translators only a few weeks ago and lately, I decided to look into the sentence segmentation rules for Chinese source text.Ĭonsidering that Chinese is the most spoken language in the world, I was surprised that there was no official sentence segmentation rule for Chinese source text in OmegaT at this point (Jan. OmegaT Chinese sentence segmentation rules (Click the above to enlarge) by Weedy Tan on January 13, 2014
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |