Orthographic Word Segmentation Task
- AKA: OWST, Orthographic Word Segmentation, Written Word Segmentation Task.
- OWST(I'mcominghome) ⇒ ([I'm] [coming] [home]), an example of PWST.
- OWST("I'm coming home") ⇒ ([I] ['m] [coming] [home]), an example of TTT.
- OWST("I bought a real time operating system") ⇒ ([I] [bought] [a] [real time] [operating system]), an example of VWST.
- OWST("日文章魚怎麼說") ⇒ ([日文] [章魚] [怎麼] [說]) (i.e. ~[Japanese] [octopus] [how] [say]), an example of VWST.
- See: Grapheme Segmentation Task, Phonological Word Segmentation Task, Spoken Word Segmentation Task.
- (Wikipedia, 2009) ⇒ http://en.wikipedia.org/wiki/Text_segmentation#Word_segmentation
- Word segmentation is the problem of dividing a string of written language into its component words. In English and many other modern languages using some form of the Latin alphabet dividing text using the space character is a good approximation to word segmentation. (Some examples where the space character alone may not be sufficient include contractions like can't for can not.) However the equivalent to this character is not found in all written scripts and without it word segmentation is a difficult problem. Languages which do not have a trivial word segmentation process include Chinese, Japanese and Thai.