Chinese input methods for Emacs

Sunday, 8th February, 2009

From the llaisdy blog archives.

This all feels a bit antiquated now I’m working on a Mac (I use the Mac input methods instead of the emacs input methods), but it’s useful whenever I have to work on a Windows machine. The notes below are not complete and I’d appreciate any comments to help fill in the gaps.

Introduction

Emacs provides 25 input methods for Chinese. Although each input method has its own describe-input-method page, these pages can be rather terse. There is also no overview or comparison between the different input-methods, neither have I been able to find one on the web.

Here I have gathered together the information I’ve been able to find. I’d be pleased to hear about any errors I’ve made, and where I can find further information to correct my omissions. I’ll keep this page up-to-date.

I’m learning Mandarin Chinese, I’m interested in simplified script, and for the moment I find a pinyin-based approach to the written language easiest. For my own current requirements, chinese-tonepy is fitting the bill, but I’m interested in learning a structural input method (i.e., not based on pronunciation). See the Conclusion for further discussion.

Read the rest of this entry »

Cupcake: Speech technology on Android

Tuesday, 27th January, 2009

In mid-December 2008 Google announced cupcake, an update to the Android platform. Here are some brief comments:

  • One of the new features listed under Framework is “Simplified SREC speech recognition API available”.
  • In the project layout srec is listed as an external project.
  • It’s in the source here.
  • The source is copyright Nuance and released under the Apache license. Here is an example copyright header (from AcousticModels.c):
    /*---------------------------------------------------------------------------*
     *  AcousticModels.c  *
     *                                                                           *
     *  Copyright 2007, 2008 Nuance Communciations, Inc.                               *
     *                                                                           *
     *  Licensed under the Apache License, Version 2.0 (the 'License');          *
     *  you may not use this file except in compliance with the License.         *
     *                                                                           *
     *  You may obtain a copy of the License at                                  *
     *      http://www.apache.org/licenses/LICENSE-2.0                           *
     *                                                                           *
     *  Unless required by applicable law or agreed to in writing, software      *
     *  distributed under the License is distributed on an 'AS IS' BASIS,        *
     *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * 
     *  See the License for the specific language governing permissions and      *
     *  limitations under the License.                                           *
     *                                                                           *
     *---------------------------------------------------------------------------*/
    
  • As far as I can tell, there’s no discussion of srec on android-beginners or android-developers. According to Dave Sparks on android-developers (Jan 20), “We haven’t released a Cupcake SDK yet. Docs will come when the SDK is published.” I’ll keep you posted.
  • n.b.: Text-to-speech is available on Andriod, using the eyes-free TTS library.

Polisi Iaith S4C

Sunday, 16th November, 2008

Yn ôl Golwg [1] mae S4C yn “bwriadu adolygu” ei bolisi iaith. Dyma rhai cysylltiadau perthnasol:

Mae’r erthygl Golwg yn sôn am ymateb gan Cylch yr Iaith (grŵp sy’n arolygu defnydd ieithoedd Cymraeg a Saesneg yn y cyfryngau) i’r dogfen S4C, ond nes i methu darganfod eu ymateb ar y wê.

Cyfeiriadau

[1] Golwg cyf. 21, rh 11. (13/11/2008 ) Rhybudd rhag gostwng safonau’r Gymrag.

Ddaru

Friday, 7th November, 2008

Contents

Introduction

Borsley et al. (2007) describes ddaru as "historically a verb meaning ‘happen, finish’, but … now just a marker of past tense." (p. 42). Ddaru is used similarly to gwneud but without agreeing with the following subject. The following examples (22 & 23 from Borsley et al.) can both be translated as "Elin bought a loaf of bread":

Read the rest of this entry »

Nuance to team up with Nokia

Wednesday, 15th October, 2008

Press release from Nuance:

Nuance Signs Multi-Year Agreement with Nokia, Spanning Advanced Input Technologies and Open Development Framework

And related piece in SpeechTech magazine:

Nuance, Nokia Strike a Partnership Deal

MocoNews has an interesting piece about it:

Nokia Picks Nuance For Speech Recognition And More; Will Open Up Technologies To Developers

What will this all mean for Nuance’s speech technology input to Android? At the moment Android has no speech and there are no concrete plans to introduce it. It seems like Nuance is playing the field. Well, why not?

Nice phone, shame about the speech rec.

Wednesday, 24th September, 2008

T-Mobile have just launched the first Android phone, the G1, and Android have announced the “1.0” release of the Android SDK.

Speech technology is conspicuously absent from the package index, and there’s no sign anywhere that it’ll be included anytime soon, or late.

News just in yesterday.

There’s a pie chart in yesterday’s FT showing smartphone operating system market share. Here it is again via Google’s Chart API.

Nokia intends to open source all of the Symbian code by 2010. This will include S60, UIQ, and MOAP.

Earlier this month the EU approved Nokia’s buying Trolltech (Reuters).

Google have quietly removed the speech.recognition package from the Android API. I say quietly: the removal is noted in the API Diff specification for M3-RC37a to M5-RC14, released 15th Feb, but I haven’t been able to find any more public announcements – for example, it wasn’t mentioned in the m5-rc-14 release announcement. Google have also not responded to a couple of queries about android.speech.recognition on the android-developers mailing list.

Back in November, Nuance claimed to have packaged and embedded speech technology components for open-source distribution. If these components have been removed from Android, maybe they could be released independently.

[update: June 19th]

It’s outrageous that speech technology has basically been pulled from Android on the sly. I should say the official position (from Nuance) is that speech (recognition at least) has not been dropped: http://www.speechtechblog.com/2008/03/25/does-android-still-listen.

In August, Jerry Carter, Director of Speech Architecture & Standards at Nuance, will be speaking at SpeechTek 2008 about Speaking and Listening to Mobile Devices. The blurb says, “Speech technologies will be an integral part of the user interface on future phones using the Android operating system. Learn the basics for implementing Android applications that speak and listen to users.” Maybe Jerry will have news for us.

In the meantime, I see there are various efforts to port FreeTTS to Android. Does anybody know if any are bearing fruit? If there is scope for using C-based software on Android, there is also Flite. FreeTTS was based on Flite, but Flite focusses explicitly on issues like size, memory use, portability, etc.

The Android SDK and documentation came up late yesterday and already this morning there were a thousand messages on the Android Developers discussion group. I’ve had a quick look and here are my first impressions:

Read the rest of this entry »

Android and the Open Handset Alliance

Wednesday, 7th November, 2007

This story is all over the net and the papers. Most of the stories just regurgitate the official press releases, which are here. By now everyone will know what it is, so I won’t bore you with that here.

The OHA developers’ page says that they “will make available an early look at the Android™ SDK on November 12, 2007.” On the Android FAQ they promise the SDK and “complete documentation”.

Why is it interesting to me?
Read the rest of this entry »