special characters

Special characters

Living in a post-ASCII world offers great opportunities, but brings some problems, too. After all, it's nice to be able to write Ångström or Καλημέρα or ☺☎☯, but it's not necessarily easy to enter those characters.

input methods

So - what to do? First, you can set the input-method, as explained in the emacs manual. This is the best solution if you're writing a non-Latin language – Russian, Thai, Japanese, …

If you only occasionally need some accented character, input methods like latin-postfix (e" -> ë), latin-prefix ("e -> ë) or TeX (\"e -> ë) are useful. They also tend to annoy me a bit, as they often assume I need an accented character, when all I want is to put a word in quotation marks…

compose key

Another method is to use a special compose key; for example, under Gnome 3 it's in the the Region and Language applet, under Options..., in Gnome 2 it's in the Keyboard applet in Layouts/Options…. This works for all programs, not just emacs (see this Ubuntu help page for some details). I've set my Compose-key to Right-Alt, so Right-Alt "e -> ë.

Using the compose key works pretty well for me; setting the input method may be more convenient when you need to write a lot of accented characters.

Now, his may be good and well for the accented characters and other variants of Latin characters, such as the German Ess-Zet ligature "ß" (note, you can get that character with latin-prefix "s -> ß, latin-postfix s" -> ß or <compose-key> ss -> ß). But what about Greek characters? Mathematical symbols? Smileys?


One way to add characters like α, or is to use ucs-insert, by default bound to C-x 8 RET. If you know the official Unicode name for a character, you can find it there; note that there's auto-completion and you can use * wild-cards. For the mentioned characters, that would be GREEK SMALL LETTER ALPHA, INFINITY and WHITE SMILING FACE.

You can also use the Unicode code points; so C-x 8 RET 03b1 RET will insert α as well, since its code point is U+03B1. In case you don't know the code points of Unicode characters, a tool like the Character Map (gucharmap) in Gnome may be useful.


Since ucs-insert may not be convenient in all cases, you may want to add shortcuts for oft-used special characters to you abbrev table. See the entry on Abbrevs in the emacs manual. I usually edit the entries by hand with M-x edit-abbrevs, and I have entries like:

"Delta0"       0    "Δ"
"^2"           0    "²"
"^3"           0    "³"
"almost0"      0    "≈"
"alpha0"       0    "α"
"any0"         0    "∀"
"beta0"        0    "β"
"c0"           0    "©"
"deg0"         0    "℃"
"delta0"       0    "δ"
"elm0"         0    "∈"
"epsilon0"     0    "ε"
"eta0"         0    "η"
"heart0"       0    "♥"
"inf0"         0    "∞"
"int0"         0    "∫"
"notis0"       0    "≠"

Now, alpha0 will be auto-replaced with α. I'm using the 0 suffix for most entries so I can easily remember them, without making it hard to use alpha as a normal word. Note, abbrevs are a bit picky when it comes to the characters in the shortcut – for example, setting != -> won't work.

inheriting abbrevs from other modes

If you have set up a nice set of abbreviations for text-mode, you may want to use them in other modes as well; you can accomplish this by including the text-mode abbreviations into the table for the current one, for example in your ERC setup:

;; inherit abbrevs from text-mode
(abbrev-table-put erc-mode-abbrev-table :parents (list text-mode-abbrev-table))


Pavel Iosad said...

Input methods are great. I'm writing a thesis in linguistics with lots of phonetic symbols, and IPA-X-SAMPA is has saved me unthinkable amounts of time, especially in combination with C-h.

But if you need weird symbols only occasionally, and they are moderately weird, switching to an input method like TeX is indeed overkill; C-x 8 gives lots of useful symbols without going the roundabout with Unicode names. E.g. C-x 8 " a gives ä, and C-x 8 S gives §, etc.

Dave Sailer said...

Thanks. Unfortunately I can't devote my whole life to researching emacs. So it's nice to find that about every third post of yours there is some truly golden nugget that helps me immensely.

I know about EmacsWiki and the emacs manual but somehow can never make sense of either.

I've been using "Spanish Minor Mode for GNU Emacs" [http://www.1729.com/spanish/spanish-emacs.html] to help in learning the language. It's OK but can't do ü. And is outdated. Somehow or other I managed to modify it to defeat the deprecation warning.


After reading your post I looked up the input methods. More complete, look better. Now I know.

It's taken about 15 years to find out (by accident, this week) that there is a routine to save a keyboard macro in the .emacs file and convert the syntax. I had to figure it out by trial and error. Wish I'd known then. Arrgh.

djcb said...

@Dave Sailer: I'm glad it's useful for you!

Shawnessy said...

I find the Agda input method is far better than ucs and latex for mathematical text. The completion of sequences of related symbols, e.g. arrows, is great.



Drew said...

See also ucs-cmds.el. You can easily define commands to insert individual Unicode chars. Bind them to keys to, in effect, add special chars to your keyboard.