/[ddp]/manuals/trunk/intro-i18n/spanish.sgml
ViewVC logotype

Contents of /manuals/trunk/intro-i18n/spanish.sgml

Parent Directory Parent Directory | Revision Log Revision Log


Revision 864 - (show annotations) (download) (as text)
Thu Oct 14 12:05:10 1999 UTC (13 years, 7 months ago) by kubota
File MIME type: text/x-sgml
File size: 12754 byte(s)
Explanation on mail is expanded.

A section on WWW is added.

LANG variable for Spanish.

Explanation on codesets and input from keyboard is changed a little.

Typo in the section for Japanese.
1
2 <sect id="spanish"><heading>Spanish language / used in Spain,
3 most of America and Equatorial Guinea</heading>
4
5
6
7 <P>
8 Section written by
9 Eusebio C Rufian-Zilbermann <email>eusebio@acm.org</email>.
10 </P>
11
12 <P>
13 Spanish is one of the official languages in Spain,
14 the official language in most of the countries in the American
15 continent and the official language in Equatorial Guinea.
16 It is spoken in many other regions where it is not the
17 official language. Other official languages in Spain are
18 Galician, Catalan and Basque. These other languages each
19 have their own specific issues with regards to Localization.
20 They are not described in this section of the document.
21 </P>
22
23 <P>
24 The Spanish Language derives from the variation spoken in
25 the Castille region. The term Castillian is sometimes used
26 to refer to the Spanish language (particularly when an
27 author wants to stress the fact that there are other
28 languages spoken in Spain). Both Castillian and Spanish
29 language refer to the same language, they are not
30 different things.
31 </P>
32
33
34 <sect1 id="spanish-character"><heading>Characters used in Spanish</heading>
35
36 <P>
37 Spanish uses a Latin alphabet. The numerical characters
38 used in Spanish are the Arabic numerals.
39 </P>
40
41 <P>
42 The character that distinguishes Spanish from other Latin alphabets
43 is the &Ntilde; ('N' with tilde), which exists in uppercase and
44 lowercase versions. Vowels in Spanish may have a mark (the accent)
45 on top of them to indicate intensity intonation. This accent is
46 required for orthography (written correctness) on lowercase vowels
47 but it is optional in uppercase vowels. The letter 'u' may have a
48 dieresis (like the German umlaut), both in uppercase and
49 lowercase forms.
50 </P>
51
52 <P>
53 Some punctuation signs are characteristic of the Spanish
54 language. The opening question mark and the opening
55 exclamation sign look like the English question mark and
56 exclamation sign rotated 180 degrees. The English question
57 mark and exclamation sign are referred to as closing question
58 mark and exclamation sign. The small underlined 'a' and 'o' are
59 used mainly for ordinal numbers, similar to the small 'th'
60 in English ordinals.
61 </P>
62
63 <sect1 id="spanish-sets"><heading>Character Sets</heading>
64
65 <P>
66 UNE (Una Norma Espa&ntilde;ola) is the National
67 Standards Organization in Spain. UNE is a member of the ISO and
68 standards that have one-to-one correspondence are usually
69 called by their ISO number, rather than their UNE number.
70 </P>
71
72 <P>
73 ISO 8859-1, also known as ISO Latin-1, contains the characters
74 required for Spanish.
75 </P>
76
77 <sect1 id="spanish-codesets"><heading>Codesets</heading>
78
79 <P>
80 The codeset mostly used for Spanish is ISO 8859-1.
81 The codepage Windows 1252 a.k.a. Windows Latin-1 is a
82 superset of ISO 8859-1 that adds some characters in the
83 range 128 to 159. Other codesets are Unicode, Macintosh Roman
84 (codepage 1000), MS-DOS Latin-1 (codepage 850) or less
85 frequently MS-DOS Latin US (codepage 437) which contains
86 accented lowercase characters but not uppercase. Some
87 additional Latin codesets are EBCDIC CP500 and CP 1026
88 (used in IBM mainframes and terminal emulators),
89 Adobe Standard (used as default for Postscript documents),
90 Nextstep Latin, HP Roman 8 (for HPUX and Laserjet resident
91 printer fonts) and the Latin codepage in OS/2.
92 They are all stateless, 8-bit codepages (with the
93 exception of Unicode that is 16-bit).
94 </P>
95
96 <sect1 id="spanish-how"><heading>How These Codesets Are Used --- Information for Programmers</heading>
97
98 <P>
99 In most cases it is safe to use ISO 8859-1 characters. Some exceptions are
100 <list>
101 <item>WWW browsers should recognize all codesets.
102 <item>Software which communicates with IBM mainframes, Macintosh,
103 MS-DOS, Nextstep, HPUX, OS/2 should handle the
104 corresponding encoding.
105 <item>File names for Joliet-format CD-ROM used for Windows is
106 written in Unicode.
107 <item>Postscript interpreters should handle the Adobe Standard character set.
108 <item>Printer filters or drivers for HP printers should handle the
109 Roman-8 character set if using the internal fonts.
110 </list>
111 </P>
112
113 <sect1 id="spanish-columns"><heading>Columns</heading>
114
115 <P>
116 On console displays, each character occupies one column. Printed
117 text can be equally spaced (one column per character) or
118 proportionally spaced (a character can occupy fractionally
119 more or less than a column, depending on its shape).
120 </P>
121
122 <P>
123 Note: Even when using Traditional Sorting, ch and ll occupy
124 two columns. See the comment on Traditional sorting in
125 <ref id="spanish-sort">.
126 </P>
127
128 <sect1 id="spanish-direction"><heading>Writing Direction</heading>
129
130 <P>
131 Spanish is normally written in left to right lines arranged
132 from top to bottom of the page. For artistic purposes it might
133 be written in top to bottom columns arranged left to right
134 within the page. This columnar arrangement would be expected
135 only in graphic and charting programs (e.g., a drawing program,
136 a spreadsheet graph or a page layout program for composing
137 brochures) but regular text editors wouldn't be expected to
138 implement this style.
139 </P>
140
141 <sect1 id="spanish-layout"><heading>Layout of Characters</heading>
142
143 <P>
144 In the Spanish language, words are separated by spaces
145 and a line can be broken at a space, a punctuation sign or
146 a hyphenated word.
147 </P>
148
149 <P>
150 There are several sets of paired characters in Spanish.
151 Unlike English, question marks and exclamation signs are
152 also paired. Other paired characters are the same as English
153 (parenthesis, square brackets, and so forth). Opening
154 characters shouldn't appear at the end of a line.
155 Closing characters and punctuation signs such as period and
156 comma shouldn't appear at the beginning of a line.
157 </P>
158
159 <P>
160 Words can be broken at a syllabus and hyphenated. Unlike
161 English, syllabi in Spanish end in a vowel more often than
162 in a consonant. Syllabi that end in a consonant letter are
163 typically at the end of a word or followed by a syllabus
164 that starts with another consonant. Anyway, the rules are
165 not completely consistent and a hyphenation
166 dictionary has to be used.
167 </P>
168
169 <sect1 id="spanish-lang"><heading>LANG variable</heading>
170
171 <P>
172 For Bash
173 <example>
174 set meta-flag on # keep all 8 bits for keyboard input
175 set output-meta on # keep all 8 bits for terminal output
176 set convert-meta off # don't convert escape sequences
177 export LC_CTYPE=ISO_8859_1
178 </example>
179 </P>
180
181 <P>
182 For Tcsh
183 <example>
184 setenv LANG C
185 setenv LC_CTYPE "iso_8859_1"
186 </example>
187 </P>
188
189 <sect1 id="spanish-input"><heading>Input from Keyboard</heading>
190
191 <P>
192 For the Spanish keyboard to work correctly, you need the command
193 <tt>loadkeys /usr/lib/kbd/keytables/es.map</tt> in the corresponding
194 startup (rc) file.
195 </P>
196
197 <P>
198 Most of the Spanish characters are input from the keyboard
199 with a single stroke. A two-key combination is used for accent
200 and dieresis marks above vowels. Traditional typewriter
201 machines used a 'dead key' system with keys that would
202 strike the paper without advancing the carriage to the
203 next character. Typing on a computer keyboard simulates
204 this behavior, typing the accent or dieresis key does
205 not produce any visible output until a vowel is typed
206 afterwards. Usually if the accent or dieresis key is followed
207 by a consonant, the accent key is ignored. Accented or
208 dieresis characters cannot be used for shortcut keys
209 for selecting options.
210 </P>
211
212 <P>
213 The words for Yes and No are S&iacute; (the character next to S is 'i'
214 with acute accent) and No. We would commonly use the S and N
215 keys for a S&iacute;/No choice.
216 </P>
217
218 <P>
219 Spanish keyboards usually allow for typing not only the Spanish
220 accent signs, but also the accent signs in French and other
221 languages (grave accent, circumflex accent, umlaut on letters
222 other than the u). Other character that is typically available
223 is the cedilla C (that looks like a C with a comma underneath,
224 used for Catalan, Portuguese and French words, for example).
225 There is a Latin-American keyboard layout that does not contain
226 the grave accent and the cedilla C.
227 </P>
228
229 <sect1 id="spanish-more"><heading>More Detailed Discussions</heading>
230
231 <sect2 id="spanish-sort"><heading>Sorting</heading>
232
233 <P>
234 Traditional Spanish considered the combinations CH and LL individual
235 single letters. For usage in computers, this required an additional
236 effort for sorting and character counting algorithms. It was
237 decided that the savings in not requiring special algorithms
238 was significant enough and that it would be acceptable to treat
239 them as 2 separate letters. Some software that already had
240 incorporated the special sorting algorithms now allows for
241 choosing between 'Traditional Spanish Sort' and 'Modern Spanish Sort'.
242 </P>
243
244 <P>
245 Accents and dieresis are ignored for sorting purposes. The
246 only exception is the rare case where two words are exactly
247 the same and the accent is the only difference, the word with
248 the unaccented character should be sorted first. E.g.,
249 cami&oacute;n (c-a-m-i-o with acute accent-n), camionero,
250 este, &eacute;ste (e with acute accent-s-t-e).
251 </P>
252
253 <P>
254 The &ntilde; (n with tilde) is always sorted after the n and
255 before the l. It cannot be intermixed with the n.
256 </P>
257
258 <sect2 id="spanish-number"><heading>Number format, date and
259 currency symbols</heading>
260
261 <P>
262 The use of the dot and the comma as a thousands separator and
263 for decimal places is usually the opposite of US English.
264 E.g., 1.000,00 instead of 1,000.00. Some Spanish-speaking countries,
265 notably Mexico, follow the same standards as the US. It is
266 desirable that programs can handle both forms as an independent setting.
267 </P>
268
269 <P>
270 The usual date format is DD-MM-YYYY rather than MM-DD-YYYY, but
271 again this depends on the specific country. It is desirable to have
272 the date format as a configurable parameter.
273 </P>
274
275 <P>
276 The currency symbol can be prepended or appended to the number
277 and it can be one or several characters long. E.g., 100 PTA for
278 Spanish pesetas or N$ 100 for Mexican pesos. It is desirable that
279 the symbol and position can be individually defined and to allow
280 for currency symbols longer than 1-character.
281 </P>
282
283 <sect2 id="spanish-varieties"><heading>Varieties of Spanish</heading>
284
285 <P>
286 Spanish is spoken by a tremendous variety of people. Academics
287 through the different Spanish-speaking countries realized
288 that this could lead to a dismemberment of the language
289 and founded the Academy of the Spanish Language. This
290 academy has branches in most of the Spanish-speaking
291 countries, there is a Royal Academy of the Spanish Language
292 of Spain, an Academy of the Spanish Language of Mexico,
293 et cetera. The members of this Academy study the local
294 evolution of the languages in each country. They meet
295 together to maintain a body of knowledge of what should
296 be considered the Standard Spanish Language and what should
297 be considered local or regional terms and slang terms.
298 </P>
299
300 <P>
301 In most cases, software can use terms that are within the
302 Standard set by the Academy. When new terms appear (e.g.,
303 when a new product is created that has no previous name in
304 the Spanish language) each region typically starts using a
305 new word. When there is one or two terms that become the
306 de-facto standard, the Academy would incorporate the new
307 term into the Standard. This is a very slow process and
308 there will be temporary usages in different regions within
309 the Spanish-speaking worlds that conflict with each other.
310 Some people speak about Spain-Spanish and American-Spanish
311 but most of the time it doesn't really make sense to make
312 this distinction. First of all, even within America, there
313 are differences between the local varieties that may be greater
314 than the differences with Spain itself. E.g., Spanish as
315 spoken in Mexico, Colombia and Argentina may have between
316 them as much differences as each of them when compared to
317 how it is spoken in Spain. A computer user in Ecuador may
318 feel more comfortable overall with the terms used in Spain
319 than with the terms used in Mexico (and of course, most
320 comfortable with the terms used in Ecuador itself!). The
321 options are to either produce one Spanish version of a
322 software product that is an acceptable compromise (maybe
323 not perfect) for all Spanish-speaking countries or to
324 produce multiple versions to account for all the regional
325 variations.
326 </P>
327
328 <P>
329 A plea to all the people who are localizing software into
330 Spanish: Let's use our efforts judiciously and create one
331 Spanish version and not many. Let's strive for a version
332 that conforms to the Standards and that can be as widely
333 accepted as possible for the areas not covered by the
334 Standards. Wouldn't you rather have a new product translated,
335 instead of two versions of a product where one matches your
336 local variety of the language?
337 </P>
338

  ViewVC Help
Powered by ViewVC 1.1.5