/[ddp]/utils/debiandoc-to-docbook/dd2xml.html
ViewVC logotype

Contents of /utils/debiandoc-to-docbook/dd2xml.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 567 - (show annotations) (download) (as text)
Tue Dec 17 22:10:31 2002 UTC (10 years, 5 months ago) by osamu
File MIME type: text/html
File size: 12544 byte(s)
Updated documentation to match new script
1 <html>
2 <head>
3 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
4 <meta name="Author" content="Osamu Aoki">
5 <meta name="GENERATOR" content="VIM">
6 <title>debiandoc-sgml to dookbook-xml conversion</title>
7 </head>
8 <body>
9 <h1>debiandoc-sgml to dookbook-xml conversion</h1>
10 <pre>
11 ==========================================================================
12 (original concept) Philippe Batailler &lt;pbatailler@teaser.fr&gt;
13 (original concept) Adam DiCarlo &lt;aph@debian.org&gt;
14 (ghost writer) Osamu Aoki &lt;osamu@debian.org&gt;
15 Sat Dec 14 00:54:21 2002
16 ==========================================================================
17 </pre>
18 </p>
19
20 <h2><a name="toc">Table of contents</a></h2>
21 <p>
22 <ul>
23 <li><a href="#why">Why convert?</a></li>
24 <li><a href="#how">How to read this?</a></li>
25 <li><a href="#step">Step by step guide.</a></li>
26 <li><a href="#tags">How tags are converted?</a></li>
27 </ul>
28 </p>
29 <hr>
30
31 <h2><a name="why">Why convert?</a></h2>
32 <p>
33 Because it is cool to be XML :)
34 </p>
35 <hr>
36
37 <h2><a name="how">How to read this?</a></h2>
38 <p>
39 Use table capable web browser if you are reading HTML.
40 </p>
41 <ul>
42 <li>Mozilla</li>
43 <li>Galeon</li>
44 <li>links</li>
45 <li>w3m</li>
46 <li>...</li>
47 </ul>
48 <hr>
49
50 <h2><a name="step">Step by step guide.</a></h2>
51 <p>
52 This is a rehashed tutorial given by Philippe Batailler's to Osamu Aoki
53 through the private e-mails in 2002.
54 </p>
55 <p>
56 In order to convert debiandoc-sgml into docbook-xml, following
57 steps needs to be taken:
58 </p>
59
60 <p>
61 <ol>
62 <li>Install debian2docbookxml
63 <!--
64 <p>
65 Get a file <tt>Debiandoc2docbookxml.tar.gz</tt> from
66 <a href="http://www.teaser.fr/~pbatailler/">Philippe Batailler's web site</a>.
67 Then untar it and copy contents to the root of SGML file source.
68 </p>
69 <p>
70 Note: Something like following should work once this is packaged into deb.
71 <pre>
72 $ su -c "apt-get update && apt-get install ???debian2docbookxml???"
73 </pre>
74 </p>
75 -->
76 <p>
77 Get scripts from DDP CVS server
78 <pre>
79 $ cd $HOME
80 $ echo 'export PATH="~/Debiandoc-to-docbook:${PATH}"'>> ~/.bash_profile
81 $ . ~/.bash_profile
82 $ export CVSROOT=:pserver:anonymous@cvs.debian.org:/cvs/debian-doc
83 $ cvs login
84 $ cvs co -d Debiandoc-to-docbook utils/debiandoc-to-docbook # ***
85 $ cd Debiandoc-to-docbook
86 $ make
87 $ cd /some/debiandoc/sgml/source-directory/ # use of mc is easy way :)
88 </pre>
89 ***) Whoever checked out from old CVS location, please commit all
90 changes and check out all new trees in a different location.
91 I will remove old CVS location soon.
92 </p>
93
94 <li>Make source file compatible with script manually
95 <p>
96 Due to some conversion script limitations, if you experience problems
97 converting files, please consider the following source touch-up rules
98 presented below, although script might have fixed some of the issues
99 already (it will not harm).
100 </p>
101 <ul>
102 <li>Adjust SGML header lines (first few lines of the file)
103 <p>If foo.sgml includes many files (subset of dtd), first line must end
104 with [ as:
105 <pre>
106 &lt;!DOCTYPE debiandoc PUBLIC "-//DebianDoc//DTD DebianDoc//EN" [
107 ... content
108 ]&gt;
109 </pre>
110 Here splitting each start and end of these section will fail.
111 </p>
112 <p>
113 If foo.sgml is a single file, header is:
114 <pre>
115 &lt;!DOCTYPE debiandoc PUBLIC "-//DebianDoc//DTD DebianDoc//EN"&gt;
116 </pre>
117 </p>
118
119 <li>Keep some conditionals within one line.
120 <p>
121 <pre>
122 &lt;[%bar;[
123 </pre>
124 </p>
125
126 <li>Keep some some tags within one line.
127 <p>
128 <pre>
129 &lt;chapt&gt;...&lt;/chapt&gt;
130 &lt;appendix&gt;...&lt;/appendix&gt;
131 &lt;sect&gt;...&lt;/sect&gt;
132 &lt;sect1&gt;...&lt;/sect1&gt;
133 &lt;sect2&gt;...&lt;/sect2&gt;
134 </pre>
135 </p>
136
137 <li>Remove some comment
138 <p>
139 <pre>
140 &lt;!-- ... --&gt;
141 </pre>
142 </p>
143 <p>
144 This is to avoid script to malfunction by "<tt> ]&gt; </tt>" in the comment.
145 </p>
146
147 <li>make attribute such as "id" a single token.
148 <p>
149 <pre>
150 wrong: &lt;book id="foo bar"&gt;
151 correct: &lt;book id="foo_bar"&gt;
152 </pre>
153 </p>
154
155 </ul>
156 </p>
157
158 <li>Normalize SGML to XML compatible format (debiandoc-tidy)
159 <p>
160 <pre>
161 $ debiandoc-tidy foo.sgml
162 $ debiandoc-tidy -e bar.ent
163 </pre>
164 </p>
165
166 <li>Convert SGML tags into XML tags (debiandoc2docbookxml)
167 <p>
168 If foo.sgml is smaller article in a single file without subset of dtd.
169 <pre>
170 $ debiandoc2docbookxml -a foo.sgml
171 </pre>
172 </p>
173 <p>
174 If foo.sgml is larger book in a single file without subset of dtd.
175 <pre>
176 $ debiandoc2docbookxml -b foo.sgml
177 </pre>
178 </p>
179 <p>
180 If foo.sgml is larger book with many included files (subset of dtd).
181 <pre>
182 $ debiandoc2docbookxml -s -b foo.sgml
183 </pre>
184 </p>
185 <p>
186 Now we have got a large single foo.xml
187 </p>
188 <p>
189 If foo.sgml is larger book with many included files (subset of dtd). To
190 create split file output in <tt>bar/</tt>,
191 <pre>
192 $ mkdir bar ; cd bar
193 $ debiandoc2docbookxml -S -s -b ../foo.sgml
194 $ cd ..
195 </pre>
196 </p>
197 <p>
198 Now we have got a foo.xml with many chunks of files under <tt>bar/</tt>
199 </p>
200 <p>
201 For debugging, use "-k" to keep intermediate files and use
202 "-t" to trace shell activities.
203 </p>
204
205 <li>Test it with emacs and psgml, or nsgmls:
206 <p>
207 <pre>
208 $ nsgmls -s /usr/share/sgml/declaration/xml.decl foo.xml
209 </pre>
210 <li>Format source for readability
211 <p>
212 In order to make source more readable, some reformatting may be good idea.
213 For example, to add newline after &lt;/listitem&gt;:
214 <pre>
215 $ perl -i -p -e's,&lt;/listitem&gt;,&lt;/listitem&gt;\n,g' foo.xml
216 </pre>
217 </p>
218
219 <li>Building output
220 <p>
221 There are few strategies to build output.
222 <table border="1">
223 <tr>
224 <td>
225 Stylesheet
226 </td>
227 <td>
228 Back end
229 </td>
230 </tr>
231 <tr>
232 <td>DSSSL</td>
233 <td>jade and jadetex</td>
234 </tr>
235 <tr>
236 <td>CSS</td>
237 <td>mozilla?</td>
238 </tr>
239 <tr>
240 <td>XSL</td>
241 <td>passivetex?</td>
242 </tr>
243 </table>
244 <p>
245 Needs more documentation for creating files (plain text, multi-file,
246 HTML, PS, PDF).
247 </p>
248 </ol>
249 </p>
250 <hr>
251 <h2><a name="tags">How tags are converted?</a></h2>
252 <p>
253 Here is a conversion list of tags from debiandoc-sgml to dookbook-xml.
254 Each column means as follows:
255
256 <ul>
257 <li>"debiandoc-sgml tag" are the tags used in original documents.</li>
258 <li>"converted docbook-xml tag" are the tags converted programatically
259 by XLST.</li>
260 <li>"alternative docbook-xml tag" are the alternative tags which may
261 be used in places by editing dookbook-xml source later by the human.</li>
262 </ul>
263 </p>
264
265 <table border="1">
266 <tr>
267 <td><p>original debiandoc-sgml tag</p></td>
268 <td><p>converted docbook-xml tag using XLST</p></td>
269 <td><p>alternative docbook-xml tag</p></td>
270 </tr>
271 <tr>
272 <td><p>book</p></td>
273 <td>
274 <p>book (-b option)</p>
275 <p>article (-a option)</p>
276 </td>
277 <td><p></p></td>
278 </tr>
279 <tr>
280 <td><p>title</p></td>
281 <td><p>title</p></td>
282 <td><p></p></td>
283 </tr>
284 <tr>
285 <td><p>author</p></td>
286 <td><p>author</p></td>
287 <td><p></p></td>
288 </tr>
289 <tr>
290 <td><p>name</p></td>
291 <td><p>firstname + surname</p></td>
292 <td><p></p></td>
293 </tr>
294 <tr>
295 <td><p>email</p></td>
296 <td>
297 <p>affiliation + address + email (in author element)</p>
298 <p>email (other places)</p></td>
299 </td>
300 <td><p></p></td>
301 </tr>
302 <tr>
303 <td><p>version</p></td>
304 <td><p>releaseinfo</p></td>
305 <td><p></p></td>
306 </tr>
307 <tr>
308 <td><p>abstract</p></td>
309 <td><p>abstract + para</p></td>
310 <td><p></p></td>
311 </tr>
312 <tr>
313 <td><p>copyright</p></td>
314 <td><p>copyright</p></td>
315 <td><p></p></td>
316 </tr>
317 <tr>
318 <td><p>toc</p></td>
319 <td>
320 <p>(presentation tool takes care)</p>
321 <p>(stylesheet is needed?) (oa)</p>
322 </td>
323 <td><p></p></td>
324 </tr>
325 <tr>
326 <td><p>chapt</p></td>
327 <td>
328 <p>chapter (-b option)</p>
329 <p>section (-a option)</p>
330 </td>
331 <td><p></p></td>
332 </tr>
333 <tr>
334 <td><p>appendix</p></td>
335 <td><p>appendix</p></td>
336 <td><p></p></td>
337 </tr>
338 <tr>
339 <td><p>sect</p></td>
340 <td><p>section</p></td>
341 <td><p></p></td>
342 </tr>
343 <tr>
344 <td><p>sect1</p></td>
345 <td><p>section</p></td>
346 <td><p></p></td>
347 </tr>
348 <tr>
349 <td><p>sect2</p></td>
350 <td><p>section</p></td>
351 <td><p></p></td>
352 </tr>
353 <tr>
354 <td><p>sect3</p></td>
355 <td><p>section</p></td>
356 <td><p></p></td>
357 </tr>
358 <tr>
359 <td><p>sect4</p></td>
360 <td><p>section</p></td>
361 <td><p></p></td>
362 </tr>
363 <tr>
364 <td><p>p</p></td>
365 <td><p>para</p></td>
366 <td><p></p></td>
367 </tr>
368 <tr>
369 <td><p>em</p></td>
370 <td><p>emphasis</p></td>
371 <td><p></p></td>
372 </tr>
373 <tr>
374 <td><p>strong</p></td>
375 <td>
376 <p>emphasis role="strong" (aph)</p>
377 <p>emphasis role="bold" (pb)</p>
378 </td>
379 <td><p>emphasis role="important"</p></td>
380 </tr>
381 <tr>
382 <td><p>var</p></td>
383 <td><p>replaceable</p></td>
384 <td><p></p></td>
385 </tr>
386 <tr>
387 <td><p>package</p></td>
388 <td><p>systemitem role="package"</p></td>
389 <td><p></p></td>
390 </tr>
391 <tr>
392 <td><p>prgn</p></td>
393 <td><p>command</p></td>
394 <td><p>??? (what to use for well known file w/o path)</p></td>
395 </tr>
396 <tr>
397 <td><p>file</p></td>
398 <td><p>filename</p><p>filename class="directory" (if it end with /)</p></td>
399 <td><p>filename class="directory"</p></td>
400 </tr>
401 <tr>
402 <td><p>tt</p></td>
403 <td><p>literal</p></td>
404 <td>
405 <p>command (this should have been prgn but many documents do this)</p>
406 <p>constant</p>
407 <p>computeroutput</p>
408 <p>envar</p>
409 <p>function</p>
410 <p>keycap</p>
411 <p>keycode</p>
412 <p>keycombo</p>
413 <p>keysym</p>
414 <p>markup</p>
415 <p>option</p>
416 <p>parameter</p>
417 <p>prompt</p>
418 <p>property</p>
419 <p>returnvalue</p>
420 <p>sgmltag</p>
421 <p>symbol</p>
422 <p>token</p>
423 <p>userinput</p>
424 <p>varname</p>
425 <p>wordasword</p>
426 <p>(do we need all these? are all in docbook-simple?</p>
427 </tr>
428 <tr>
429 <td><p>qref</p></td>
430 <td><p>link</p></td>
431 <td><p>citation ?</p></td>
432 </tr>
433 <tr>
434 <td><p>ref</p></td>
435 <td>
436 <p>xref (empty element)</p>
437 </td>
438 <td><p></p></td>
439 </tr>
440 <tr>
441 <td><p>manref</p></td>
442 <td><p>citerefentry + refentrytitle + manvolnum</p></td>
443 <td><p></p></td>
444 </tr>
445 <tr>
446 <td><p>ftpsite (old)</p></td>
447 <td><p>(convert original tag to url in debiandoc source)</p></td>
448 <td><p></p></td>
449 </tr>
450 <tr>
451 <td><p>ftppath (old)</p></td>
452 <td><p>(convert original tag to url in debiandoc source)</p></td>
453 <td><p></p></td>
454 </tr>
455 <tr>
456 <td><p>httpsite (old)</p></td>
457 <td><p>(convert original tag to url in debiandoc source)</p></td>
458 <td><p></p></td>
459 </tr>
460 <tr>
461 <td><p>httppath (old)</p></td>
462 <td><p>(convert original tag to url in debiandoc source)</p></td>
463 <td><p></p></td>
464 </tr>
465 <tr>
466 <td><p>url</p></td>
467 <td><p>ulink</p></td>
468 <td><p></p></td>
469 </tr>
470 <tr>
471 <td><p>footnote</p></td>
472 <td><p>footnote</p></td>
473 <td><p></p></td>
474 </tr>
475 <tr>
476 <td><p>list</p></td>
477 <td><p>itemizedlist</p></td>
478 <td><p></p></td>
479 </tr>
480 <tr>
481 <td><p>list compact</p></td>
482 <td><p>itemizedlist spacing="compact"</p></td>
483 <td><p></p></td>
484 </tr>
485 <tr>
486 <td><p>enumlist</p></td>
487 <td><p>orderedlist</p></td>
488 <td><p></p></td>
489 </tr>
490 <tr>
491 <td><p>enumlist compact</p></td>
492 <td><p>orderedlist spacing="compact"</p></td>
493 <td><p></p></td>
494 </tr>
495 <tr>
496 <td><p>taglist</p></td>
497 <td><p>variablelist</p></td>
498 <td><p></p></td>
499 </tr>
500 <tr>
501 <td><p>taglist compact</p></td>
502 <td><p>variablelist (there is no "spacing" attribute)</p></td>
503 <td><p>(possibly converting to table)</p></td>
504 </tr>
505 <tr>
506 <td><p>item</p></td>
507 <td><p>listitem + para</p></td>
508 <td><p></p></td>
509 </tr>
510 <tr>
511 <td><p>tag</p></td>
512 <td><p>varlistentry + term</p></td>
513 <td><p></p></td>
514 </tr>
515 <tr>
516 <td><p>example</p></td>
517 <td><p>screen</p></td>
518 <td>
519 <p>literallayout class="monospaced"</p>
520 <p></p>
521 </td>
522 </tr>
523 <tr>
524 <td><p>heading</p></td>
525 <td><p>title</p></td>
526 <td><p></p></td>
527 </tr>
528 <tr>
529 <td><p>comment</p></td>
530 <td><p>remark</p></td>
531 <td>
532 <p>caution</p>
533 <p>tip</p>
534 <p>warning</p>
535 <p>note</p>
536 </td>
537 </tr>
538 <tr>
539 <td><p>comment/p</p></td>
540 <td><p>phrase</p></td>
541 <td><p></p></td>
542 </tr>
543
544 <tr>
545 <td><p>*HTML* (table)</p></td>
546 <td><p></p></td>
547 <td><p></p></td>
548 </tr>
549
550 <tr>
551 <td><p>*HTML* (tr)</p></td>
552 <td><p></p></td>
553 <td><p></p></td>
554 </tr>
555
556 <tr>
557 <td><p>*HTML* (th)</p></td>
558 <td><p></p></td>
559 <td><p></p></td>
560 </tr>
561
562 <tr>
563 <td><p>*HTML* (td)</p></td>
564 <td><p></p></td>
565 <td><p></p></td>
566 </tr>
567
568 <tr>
569 <td><p>*HTML* (img src)</p></td>
570 <td><p></p></td>
571 <td><p></p></td>
572 </tr>
573
574 <tr>
575 <td><p>?</p></td>
576 <td><p></p></td>
577 <td><p></p></td>
578 </tr>
579
580 </table>
581
582 <p>
583 Here <strong>*HTML*</strong> entries above is not real tags in
584 debiandoc-sgml but tags of the missing feature to create
585 corresponding HTML tags.
586 </p>
587
588 <p>
589 file splitting is has funny bug which create titletoc.xml in scripts
590 directory. Also, multi file XML requires entries like:
591 <pre>
592 &lt;!ENTITY titletoc SYSTEM "en/titletoc.sgml"&gt;
593 </pre>
594 Currently, this is manual process.
595 </p>
596
597 </body>
598 </html>

  ViewVC Help
Powered by ViewVC 1.1.5