
@** Release history.

\font\df=cmbx12
\def\date#1{{\medskip\noindent\df #1\medskip}}
\parskip=1ex
\parindent=0pt

\date{Release 1: August 1993}

Initial release, accompanied {\it De la Terre \`a la Lune}.
Supported \LaTeX\ output only.  The program was called
\.{ETLATEX} in this release.

\date{Release 2: December 1996}

Added support for HTML, generating a document tree consisting
of an index and one document per chapter with navigation links.
Added support for French-style punctuation.  This release
accompanied the Etext of {\it Le Tour du Monde en 80 Jours}.
The program was renamed \.{ETSET} as of this release.

\date{Release 2.1: December 1996}

Added the ability to ``flatten'' 8-bit ISO characters to their
closest 7-bit ASCII representation for dumb terminals which
cannot display accented characters and an option to warn if
the Etext contains any non-ISO characters.

\date{Release 3: September 2001}

Essentially rewritten as a \CPP//STL application in Knuth's
``Literate Programming'' paradigm.

Added support for Palm Markup Language (PML) output.

HTML output may now be written as a single file containing
all chapters as well as a document tree with one chapter
per file.

\LaTeX\ output may now invoke the \.{babel} package
for language-specific formatting and use ISO 8859-1
characters in the \LaTeX\ input.

Each output format now permits ``special'' format-specific
commands to be embedded in a master Etext.  The interpretation
of these commands is up to the code which generates each
output format; common applications are passing commands
transparently through to the output to include images, etc.,
and substituting strings in the Etext for format-specific encodings
which better represent the original document.

Tools for preparers of Etexts are provided including
facilities to syntax check documents, strip format-specific
special commands, prologues, and epilogues, convert text
to canonical form by expanding tab stops and deleting trailing
white space, and more.

This release was coincident with, but not embedded in, the
Etext of {\it Autour de la Lune}.

\date{Release 3.1: May 2005}

Revised HTML generation to be compliant with XHTML 1.0
and replace GIF navigation images with PNG files.

Migrated the WIN32 version from DJGpp to Microsoft Visual
\CPP/.NET.


@** Development log.

\date{2001 August 30}

What with commencing the cleanup and debug phase, particularly
in the output format specific parts where it's easy to fix
something in one format and forget to make corresponding
changes in the other, time has come to start a development
log.  So here goes.

Renamed the flag which controls French-style spacing around
punctuation as the more appropriate |frenchPunct| (this is
enabled by the \.{-f} command line option) and fixed numerous bugs
in handling it on both the \LaTeX\ and HTML sides.  \LaTeX\ now
uses a smaller font for the nonbreaking space before punctuation.
Note that with the new guillemets there is no need for a special
case for spacing after an open guillemet in \LaTeX, but HTML
must still test for this condition.

Added support for chapter numbers in HTML single file
output.  Chapter separators may contain a chapter number,
chapter name, or neither.  If the separator contains only
a number or a name, no rule is generated.  Otherwise,
a rule separates the chapter number and name or suffices
as the entire chapter separator.

Added code to handle multi-line title, author, chapter
number, and chapter name specifications for HTML format
output.  This was already implicitly handled for
\LaTeX\ since all of these items wrap the body in a
declaration which may span multiple lines.

We don't permit nested \.{[}{\it footnotes}\.{]} and, although they actually
happen to work in \LaTeX, they don't in HTML\null.  Added a check for
this in HTML output which issues a warning for nested footnotes or
end footnote brackets when no footnote is open.

Added a check for close footnote bracket with no footnote
open in \LaTeX\ output as well.

Installed a very nice rendering of footnotes in single file
HTML as right aligned tables the text which contained the footnote
flowed around, then immediately ripped it right out again (actually,
disabled on ``\.{\#ifdef{ }NETSCRAPE\_SUCKS}'' because of the old bugeroo
which bit us way back in ``{\it Guns in Space}''---any floating object
causes Netscrape to forget about a running style, with the result
that the page margins are lost at the end of the floating object
and thereafter.  I know of no work-around for this, so I replaced
the nice code with ugly code which simply renders the footnote
in-line in a smaller sans-serif font with a light yellow
background.  Even that Netscrape can't completely cope with; when
it's adding space to justify a line of text, it forgets about
the background-color of the font and adds the justification space
in the page background colour instead.

\date{2001 August 31}

Moved code which creates the GIF navigation buttons into static
methods of |HTMLGenerationSink|.  |createNavButton| creates a single
button, and |createNavButtons| writes the lot.  The embedded
definitions of the buttons are now static variables local to
|createNavButtons|.

Changed definition of button embedded file arrays to
``\.{const}'', occasioning the usual \CPP/ hoo-rah all around the
program.

Added \.{buttons.h},.{config.h}, and \.{getopt.h} to the dependencies
of \.{etset.c} in \.{Makefile.in}.

Added a |createNavigationPanel| method to |HTMLGenerationSink| to
generate the nevigation panel for a chapter.  It's called with
the numbers of the previous and next chapters (zero if none
exist and a greyed-out button should be generated).  Note
that the look-ahead required to grey out the next button in
the last chapter is not yet implemented---at the moment it
blindly makes a bad link to a nonexistent chapter.

Integrated the GNU Getopt (\.{getopt.c}, \.{getopt.h}, \.{getopt1.c}) from
the \.{fileutils-4.1} distribution (\.{lib} directory).  This version
supports long options with automatic disambiguation.  Since it's
covered by the GPL, I included a copy of the GPL as \.{COPYING.GNU}
and added a mention of the status of these files to the main
\.{COPYING} statement.  Updated the \.{Makefile.in} to add the new files
to the build and release targets.  (The release target is almost
certainly out of date in other ways and will need to be reviewed
when we're closer to releasing the thing.)

Replaced explicit 8-bit ISO 8859-1 characters in string literals with
references to defined constants based on their Unicode name which
express them as escaped hexadecimal characters or strings.  While most
compilers have no problems with such characters in character or
string constants, getting Plain \TeX\ to display them is quite a
challenge and won't necessarily work on an old or very basic
\TeX\ installation.  Since the case only comes up a few times, the path
of least resistance is to quote them and be safe.

Cleaned up some unreferenced variables and signed/unsigned natters
identified by a \.{-Wall} build.

In multi-file HTML documents, the HTML \.{<title>} of the chapter documents
is now the title of the work with the chapter number (if any) appended
following a colon.  The earlier practice of using the chapter title
looked dumb.

Modified chapter title generation for multi-file HTML to use the chapter
number, rule, and chapter name logic as used for single file documents.  I
kept the code separate because it uses a different level of heading and we'll
probably want to tweak it further when aesthetical fine tuning is in order.

Output of the document title, author, and chapter numbers and names in
HTML documents was not applying HTML translation, only quoting.  This
caused markup such as italics not to be honoured.  I added |translateHTMLString|
calls where these items are output to the documents as text.  Note that
we don't want to store these items in memory in translated form as
there are circumstances in which we need them free of HTML mark-up,
for example in the \.{<title>} section of the \.{<head>} and as
arguments in \.{<meta>} items.

Got rid of the last old-style \CEE/ I/O in the command line option handler; it's
all \CPP/ streams now.

Generation of the document preamble in \LaTeX\ was getting a tad long to be
embedded in the main case statement.  I broke it out into its own section.

\date{2001 September 1}

Implemented prejustified tables.  A prejustified table begins with a line
which begins in column 3 and contains at least one sequence of three or
more spaces between two nonblank substrings.  This, and subsequent lines
until the next blank line are to be rendered as-is, in a fixed width font
to preserve alignment.  Added documentation for this to \.{etsetfmt.w} and the
\.{adlune}.txt test Etext, and supported it with a \.{verbatim} environment in
\LaTeX\ and \.{<pre>} in HTML.  Interpretation of control codes within the text
are to be suppressed in tables.  I think this is correct now in \LaTeX, but
it needs more testing in HTML.

Fixed a nasty bug in alignment classification where |isspace()| was considering
ISO characters as blank.  We need to eliminate all |isspace()| calls and
replace with explicit tests for blank (once we know white space has been
expanded) or for ASCII white space, being careful not to be fooled by
sign-extended characters on machines where |char| is signed.

Changed \.{Makefile.in} build commands for ``\.{configurator}'' to assume a standard
installation of \.{autoconf} as opposed to the weird environment on Jura.
There's no need to specify an explicit path for the macro
library unless it's somewhere other than where \.{autoconf} expects
it to be (usually \.{/usr/share/autoconf}).

Deleted a bunch of excess baggage from \.{configure.in}.

The \.{Makefile.in} did not previously distinguish between \CEE/ and \CPP/ source
files.  This was no problem until we integrated the GNU |getopt|, which
will not build with a \CPP/ compiler as it contains K\&R style function
declarations.  I added logic to \.{Makefile.in} to consider files with an
extension of \.{.cc} as \CPP/ and \.{.c} as C, using the appropriate compiler,
and a target to produce a \.{.cc} file from a \.{CWEB} \.{.w} file by processing
it with \.{CTANGLE} then renaming the resulting \.{.c} file as \.{.cc}.  (Any
additional files produced from the \.{.w} can specify their extensions
explicitly, but this allows using the default output of \.{CTANGLE}.)

Added logic to configure.in to set \.{PAGER} to \.{less}, \.{more}, or \.{cat}, in
that order, depending on which are present.

Converted command line option processing to use |getopt_long| and
provided long option alternatives for all existing single letter
options.  Updated |usage| to document the new long options.

\date{2001 September 2}

Added logic to |auditFilter| to detect when lines belong to a
preformatted table and permit embedded spaces in them.

Added a check to |auditFilter| which warns when a line judged
to be centred by |etextBodyParserFilter::|\breakOK|classifyLine| has a
discrepancy of more than 2 between the number of spaces at the
left and the right (considering the line to extend to
|maxLineLength| characters).  In addition to ugly centring
in the Etext, this will catch many errors in aligning block
quotes and ragged left and right text.

Installed logic to dynamically assemble the processing pipeline
with dynamically allocated components where required.  A |Plumb()|
macro in the main program now attaches each component to the end
of the pipe, advancing a |pipeEnd| pointer as it goes.  You can,
of course, still create static pipelines but this approach provides
the flexibility we'll need to support all the various task options
soon to be provided.  For the first time it's actually possible
to choose \LaTeX\ or HTML format output via the command line
options!

Added a \.{--single-file} option to enable single file HTML output.
The default remains a document tree with one file per chapter.

Added a \.{--debug-parser}~{\it file} option to enable parser debugging
without any compile-time conditionals.  If set, a tee is inserted
in the pipeline after the body parser with the secondary output
directed to a parser diagnostic filter whose output in turn
goes to a |streamSink| which writes the parser debug log to the
specified {\it file}.

Added the ability to direct the prologue and epilogue of the input
file to designated output files.  The \.{--save-prologue} and
\.{--save-epilogue} options both take a file argument designating where that
section of the input is written.  These may be the same file name,
in which case the prologue and epilogue are concatenated to the
same file.

Added the ability to configure |auditFilter| for which tests should
be performed on text it processes.  A new optional argument to the
constructor (default is all tests enabled) accepts a bit map
of |enum| type |audit_criteria|, defined public in the class
definition.  You may also set the criteria with |setAuditCriteria|
and return them with |getAuditCriteria|.

Implemented footnote support in multi-file HTML output.  When the
first footnote is encountered, a footnote document named
{\it basename}\\{\_foot.html} is created.  Each footnote in the document
is given an ascending number, which is used as a link in the
main document (as a superscript number) which points to a
fragment tag in the footnote document.  Footnotes are
separated by blank lines in a \.{<pre>} tag so the selected footnote
will appear at the top of the page.  The footnote document is
opened with a target window of {\it basename}\\{\_foot}, so the main
document continues to be displayed in browsers which support
targeted links.

\date{2001 September 3}

Added a ``\.{--clean}'' option to perform final cleanup and canonicalisation
of a complete Etext prior to publication.  This processes the
entire input, ignoring section breaks, and expands tabs to
spaces, removes trailing white space, and audits the result for
trailing blanks and embedded tabs (as an internal sanity check),
lines which exceed the maximum length, and invalid characters.
The higher-level body-only auditing is not done since the prologue
and epilogue need not and usually will not conform to the specifications
for the body.  Output is to the second argument on the command line
or standard output if it is omitted (just like \LaTeX\ output), and
error messages are directed to standard error.

Added a ``\.{--check}'' option which assumes the input is in the
canonical ready-to-publish form produced and verified by the
``\.{--clean}'' option; any text produced by \.{--clean} with no warnings
should produce no warnings when run with \.{--check}.  The text is
examined for trailing space, embedded tabs, lines which exceed
the maximum length, and invalid characters.  There is no output
other any warning message, if any, which are written to standard
output.  Before publishing a text, it's always a good idea to
verify it with \.{--check}, since it's only too easy when making
last minute edits to embed a tab which may mess up formatting
on a reader's system with different assumptions about tab stops.

Modified |auditFilter| so lines containing embedded tabs diagnosed
when |embedded_tabs| is set in the auditing criteria are not double
reported as having invalid characters.  Also, all embedded tabs are
now reported, not just the first one in the line.

Added a |quoteArbitraryString| function to |auditFilter|, used when
printing lines which merit diagnostic messages.  This function
expands all characters which fail the |isCharacterPermissible|
test to their C $\backslash${\tt x}{\it NN} escape form so non-graphic characters are
rendered visible and nondestructive.  In addition, blanks appearing
at the end of lines are quoted as $\backslash${\tt x20} so they're apparent.

Added logic to |HTMLGenerationSink| to cache lines of a chapter in
a queue until either the start of the next chapter is encountered
or the end of document is reached.  Only after we know whether a
chapter actually follows the current chapter do we generate the
chapter's HTML file.  This permits disabling the ``next'' button for
the last chapter, as opposed to allowing the user to click it only to
receive a broken link as a reward.

\date{2001 September 4}

Added ``|const|'' to declarations of several read-only tables.

Disabled generation of the footnote navigation button on
``\.{\#ifdef{ }FOOTNOTE\_BUTTON\_NEEDED}''.  At the moment we use a
superscript footnote number instead of this button.

Implemented support for ``special commands''.  A special command
appears in the document body like a regular line of text and
obeys the same justification rules.  Only one special command may
appear on a line, and each as the format:

\verbatim
!    <><><>Special:FORMAT ...Format-specific text...<><><>
!endgroup
    
where FORMAT identifies which output format this special command
pertains to, for example ``\.{HTML}'' or ``\.{LaTeX}''.  By default,
|etextBodyParserFilter| strips all special commands from
text before passing it downstream.  To receive special
commands for a given format {\it FMT}, the |setSpecialFilter|
method of the body parser should be used to specify the
format desired; special commands for other formats will
continue to be elided.  To receive all special commands,
set the special filter to ``\.{*}''.

Implemented a special command handler in |HTMLGenerationSink|
which simply passes through the body of the special to the
output HTML file unquoted (this will be the way most format
handlers implement specials).  This permits including figures
in a document with a special like:

\verbatim
!  <><><>Special:HTML <img src="fig1.jpg" width=450 height=640 alt=""><><><>
!endgroup

Note that the figure will form part of a paragraph with whatever
alignment is specified by the justification of the special.
You can, of course, override this by including explicit HTML
tags within the body of the special.

Added code to command line parsing to verify that the input
and output files aren't the same, which would disastrously
truncate the input file before it was even read.  If the
output format is not HTML, two names are given, and neither
is standard I/O, we first test if they're lexically equal
and, if not and we're running on a system which supports
Unix |stat()|, we test for identical device and Inode numbers
to protect against aliased files.

Added the concept of a ``Declarations'' section consisting entirely
of special commands which appear before the title (or first chapter,
or whatever---at the top of the document).  Declarations begin
with the first special command encountered prior to the title
and continue until a non-special is encountered.  (Note that
specials not pertaining to the downstream components are already
filtered out before this processing is performed.)  The block
of declarations is emitted between \.{Begin} and \.{End} brackets with
each declaration bearing a \.{Body} bracket in the same manner as
any other text block.  If multiple sequences of consecutive
specials appear, separated by blank lines, they will be
output as multiple declaration blocks, each with its own \.{Begin}
and \.{End} brackets.

Added support for declarations in |LaTeXGenerationFilter|.  Declarations
are output in the document preamble, prior to the \.{$\backslash$begin\{document\}}
statement.  This allows declarations to include packages and
make declarations in the global document context.  For example:

\verbatim
!    <><><>Special:LaTeX \usepackage{graphics}<><><>
!    <><><>Special:LaTeX \newcommand{\fig}[1]{\resizebox{8cm}{!!}<><><>
!    <><><>Special:LaTeX    {\includegraphics{figures/#1.epsf}}}<><><>
!endgroup

Note how long specials may be split onto multiple lines as long as the
target language permits such syntax.

\date{2001 September 5}

Added |enableAuditCriteria| and |disableAuditCriteria| methods to
|auditFilter| to set and clear bit masks in the audit criteria mask.

Exempted special commands from line length and improper centring
checks in |auditFilter|.

Configured |auditFilter| to permit special commands in texts being
translated to \LaTeX\ and HTML.

Added a \.{--verbose} (or \.{-v}) option to enable gabby chatter about
what's going on to standard output.

Added \.{--verbose} mode output to show the number of lines written to
a \LaTeX\ file and each HTML file generated.

Added a |permit_8_bit_ISO_characters| mode to the audit criteria
of |auditFilter| (so-phrased so it's enabled if you use the
default of ``|everything|''), and a new command line option
``\.{--ascii-only}'' which clears this mode when building the pipeline.
If you require the text to be exclusively 7-bit, you can
verify it using this option.  The option works in conjunction
with the \.{--check} and \.{--clean} options as well as those which
translate the text.

Added a |stripSpecialCommandsFilter| which removes all special
commands from the stream.  If elision of special commands would
result in two consecutive blank lines, the blank line following
the special(s) is deleted as well.  A new ``\.{--special-strip}''
option interposes this filter after the trailing white space
and tab expansion filters (or directly after the input source
if these filters are suppressed by the \.{--check} option).  The
stripping of specials may then be applied when producing a
text for publication with the \.{--clean} option, or when
formatting the text.

\date{2001 September 6}

Added \.{<link>} tags to the header of chapter documents in multiple
file HTML to indicate the parent/child relationship between
the chapters and the index document and the next/previous
relationships among chapters.

Added \.{<meta>} ``\.{description}'' and ``\.{author}'' tags to header section
of all HTML documents for which a title and author were specified.

Implemented declarations in HTML\null.  Special declarations which
appear before the title are saved in a |dqueue| ``|declarationsQueue|''
and then output prior to the \.{</head>} tag by |writeHTMLDocumentPreamble|.
Declarations are transcribed verbatim to each HTML file generated.
You may include as many lines as you wish in the header simply
by providing as many consecutive declaration lines.

The check for extra embedded spaces in |auditFilter| should have been
suppressed for special commands.  It is now.

\date{2001 September 7}

Added a --flatten-iso option which interposes a |flattenISOCharactersFilter|
in the pipeline.  This filter translates 8-bit ISO characters to
the nearest 7-bit ASCII equivalent, stripping accents from
letters and representing punctuation as best as possible.  This
filter may be used to extract a flattened Etext with the
\.{--check} option, or to flatten input prior to formatting.

If a document title is specified, it will now appear centred at the top
of each HTML chapter document, with the navigation buttons at
the right.

If chapter numbers are specified, they will be used to identify
chapters in the index document created for multiple file HTML
output as the terms in a definition list.  Both the terms and the
definitions (chapter titles) are linked to the chapter documents.
If no chapter number is given, its chapter number ($1\cdots n$) is used,
followed by a period.

The incompatibility brigade having set its sights on the
humble \.{<DL{ }COMPACT>} tag (why would you {\it need} a list
user-supplied items and descriptions appearing on the
same line, after all?), I gave up and made the chapter
table in the muultiple file HTML index document a
\.{<TABLE>}.  After sufficient tweaking, it appears to behave
reasonably in Netscrape and Mozilla.  I have yet to subject
it to the tender embrace of Exploder.

When generating multiple file HTML, if a special appeared
between the title/author and the first chapter number/name
item, it would be placed within the table of chapters.  This
isn't what you want---such specials are likely to be figures at
the start of the document or some such, and shouldn't have to
conform to the constraints of being embedded in the chapter
table.  I deferred generating the start of the chapter table
until the first chapter title is encountered (at the same time
the navigation buttons are written), shifting specials to
before the start of the table.

Added a bogus column to the chapter table in the multiple
file HTML index document to separate the chapter number from
the chapter title.

\date{2001 September 8}

Converted this development log into \TeX\ in a \.{log.w} file which
is included in \.{etset.w} so it's automatically formatted when the
program is printed.  I added \.{log.w} and \.{etsetfmt.w} as dependencies
of \.{etset.tex} in \.{Makefile.in}.

\date{2001 September 9}

Added documentation of the command line and options to the top of
\.{etset.w}.

Deleted code conditional on \.{OLD\_GUILLEMETS}---the new ones are
so much prettier.

Implemented a ``\.{Substitution}'' special command for \LaTeX.  A text line
of the form:

\verbatim
!    <><><>Special:LaTeX Substitute /oeil/\oe il/<><><>
!endgroup
    
will substitute the text within the second set of delimiters
(which may be any character) for the text within the first, wherever
it may occur.  Substitution {\it is not} recursive---substituted text
is not re-scanned for occurrences of the same substitution.  Note that
substitutions are applied in |quoteLaTeXString| {\it after} all
transformations from the original text to \LaTeX\ are applied; this
provides the maximum flexibility for overriding the default translation
of text.

Absent any widely-available rendering engine for mathematics, we
don't translate mathematics into HTML---it is simply output in its
\LaTeX\ representation.  Since editors of HTML documents will usually
wish to replace this gnarl with images of the typeset equation produced
with tools such as

\centerline{\pdfURL{\TeX to{\tt GIF}}{http://www.fourmilab.ch/webtools/textogif/}}

I added code to typeset mathematics in pink \.{<table>} boxes to make
them more apparent when proofreading the document.

\date{2001 September 16}

Added code to |PalmGenerationFilter::quotePalmString| to quote
all non-ASCII characters as $\backslash$\.{a}{\it xxx} escapes; the
documentation doesn't make this clear, but it is required unless
you fancy a warning message for each and every ISO character.

Fixed an error in generating ellipses for Palm Reader documents; a
little left-over code from \LaTeX\ generation was resulting in
the pesky ``\.{Illegal control code: \\}'' messages.

Added code to re-format paragraphs in Palm Markup Language in
``one line per paragraph'' as it expects.  At the moment only
justified paragraphs are so-formatted.

Removed pre-existing indentation before title, author, and
chapter titles.  Multi-line chapter titles are aggregated
onto a single line.

\date{2001 September 17}

Palm output wasn't properly keeping track of math mode, which
resulted in \TeX\ subscript symbols being treated as italic
toggles, completely messing up subsequent text.  Even though
we don't do anything with math mode, we need to keep track of
whether or not we're in it, as it affects quoting of characters
therein.

Added a blank line before chapter breaks in Palm Markup Language
output.  It doesn't affect the generated Palm document, but it
makes it a lot easier to scan through the PML text when looking
for problems.

Pruned existing indentation when generating an aligned paragraph
for PML\null.  PML treats indentation as text characters when justifying
the line, which messes up the intended alignment.  A special case
is required for preformatted tables, since we cannot blindly
strip significant indentation which follows the two character
indentation which marks the table; for tables only the first
two blanks are stripped.  Note that this behaviour is slightly
different from that of \LaTeX\ and HTML output where the
table indentation is preserved and it would be possible to
have a table containing lines with nonblank characters in
the first or second columns of lines subsequent to the first.
While this might be nice, two lost characters are a horrible price
to pay on the cramped Palm screen which has difficulty fitting
aligned tables of any kind, so as a compromise I strip the first
two characters of any line so long as they are blank, but
preserve the leading two characters if either is nonblank.  Yes,
this may misalign a table, but in the vast majority of cases
it will provide the best rendering of the author's intent.  Since
Palm Reader doesn't presently support a monospaced font option,
there's no hope of properly aligning preformatted tables in any case---the
author is going to have to do it by hand with the \.{\\T=} tag
after the PML is generated.

Tweaked |generateAlignedParagraph| for Palm output to cope with
an eccentricity of PML paragraph justification tags.  The tags
ending centred and right justified paragraphs must appear at the start
of a line, but the tag ending an intended block quote must be
at the {\it end of the last line}.  If you place it at the start
of a new line, you end up with an extra line after the paragraph.
We now cope with this, buffering text one line ahead in the case of
a block quote so the closing tag may be appended when the
|End| bracket is encountered.

Note: Palm \.{MakeBook} and \.{DropBook} don't like chapter
titles which exceed 80 characters, and issue a warning (or
``error'' in the case of \.{MakeBook}) which suggests that
chapter markers may be unpaired.  The chapter title is, nonetheless,
properly rendered in the document and truncated with an
ellipsis in the go to chapter form.  (Note that chapter titles
much shorter than 80 character will usually be truncated in
this form as well.)  I leave such titles intact, since it's
better to put up with a warning than lop them off or include some
kind of kludge which would render poorly when the document is
read.

One more little twist with block quotes: naturally they too should
be output one paragraph per line with a \.{\\t} before and
after; I diddled the code to accomplish this, which actually
simplified it.  I may make this a special case of
generating a justified text paragraph rather than
|generateAlignedParagraph|, since the logic now more closely
resembles the former.

Made the |pruneIndent|, |elideNewLines|, and |linesIn| private
helper methods of |HTMLGenerationSink| |static|; they don't need
no steenkin instance variables.

Implemented special commands and declarations for Palm output,
including ``\.{Substitute}'' specials.  Non-substitution specials
emit their text directly into the text being assembled and
hence will be embedded in the middle of a paragraph if they
appear in regular text or a block quote.  Declarations are
output before the start of the text and may be used for
special titling.

Spacing around guillemets in |frenchPunct| mode was broken
due to yet another signed/unsigned |char| problem with
the ISO guillemets which are, of course, negative when
treated as a |signed char|.  I changed the definition to
hexadecimal constants and anded |char| quantities which
may be signed with |0xFF| where appropriate to guarantee
the comparison will be valid.  This required corresponding
changes in the |frenchPunct| handling of \LaTeX\ and
HTML generation as well.  Since we need to quote all ISO
characters for the Palm, spacing around guillemets had to
be handled separately in the
|@<Quote ISO 8859-1 character in Palm@>| handler.  This
actually simplified the code, since the case of
punctuation followed by a right guillemet is more clearly
distinguished from guillemet handling itself.

Palm markup language prescribes a single space after punctuation,
not two.  I modified accrual in |@<Output ASCII text character in Palm@>|
to check whether the last character in the string being
assembled is blank, if so the space is discarded.  Note
that this occurs when quoting document text, so you can
still insert multiple spaces in a special, should you need
to.  Naturally, we don't do this within a preformatted
table.

Ripped out the |terminator| argument from |generateAlignedParagraph|
for the Palm---it was no longer used.

Replaced the old logic for generating body paragraphs with a new
|generateFilledParagraph| method which takes the same
arguments and works in the same fashion as |generateAlignedParagraph|,
but joins the lines of the paragraph into a single line with
optional markup tags at the start and end.  This is now used
to generate body paragraphs (with null markup tags) and
indented block quotes with \.{\\t} tags.  This allowed me
to eliminate all the special cases for block quotes from
|generateAlignedParagraph|, dramatically simplifying it.

\date{2001 September 18}

Modified \.{Makefile.in}'s \.{clean} target to delete
\.{*.pml} files left around from testing.

Added a ``\.{reconfigure}'' target to \.{Makefile.in}
to facilitate testing on multiple platforms.  It deletes
\.{config.cache}, re-runs \.{./configure}, then does a
``{\tt make~clean}'' using the newly-generated \.{Makefile}.

Split \LaTeX, HTML, and Palm generation into separate
\.{latex.w}, \.{html.w}, and \.{palm.w} \.{CWEB} files,
all included in the main \.{etset.w} with the
\.{@@i} control code.  This facilitates comparing code
among formats since each can be opened in a separate window
as opposed to forever scrolling back and forth.

Had another go at persuading the \.{Makefile} to comprehend
the fact that while \.{CTANGLE} writes a \.{.c} file,
we want to compile it as a \.{.cc} file.  The last
attempt would fail in a bizarre way if \.{CTANGLE}
detected an error and exited, leaving a \.{.c}
file around, which the next \.{make} would
attempt to compile with the \CEE/ compiler
instead of \CPP/.

Added a check to |auditFilter|, governed by audit
criterion |trailing_hyphen|, which checks for the common
sin in scanned documents of a hyphenated word which the
editor has forgotten to join in the Etext.  (Note that
each occurrence of a hyphen at the end of a line must
be reviewed by the editor to determine whether the hyphen
was inserted between syllables or existed before the
word was broken, as for example in ``Franco-Prussian''.)
A trailing hyphen is reported only if it is preceded by
a letter, including ISO accented letters.  A public
|static| function, |isISOletter| is provided by
|auditFilter| to other code which may need to determine
if a character is an ISO letter.

\date{2001 September 19}

Spent all day implementing footnotes in Palm Markup Language
output.  Well$\ldots$to be more precise, I spent about 15 minutes
designing, implementing, and debugging how I intended
footnotes to work, then the rest of the day psychoanalysing
\.{DropBook} (version 1.1.1) and bugs in Palm Reader (version
1.0.6), both versions being the latest released as of this
writing.

Footnotes in PML documents, like those in HTML, may not
be nested---if nested footnotes appear in the input
text a warning is issued and the nested footnotes are
simply included in the outermost footnote surrounded
by square brackets, just as in the input.  When a
footnote appears, a link is placed in the output
string being assembled by |quotePalmString| and a
link is inserted, consisting of the footnote number
enclosed in square brackets, with link destination
``\.{f}{\it n}'' where {\it n} is the footnote number.
An anchor named `\.{b}{\it n}'' is placed before the
footnote link in the text; this permits returning from
the footnote to the text in which it appears.

Footnotes are placed in a pseudo-chapter added to the
end of the document; to avoid language-specific nomenclature,
this chapter is named ``$^1$~$^2$~$^3\ldots$''.  A page
break appears before each footnote in this chapter, and
each footnote begins with its number and a period in
bold type.  At the end of the footnote text is a link
back to the body where the footnote appeared; the
target of this link is the bold string ``${\bf <<<}$'',
once again avoiding language-specificity.  In case
the clever notion of using guillemets for such a
link pops into your head, invite it to pop right back
out---Palm Reader goes into gibber gibber land if it
sees an ISO character in a link target and starts
scribbling all over system and unallocated memory.
Maybe they'll fix that some day.

When the opening bracket of a footnote is encountered,
the current encoded string being assembled by
|quotePalmString| is saved in |footsave| and
the flag |infoot| is set.  The text processing modes
|quoth| and |italics| are also saved and reset to
their defaults at the start of a paragraph.

You'll recall that |generateFilledParagraph| must concatenate
lines into one monster line per paragraph or else Palm Reader
will faithfully start a new paragraph at each end of line,
resulting in horrid looking text.  Consequently, it can't call
|emit| for each line of the body it receives, but must assemble
lines into the single line paragraph.  As a result, it must
also handle strings returned by |quotePalmString| with
|infoot| set, concatenating them itself into the
|footpar| being assembled.  Even though |generateAlignedParagraph|
usually emits quoted lines as they are completed, it still
must assemble footnotes into a single line per footnote
(we assume each footnote is a single paragraph) so that lines
will flow when it is displayed.  Thus, when |infoot| is
set it concatenates footnote text returned by
|quotePalmString| to |footpar|.

When the right bracket delimiting the end of a footnote
(only the outermost, if they are nested) is encountered,
the text assembled so far for this line is concatenated
to that from earlier lines, if any, and the result,
with footnote number, anchor, and link back to the
text where the footnote appeared is output, using
|emit| to append it to the master |footnotes| string.
The partial translated string saved at the start of
the footnote is then restored, along with the text
modes in effect at that time, and processing of
normal text resumes.

\date{2001 September 20}

Bugged out last night without ever describing how footnotes
actually make it into the PML output file.  When the closing
bracket of a footnote is processed, the anchor and text
accumulated in |footpar| is output by calling the class-local
version of |emit| which, with |infoot| set, appends its
argument plus a new line to the string |footnotes| rather
than passing it down the pipeline.  The back link from the
footnote to the text is similarly appended by calling
|emit|.  Finally, when the end of input is reached and we
receive the |EndOfText| item, all that need be done is to
call |emit(footnotes)| with |infoot| |false|, which
passes all the accumulated footnotes down the pipeline.

Document title and author specifications which spanned
multiple lines did not work in PML\null.  Integrated
code from |HTMLGenerationSink| to collect them into a
single line as required in PML.

Added logic to |HTMLGenerationSink| to push the current italic
text state when a footnote is encountered and pop it at the end
of the footnote (if footnotes are [improperly] nested, this
applies only to the outermost footnote).  Also, nested footnotes
are now output like in-line footnotes in single file HTML, rather
than simply being merged with the text.

A footnote which appeared in a centred, ragged right, or ragged left
paragraph in multiple file HTML output would preserve the line
breaks in the input text in the footnote document.  I modified
|HTMLGenerationSink::|\breakOK|generateAlignedParagraph| to skip
appending the |terminator| when |infoot| is set upon return from
|translateHTMLString|.

\CPP/ ``mountain range'' identifiers like:\hfill\break
\centerline{\.{pneumaticJackHammer::diggaDiggadig}}
\noindent can wreak havoc with \TeX\ line filling, and the \.{CWEB}
macros don't honour either a ``\.{\atsign\vbar}'' optional line break
in the middle of one or even two adjacent ``\.{\vbar}'' constructs separated
by a space.  I defined a \TeX\ macro ``\.{\bslash breakOK}'' to use within
such items (usually after the double colon, between two adjacent \CEE/ items),
to permit them to be broken more \ae sthetically.

Defined \TeX\ macros \.{\\atsign}~``\.{\atsign}'',
\.{\\bslash}~``\.{\bslash}'',
\.{\\caret}~``\.{\caret}'',
\.{\\uline}~``\.{\uline}'',
and \.{\\vbar}~``\.{\vbar}'' to make
it easier to talk about such characters in this document without
clever special-case quoting.

Here are some things to watch out for when creating PML documents with
embedded images.  First of all, the images must be created in PNG
format and placed in a subdirectory of the directory containing the
\.{.pml} file you're compiling.  If your PML file is
\.{/home/elvis/palmdoc/hounddog.pml} then the images must be
placed in a directory named \.{/home/elvis/palmdoc/hounddog\_img}.
The PNG files for these images must be of the
``palette'' type.  As of \.{DropBook} 1.1.1, grey scale and
other types of PNG files do not work.  To determine which
kind of PNG file and how big it is, if you have the NETPBM and
PNG utilities on your system, use the command:

\hskip 1cm \.{pngtopnm }{\it name}\.{.png \vbar\ pnmfile}

This will produce output like:

\verbatim
!    pngtopnm: reading a 112 x 158 image, 8 bits palette
!    pngtopnm: writing a PGM file (maxval=255)
!    stdin:	PGM raw, 112 by 158  maxval 255
!endgroup

If the first line does not indicate ``\.{8 bits palette}'', you're probably
in for trouble; you'll need to load the image with an
image editing program and convert it to an 8-bit palette image.
Take note of the image size as well.  An image of $158\times148$
pixels will display in-line in the document, while a larger image
will be represented by an icon the user must tap to display the
actual image.  If the image is larger than $158\times158$ pixels
the user will be required to scroll the screen to see the
complete image.  Finally, the actual PNG file embedded in the
document may be no larger than 65505 bytes, as this is the maximum
size of the Palm database records in which they are stored.  If
your image is larger than this, you'll need to reduce resolution
and/or select compression modes which reduce it to 65505 bytes
or smaller.  None of these constraints have anything to do with
this program proper; I mention them here in the interest of sparing
you some of the frustrations I experienced trying to make illustrated
PML documents while testing it.

\date{2001 September 21}

Palm output didn't show the number of lines of PML generated
when the ``\.{--verbose}'' option was specified; fixed.  One
subtlety is that when the footnotes are appended to the end of the
document, |emit| is called with a line which may contain one
or more embedded end-of-line characters.  This is a little shoddy,
but it works just fine and saves us from all the complexity of
a line queue and separate calls on |emit| whose only benefit
would be purity of essence.  It does mean, however, that in order
to accurately count the line written to the PML file we need
to count the number of new line characters in the aggregate
footnote string and add that to the number of lines counted
by |emit|; this is easily accomplished.

Cleaned up formatting of the input syntax documentation in
\.{etsetfmt.w}.

Plain \CPP/ |iterator| was not defined in \.{cweb/c++lib.w} as
a type name; I added it.

\date{2001 September 22}

To make it easier to cope with Project Gutenberg source documents
which are perversely published in MS-DOS Code Page 850 8-bit
characters, I added a |convertForeignCharacterSetToISOFilter|
which, driven by a translation table, converts characters in
the lines it passes through.  I defined a Code Page 850 to
ISO translation table in the file \.{cp850.h}, which is
included in \.{etset.w}.  (Keeping the conversion table in
a separate file allows me to use ISO characters in comments without
gnarly encoding for \.{CWEB} in a table which nobody will ever look
at anyway.)

\date{2001 September 23}

Added a \.{--dos-characters} command line option to place
a |convertForeignCharacterSetToISOFilter| immediately
after the stream source at the head of the pipeline.  The
filter uses the |cp850_to_ISO| translation table to convert
DOS characters to ISO 8859.

Added the ability to strip DOS carriage returns from the ends
of lines in |streamSource|.  A new |setStripEOL| method, called
with an argument of |true|, enables carriage return stripping. 
Only a single carriage return will be stripped from the end of
lines, and lines which do not contain a trailing carrage return
will not be modified.    This mode is activated by the
\.{--dos-characters} option. There's a |getStripEOL| method to
inquire if stripping is enabled in case you need to know.

Added \.{cp850.h} to dependencies of \.{etset.cc} in
\.{Makefile.in}.

Fixed a few more instances of awkward grammar in \.{etsetfmt.w}
in the process of integrating the text into the latest version
of {\it Autour de la Lune}.

Brought the \.{README} up to date.

\date{2001 September 24}

Integrated current description of command line options and
input syntax into the manual page \.{etset.1}.

Placed the Web document for the program in subdirectory
\.{webdoc} and updated.

Added a \.{check} target to \.{Makefile.in} and a
\.{reference} target to rebuild the \.{check\_master.txt.gz}
file which the output of the check run is compared against.

Made a \.{makew32.bat} file to build the Win32 executable
with DJGpp (compiling \.{getopt.c} and \.{getopt1.c} with
\.{gcc}).  Added a \.{testw32.bat} file to semi-automate
testing on Win32.  You still have to do the \.{diff} by
hand, since there aren't stock utilities to perform this
step on Win32.

\date{2001 September 25}

Cleaned up some signed/unsigned natters from a \.{-Wall}
build in code related to special command parsing and
processing.

Added null destructors declared |virtual| to
|LaTeXGenerationFilter| and |PalmGenerationFilter|
to get rid of natters about ``class has virtual
functions but non-virtual destructor''.

Fixed a rather subtle signed/unsigned problem which
has bedeviling me on the Win32 build for the last
day.  Consider code of the form:

\verbatim
!   string s;
!   unsigned int i;
!   for (i = s.length() - 1; i >= 0; i--) {
!endgroup

where you wish to scan a string in reverse order for
something or other.  If \\{s} is the null string, the expression
\\{s.length()-1} will go negative, which will be treated as
a huge positive value since \\{i} is |unsigned|.  What this
does is architecture- and compiler-dependent; on the Win32
build with DJGpp, it appears to have indexed off the start
of the string, which caused the code in
|@<Check for line with trailing white space@>| to
randomly (actually non-repeatably) report trailing
blanks on lines which were actually empty.  I changed the
iteration variable to a regular |int| and the problem went
away.

Modified the \.{dist} target in \.{Makefile.in} to handle
subdirectories (such as \.{cweb} and \.{webdoc}) included
in the distribution.

Added dependencies to the \.{dist} target to ensure the
PDF documentation and \CPP/ source are current.

How embarrassing!  In testing distribution archives, I
discovered that the program did not detect if its input file
did not exist.  To handle this in a thoroughly clean manner,
I broke out the file open code in |streamSource| into a
separate |openFile| method which throws an |invalid_argument|
argument if the file does not exist.  If you wish to catch
this exception, construct the |streamSource| with no
arguments (initially binding it to |cin|), then perform the
|openFile| within a |try| block which handles the exception
that results if the input file does not exist.

\date{2001 September 26}

Added \.{test.txt} to the distribution.  It was its absence
which provoked yesterday's alarums.  Also needed to include
\.{check\_master.txt.gz} in the archive to run \.{check}
tests on builds.

Corrected dependencies in \.{Makefile.in} so
generation of \.{config.h} by \.{configure} won't require
\.{etset.cc} to be regenerated from the \.{.w} files;
\.{etset.o} depends on \.{config.h}, but \.{etset.cc}
doesn't.

\date{2001 September 28}

Corrected an error advancing past the opening backet of a
footnote in |quotePalmString|; a footnote which followed the
two spaces at the end of the sentence would incorrectly
be reported as nested.

Release 3.0.1.

\date{2005 April 25}

Fixed two places, one in \.{latex.w}, the other in \.{html.w},
where I offended the \.{gcc} priesthood (version 3.2.3) by
declaring a default value for a function argument in both the
class definition and the implementtion.  How removing this
documentation from where the argument is actually used is
supposed to improve maintainability of code escapes me, but
I suppose my head is insufficiently pointed to comprehend.

Fixed a compile error in \.{HTMLGenerationSink::createNavButton}
where I committed the mortal sin of passing an |unsigned char|
to |write|, which expects a |char|.  I just hit it over the head
with a |reinterpret_cast|.

Somehow a backslash had crept into a line in |@<Toggle math mode in HTML@>|
in \.{html.w} which didn't bother the \CPP/ compiler but torpedoed
\TeX\ when attempting to typeset the program.  I removed it.

Confirmed that with these ``fixes'', the program compiles without
problems and passes the self-test on \.{gcc} 3.4.3 as well.

The \.{-Wall} option on \.{gcc} now natters if you don't
either handle all possible values of an enumeration in a
switch statement on an enumeration type or else include a
|default:| case to catch them.  I added a default to
|@<BetweenParagraphs state@>| in \.{etset.w} to get rid of
this warning.

Converted the image definitions in \.{buttons.h} to PNG file
format, then modified |HTMLGenerationSink::createNavButtons|
and |HTMLGenerationSink::createNavigationPanel| in
\.{html.w} accordingly.  I also added attribute quotes and a
close slash to the image tags to be XHTML compliant.

Added closing table data and row tags to the table generated
for mathematics in the interest of XHTML compliance.

XHTML doesn't allow one to include a horizontal rule within
a heading.  I modified the chapter heading generation for
single file HTML mode to close the \.{h2} item, emit the
rule, then re-open it with the same modes.  The same change
was required for the \.{h1} tags used in chapter titles in
multiple file HTML output as well.

Added quotes and a closure slash for the \.{<link>} tags included
in multiple file HTML output.

Added closure tags for table markup in the header and table
of contents of multiple file HTML output.

Added self-closure to the blank paragraphs which separate
footnotes in multiple file HTML output.

\date{2005 May 6}

In order to build a WIN32 binary with Microsoft Visual \CPP/.NET,
I added preprocessor logic in |@<System include files@>| which,
conditional on \.{WIN32}, modifies the configuration to
permit a build with the standard Unix \.{config.h} without
modifications.  I also added similar code to {\tt getopt.h}
to avoid a conflict with \.{string.h}.

Deleted the \.{makew32.bat} files used to build with DJgpp.

If you use the \.{ctype.h} macros with the standard \CPP/
|string| data type, any ISO character with the sign bit set
will cause an assertion failure in debug builds.  To
avoid this idiocy, you have to compile with the ``|char|
type is |unsigned|'', which I therefore enabled.

Added the \.{etset.sln} and \.{etset.vcproj} solution and project
files to the source archive.  These files are used to build
with Visual \CPP/.

Added a macro to the \.{testw32.bat} WIN32 regression test script
to allow specification of the directory path to the \.{etset.exe}
executable and documention on how to verify the result from
running the script.

Release 3.1.

\date{2005 May 22}

When built with \.{gcc} 3.{\it x}, the \LaTeX\ or PML output
file was truncated when an output file name was specified (as
opposed to sending the output to standard output) because the
file was not closed and the last buffer not flushed.  I added
logic to delete the dynamically allocated |streamSink| created
to write these files, and code to the |streamSink| itself to delete
its dynamically allocated |ofstream|, if any.  This serves to
guarantee the output file is flushed and closed normally.  For
some reason this problem never occurred with \.{gcc} 2.96.

Release 3.1.1.

\date{2006 May 16}

Changed working version to 3.2.

\date{2006 May 17}

Completed initial implementation of the \.{--strict} option which,
when specified along with the \.{--html} option, generates XHTML 1.1
Strict compliant documents.  Output should be essentially unchanged if
\.{--strict} is not specified---only a few little cleanups, such as
making the tag names in CSS all lower-case and moving the document
character set declaration to a ``{\tt meta}'' tag so as not to
befuddle simple-minded Exploder.

Removed the ``\.{-s 0}'' option from \.{xdvi} in \.{Makefile.in}:
it befuddles recent versions of the program.

Added the \.{--unicode} option which, when specified along with
the \.{--html} option, emits Unicode text escapes for left and
right double quotes, ellipses, and m-dashes.  This option may be used
in conjunction with any other HTML generation options.

Added a dummy virtual destructor to the |textComponent| class to get
rid of a plethora of idiot ``virtual functions but non-virtual
destructor'' warnings from \.{gcc} 4.0.2 in \.{-Wall} mode.

\date{2006 May 18}

Modified the test case and reference document generation in
\.{Makefile.in} to generate HTML documents with \.{--strict}
and \.{--unicode} modes as well as the default and verify
them.  Modified the \.{testw32.bat} WIN32 regression test script
to create output compatible with the revised reference document.

Fixed a couple of symbolic links to image files for the test
document which were broken when the development directory
structure was re-organised by version.

Changed a hard-coded reference to ``\.{zcat}'' in the
\.{check} target of \.{Makefile.in} to reference a
\.{UNCOMPRESSOR} macro defined at the top instead.

XHTML text entities generated when \.{--unicode} mode was
specified ran afoul of the extra non-breaking spaces interpolated
when \.{--french} was also specified.  I added a test for them
in |HTMLGenerationSink::translateHTMLString| to guarantee that
they are transcribed to the output without modification.

\date{2006 May 19}

Fixed a line-counting bug which resulted in off-by-two estimations of
line counts for HTML files generated in multi-file mode.

\date{2006 May 20}

The the specification of a class qualifier in the declaration
of |sectionSeparatorSquid::eof| ran afoul of the latest round
of \CPP/ fanaticism in \.{gcc} 4.1.0.  This ``error'' (which was
not previously even a warning in \.{-Wall} mode) was ``fixed''
by removing the nugatory class name qualifier.

Release 3.2.

%%%%%%%%%    Add new entries before this line    %%%%%%%%%
\parskip=0pt plus1pt
\parindent=20pt
