Subject: htm1-pp: htm1 perl processor From: Guido Draheim [htm1-pp:] "{courier:(color=#004040){b:htm1{small:-}{i:pp}} $()}" [htm1:] "{code:htm{/big:1}}" [_htm1:] "{code:.htm{/big:1}}" [_html:] "{code:.html}" [v:] "{nobr:{code:`$(*)'}}" h2: {htm1-pp: {/sup:1.1}} content-table-#: h3-: Introduction The {htm1-pp:} processor expands macros found in {_htm1:}-files and produces a corresponding {_html:}-file. (it accepts {code:.{small:HTM1}} as well). The possible macros are not restricted to those predefined - you can even define your own macros in the actual {_htm1:}-sourcefile. The syntax is modelled as to be able to directly feed plain rfc822 internet-mail messages to {htm1-pp:}. The predefined functionality in {htm1-pp:} will try to produce a nice result, or really, to arrive at a nice result without much changes in that e-mail for you. The {htm1-pp:} perl-based processor has served me very well in the past months (so I hope to got rid of major bugs). At the time of writing, I am always using {htm1-pp:} to create html-documents - even this html-document has an href: htm1-pp.htm1 accompanying {_htm1:}-file ({href: index.htm1}) that it was built from. h3-: Invokation Use {v: htm1-pp {i:index.htm1}} to produce a corresponding {v:{i:index.html}} containing html-code. Messages to the screen will tell you about the predefined macro-files and perl-files loaded in advance to process your {v:{i:index.htm1}}. If you are often using a Makefile, you may want to include the following: -: {some-code: .SUFFIXES .html {_htm1:} .c .h {_htm1:}.html: htm1-pp $< all: *.html } so you can just call {v:make} to convert your {_htm1:}-files into their corresponding {_html:}-files. h3-: Syntax A {htm1:}-tag does always end with a single colon, optionally followed by subtag arguments in round parentheses. -: The start of a tag is marked by {ul: *: a linebreak. The text to be enclosed with html-tags in your output-file does then stretch up to the end of line. (atleast for the body of your {htm1:}-text - more on this later). -: ===:(top) {td-code: b: this text is in bold face and this is not } ---: b: this text is in bold face and this is not
*: an opening (curly) brace. The text to be enclosed stretches to the next matching closing brace. -: ===:(top) {td-code: {b\: all this text will be set in bold face } but not this } ---: {b: all this text will be set in bold face } but not this
} {-#-:ul} You are free to embed arbitrary html-tags as long as the first opening angle starts at the beginning of a line. The actual source code for the table above looks like this -: {some-code: ===:(top) {td-code\: {b\\: all this text will be set in bold face } but not this } ---: {b\: all this text will be set in bold face } but not this
} note how the {v:b:}-tag in the first column had been masked with a backslash. h3-: Predefined macros Here should be a list full of pointers to pages telling you about the predefined macros. As you can see, most these aren't working. Alas, you are free to look at the source code that is containd in: {ul: *href: build/bin/htm1-pp bin/htm1-pp contains the preprocessor itself along with some basic macro tags. *href: build/etc/htm1-pp.ht1 etc/htm1-pp.ht1 contains predefined htm1-macro definitions. These are not functions and may be defined just as well in your actual {htm1:}-sources just as seen there. Take this file as a good example on possibilities of defining (mostly) simple {htm1:}-tags. *href: build/etc/htm1-pp.pl1 etc/htm1-pp.pl1 contains predefined perl-macro definitions. These functions can do a lot for you. Notably the {v:content-table:}-macro - as used for this document - is often very helpful. } Anyway, these files above (along with documentation) form the distribution files. There's nothing more to it, just small and powerful. Check out these documents... {ul: {li: ... }} h3-: Defining macros The macro names you can choose must match the regular expressions -: {code: '[\\w\\+\\-][\\w\\+\\-\\#]+' } for about any tag or -: {code: '[\\w\\+\\-\\=\\*][\\w\\+\\-\\#\\=]+' } for line-only tags. In normal words: you can choose any alphanumeric name plus '+' and '-'. Line-only tags may also contain '=' and they may also start with one '*'. Besides used as the first letter, the names may also contain '#'. h4: html-tag1 & html-tag2 Usually html-tags must be written with an opening html-tag, the enclosed text, and a closing html-tag (written like ""). You can declare those with {v:html-tag2:} which is followed by the tag name and its arguments. Some html-tags are not used with a closing tag, e.g. the horizontal ruler
, or the -tag. That's what the {v:html-tag1:} is for. Most of the simple html-tags should be predefined, but sometimes you get warnings about {v:unknown tag {blockquote\:...}}. Then simply do {v:html-def2: blockquote "blockquote"} and recompile. If you try to define a tag that had already been declared in a predefinition-file, you will be warned about {v:redefining blockquote}. You can again get rid of this warning by preceding the acting word with an {v:x-*}. That is {v:x-html-def2: blockquote "blockquote"} will not warn you about a redefinition, but you can still use it. Note that {v:html-tag2:} has some intelligence, so that a blue-tag like {v:html-tag2: blue "font color=blue"} will be expanded to {v:...} (which you would expect). h4: html-def The {v:html-def} can do a lot more for you. Contrary to {v:html-tag2:} you should not leave out the (sharp) angles used for the html-tags. On use of such a html-def, the {htm1:}-tag is replace directly with the text. Yet, there's some magic in html-def too. When expanding the definition, you can access the values of variables. They should look like {v:\$(name)}. Two special variables are {v:\$(*)} for the enclosed text, and {v:\$(.)} for the current sub-tag arguments on {htm1:}-tag use. ({v:\$()} is a compatibility synonym for {v:\$(*)}). h4: variables What are variables? Well actually, the name is a little misleading. It is simply a storage of the values of the last expansions. An explicit variable usage is (currently) called {v:put:}. Just try to {v:{put\:h2}} to get the latest section-header as a string. A real benifit comes from the {v:x-*} tags - which would normally just suppress a warning about non-known tags being tried. In the header, unknown tags would normally expand to html -tags. But at the same time, their value is stored for later use. So in fact, I am often using {v:{put\:title}} and {v:{put\:From}} in the bottom line. And I use {v:{x-bgcolor\: #FFFFE0}} in the header to get the specified bgcolor. Inside of {v:html-def:} definitions you can access the these variables using a {v:\$(From)} syntax that will be expanded later to the current value of the name variable. -: {small: using {v:{put\:From}} in a {v:html-def:}-definition would be expanded {i:before} the definition is stored, so you would get the same value on each expansion of {v:html-def}'s right side. Mostly, this is not what you wanted to have.} h4: functions Functions are not defined in a ht1-file, instead you include them in a perl-inclusion file a.k.a. pl1-file. The perl argument is simply the current enclosement ({v:\$(*)}). The current sub-tags are handily available in the %Args hash. And the last expansions of other tags are accessible in %Vars. You can add or delete other sub-tag arguments and get the whole slew of them with {v:&getArgs()}, such as in {v: return "".@_."";}. Just look into the predefined functions to learn some of the tricks. h4: load order On compiling your {_htm1:} source files into {_html:} files you will notice a bunch of files being pre-loaded for you. Here's the order: html-def: ---get " get$(*)" html-def: ---and " and$(*)" ===: ---get: {v:/usr/etc/htm1-pp.ht#} ---and: {v:/usr/etc/htm1-pp.pl#} ===: ---get: {v:/usr/html/etc/htm1-pp.ht#} ---and: {v:/usr/html/etc/htm1-pp.pl#} ===: ---get: {v:~/etc/htm1-pp.ht#} ---and: {v:~/etc/htm1-pp.pl#} ===: ---get: {v:~/.htm1-pp.ht#} ---and: {v:~/.htm1-pp.pl#} ===: ---get: {v:./etc/htm1-pp.ht#} ---and: {v:./etc/htm1-pp.pl#} ===: ---get: {v:./.htm1-pp.ht#} ---and: {v:./.htm1-pp.pl#}
and replace the '#'-number sign with "",1,2,3,4,5,6,7,8,9 in this order. So you may copy some files to your project directory (or the project's subdirectory {v:etc}) and have them included in order. No need to combine them into one. If you install {htm1-pp:} in your home-{v:bin} then you may want to put your personal favourites under {v:~/etc/htm1-pp.pl2} or {v:~/.htm1-pp.pl2} instead of using the "official" {v:.pl1}. Good luck. h3-: Special Features There is made a distinction between the header of a file and its body - just like in an e-mail message the header expands down to the first empty line after which the body part starts. In the header, line-tags may actually span multiple lines - as long as the extension lines {i:don't} start at the first column and have rather a little indent-space before. Even more unknown tags are not really warned - they are transferred into meta-tags. So {v:Organization: Lucky Company} will expand to {v:-tag. x-html-tag2: ol Other specials: {ul:(square) *: an empty line as itself is treated to be a paragraph end. *: a line introduced with `> ' goes italic, as it is the normal way of mails to show quotes. *: lines starting with '<' are treated as verbatim html-text. *: the expansion order is: {ol: *: one-line-tags (thereby skipping those with '#') *: em'brace'd-tags *: line-tags whose name contains '#' (they are skipped on 1. pass) } and this order is repeated a number of times so that macro-tags that were introduced by expansion will get expanded too. *: {v:*:}-linetag is nice for {v:
  • } *: you may want to use {v:content-table-#} that will list all your {v:{name\: named texts}} *: quite a few of the predefined macros have a notion of fgcolor2 and bgcolor2 (as opposed to fgcolor and bgcolor values). That is a good way to limit the colors used in your documents {i:and} have them always the same in their appropriate positions. (remember, it's an error to have a 'colorful' web-page). } Have fun! h3-: Changes of 1.1 over 1.0 {ul: *: allow {v:/} in tag names. If a tag starts with {v:/}, just take it implictly to be a {v:html-tag2}. If a tag ends with a {v:/} just take it to be a {v:html-tag1}. Don't warn on it. *: there had been a default to warn on unknown tags, but treat them as {v:html-tag2}. With the above, this is obsolete. And it had been error-prone too, esp. if used as a line tag. So, the default for unknown line tags (in the body) will be to return the tag plus colon plus text. That is, the result is just the source code. {small: (unknown tags in the header will still expand to html-meta-tags)} *: allow {v:=} in all tag names, not just line tags. That has to do with the implicit html-tags above: in those implicit html-tag {htm1-pp:} will replace underscores with blanks. So maybe you want to try {v:{/font_color=red\:some text}} to produce {/font_color=red:some text} in red. *: introduce bracket-defs: a line starting with something like {v:[name:] } will be treated as a {v:html-def: name } line. Since the text-body is scanned for those lines {i:first}, you have the option to put these defs {i:after} their use. I hold this to be an intended feature. } Version 2 of {htm1-pp:} will move the brace'd-text to use bracket-tags {i:after} the enclosed text. This is just the first step in this direction. (and of course to make shifting the {htm1:}-source base easier). h3-: Installation The {href: htm1-pp.tgz distribution file} contains the files along in their usual subdirectories {v:bin} {v:etc} and {v:doc}. So installation goes safely by going to the desired prefix-directory (or just stay in your home-directory) and expand the tar-file. In an xterm-shell you would do {v:tar xzvf {href:htm1-pp.tgz}}. It is very probably you have to edit {v:bin/htm1-pp}'s first line to point to the {v:perl}-interpreter. Again, in a shell you can usually type {v:type -p perl} or {v:whereis perl} to get the path. That should be it. It may get a lot nastier on non-unix systems. I am currently running it under linux, hpux and solaris, installed under my home directory. h3-: Bugs & Caveats & Things to watch out for bugs & caveats do most often have to do with functions, since those are quite special. I recommend to name all functions with an additional non-alpha-numerical character in the name, so that you'll be warned. (well, quite a few function-tags {i:don't} follow this rule yet, I'll try hard to rename them... so expect changes) {ul: *: if you use {v:html-def:} to create a new tag in your header, you can embed {htm1:}-macro-tags in there, so they will expand to their definitions. This is quite handy, but it may not do what you expect, esp. if some of the {htm1:}-tags are functions. The reason is, that the embedded tags are expanded {i:before} the definition is stored, ie. the functions are called, may return sth. (or just fail), and the later usage of your new tag will {i:not} call the function again. {b:[fixed]} *: all (curly) braces must be symmetrical. If {htm1-pp:} encounters a brace-pair that is not a tag-call, it will mask this pair assuming it came from some embedded notation, especially C-code, C++-code, Java-code, etc. Sometimes a brace is not part of a pair, so be sure to mask it explicity by using {v:\\\{} instead of just {v:\{}. {small: (look into the {_htm1:}-source of this file to see how I have written this paragraph...)} *: writing text does sometimes use a colon after a word to introduce a second part of the sentence. This may be seen as a line-tag if it happens to be at the left column in the text. Watch for messages saying "undefined tag ^...^" - it does often beat on that one. -: {small: (you could start each simple text line with a space, but actually this is a thing you can recognize easily in your source)} -: {b:[fixed]} see Changes section on new intepretation of such line-tags *: Sometimes you get strange error messages, or just something in (round) parentheses has gotten lost in the resulting text. This is often the case if you wrote something like {v:\{small:(oooohhh)\}}, where the word in parentheses is held to be a subtag-argument to {v:small:}. The solution is to put an additional space between the introducing {htm1:}-tag and the first one of the parentheses, ie. {v: small: (oooh)}. Introducing spaces in the vicinity of everything looking quite like a {htm1:}-tag is always recommended. } {-#-:} h3-: License The package is Free Software in the notion of {href:(nounder) www.OpenSource.org/licenses/lgpl-license.html Lesser GPL}, (the {href:(nounder) COPYING.LIB {v: GNU Library General Public License}}) - you can inform yourself about the meaning of free software, their benefits/drawbacks/liability and other {href:(nounder) www.OpenSource.org/licenses possible licenses} at {href: www.OpenSource.org}, or the Original - The {href: www.fsf.org Free Software Foundation}. {table:(b0 cs=0 cp=0 100%) ===: ---:(left colspan=2) - and remember: faithful in the matter, but use at your own risk - the sources tell you all ===: ---:(right colspan=2) {/sup: they are your safety, warranty, compatibility, portability, and much more} } h3-: References {ul: *href: http://www.andreasen.org/smart/ SmartHTML html-processor has served as the foundation, and there are still about 15% of remainings of that package {small: (see {href: http://freshmeat.net/news/1999/08/28/935892469.html appindex} at {href: http://freshmeat.net/appindex/web/pre-processors.html freshmeat/web} ) } *href: www.perl.org Perl Scripting Language is the underlying scripting language - both for the {htm1-pp:} preprocessor and for the user defined extensions } h3-: Comments comments, bugfixes, patches or extensions - everything's wellcome. {ul: *: {i:Erwin S. Andreas} (author of SmartHTML) wrote: -: {code: Cool, I'm glad SmartHTML proved useful beyond its original intent :) I like some of the ideas, e.g. using the META GENERATOR tag, the argument abbreviation (i.e. change an argument of "100%" automatically into width=100% sine that's where it's used most), definition of global "variables" (e.g. $(fgcolor)) so you can easier change your color scheme. } } hr: {box-30: {nobr: (C) Aug'99-{file-date:}} {nobr:{put:From}} | {put:h2} | {e-mail:guidod@gmx.de?subject=htm1-pp} }