## html2mallard / mkdocs-mallard
Extremely crude HTML to [mallard help](http://projectmallard.org/)
conversion. Specifically for output from [mkdocs](https://www.mkdocs.org/)
with RTD or Material theme.
It's a very basic regex extraction (โ*I'm looking forward to your letters!*)
and filtering process. It only retains some structural elements (headlines,
paragraphs, tables, lists, notes). Doesn't even attempt to gather any topic
relation/structure from the navigation list.
* Really just intended for one-time/initial conversion.
* Requires some editing to get pages to validate. (Though they probably
"work" in yelp as is).
* Links and image references certainly require manual cleanup. Nested
lists or tables are likely to cause issues.
* And API docs are least convertible (only tested mkdocstrings, source
dump is omitted, and there's obviously no syntax colorization in yelp;
alternatively try [mkgendocs](https://pypi.org/project/mkgendocs/)).
* Primarily designed for mkdocsยด HTML output. But also contains some
cleanup rules for [fossil](https://fossil-scm.org/) wiki pages (with
[github](https://fossil.include-once.org/fossil-skins/wiki/GitHub) skin),
and yelp-builds` html.
* Conversion doesn't work well for sphinx output (not consistent enough),
nor GitHub wiki pages.
## html2mallard
Simple command line tool to convert a single .html file:
html2mallard site/index.html > help/index.page
Add a `-d`/`--debug` flag after the filename for details on the shortening
process.
html2mallard in.html --debug | xmllint - --recover > out.page
With [xmllint](http://xmlsoft.org/xmllint.html) to fix some unmatched tags.
Now also supports http:// urls for conversion:
html2mallard http://wiki/index.html > index.page
And directly converting from markdown:
html2mallard index.md > index.page
## API
There's basically just one main function in html2mallard:
import html2mallard
page = html2mallard.convert(html_file_content, fn)
The filename parameter is just used to deduce id and/or title from.
As convenience method there is also `page = html2mallard.convert_file(fn)`,
which would also automatically invoke `markdown` conversion given such an
extension, or even resolve an url as parameter.
## mkdocs-mallard
Converts a list of mkdocs output files to *.page files.
mkdocs-mallard
Requires an extra **`mallard_dir`** in the `mkdocs.yml` config:
site_name: logfmt1
docs_dir: docs
site_dir: html
mallard_dir: mallard
use_directory_urls: false
nav:
- Intro: index.md
theme:
name: readthedocs
highlightjs: false
repo_url: https://...
markdown_extensions:
- admonition
- codehilite
- attr_list
- def_list
- tables
- markdown.extensions.codehilite:
guess_lang: true
plugins:
- mkdocstrings
Also depends on `use_directory_urls: false`, since the script only `glob()`s
one level of `*.html` files.
## Nav links
Ensure the `index.page` contains a section like:
<section id="nav" style="2column">
<subtitle>Topics</subtitle>
</section>
But not the recursive self-reference `<link type="guide" xref="index#nav"/>`.
## Adaption
The first two `rewrite` rules likely require changes for other HTML sources
or templates. Specifically `"^.+?</nav>"` should strip the initial
boilerplate, else might need expansion. (Either in the `GENERAL HTML` or
a new rewrite collection.)
### from `project` import `meta`
| meta | info |
|:---------------|:----------------------------------------------------------------|
| depends | - |
| compat | Python โฅ3.6, mkdocs 1.x |
| compliancy | !pep8, mallard, manpage, !doap, !xdg |
| system usage | - |
| paths | - |
| testing | - |
| docs | - |
| activity | abandoned |
| state | alpha |
| support | - |
| contrib | - |
| announce | - |