GUI editor to tame mod_security rules

⌈⌋ branch:  modseccfg


Artifact [50cdb787c7]

Artifact 50cdb787c7cba3e2154dca63f1f7bb997bc925f717ddfc63e0165d76e2ca838c:

  • File html2mallard/README.md — part of check-in [ada19bd287] at 2021-03-26 12:13:35 on branch trunk — html2mallard update: support direct .md conversion, and http:// url params, doc updates. (user: mario size: 4718)

html2mallard / mkdocs-mallard

Extremely crude HTML to mallard help conversion. Specifically for output from mkdocs with RTD or Material theme.

It's a very basic regex extraction (→I'm looking forward to your letters!) and filtering process. It only retains some structural elements (headlines, paragraphs, tables, lists, notes). Doesn't even attempt to gather any topic relation/structure from the navigation list.

  • Really just intended for one-time/initial conversion.
  • Requires some editing to get pages to validate. (Though they probably "work" in yelp as is).
  • Links and image references certainly require manual cleanup. Nested lists or tables are likely to cause issues.
  • And API docs are least convertible (only tested mkdocstrings, source dump is omitted, and there's obviously no syntax colorization in yelp; alternatively try mkgendocs).
  • Primarily designed for mkdocs´ HTML output. But also contains some cleanup rules for fossil wiki pages (with github skin), and yelp-builds` html.
  • Conversion doesn't work well for sphinx output (not consistent enough), nor GitHub wiki pages.

html2mallard

Simple command line tool to convert a single .html file:

html2mallard site/index.html > help/index.page

Add a -d/--debug flag after the filename for details on the shortening process.

html2mallard in.html --debug | xmllint - --recover > out.page

With xmllint to fix some unmatched tags.

Now also supports http:// urls for conversion:

html2mallard http://wiki/index.html > index.page

And directly converting from markdown:

html2mallard index.md > index.page

API

There's basically just one main function in html2mallard:

import html2mallard
page = html2mallard.convert(html_file_content, fn)

The filename parameter is just used to deduce id and/or title from. As convenience method there is also page = html2mallard.convert_file(fn), which would also automatically invoke markdown conversion given such an extension, or even resolve an url as parameter.

mkdocs-mallard

Converts a list of mkdocs output files to *.page files.

mkdocs-mallard

Requires an extra mallard_dir in the mkdocs.yml config:

site_name: logfmt1
docs_dir: docs
site_dir: html
mallard_dir: mallard
use_directory_urls: false
nav:
  - Intro: index.md
theme:
  name: readthedocs
  highlightjs: false
repo_url: https://...
markdown_extensions:
  - admonition
  - codehilite
  - attr_list
  - def_list
  - tables
  - markdown.extensions.codehilite:
      guess_lang: true
plugins:
  - mkdocstrings

Also depends on use_directory_urls: false, since the script only glob()s one level of *.html files.

Nav links

Ensure the index.page contains a section like:

<section id="nav" style="2column">
 <subtitle>Topics</subtitle>
</section>

But not the recursive self-reference <link type="guide" xref="index#nav"/>.

Adaption

The first two rewrite rules likely require changes for other HTML sources or templates. Specifically "^.+?</nav>" should strip the initial boilerplate, else might need expansion. (Either in the GENERAL HTML or a new rewrite collection.)

from project import meta

meta info
depends -
compat Python ≥3.6, mkdocs 1.x
compliancy !pep8, mallard, manpage, !doap, !xdg
system usage -
paths -
testing -
docs -
activity abandoned
state alpha
support -
contrib -
announce -