GUI editor to tame mod_security rules

⌈⌋ ⎇ branch:  modseccfg


Check-in [ada19bd287]

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:html2mallard update: support direct .md conversion, and http:// url params, doc updates.
Downloads: Tarball | ZIP archive | SQL archive
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA3-256: ada19bd287ce6efb72c815cc299114f72c0cc82b998aeb86d0e4273b09e5f891
User & Date: mario 2021-03-26 12:13:35
Context
2021-03-29
21:23
Performance fix for pyrewrite in range() check. check-in: 950ea0eb0c user: mario tags: trunk
2021-03-26
12:13
html2mallard update: support direct .md conversion, and http:// url params, doc updates. check-in: ada19bd287 user: mario tags: trunk
2021-03-06
21:46
Requires msc_pyparser >= 1.1 (for the CRS 3.2 line continuation issue in 901.conf) check-in: 6d5c87b143 user: mario tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to html2mallard/README.md.

17
18
19
20
21
22
23
24

25
26
27
28
29
30
31
32
33
34
35
36
37
38
39




















40
41
42
43
44
45
46
 * And API docs are least convertible (only tested mkdocstrings, source
   dump is omitted, and there's obviously no syntax colorization in yelp;
   alternatively try [mkgendocs](https://pypi.org/project/mkgendocs/)).
 * Primarily designed for mkdocs´ HTML output.  But also contains some
   cleanup rules for [fossil](https://fossil-scm.org/) wiki pages (with
   [github](https://fossil.include-once.org/fossil-skins/wiki/GitHub) skin),
   and yelp-builds` html.
 * Conversion doesn't work well for sphinx output (not consistent enough).



## html2mallard

Simple command line tool to convert a single .html file:

    html2mallard site/index.html > help/index.page

Add a `-d`/`--debug` flag after the filename for details on the shortening
process.

    html2mallard in.html --debug | xmllint - --recover > out.page

With [xmllint](http://xmlsoft.org/xmllint.html) to fix some unmatched tags.






















## mkdocs-mallard

Converts a list of mkdocs output files to *.page files.

    mkdocs-mallard








|
>















>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
 * And API docs are least convertible (only tested mkdocstrings, source
   dump is omitted, and there's obviously no syntax colorization in yelp;
   alternatively try [mkgendocs](https://pypi.org/project/mkgendocs/)).
 * Primarily designed for mkdocs´ HTML output.  But also contains some
   cleanup rules for [fossil](https://fossil-scm.org/) wiki pages (with
   [github](https://fossil.include-once.org/fossil-skins/wiki/GitHub) skin),
   and yelp-builds` html.
 * Conversion doesn't work well for sphinx output (not consistent enough),
   nor GitHub wiki pages.


## html2mallard

Simple command line tool to convert a single .html file:

    html2mallard site/index.html > help/index.page

Add a `-d`/`--debug` flag after the filename for details on the shortening
process.

    html2mallard in.html --debug | xmllint - --recover > out.page

With [xmllint](http://xmlsoft.org/xmllint.html) to fix some unmatched tags.

Now also supports http:// urls for conversion:

    html2mallard http://wiki/index.html > index.page

And directly converting from markdown:

    html2mallard index.md > index.page

## API

There's basically just one main function in html2mallard:

    import html2mallard
    page = html2mallard.convert(html_file_content, fn)

The filename parameter is just used to deduce id and/or title from.
As convenience method there is also `page = html2mallard.convert_file(fn)`,
which would also automatically invoke `markdown` conversion given such an
extension, or even resolve an url as parameter.


## mkdocs-mallard

Converts a list of mkdocs output files to *.page files.

    mkdocs-mallard

67
68
69
70
71
72
73











74
75
76
77
78
79
80
          guess_lang: true
    plugins:
      - mkdocstrings

Also depends on `use_directory_urls: false`, since the script only `glob()`s
one level of `*.html` files.













## Adaption

The first two `rewrite` rules likely require changes for other HTML sources
or templates. Specifically `"^.+?</nav>"` should strip the initial
boilerplate, else might need expansion. (Either in the `GENERAL HTML` or
a new rewrite collection.)







>
>
>
>
>
>
>
>
>
>
>







88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
          guess_lang: true
    plugins:
      - mkdocstrings

Also depends on `use_directory_urls: false`, since the script only `glob()`s
one level of `*.html` files.


## Nav links

Ensure the `index.page` contains a section like:

    <section id="nav" style="2column">
     <subtitle>Topics</subtitle>
    </section>

But not the recursive self-reference `<link type="guide" xref="index#nav"/>`.


## Adaption

The first two `rewrite` rules likely require changes for other HTML sources
or templates. Specifically `"^.+?</nav>"` should strip the initial
boilerplate, else might need expansion. (Either in the `GENERAL HTML` or
a new rewrite collection.)

Changes to html2mallard/html2mallard.py.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#!/usr/bin/env python3
# api: cli
# encoding: utf-8
# type: transform
# title: HTML to mallard
# description: convert mkdocs´ html output to mallard/yelp xml
# category: documentation
# keywords: mkdocs mallard
# version: 0.2.0
# depends: python (>= 3.6), python:PyYAML (>= 5.0)
# license: Public Domain
# url: https://fossil.include-once.org/modseccfg/wiki/html2mallard
# 
# Poor transformation approach, mostly salvaging some HTML structures
# and reshuffling document body into mallard <page> with allowed
# inline markup.
#








|
|







1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#!/usr/bin/env python3
# api: cli
# encoding: utf-8
# type: transform
# title: HTML to mallard
# description: convert mkdocs´ html output to mallard/yelp xml
# category: documentation
# keywords: mkdocs mallard
# version: 0.3.0
# depends: python (>= 3.6), python:PyYAML (>= 5.0), python:markdown
# license: Public Domain
# url: https://fossil.include-once.org/modseccfg/wiki/html2mallard
# 
# Poor transformation approach, mostly salvaging some HTML structures
# and reshuffling document body into mallard <page> with allowed
# inline markup.
#
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

57
58
59
60
61
62
63

# output
template = dedent("""
    <page xmlns="http://projectmallard.org/1.0/"
     type="guide" id="{id}">

    <info>
        <link type="guide" xref="index#nav"/>
    {links}
        <desc>{desc}</desc>
        <?http header="X-Generator: html2mallard" ?>
    </info>

    <title>{title}</title>

    {body}

    </page>
""").lstrip()

# regex all the way
extract = {
    # meta info
    "mkdocs_page_name = \"(.*?)\";": "title",
    "<title>(.+?)</title>": "title",

    '<meta name="description" content="(.+?)"[^>]*>': "desc",
    '<a class="reference internal" href="(\w+).html">.+?</a>': "links",
    '<a class="trail" href="(\w+).html(#.+?)?" title=".+?">': "links",
    # flags
    '(<.+>)': "is_html",
    '(mkdocs)': "is_mkdocs",
    'data-target="[#.]navbar-(collapse)"': "is_material",







<
|









|





|
>







32
33
34
35
36
37
38

39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63

# output
template = dedent("""
    <page xmlns="http://projectmallard.org/1.0/"
     type="guide" id="{id}">

    <info>

        {links}
        <desc>{desc}</desc>
        <?http header="X-Generator: html2mallard" ?>
    </info>

    <title>{title}</title>

    {body}

    </page>
""").strip()

# regex all the way
extract = {
    # meta info
    "mkdocs_page_name = \"(.*?)\";": "title",
    "<title>(?:\w+:\s)?(.+?)</title>": "title",
    "<(?:h1|h2)>([^<]+)</(?:h1|h2)>": "title",
    '<meta name="description" content="(.+?)"[^>]*>': "desc",
    '<a class="reference internal" href="(\w+).html">.+?</a>': "links",
    '<a class="trail" href="(\w+).html(#.+?)?" title=".+?">': "links",
    # flags
    '(<.+>)': "is_html",
    '(mkdocs)': "is_mkdocs",
    'data-target="[#.]navbar-(collapse)"': "is_material",
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
        "<(?:h1|h2)[^>]*>(.+?)</(?:h1|h2)>": "<title>\\1</title>",
        "<(?:h3|h4)[^>]*>(.+?)</(?:h3|h4)>": "<subtitle>\\1</subtitle>",
        "<(?:h5|h6)[^>]*>(.+?)</(?:h5|h7)>": "<em>\\1</em>",
        "<strong>(.+?)</strong>": "<em style=\"strong\">\\1</em>",
        # lists
        "<ol[^>]*>(.+?)</ol>": "<steps>\\1</steps>",
        "<ul[^>]*>(.+?)</ul>": "<list>\\1</list>",
        "<li[^>]*>(.+?)</li>": "<item><p>\\1</p></item>",
        "<dl[^>]*>(.+?)</dl>": "<terms>\\1</terms>",
        "<dt[^>]*>(.+?)</dt>": "<item><title>\\1</title>",
        "<dd[^>]*>(.+?)</dd>": "<p>\\1</p></item>",
        # fix nested list   \1         \2                 \3                      \4    
        "(<(?:item|steps|terms)>)<p> ([^<]+(?<!\s)) \s* <(list|steps|terms)> \s* (.+?) </\\3>":
            "\\1<p>\\2</p>\n <\\3>\n<item><p>\\4 </\\3>\n</item>",
        # links







|







123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
        "<(?:h1|h2)[^>]*>(.+?)</(?:h1|h2)>": "<title>\\1</title>",
        "<(?:h3|h4)[^>]*>(.+?)</(?:h3|h4)>": "<subtitle>\\1</subtitle>",
        "<(?:h5|h6)[^>]*>(.+?)</(?:h5|h7)>": "<em>\\1</em>",
        "<strong>(.+?)</strong>": "<em style=\"strong\">\\1</em>",
        # lists
        "<ol[^>]*>(.+?)</ol>": "<steps>\\1</steps>",
        "<ul[^>]*>(.+?)</ul>": "<list>\\1</list>",
        "<li\\b[^>]*>(.+?)</li>": "<item><p>\\1</p></item>",
        "<dl[^>]*>(.+?)</dl>": "<terms>\\1</terms>",
        "<dt[^>]*>(.+?)</dt>": "<item><title>\\1</title>",
        "<dd[^>]*>(.+?)</dd>": "<p>\\1</p></item>",
        # fix nested list   \1         \2                 \3                      \4    
        "(<(?:item|steps|terms)>)<p> ([^<]+(?<!\s)) \s* <(list|steps|terms)> \s* (.+?) </\\3>":
            "\\1<p>\\2</p>\n <\\3>\n<item><p>\\4 </\\3>\n</item>",
        # links
171
172
173
174
175
176
177



178






179





180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197




198
199
200
201
202
203
204
205
206
207
208
209
210
211
212


213
214
215
216






217




218
219
220
221
222
223
224
225
        # strip lone </section>, empty spans
        "(<section>.+?</section>)|</section>": "\\1",
        "(<span[^>]*></span>)": "",
        "(<p[^>]*><p[^>]*>)(.+?)(</p></p>)": "<p>\\2</p>",
    }
}











def convert(html, fn):






    # prepare snippets for .format kwargs
    kw = {
        "id": re.sub("\W+", "_", re.sub("^.+/|\.\w+$", "", fn)).lower(),
        "desc": "",
        "title": "",
        "body": "",
        "links": "",
    }
    for rx, name in extract.items():
        m = re.search(rx, html)
        if m and (not name in kw or not kw[name]):
            if name == "links":
                kw[name] = ["".join(row) for row in re.findall(rx, html)]
            else:
                kw[name] = re.sub("&\w+;|<.+?>", "", m.group(1))
    if kw["links"]:
        kw["links"] = indent("\n".join(f"<link type=\"guide\" xref=\"{id}\"/>" for id in kw["links"]), prefix="    ")




        
    # simplify/convert html
    for (group, flag), patterns in rewrite.items():
        if not flag in kw: # possibly skip rule group
            continue
        elif debug:
            sys.stderr.write(f"group: {group}\n")
        for rx, repl in patterns.items():
            l = len(html)
            html = re.sub(rx, repl, html, 0, re.X|re.M|re.S|re.I)
            if debug and l != len(html):
                sys.stderr.write(f"rewrite: {len(html) - l} bytes, pattern: ~{rx}~\n")
    kw["body"] = html
    
    # return converted


    return template.format(**kw)

# single file
def convert_file(fn):






    with open(fn, "r", encoding="utf-8") as f:   # → html2mallard "site/index.html"




        return convert(f.read(), fn)

# process directory
def mkdocs():
    import yaml
    src = open("mkdocs.yml", "r")   # → ought to be in current directory
    cfg = yaml.load(src, Loader=yaml.Loader)
    srcdir = cfg["site_dir"]







>
>
>
|
>
>
>
>
>
>
|
>
>
>
>
>


















>
>
>
>















>
>




>
>
>
>
>
>
|
>
>
>
>
|







171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
        # strip lone </section>, empty spans
        "(<section>.+?</section>)|</section>": "\\1",
        "(<span[^>]*></span>)": "",
        "(<p[^>]*><p[^>]*>)(.+?)(</p></p>)": "<p>\\2</p>",
    }
}

def convert(html, fn):
    """
        Convert HTML to mallard page document.
        
        Parameters
        ----------
        html : str
            HTML page source (`<html>...`)
        fn : str
            Original filename (`index.html`)
            
        Returns
        -------
        str
            Converted mallard xml .page source
    """

    # prepare snippets for .format kwargs
    kw = {
        "id": re.sub("\W+", "_", re.sub("^.+/|\.\w+$", "", fn)).lower(),
        "desc": "",
        "title": "",
        "body": "",
        "links": "",
    }
    for rx, name in extract.items():
        m = re.search(rx, html)
        if m and (not name in kw or not kw[name]):
            if name == "links":
                kw[name] = ["".join(row) for row in re.findall(rx, html)]
            else:
                kw[name] = re.sub("&\w+;|<.+?>", "", m.group(1))
    if kw["links"]:
        kw["links"] = indent("\n".join(f"<link type=\"guide\" xref=\"{id}\"/>" for id in kw["links"]), prefix="    ")
    if kw["id"] != "index":
        kw["links"] = """<link type="guide" xref="index#nav"/>\n""" + kw["links"]
    if not kw["title"]:
        kw["title"] = re.sub("^.+/|\.\w+$", "", fn).title()
        
    # simplify/convert html
    for (group, flag), patterns in rewrite.items():
        if not flag in kw: # possibly skip rule group
            continue
        elif debug:
            sys.stderr.write(f"group: {group}\n")
        for rx, repl in patterns.items():
            l = len(html)
            html = re.sub(rx, repl, html, 0, re.X|re.M|re.S|re.I)
            if debug and l != len(html):
                sys.stderr.write(f"rewrite: {len(html) - l} bytes, pattern: ~{rx}~\n")
    kw["body"] = html
    
    # return converted
    if kw["id"] == "index":
        kw["body"] = """<section id="nav">\n  <!--<title>Topics</title>-->\n</section>\n""" + kw["body"]
    return template.format(**kw)

# single file
def convert_file(fn):
    html = ""
    if re.match("https?://.+", fn):     # → html2mallard http://page.html
        import requests
        html = requests.get(fn).text
        fn = re.sub(".+/", "", fn)
    else:                               # → html2mallard "site/index.html"
        with open(fn, "r", encoding="utf-8") as f:
            html = f.read()
    if re.search("\.md$", fn):          # → html2mallard page.md
        import markdown
        html = markdown.markdown(html)
    return convert(html, fn)

# process directory
def mkdocs():
    import yaml
    src = open("mkdocs.yml", "r")   # → ought to be in current directory
    cfg = yaml.load(src, Loader=yaml.Loader)
    srcdir = cfg["site_dir"]

Changes to html2mallard/man/html2mallard.1.

1
2
3
4
5
6
7
8
9
10

11
12
13
14
15
16
17
18
19
20
21
22
23


24
25
26
27
28
29
30
31

32
33
34
35
36
37










38
39
40
.\" Automatically generated by Pandoc 2.5
.\"
.TH "html2mallard" "1" "" "from modseccfg" "Version 0.2"
.hy
.SH NAME
.PP
\f[B]html2mallard\f[R] \[em] rough help conversion tool
.SH SYNOPSIS
.PP
\f[B]html2mallard\f[R] \f[I]input.html\f[R] \f[B]>\f[R]

\f[I]output.page\f[R]
.PP
\f[B]html2mallard\f[R] \f[I]input.html\f[R]
[\f[B]\-D\f[R]|\f[B]\-d\f[R]|\f[B]\-\-debug\f[R]]
.PP
\f[B]html2mallard\f[R] \f[I]input.html\f[R] | \f[B]xmllint\f[R]
\f[I]\-\f[R] \f[B]\-\-recover\f[R]
.PP
\f[B]mkdocs\-mallard\f[R]
.SH DESCRIPTION
.PP
Provides a rough conversion between mkdocs\[ga] generated HTML and
mallard/yelp files.


.PP
Where \f[B]mkdocs\-mallard\f[R] reads a bunch of files from the
\f[I]site_dir\f[R] defined in \f[I]mkkdocs.yml\f[R], and writes them to
\f[I]mallard_dir\f[R].
.SH PURPOSE
.PP
Poor transformation approach, mostly salvaging some HTML structures and
reshuffling document body into mallard with allowed inline markup.

.IP \[bu] 2
XSLT might have been easier, but doesn\[cq]t work on most HTML.
.IP \[bu] 2
BS/lxml is way overkill for this task (hence zero such tools).
.IP \[bu] 2
Noone\[cq]s doing a markdown to ducktype/mallard converter either.










.SH SEE ALSO
.PP
\f[B]https://pypi.org/project/html2mallard/\f[R], \f[B]xmllint\f[R](1)


|






|
>



|
<
<







>
>

|
|




|
>






>
>
>
>
>
>
>
>
>
>



1
2
3
4
5
6
7
8
9
10
11
12
13
14
15


16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
.\" Automatically generated by Pandoc 2.5
.\"
.TH "html2mallard" "1" "" "from modseccfg" "Version 0.3"
.hy
.SH NAME
.PP
\f[B]html2mallard\f[R] \[em] rough help conversion tool
.SH SYNOPSIS
.PP
\f[B]html2mallard\f[R] [ \f[I]input.html\f[R] | \f[I]input.md\f[R] |
\f[I]http://example.com/input.html\f[R] ] \f[B]>\f[R]
\f[I]output.page\f[R]
.PP
\f[B]html2mallard\f[R] \f[I]input.html\f[R]
[\f[B]\-D\f[R]|\f[B]\-d\f[R]|\f[B]\-\-debug\f[R]] | \f[B]xmllint\f[R]


\f[I]\-\f[R] \f[B]\-\-recover\f[R]
.PP
\f[B]mkdocs\-mallard\f[R]
.SH DESCRIPTION
.PP
Provides a rough conversion between mkdocs\[ga] generated HTML and
mallard/yelp files.
Also accepts \f[I]*.md\f[R] input files (converted per
markdown.markdown), or even remote *.html files (per requests).
.PP
Whereas \f[B]mkdocs\-mallard\f[R] reads a bunch of files from the
\f[I]site_dir\f[R] defined in \f[I]mkdocs.yml\f[R], and writes them to
\f[I]mallard_dir\f[R].
.SH PURPOSE
.PP
Poor transformation approach, mostly salvaging some HTML structures and
reshuffling document body into mallard <page> with allowed inline
markup.
.IP \[bu] 2
XSLT might have been easier, but doesn\[cq]t work on most HTML.
.IP \[bu] 2
BS/lxml is way overkill for this task (hence zero such tools).
.IP \[bu] 2
Noone\[cq]s doing a markdown to ducktype/mallard converter either.
.PP
Generated pages often require some post\-editing, such as removing
duplicate \f[B]<title>s\f[R] or empty \f[B]<section>s\f[R], or adding a
\f[B]<desc>\f[R].
Mallard help also requires an \f[I]index.page\f[R], ideally with a
\f[B]<section id=\[lq]nav\[rq]>\f[R], so other pages automatically link
there.
(The index.page itself should not carry the \f[B]<link
type=\[lq]guide\[rq] xref=\[lq]index#nav\[rq]/>\f[R], as that would be
recursive.)
.SH SEE ALSO
.PP
\f[B]https://pypi.org/project/html2mallard/\f[R], \f[B]xmllint\f[R](1)

Changes to html2mallard/man/html2mallard.md.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42





43
44
45
46
47
48
% html2mallard(1) from modseccfg | Version 0.2


NAME
====

**html2mallard** — rough help conversion tool

SYNOPSIS
========

  **html2mallard** *input.html* **>** *output.page*

  **html2mallard** *input.html* \[**-D**|**-d**|**\-\-debug**\]

  **html2mallard** *input.html* \| **xmllint** *\-* **\-\-recover**

  **mkdocs-mallard**


DESCRIPTION
===========

Provides a rough conversion between mkdocs` generated HTML and
mallard/yelp files.


Where **mkdocs-mallard** reads a bunch of files from the *site_dir*
defined in *mkkdocs.yml*, and writes them to *mallard_dir*.


PURPOSE
=======

Poor transformation approach, mostly salvaging some HTML structures
and reshuffling document body into mallard <page> with allowed
inline markup.

 * XSLT might have been easier, but doesn't work on most HTML.
 * BS/lxml is way overkill for this task (hence zero such tools).
 * Noone's doing a markdown to ducktype/mallard converter either.








SEE ALSO
========

**https://pypi.org/project/html2mallard/**, **xmllint**(1)

|










|

|
<
<








|
>

|
|






|






|
>
>
>
>
>






1
2
3
4
5
6
7
8
9
10
11
12
13
14


15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
% html2mallard(1) from modseccfg | Version 0.3


NAME
====

**html2mallard** — rough help conversion tool

SYNOPSIS
========

  **html2mallard** \[ *input.html* | *input.md* | *http://example.com/input.html* ] **>** *output.page*

  **html2mallard** *input.html* \[**-D**|**-d**|**\-\-debug**\] \| **xmllint** *\-* **\-\-recover**



  **mkdocs-mallard**


DESCRIPTION
===========

Provides a rough conversion between mkdocs` generated HTML and
mallard/yelp files. Also accepts *\*.md* input files (converted
per markdown.markdown), or even remote \*.html files (per requests).

Whereas **mkdocs-mallard** reads a bunch of files from the *site_dir*
defined in *mkdocs.yml*, and writes them to *mallard_dir*.


PURPOSE
=======

Poor transformation approach, mostly salvaging some HTML structures
and reshuffling document body into mallard \<page> with allowed
inline markup.

 * XSLT might have been easier, but doesn't work on most HTML.
 * BS/lxml is way overkill for this task (hence zero such tools).
 * Noone's doing a markdown to ducktype/mallard converter either.

Generated pages often require some post-editing, such as removing duplicate
**\<title>s** or empty **\<section>s**, or adding a **\<desc>**.  Mallard
help also requires an *index.page*, ideally with a **\<section id="nav">**,
so other pages automatically link there.  (The index.page itself should not
carry the **\<link type="guide" xref="index#nav"/>**, as that would be
recursive.)

SEE ALSO
========

**https://pypi.org/project/html2mallard/**, **xmllint**(1)