<!DOCTYPE html>
<!--[if IE 8]><html class="no-js lt-ie9" lang="en" > <![endif]-->
<!--[if gt IE 8]><!--> <html class="no-js" lang="en"> <!--<![endif]-->
<head>
<meta charset="utf-8"/>
<meta content="IE=edge" http-equiv="X-UA-Compatible"/>
<meta content="width=device-width, initial-scale=1.0" name="viewport"/>
<link href="img/favicon.ico" rel="shortcut icon"/>
<title>regex() - logfmt1</title>
<link href="https://fonts.googleapis.com/css?family=Lato:400,700|Roboto+Slab:400,700|Inconsolata:400,700" rel="stylesheet">
<link href="css/theme.css" rel="stylesheet"/>
<link href="css/theme_extra.css" rel="stylesheet"/>
<link href="custom.css" rel="stylesheet"/>
<link href="syntax.css" rel="stylesheet"/>
<script>
// Current page data
var mkdocs_page_name = "regex()";
var mkdocs_page_input_path = "regex.md";
var mkdocs_page_url = null;
</script>
<script defer="" src="js/jquery-2.1.1.min.js"></script>
<script defer="" src="js/modernizr-2.8.3.min.js"></script>
</link></head>
<body class="wy-body-for-nav" role="document">
<div class="wy-grid-for-nav">
<nav class="wy-nav-side stickynav" data-toggle="wy-nav-shift">
<div class="wy-side-scroll">
<div class="wy-side-nav-search">
<a class="icon icon-home" href="."> logfmt1</a>
</div>
<div aria-label="main navigation" class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation">
<ul>
<li class="toctree-l1"><a class="reference internal" href="index.html">Intro</a>
</li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="log.fmt.html">.log.fmt</a>
</li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="logopen.html">logopen()</a>
</li>
</ul>
<ul class="current">
<li class="toctree-l1 current"><a class="reference internal current" href="regex.html">regex()</a>
<ul class="current">
</ul>
</li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="fmt.html">global .fmt db</a>
</li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="update-logfmt.html">update-logfmt</a>
</li>
</ul>
<ul>
<li class="toctree-l1"><a class="reference internal" href="logex.html">logex</a>
</li>
</ul>
</div>
</div>
</nav>
<section class="wy-nav-content-wrap" data-toggle="wy-nav-shift">
<nav aria-label="top navigation" class="wy-nav-top" role="navigation">
<i class="fa fa-bars" data-toggle="wy-nav-top"></i>
<a href=".">logfmt1</a>
</nav>
<div class="wy-nav-content">
<div class="rst-content">
<div aria-label="breadcrumbs navigation" role="navigation">
<ul class="wy-breadcrumbs">
<li><a href=".">Docs</a> »</li>
<li>regex()</li>
<li class="wy-breadcrumbs-aside">
</li>
</ul>
<hr/>
</div>
<div role="main">
<div class="section">
<p><a href="logopen.html">logopen()</a> does this internally, but you might want to manually
craft the <code>"regex":</code> for a .fmt descriptor. In such cases,
you can use <code>regex()</code> or <code>update()</code> to combine the log.fmt with definitions
from the global fmt database.</p>
<h3 id="regex">regex()</h3>
<p>Takes a .fmt json/dict, and generates the regex with named capture groups
from it.</p>
<div class="doc doc-object doc-function">
<h2 class="hidden-toc" href="#logfmt1.regex" id="logfmt1.regex" style="visibility: hidden; position: absolute;">
</h2>
<div class="doc doc-contents first">
<p>Create regex for log fmt{}.</p>
<p><strong>Parameters:</strong></p>
<table>
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th>Description</th>
<th>Default</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>fmt</code></td>
<td><code>dict</code></td>
<td>
<p>Should contain record: and class:, but may define
custom fields or aliases.</p>
</td>
<td><em>required</em></td>
</tr>
<tr>
<td><code>update</code></td>
<td><code>bool</code></td>
<td>
<p>Inject fields and other declarations from shared
.fmt database into <code>fmt</code> dict.</p>
</td>
<td><code>False</code></td>
</tr>
</tbody>
</table>
<p><strong>Returns:</strong></p>
<table>
<thead>
<tr>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>str</code></td>
<td>
<p>Combined regex, for example <code>(?<remote_host>[\w\-.:]+)
(?<remote_logname>[\w\-.:]+) (?<remote_user>[\-\w@.]+)
\[?(?<request_time>\d[\d:\w\s:./\-+,;]+)\]?
"(?<request_line>(?<request_method>\w+) …␣…)"…</code></p>
</td>
</tr>
</tbody>
</table>
<details class="quote">
<summary>Source code in <code>logfmt1/logfmt1.py</code></summary>
<div class="highlight">
<pre><span></span><code><span class="k">def</span> <span class="nf">regex</span><span class="p">(</span><span class="n">fmt</span><span class="p">,</span> <span class="n">update</span><span class="o">=</span><span class="kc">False</span><span class="p">):</span>
<span class="sd">"""</span>
<span class="sd"> Create regex for log fmt{}.</span>
<span class="sd"> Args:</span>
<span class="sd"> fmt (dict): Should contain record: and class:, but may define</span>
<span class="sd"> custom fields or aliases.</span>
<span class="sd"> update (bool: Inject fields and other declarations from shared</span>
<span class="sd"> .fmt database into `fmt` dict.</span>
<span class="sd"> Returns:</span>
<span class="sd"> str: Combined regex, for example `(?<remote_host>[\\w\\-.:]+)</span>
<span class="sd"> (?<remote_logname>[\\w\\-.:]+) (?<remote_user>[\\-\\w@.]+)</span>
<span class="sd"> \\[?(?<request_time>\\d[\\d:\\w\\s:./\\-+,;]+)\\]? </span>
<span class="sd"> "(?<request_line>(?<request_method>\\w+) …␣…)"…`</span>
<span class="sd"> """</span>
<span class="n">rules</span> <span class="o">=</span> <span class="n">rulesdb</span><span class="o">.</span><span class="n">merge</span><span class="p">(</span>
<span class="n">fmt</span><span class="p">,</span>
<span class="n">rulesdb</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">fmt</span><span class="p">[</span><span class="s2">"class"</span><span class="p">])</span>
<span class="p">)</span>
<span class="n">fields</span> <span class="o">=</span> <span class="n">rules</span><span class="p">[</span><span class="s2">"fields"</span><span class="p">]</span>
<span class="n">record</span> <span class="o">=</span> <span class="n">fmt</span><span class="p">[</span><span class="s2">"record"</span><span class="p">]</span>
<span class="k">if</span> <span class="n">update</span><span class="p">:</span>
<span class="k">for</span> <span class="n">field</span> <span class="ow">in</span> <span class="p">[</span><span class="s2">"rewrite"</span><span class="p">,</span> <span class="s2">"fields"</span><span class="p">,</span> <span class="s2">"expand"</span><span class="p">,</span> <span class="s2">"alias"</span><span class="p">,</span> <span class="s2">"container"</span><span class="p">]:</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">field</span> <span class="ow">in</span> <span class="n">fmt</span><span class="p">:</span>
<span class="n">fmt</span><span class="p">[</span><span class="n">field</span><span class="p">]</span> <span class="o">=</span> <span class="p">{}</span>
<span class="c1"># pre-cleanup (for irrelevant format string `%!200,500<s` control prefixes)</span>
<span class="k">if</span> <span class="s2">"rewrite"</span> <span class="ow">in</span> <span class="n">rules</span><span class="p">:</span>
<span class="k">for</span> <span class="n">rx</span><span class="p">,</span> <span class="n">repl</span> <span class="ow">in</span> <span class="n">rules</span><span class="p">[</span><span class="s2">"rewrite"</span><span class="p">]</span><span class="o">.</span><span class="n">items</span><span class="p">():</span>
<span class="n">record</span> <span class="o">=</span> <span class="n">rx_sub</span><span class="p">(</span><span class="n">rx</span><span class="p">,</span> <span class="n">repl</span><span class="p">,</span> <span class="n">record</span><span class="p">)</span>
<span class="c1"># create fields from variant placeholders</span>
<span class="k">if</span> <span class="s2">"expand"</span> <span class="ow">in</span> <span class="n">rules</span><span class="p">:</span>
<span class="n">rx_quote_alt</span> <span class="o">=</span> <span class="p">{</span> <span class="c1"># (is_quoted + rx) can map to alternative regex</span>
<span class="p">(</span><span class="kc">False</span><span class="p">,</span> <span class="s1">'[^"]*'</span><span class="p">):</span> <span class="s2">"\S*"</span><span class="p">,</span>
<span class="p">(</span><span class="kc">True</span><span class="p">,</span> <span class="s2">"\S+"</span><span class="p">):</span> <span class="s2">"(?:[^</span><span class="se">\"</span><span class="s2">]*|</span><span class="se">\\\\\"</span><span class="s2">)+"</span><span class="p">,</span>
<span class="p">}</span>
<span class="k">for</span> <span class="n">rx</span><span class="p">,</span> <span class="n">expand</span> <span class="ow">in</span> <span class="n">rules</span><span class="p">[</span><span class="s2">"expand"</span><span class="p">]</span><span class="o">.</span><span class="n">items</span><span class="p">():</span>
<span class="k">for</span> <span class="n">is_quoted</span><span class="p">,</span> <span class="n">match</span><span class="p">,</span> <span class="o">*</span><span class="n">uu</span> <span class="ow">in</span> <span class="n">re</span><span class="o">.</span><span class="n">findall</span><span class="p">(</span><span class="sa">f</span><span class="s2">"(</span><span class="se">\"</span><span class="s2">?)(</span><span class="si">{</span><span class="n">rx</span><span class="si">}</span><span class="s2">)"</span><span class="p">,</span> <span class="n">record</span><span class="p">):</span>
<span class="k">if</span> <span class="n">match</span> <span class="ow">in</span> <span class="n">fields</span><span class="p">:</span>
<span class="k">continue</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">copy</span><span class="p">(</span><span class="n">expand</span><span class="p">)</span>
<span class="c1"># id: is usually "$1", but might be "prefix_$2" or something</span>
<span class="k">if</span> <span class="n">x</span><span class="p">[</span><span class="s2">"id"</span><span class="p">]</span><span class="o">.</span><span class="n">find</span><span class="p">(</span><span class="s1">'$'</span><span class="p">)</span> <span class="o">>=</span> <span class="mi">0</span><span class="p">:</span>
<span class="n">x</span><span class="p">[</span><span class="s2">"id"</span><span class="p">]</span> <span class="o">=</span> <span class="n">rx_sub</span><span class="p">(</span><span class="n">rx</span><span class="p">,</span> <span class="n">x</span><span class="p">[</span><span class="s2">"id"</span><span class="p">],</span> <span class="n">match</span><span class="p">)</span>
<span class="n">x</span><span class="p">[</span><span class="s2">"id"</span><span class="p">]</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="s2">"\W+"</span><span class="p">,</span> <span class="s2">""</span><span class="p">,</span> <span class="n">x</span><span class="p">[</span><span class="s2">"id"</span><span class="p">])</span><span class="o">.</span><span class="n">lower</span><span class="p">()</span>
<span class="c1"># recurse into other pattern types</span>
<span class="k">if</span> <span class="ow">not</span> <span class="s2">"rx"</span> <span class="ow">in</span> <span class="n">x</span> <span class="ow">and</span> <span class="s2">"class"</span> <span class="ow">in</span> <span class="n">x</span><span class="p">:</span>
<span class="n">x</span><span class="p">[</span><span class="s2">"rx"</span><span class="p">]</span> <span class="o">=</span> <span class="n">regex</span><span class="p">({</span>
<span class="s2">"class"</span><span class="p">:</span> <span class="n">x</span><span class="p">[</span><span class="s2">"class"</span><span class="p">],</span>
<span class="s2">"record"</span><span class="p">:</span> <span class="n">rx_sub</span><span class="p">(</span><span class="n">rx</span><span class="p">,</span> <span class="n">x</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s2">"record"</span><span class="p">)</span> <span class="ow">or</span> <span class="s2">"$1"</span><span class="p">,</span> <span class="n">match</span><span class="p">)</span>
<span class="p">})</span>
<span class="c1"># regex alternatives, depending on quotes preceeding the placeholder</span>
<span class="k">if</span> <span class="n">is_quoted</span> <span class="ow">and</span> <span class="s2">"if_quoted"</span> <span class="ow">in</span> <span class="n">x</span><span class="p">:</span> <span class="c1"># maybe an `if_quoted` definition makes this block redundant</span>
<span class="n">x</span><span class="p">[</span><span class="s2">"rx"</span><span class="p">]</span> <span class="o">=</span> <span class="n">x</span><span class="p">[</span><span class="s2">"if_quoted"</span><span class="p">]</span>
<span class="k">elif</span> <span class="p">(</span><span class="n">is_quoted</span><span class="p">,</span> <span class="n">x</span><span class="p">[</span><span class="s2">"rx"</span><span class="p">])</span> <span class="ow">in</span> <span class="n">rx_quote_alt</span><span class="p">:</span>
<span class="n">x</span><span class="p">[</span><span class="s2">"rx"</span><span class="p">]</span> <span class="o">=</span> <span class="n">rx_quote_alt</span><span class="p">[(</span><span class="n">is_quoted</span><span class="p">,</span> <span class="n">x</span><span class="p">[</span><span class="s2">"rx"</span><span class="p">])]</span>
<span class="n">fields</span><span class="p">[</span><span class="n">match</span><span class="p">]</span> <span class="o">=</span> <span class="n">x</span>
<span class="c1"># catch-all \S+ for completely unknown placeholders</span>
<span class="k">if</span> <span class="s2">"placeholder"</span> <span class="ow">in</span> <span class="n">rules</span><span class="p">:</span>
<span class="k">for</span> <span class="n">ph</span> <span class="ow">in</span> <span class="n">re</span><span class="o">.</span><span class="n">findall</span><span class="p">(</span><span class="n">rules</span><span class="p">[</span><span class="s2">"placeholder"</span><span class="p">],</span> <span class="n">record</span><span class="p">):</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">ph</span> <span class="ow">in</span> <span class="n">fields</span><span class="p">:</span>
<span class="nb">id</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="s2">"\W+"</span><span class="p">,</span> <span class="s2">""</span><span class="p">,</span> <span class="n">ph</span><span class="p">)</span>
<span class="n">fields</span><span class="p">[</span><span class="n">ph</span><span class="p">]</span> <span class="o">=</span> <span class="p">{</span> <span class="s2">"id"</span><span class="p">:</span> <span class="nb">id</span><span class="p">,</span> <span class="s2">"rx"</span><span class="p">:</span> <span class="s2">"\S+"</span> <span class="p">}</span>
<span class="c1"># do the actual replacement</span>
<span class="k">def</span> <span class="nf">sub_placeholder</span><span class="p">(</span><span class="n">m</span><span class="p">):</span>
<span class="n">ph</span> <span class="o">=</span> <span class="n">fields</span><span class="p">[</span><span class="n">m</span><span class="o">.</span><span class="n">group</span><span class="p">(</span><span class="mi">0</span><span class="p">)]</span>
<span class="k">if</span> <span class="n">update</span><span class="p">:</span>
<span class="n">fmt</span><span class="p">[</span><span class="s2">"fields"</span><span class="p">][</span><span class="n">m</span><span class="o">.</span><span class="n">group</span><span class="p">(</span><span class="mi">0</span><span class="p">)]</span> <span class="o">=</span> <span class="n">ph</span> <span class="c1"># store used placeholders in fmt</span>
<span class="n">rx</span> <span class="o">=</span> <span class="n">ph</span><span class="p">[</span><span class="s2">"rx"</span><span class="p">]</span>
<span class="nb">id</span> <span class="o">=</span> <span class="n">ph</span><span class="p">[</span><span class="s2">"id"</span><span class="p">]</span>
<span class="c1"># check for existing (…) capture group to mark up</span>
<span class="k">if</span> <span class="n">re</span><span class="o">.</span><span class="n">search</span><span class="p">(</span><span class="s2">"(?<!</span><span class="se">\\\\</span><span class="s2">)\((?!\?)"</span><span class="p">,</span> <span class="n">rx</span><span class="p">):</span>
<span class="n">rx</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="s2">"(?<!</span><span class="se">\\\\</span><span class="s2">)\((?!\?)"</span><span class="p">,</span> <span class="sa">f</span><span class="s2">"(?<</span><span class="si">{</span><span class="nb">id</span><span class="si">}</span><span class="s2">>"</span><span class="p">,</span> <span class="n">rx</span><span class="p">,</span> <span class="n">re</span><span class="o">.</span><span class="n">M</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">rx</span> <span class="o">=</span> <span class="sa">f</span><span class="s2">"(?<</span><span class="si">{</span><span class="nb">id</span><span class="si">}</span><span class="s2">></span><span class="si">{</span><span class="n">rx</span><span class="si">}</span><span class="s2">)"</span>
<span class="k">return</span> <span class="n">rx</span>
<span class="n">rx</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="n">rules</span><span class="p">[</span><span class="s2">"placeholder"</span><span class="p">],</span> <span class="n">sub_placeholder</span><span class="p">,</span> <span class="n">record</span><span class="p">)</span>
<span class="n">rx</span> <span class="o">=</span> <span class="n">rename_duplicates</span><span class="p">(</span><span class="n">rx</span><span class="p">)</span>
<span class="k">return</span> <span class="n">rx</span>
</code></pre>
</div>
</details>
</div>
</div>
<hr/>
<h3 id="rx2re">rx2re()</h3>
<div class="doc doc-object doc-function">
<h2 class="hidden-toc" href="#logfmt1.rx2re" id="logfmt1.rx2re" style="visibility: hidden; position: absolute;">
</h2>
<div class="doc doc-contents first">
<p>Convert generic <code>(?<name>…)</code> to Python <code>(?P<name>…)</code> regex capture group.
(logfmt1 definitions use standard syntax per default.)</p>
<p><strong>Parameters:</strong></p>
<table>
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th>Description</th>
<th>Default</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>rx</code></td>
<td><code>str</code></td>
<td>
<p>Generic/PCRE regex syntax</p>
</td>
<td><em>required</em></td>
</tr>
</tbody>
</table>
<p><strong>Returns:</strong></p>
<table>
<thead>
<tr>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>str</code></td>
<td>
<p>Python re syntax</p>
</td>
</tr>
</tbody>
</table>
<details class="quote">
<summary>Source code in <code>logfmt1/logfmt1.py</code></summary>
<div class="highlight">
<pre><span></span><code><span class="k">def</span> <span class="nf">rx2re</span><span class="p">(</span><span class="n">rx</span><span class="p">):</span>
<span class="sd">"""</span>
<span class="sd"> Convert generic `(?<name>…)` to Python `(?P<name>…)` regex capture group.</span>
<span class="sd"> (logfmt1 definitions use standard syntax per default.)</span>
<span class="sd"> Args:</span>
<span class="sd"> rx (str): Generic/PCRE regex syntax</span>
<span class="sd"> Returns:</span>
<span class="sd"> str: Python re syntax</span>
<span class="sd"> """</span>
<span class="k">return</span> <span class="n">re</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="s2">"\(\?<(?=\w+>)"</span><span class="p">,</span> <span class="s2">"(?P<"</span><span class="p">,</span> <span class="n">rx</span><span class="p">)</span>
</code></pre>
</div>
</details>
</div>
</div>
<hr/>
<h3 id="update">update()</h3>
<p>Updates the <code>regex:</code> property in a .fmt dict. Also injects any used
<code>fields:</code> or <code>alias:</code> lists. (For now this is basically just a wrapper
of the <code>regex()</code> builder.)</p>
<div class="doc doc-object doc-function">
<h2 class="hidden-toc" href="#logfmt1.update" id="logfmt1.update" style="visibility: hidden; position: absolute;">
</h2>
<div class="doc doc-contents first">
<p>should be the other way round: regex() is meant to be a subset of update() </p>
<details class="quote">
<summary>Source code in <code>logfmt1/logfmt1.py</code></summary>
<div class="highlight">
<pre><span></span><code><span class="k">def</span> <span class="nf">update</span><span class="p">(</span><span class="n">fmt</span><span class="p">):</span>
<span class="sd">""" should be the other way round: regex() is meant to be a subset of update() """</span>
<span class="n">fmt</span><span class="p">[</span><span class="s2">"regex"</span><span class="p">]</span> <span class="o">=</span> <span class="n">regex</span><span class="p">(</span><span class="n">fmt</span><span class="p">,</span> <span class="n">update</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
</code></pre>
</div>
</details>
</div>
</div>
</div>
</div>
<footer>
<div aria-label="footer navigation" class="rst-footer-buttons" role="navigation">
<a class="btn btn-neutral float-right" href="fmt.html" title="global .fmt db">Next <span class="icon icon-circle-arrow-right"></span></a>
<a class="btn btn-neutral" href="logopen.html" title="logopen()"><span class="icon icon-circle-arrow-left"></span> Previous</a>
</div>
<hr/>
<div role="contentinfo">
<!-- Copyright etc -->
</div>
Built with <a href="https://www.mkdocs.org/">MkDocs</a> using a <a href="https://github.com/snide/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<div aria-label="versions" class="rst-versions" role="note">
<span class="rst-current-version" data-toggle="rst-current-version">
<span><a href="logopen.html" style="color: #fcfcfc;">« Previous</a></span>
<span style="margin-left: 15px"><a href="fmt.html" style="color: #fcfcfc">Next »</a></span>
</span>
</div>
<script>var base_url = '.';</script>
<script defer="" src="js/theme.js"></script>
<script defer="">
window.onload = function () {
SphinxRtdTheme.Navigation.enable(true);
};
</script>
</body>
</html>