Wiki page
[log] by
mario
2015-01-06 13:40:24.
D 2015-01-06T13:40:24.107
L log
N text/x-markdown
P e5d95d497fd7e0ad9b0f6675fcfb8afc90a549b2
U mario
W 11596
<h2> Structured and hierarchical logging with <kbd>:token</kbd>-parametric API </h2>
* <small> State: ***experimental*** </small>
* <small> Category: logging </small>
* <small> Features: journaling, structured, hierarchical </small>
* <small> Backend: SQLite, <del>JSON</del>, <del>fluentd</del> </small>
* <small> Signature: hybrid, parametric </small>
**logStruck <kbd>`Ε`</kbd>** implements a logging API and SQLite/JSON storage backend.
* It's purpose is storing *structured* log data, and it tries to retain log event hierarchies.
* Implements a hybrid and terse function interface.
* And accepts plain string messages, Ruby-style `:token` categorizers and placeholders, and expects array structures.
* Implicitly captures and maps PHP errors, unhandled exceptions, and most importantly *`assert()`ions*.
Unlike other PHP logging frameworks it's not a primarily text/line-oriented message store.
## Quick example
Basic invocations are along the lines of:
Ε(':warn', ':wikiauth', "User doesn't have permission", $pageObj, ':vars', $_SESSION);
All the fun is in the `:token` literals, and passing arrays or objects.
## Database scheme, primary fields
All columns in the database scheme are *primary fields*. Any extra data/values go into the `context` array.
<style>
table.dbstruct {
width: 85%;
margin-left: 5%;
}
table.dbstruct tr:nth-child(2n) {
background: #efefef;
}
</style>
<table class=dbstruct>
<tr><td> <kbd>i</kbd> </td> <td>PRIM</td> <td rowspan=3>Where `i` is the primary index, `g` the event group, and `p` the parent reference. <img src="raw/aeb869901f9dba7dc83f89f99d8ca00578c25c0c?m=image/png" align=bottom width=514 height=93 alt="log tree"></td></tr>
<tr><td> <kbd>g</kbd> </td> <td>INT</td> </tr>
<tr><td> <kbd>p</kbd> </td> <td>INT</td> </tr>
<tr><td> <kbd>timestamp</kbd> </td> <td>REAL</td> <td>Timestamp with microseconds.</td></tr>
<tr><td> <kbd>timestr</kbd></td> <td>TEXT</td> <td>ISO DateTime string. In GMT/UTC of course.</td></tr>
<tr><td> <kbd>host</kbd> </td> <td>TEXT</td> <td>Hostname.</td></tr>
<tr><td> <kbd>pri</kbd> </td> <td>INT</td> <td>Priority number (0β¦7).</td></tr>
<tr><td> <kbd>**prio**</kbd></td> <td>TEXT</td> <td>Priority string (emergβ¦info)</td></tr>
<tr><td> <kbd>source</kbd> </td> <td>TEXT</td> <td>log|sys|lang|excpt|assert</td></tr>
<tr><td> <kbd>errno</kbd> </td> <td>INT</td> <td>0β¦32767</td></tr>
<tr><td> <kbd>app</kbd> </td> <td>TEXT</td> <td>AppName.php</td></tr>
<tr><td> <kbd>**section**</kbd></td> <td>TEXT</td> <td>Application structure / module / part / section.</td></tr>
<tr><td> <kbd>file</kbd> </td> <td>TEXT</td> <td>path/file.php</td></tr>
<tr><td> <kbd>line</kbd> </td> <td>INT</td> <td>125</td></tr>
<tr><td> <kbd>version</kbd></td> <td>TEXT</td> <td>Meta data from source code.</td></tr>
<tr><td> <kbd>**message**</kbd></td> <td>TEXT</td> <td>Primary log event message string.</td></tr>
<tr><td> <kbd>doc</kbd> </td> <td>TEXT</td> <td>Extra documentation / long message / href.</td></tr>
<tr><td> <kbd>backtrace</kbd></td> <td>JSON</td> <td>Array of `:backtrace`</td></tr>
<tr><td> <kbd>code</kbd> </td> <td>TEXT</td> <td>Extracted code context (3 lines).</td></tr>
<tr><td> <kbd>vars</kbd> </td> <td>JSON</td> <td>Main $vars[] array.</td></tr>
<tr><td> <kbd>context</kbd></td> <td>JSON</td> <td>Additional / user-defined fields.</td></tr>
<table>
## Flexible parametric API
The chief invocation method is <kbd>Ε()</kbd>. On the outset it's a procedural function, thus available globally. Behind the scenes it keeps a primary logger group. Alternatively it can be invoked via <kbd>Ε::<em style=color:#630>option</em>_<em style=color:#226>tokens</em>()</kbd>.
For plain old log messages it's as simple to use as:
Ε("A thing happened.");
Usually you also want to convey a <u>priority</u> however:
Ε("Warnful warning", ':warn');
The <kbd>:token</kbd> attributes are the most important concept in logStruck. They simulate Ruby-style symbols. In PHP they need to be enquoted as strings however.
Besides priority levels, an *important* use is classifying an <u>application section</u>. Anything that isn't a reserved keyword :token will simply be assumed to refer an application module:
Ε(':auth', "Authorization error", ':notice');
Make up memorable designators to categorize your log messages to your application structure.
An obvious benefit of the :token syntax is that it allows <u>freely ordered</u> logging parameter lists.
These are all equivalent:
Ε(':warn', ':db', "Database error");
Ε(':db', ':warn', "Database error");
Ε(':warn', "Database error", ':db');
You'd probably want to settle on one of these. But it occasionally helps readability to be flexible, in particular when passing functional :tokens or arrays/lists.
#### Array data
As mentioned, this logger API isn't meant for just string data. You often want to convey <u>context data</u>, and additional attributes. It's often as simple as just attaching an array to the parameter list:
Ε(':debug', "Front controller state", $RequestVars);
Now those would end up in the `context` database field. To retain them in the `vars` column, you have two options:
Ε(':debug', "Front controller state", ['vars' => $RequestVars]);
Or the more fancy <kbd>:vars</kbd> array designator:
Ε(':debug', "Front controller state", ':vars', β$RequestVars);
You might also use this for other fields like `:message` or `:code` or `:doc`, as long as the following parameter is an array.
#### Data mapping with `"field: value"`
Plain strings usually end up in the `message` field. But the structured logging database has more <u>fields</u>, each with specific purposes. You can easily populate them with the key:value syntax.
Ε("File reading error", "errno: EACCESS");
Here `errno` is also an integer field, which will be converted afterwards.
A more interesting field to take care of is the <u>`doc`</u> column.
Ε("Cache directory locked", "doc: ?wiki=SetCachePerm");
Nowadays logs are often consumed by machines rather than humans. For some projects you may wish to augment the often coarse and technical messsage summary with a human-readable description for non-coders. (This is usually an afterthought, if at all implemented or manageable in log processors/viewers).
While you could use the `doc:` field for a long prosaic documentation, this would of course stuff the database too much. Instead prefer hyperlinks, or references. A `"?tktid=12345"` or `"See setup.txt on chmod cache"` are helpful minimums too.
All of *primary fields* can alternatively and also be set using the key:value scheme:
Ε('section: auth', "Auth warning");
Ε('source: sys', "Exec failure", "errno: $retval", [$cmd]);
Ε("message: Log message");
Ε("Regex failed", "code: $rx", ':error', ':iniparser');
And fields that aren't primary log event columns/fields, will end up in the `context` database array.
Ε("Special needs logging", "var1: $var1", "method: $callback");
This is basically equivalent to using a `["key"=>"value", ...]` list. Again, prefer what's more readable in whatever context.
#### Injectors
Now if there wasn't enough flexibility already, <kbd>:tokens</kbd> can also refer to data source functions.
You can augment log events with a <kbd>:backtrace</kbd> most of the time:
Ε(':warn', "How did we get here?", ':backtrace');
The placeholder token will be substituted by an array, before being pushed to the log store.
Likewise you can interpolate some common vars:
Ε(':debug', "Front Controller startup", ':server');
Or extract pretty much all available meta data:
Ε(':debug', "Debug by logging", ':backtrace', ':file', ':code', ':version');
The built-in error / exception / assert handlers do this automatically for example, to different degrees.
#### Hybrid <kbd>Ε::<em style=color:#630>option</em>_<em style=color:#226>tokens</em>()</kbd>
The :token scheme is pretty neat, but certainly not to everyones liking, and sometimes less readable than plain boring method calls. Therefore the `Ε` function and `Ε` class go hand in hand. Instead of listing tokens as arguments, you can just compact them into a virutal method name:
Ε::debug_auth("Authorization failed", $UserObject);
You can even freely mix in injector callbacks and one array designator there:
Ε::warn_db_backtrace_file_vars("DB error", $stmt);
Again, you're the programmer. Make sound choices on a case-by-case basis. Don't be clingy with same-old same-old.
## All the <kbd>:tokens</kbd>!
You're free to choose any `:section` name for grouping log events. Well, as long as they're not matching one of the reserved token keywords:
<table class=dbstruct style="width:75%">
<tr><th colspan=3>Priority levels</th></tr>
<tr><td><kbd>:debug</kbd></td> <td>`7`</td> <td>Low-level debug events.</td></tr>
<tr><td><kbd>:info</kbd></td> <td>`6`</td> <td>Process flow infos etc.</td></tr>
<tr><td><kbd>:notice</kbd>, <kbd>:note</kbd></td> <td>`5`</td> <td>Lowest priority language notices.</td></tr>
<tr><td><kbd>:warning</kbd>, <kbd>:warn</kbd></td> <td>`4`</td> <td>Warnings.</td></tr>
<tr><td><kbd>:error</kbd>, <kbd>:err</kbd></td> <td>`3`</td> <td>PHP or system error.</td></tr>
<tr><td><kbd>:critical</kbd>, <kbd>:crit</kbd></td> <td>`2`</td> <td>This can't be good.</td></tr>
<tr><td><kbd>:alert</kbd>, <kbd>:alrt</kbd></td> <td>`1`</td> <td>Turn on the bat light.</td></tr>
<tr><td><kbd>:emergency</kbd>, <kbd>:emerg</kbd></td> <td>`0`</td> <td>Someone call the president.</td></tr>
<tr><th colspan=3>Source / generator</th></tr>
<tr><td><kbd>:log</kbd></td> <td></td> <td>Application origin, normal/manual log calls.</td></tr>
<tr><td><kbd>:sys</kbd></td> <td></td> <td>System-level events and errno codes.</td></tr>
<tr><td><kbd>:lang</kbd></td> <td></td> <td>Language errors, warnings, notices, etc.</td></tr>
<tr><td><kbd>:exception</kbd></td> <td></td> <td>Langauge/runtime exceptions.</td></tr>
<tr><td><kbd>:assert</kbd></td> <td></td> <td>`Assert()` warnings.</td></tr>
<tr><th colspan=3>Field names</th></tr>
<tr><td colspan=3>Any database column / primary field name can be represented as `:token`. It's pretty much only useful to use <kbd>:vars</kbd> however to map the following array parameter.</td></tr>
<tr><th colspan=3>Injector calls</th></tr>
<tr><td><kbd>:backtrace</kbd></td> <td></td> <td>Populates backtrace.</td></tr>
<tr><td><kbd>:server</kbd></td> <td></td> <td>Inserts $_SERVER array into `context`.</td></tr>
<tr><td><kbd>:file</kbd></td> <td></td> <td>Uncovers `file` and `line` from backtrace.</td></tr>
<tr><td><kbd>:version</kbd></td> <td></td> <td>Reads out meta data (file/scm version, and section) from script comments.</td></tr>
<tr><td><kbd>:code</kbd></td> <td></td> <td>Inserts 3 lines of `code` context.</td></tr>
<tr><td><kbd>:p</kbd></td> <td></td> <td>Tries to deduce log event hierarchy from prior calls, sections, and backtraces. (Not yet implemented.)</td></tr>
</table>
### Notes / Rationale
* So, this is either genius, or completely bonkers.
* The function name <kbd>Ε</kbd> isn't completely settled on. (Maybe a bit too much novelty strive.)
* Extensibility of the database scheme is not planned for. (But easily done.)
Z 6cfe5f9e6a04d6ba6cab91e89ad3eefd