PHP utility collection with hybrid and fluent APIs.

βŒˆβŒ‹ βŽ‡ branch:  hybrid7 libraries


Artifact [e5d95d497f]

Artifact e5d95d497fd7e0ad9b0f6675fcfb8afc90a549b2:

Wiki page [log] by mario 2015-01-06 13:40:04.
D 2015-01-06T13:40:04.547
L log
N text/x-markdown
P 2ddd84c652f81debb55cf4a1ec3f7e14322dad48
U mario
W 11596
<h2> Structured and hierarchical logging with <kbd>:token</kbd>-parametric API </h2>

 * <small> State: ***experimental*** </small>
 * <small> Category: logging </small>
 * <small> Features: journaling, structured, hierarchical </small>
 * <small> Backend: SQLite, <del>JSON</del>, <del>fluentd</del> </small>
 * <small> Signature: hybrid, parametric </small>

**logStruck <kbd>`Ε‚`</kbd>** implements a logging API and SQLite/JSON storage backend. 

 * It's purpose is storing *structured* log data, and it tries to retain log event hierarchies.
 * Implements a hybrid and terse function interface.
 * And accepts plain string messages, Ruby-style `:token` categorizers and placeholders, and expects array structures.
 * Implicitly captures and maps PHP errors, unhandled exceptions, and most importantly *`assert()`ions*.

Unlike other PHP logging frameworks it's not a primarily text/line-oriented message store.



## Quick example

Basic invocations are along the lines of:

     Ε‚(':warn', ':wikiauth', "User doesn't have permission", $pageObj, ':vars', $_SESSION);

All the fun is in the `:token` literals, and passing arrays or objects.



## Database scheme, primary fields

All columns in the database scheme are *primary fields*. Any extra data/values go into the `context` array.

<style>
 table.dbstruct {
   width: 75%;
   margin-left: 5%;
 }
 table.dbstruct tr:nth-child(2n) {
   background: #efefef;
 }
</style>
<table class=dbstruct>
<tr><td> <kbd>i</kbd>  </td>  <td>PRIM</td>  <td rowspan=3>Where `i` is the primary index, `g` the event group, and `p` the parent reference. <img src="raw/aeb869901f9dba7dc83f89f99d8ca00578c25c0c?m=image/png" align=bottom width=514 height=93 alt="log tree"></td></tr>
<tr><td> <kbd>g</kbd>  </td>  <td>INT</td>            </tr>
<tr><td> <kbd>p</kbd>  </td>  <td>INT</td>            </tr>
<tr><td> <kbd>timestamp</kbd>   </td>  <td>REAL</td> <td>Timestamp with microseconds.</td></tr>
<tr><td> <kbd>timestr</kbd></td>  <td>TEXT</td> <td>ISO DateTime string. In GMT/UTC of course.</td></tr>
<tr><td> <kbd>host</kbd>   </td>  <td>TEXT</td> <td>Hostname.</td></tr>
<tr><td> <kbd>pri</kbd>    </td>  <td>INT</td> <td>Priority number (0…7).</td></tr>
<tr><td> <kbd>**prio**</kbd></td>  <td>TEXT</td> <td>Priority string (emerg…info)</td></tr>
<tr><td> <kbd>source</kbd> </td>  <td>TEXT</td> <td>log|sys|lang|excpt|assert</td></tr>
<tr><td> <kbd>errno</kbd>  </td>  <td>INT</td> <td>0…32767</td></tr>
<tr><td> <kbd>app</kbd>    </td>  <td>TEXT</td> <td>AppName.php</td></tr>
<tr><td> <kbd>**section**</kbd></td> <td>TEXT</td> <td>Application structure / module / part / section.</td></tr>
<tr><td> <kbd>file</kbd>   </td>  <td>TEXT</td> <td>path/file.php</td></tr>
<tr><td> <kbd>line</kbd>   </td>  <td>INT</td> <td>125</td></tr>
<tr><td> <kbd>version</kbd></td>  <td>TEXT</td> <td>Meta data from source code.</td></tr>
<tr><td> <kbd>**message**</kbd></td>  <td>TEXT</td> <td>Primary log event message string.</td></tr>
<tr><td> <kbd>doc</kbd>    </td>  <td>TEXT</td> <td>Extra documentation / long message / href.</td></tr>
<tr><td> <kbd>backtrace</kbd></td>  <td>JSON</td> <td>Array of `:backtrace`</td></tr>
<tr><td> <kbd>code</kbd>   </td>  <td>TEXT</td> <td>Extracted code context (3 lines).</td></tr>
<tr><td> <kbd>vars</kbd>   </td>  <td>JSON</td> <td>Main $vars[] array.</td></tr>
<tr><td> <kbd>context</kbd></td>  <td>JSON</td> <td>Additional / user-defined fields.</td></tr>
<table>


## Flexible parametric API

The chief invocation method is <kbd>Ε‚()</kbd>. On the outset it's a procedural function, thus available globally. Behind the scenes it keeps a primary logger group. Alternatively it can be invoked via <kbd>Ε‚::<em style=color:#630>option</em>_<em style=color:#226>tokens</em>()</kbd>.

For plain old log messages it's as simple to use as:

     Ε‚("A thing happened.");

Usually you also want to convey a <u>priority</u> however:

     Ε‚("Warnful warning", ':warn');

The <kbd>:token</kbd> attributes are the most important concept in logStruck. They simulate Ruby-style symbols. In PHP they need to be enquoted as strings however.

Besides priority levels, an *important* use is classifying an <u>application section</u>. Anything that isn't a reserved keyword :token will simply be assumed to refer an application module:

     Ε‚(':auth', "Authorization error", ':notice');

Make up memorable designators to categorize your log messages to your application structure.

An obvious benefit of the :token syntax is that it allows <u>freely ordered</u> logging parameter lists.  
These are all equivalent:

     Ε‚(':warn', ':db', "Database error");
     Ε‚(':db', ':warn', "Database error");
     Ε‚(':warn', "Database error", ':db');

You'd probably want to settle on one of these. But it occasionally helps readability to be flexible, in particular when passing functional :tokens or arrays/lists.


#### Array data

As mentioned, this logger API isn't meant for just string data. You often want to convey <u>context data</u>, and additional attributes. It's often as simple as just attaching an array to the parameter list:

     Ε‚(':debug', "Front controller state", $RequestVars);

Now those would end up in the `context` database field. To retain them in the `vars` column, you have two options:

     Ε‚(':debug', "Front controller state", ['vars' => $RequestVars]);

Or the more fancy <kbd>:vars</kbd> array designator:

     Ε‚(':debug', "Front controller state", ':vars', ⃕$RequestVars);

You might also use this for other fields like `:message` or `:code` or `:doc`, as long as the following parameter is an array.


#### Data mapping with `"field: value"`

Plain strings usually end up in the `message` field. But the structured logging database has more <u>fields</u>, each with specific purposes. You can easily populate them with the key:value syntax.

     Ε‚("File reading error", "errno: EACCESS");

Here `errno` is also an integer field, which will be converted afterwards.

A more interesting field to take care of is the <u>`doc`</u> column.

     Ε‚("Cache directory locked", "doc: ?wiki=SetCachePerm");

Nowadays logs are often consumed by machines rather than humans. For some projects you may wish to augment the often coarse and technical messsage summary with a human-readable description for non-coders. (This is usually an afterthought, if at all implemented or manageable in log processors/viewers).

While you could use the `doc:` field for a long prosaic documentation, this would of course stuff the database too much. Instead prefer hyperlinks, or references. A `"?tktid=12345"` or `"See setup.txt on chmod cache"` are helpful minimums too.

All of *primary fields* can alternatively and also be set using the key:value scheme:

     Ε‚('section: auth', "Auth warning");
     Ε‚('source: sys', "Exec failure", "errno: $retval", [$cmd]);
     Ε‚("message: Log message");
     Ε‚("Regex failed", "code: $rx", ':error', ':iniparser');

And fields that aren't primary log event columns/fields, will end up in the `context` database array.

     Ε‚("Special needs logging", "var1: $var1", "method: $callback");

This is basically equivalent to using a `["key"=>"value", ...]` list. Again, prefer what's more readable in whatever context.


#### Injectors

Now if there wasn't enough flexibility already, <kbd>:tokens</kbd> can also refer to data source functions.

You can augment log events with a <kbd>:backtrace</kbd> most of the time:

     Ε‚(':warn', "How did we get here?", ':backtrace');

The placeholder token will be substituted by an array, before being pushed to the log store.

Likewise you can interpolate some common vars:

     Ε‚(':debug', "Front Controller startup", ':server');

Or extract pretty much all available meta data:

     Ε‚(':debug', "Debug by logging", ':backtrace', ':file', ':code', ':version');

The built-in error / exception / assert handlers do this automatically for example, to different degrees.


#### Hybrid <kbd>Ε‚::<em style=color:#630>option</em>_<em style=color:#226>tokens</em>()</kbd>

The :token scheme is pretty neat, but certainly not to everyones liking, and sometimes less readable than plain boring method calls. Therefore the `Ε‚` function and `Ε‚` class go hand in hand. Instead of listing tokens as arguments, you can just compact them into a virutal method name:

     Ε‚::debug_auth("Authorization failed", $UserObject);

You can even freely mix in injector callbacks and one array designator there:

     Ε‚::warn_db_backtrace_file_vars("DB error", $stmt);

Again, you're the programmer. Make sound choices on a case-by-case basis. Don't be clingy with same-old same-old.


## All the <kbd>:tokens</kbd>!

You're free to choose any `:section` name for grouping log events. Well, as long as they're not matching one of the reserved token keywords:


<table class=dbstruct style="width:75%">

<tr><th colspan=3>Priority levels</th></tr>
<tr><td><kbd>:debug</kbd></td> <td>`7`</td> <td>Low-level debug events.</td></tr>
<tr><td><kbd>:info</kbd></td> <td>`6`</td> <td>Process flow infos etc.</td></tr>
<tr><td><kbd>:notice</kbd>, <kbd>:note</kbd></td> <td>`5`</td> <td>Lowest priority language notices.</td></tr>
<tr><td><kbd>:warning</kbd>, <kbd>:warn</kbd></td> <td>`4`</td> <td>Warnings.</td></tr>
<tr><td><kbd>:error</kbd>, <kbd>:err</kbd></td> <td>`3`</td> <td>PHP or system error.</td></tr>
<tr><td><kbd>:critical</kbd>, <kbd>:crit</kbd></td> <td>`2`</td> <td>This can't be good.</td></tr>
<tr><td><kbd>:alert</kbd>, <kbd>:alrt</kbd></td> <td>`1`</td> <td>Turn on the bat light.</td></tr>
<tr><td><kbd>:emergency</kbd>, <kbd>:emerg</kbd></td> <td>`0`</td> <td>Someone call the president.</td></tr>

<tr><th colspan=3>Source / generator</th></tr>
<tr><td><kbd>:log</kbd></td> <td></td> <td>Application origin, normal/manual log calls.</td></tr>
<tr><td><kbd>:sys</kbd></td> <td></td> <td>System-level events and errno codes.</td></tr>
<tr><td><kbd>:lang</kbd></td> <td></td> <td>Language errors, warnings, notices, etc.</td></tr>
<tr><td><kbd>:exception</kbd></td> <td></td> <td>Langauge/runtime exceptions.</td></tr>
<tr><td><kbd>:assert</kbd></td> <td></td> <td>`Assert()` warnings.</td></tr>

<tr><th colspan=3>Field names</th></tr>
<tr><td colspan=3>Any database column / primary field name can be represented as `:token`. It's pretty much only useful to use <kbd>:vars</kbd> however to map the following array parameter.</td></tr>

<tr><th colspan=3>Injector calls</th></tr>
<tr><td><kbd>:backtrace</kbd></td> <td></td> <td>Populates backtrace.</td></tr>
<tr><td><kbd>:server</kbd></td> <td></td> <td>Inserts $_SERVER array into `context`.</td></tr>
<tr><td><kbd>:file</kbd></td> <td></td> <td>Uncovers `file` and `line` from backtrace.</td></tr>
<tr><td><kbd>:version</kbd></td> <td></td> <td>Reads out meta data (file/scm version, and section) from script comments.</td></tr>
<tr><td><kbd>:code</kbd></td> <td></td> <td>Inserts 3 lines of `code` context.</td></tr>
<tr><td><kbd>:p</kbd></td> <td></td> <td>Tries to deduce log event hierarchy from prior calls, sections, and backtraces. (Not yet implemented.)</td></tr>

</table>



### Notes / Rationale

 * So, this is either genius, or completely bonkers.
 * The function name <kbd>Ε‚</kbd> isn't completely settled on. (Maybe a bit too much novelty strive.)
 * Extensibility of the database scheme is not planned for. (But easily done.)




Z f444dde6a0fa6228d92af1a7ee55c13c