l.l.LogNormalizer : class documentation

Part of logsparser.lognormalizer View In Hierarchy

Basic normalization flow manager.
Normalizers definitions are loaded from a path and checked against the DTD.
If the definitions are syntactically correct, the normalizers are
instantiated and populate the manager's cache.
Normalization priormority is established as follows:

* Maximum priority assigned to normalizers where the "appliedTo" tag is set
  to "raw". They MUST be mutually exclusive.
* Medium priority assigned to normalizers where the "appliedTo" tag is set
  to "body".
* Lowest priority assigned to any remaining normalizers.

Some extra treatment is also done prior and after the log normalization:

* Assignment of a unique ID, under the tag "uuid"
* Conversion of date tags to UTC, if the "_timezone" was set prior to
  the normalization process.
Method __init__ Instantiates a flow manager. The default behavior is to activate every
Method reload Refreshes this instance's normalizers pool.
Method iter_normalizer Iterates through normalizers and returns the normalizers' paths.
Method __len__ Returns the amount of available normalizers.
Method update_normalizer used to add or update a normalizer.
Method get_normalizer_source Returns the raw XML source of normalizer name.
Method activate_normalizers Activates normalizers according to what was set by calling
Method get_active_normalizers Returns a dictionary of normalizers; keys are normalizers' names and
Method set_active_normalizers Sets the active/inactive normalizers. Default behavior is to
Method lognormalize This method is the entry point to normalize data (a log).
Method uuidify Adds a unique UID to the normalized log.
Method normalize plain normalization.
Method _normalize Used for testing only, the normalizers' tags prerequisite are
def __init__(self, normalizers_path, active_normalizers={}):
Instantiates a flow manager. The default behavior is to activate every available normalizer.
Parametersnormalizer_pathabsolute path to the normalizer XML definitions to use.
active_normalizersa dictionary of active normalizers in the form {name: [True|False]}.
def reload(self):
Refreshes this instance's normalizers pool.
def iter_normalizer(self):
Iterates through normalizers and returns the normalizers' paths.
Returnsa generator of absolute paths.
def __len__(self):
Returns the amount of available normalizers.
def update_normalizer(self, raw_xml_contents, name=None):
used to add or update a normalizer.
Parametersraw_xml_contentsXML description of normalizer as flat XML. It must comply to the DTD.
nameif set, the XML description will be saved as name.xml. If left blank, name will be fetched from the XML description.
def get_normalizer_source(self, name):
Returns the raw XML source of normalizer name.
def activate_normalizers(self):
Activates normalizers according to what was set by calling set_active_normalizers. If no call to the latter function has been made so far, this method activates every normalizer.
def get_active_normalizers(self):
Returns a dictionary of normalizers; keys are normalizers' names and values are True|False according to the normalizer's activation state.
def set_active_normalizers(self, norms={}):
Sets the active/inactive normalizers. Default behavior is to deactivate every normalizer.
Parametersnormsa dictionary, similar to the one returned by get_active_normalizers.
def lognormalize(self, data):
This method is the entry point to normalize data (a log).

data is passed through every activated normalizer
and extra tagging occurs accordingly.

data receives also an extra uuid tag.

If data contains a key called _timezone, its value is used to convert
any date into UTC. This value must be a valid timezone name; see
the pytz module for more information.

@param data: must be a dictionary with at least a key 'raw' or 'body'
             with BaseString values (preferably Unicode).

Here an example :
>>> from logsparser import lognormalizer
>>> from pprint import pprint
>>> ln = lognormalizer.LogNormalizer('/usr/local/share/normalizers/')
>>> mylog = {'raw' : 'Jul 18 15:35:01 zoo /USR/SBIN/CRON[14338]: (root) CMD (/srv/git/redmine-changesets.sh)'}
>>> ln.lognormalize(mylog)
>>> pprint mylog
{'body': '(root) CMD (/srv/git/redmine-changesets.sh)',
'date': datetime.datetime(2011, 7, 18, 15, 35, 1),
'pid': '14338',
'program': '/USR/SBIN/CRON',
'raw': 'Jul 18 15:35:01 zoo /USR/SBIN/CRON[14338]: (root) CMD (/srv/git/redmine-changesets.sh)',
'source': 'zoo',
'uuid': 70851882840934161193887647073096992594L}
def uuidify(self, log):
Adds a unique UID to the normalized log.
def normalize(self, log):
plain normalization.
def _normalize(self, log):
Used for testing only, the normalizers' tags prerequisite are deactivated.
API Documentation for logsparser, generated by pydoctor at 2011-07-19 11:51:07.