Part of logsparser.lognormalizer View In Hierarchy
Basic normalization flow manager. Normalizers definitions are loaded from a path and checked against the DTD. If the definitions are syntactically correct, the normalizers are instantiated and populate the manager's cache. Normalization priormority is established as follows: * Maximum priority assigned to normalizers where the "appliedTo" tag is set to "raw". They MUST be mutually exclusive. * Medium priority assigned to normalizers where the "appliedTo" tag is set to "body". * Lowest priority assigned to any remaining normalizers. Some extra treatment is also done prior and after the log normalization: * Assignment of a unique ID, under the tag "uuid" * Conversion of date tags to UTC, if the "_timezone" was set prior to the normalization process.
Method | __init__ | Instantiates a flow manager. The default behavior is to activate every |
Method | reload | Refreshes this instance's normalizers pool. |
Method | iter_normalizer | Iterates through normalizers and returns the normalizers' paths. |
Method | __len__ | Returns the amount of available normalizers. |
Method | update_normalizer | used to add or update a normalizer. |
Method | get_normalizer_source | Returns the raw XML source of normalizer name. |
Method | activate_normalizers | Activates normalizers according to what was set by calling |
Method | get_active_normalizers | Returns a dictionary of normalizers; keys are normalizers' names and |
Method | set_active_normalizers | Sets the active/inactive normalizers. Default behavior is to |
Method | lognormalize | This method is the entry point to normalize data (a log). |
Method | uuidify | Adds a unique UID to the normalized log. |
Method | normalize | plain normalization. |
Method | _normalize | Used for testing only, the normalizers' tags prerequisite are |
Parameters | normalizer_path | absolute path to the normalizer XML definitions to use. |
active_normalizers | a dictionary of active normalizers in the form {name: [True|False]}. |
Returns | a generator of absolute paths. |
Parameters | raw_xml_contents | XML description of normalizer as flat XML. It must comply to the DTD. |
name | if set, the XML description will be saved as name.xml. If left blank, name will be fetched from the XML description. |
Parameters | norms | a dictionary, similar to the one returned by get_active_normalizers. |
This method is the entry point to normalize data (a log). data is passed through every activated normalizer and extra tagging occurs accordingly. data receives also an extra uuid tag. If data contains a key called _timezone, its value is used to convert any date into UTC. This value must be a valid timezone name; see the pytz module for more information. @param data: must be a dictionary with at least a key 'raw' or 'body' with BaseString values (preferably Unicode). Here an example : >>> from logsparser import lognormalizer >>> from pprint import pprint >>> ln = lognormalizer.LogNormalizer('/usr/local/share/normalizers/') >>> mylog = {'raw' : 'Jul 18 15:35:01 zoo /USR/SBIN/CRON[14338]: (root) CMD (/srv/git/redmine-changesets.sh)'} >>> ln.lognormalize(mylog) >>> pprint mylog {'body': '(root) CMD (/srv/git/redmine-changesets.sh)', 'date': datetime.datetime(2011, 7, 18, 15, 35, 1), 'pid': '14338', 'program': '/USR/SBIN/CRON', 'raw': 'Jul 18 15:35:01 zoo /USR/SBIN/CRON[14338]: (root) CMD (/srv/git/redmine-changesets.sh)', 'source': 'zoo', 'uuid': 70851882840934161193887647073096992594L}