python/mach/README.rst

Thu, 15 Jan 2015 15:55:04 +0100

author
Michael Schloh von Bennewitz <michael@schloh.com>
date
Thu, 15 Jan 2015 15:55:04 +0100
branch
TOR_BUG_9701
changeset 9
a63d609f5ebe
permissions
-rw-r--r--

Back out 97036ab72558 which inappropriately compared turds to third parties.

michael@0 1 ====
michael@0 2 mach
michael@0 3 ====
michael@0 4
michael@0 5 Mach (German for *do*) is a generic command dispatcher for the command
michael@0 6 line.
michael@0 7
michael@0 8 To use mach, you install the mach core (a Python package), create an
michael@0 9 executable *driver* script (named whatever you want), and write mach
michael@0 10 commands. When the *driver* is executed, mach dispatches to the
michael@0 11 requested command handler automatically.
michael@0 12
michael@0 13 Features
michael@0 14 ========
michael@0 15
michael@0 16 On a high level, mach is similar to using argparse with subparsers (for
michael@0 17 command handling). When you dig deeper, mach offers a number of
michael@0 18 additional features:
michael@0 19
michael@0 20 Distributed command definitions
michael@0 21 With optparse/argparse, you have to define your commands on a central
michael@0 22 parser instance. With mach, you annotate your command methods with
michael@0 23 decorators and mach finds and dispatches to them automatically.
michael@0 24
michael@0 25 Command categories
michael@0 26 Mach commands can be grouped into categories when displayed in help.
michael@0 27 This is currently not possible with argparse.
michael@0 28
michael@0 29 Logging management
michael@0 30 Mach provides a facility for logging (both classical text and
michael@0 31 structured) that is available to any command handler.
michael@0 32
michael@0 33 Settings files
michael@0 34 Mach provides a facility for reading settings from an ini-like file
michael@0 35 format.
michael@0 36
michael@0 37 Components
michael@0 38 ==========
michael@0 39
michael@0 40 Mach is conceptually composed of the following components:
michael@0 41
michael@0 42 core
michael@0 43 The mach core is the core code powering mach. This is a Python package
michael@0 44 that contains all the business logic that makes mach work. The mach
michael@0 45 core is common to all mach deployments.
michael@0 46
michael@0 47 commands
michael@0 48 These are what mach dispatches to. Commands are simply Python methods
michael@0 49 registered as command names. The set of commands is unique to the
michael@0 50 environment mach is deployed in.
michael@0 51
michael@0 52 driver
michael@0 53 The *driver* is the entry-point to mach. It is simply an executable
michael@0 54 script that loads the mach core, tells it where commands can be found,
michael@0 55 then asks the mach core to handle the current request. The driver is
michael@0 56 unique to the deployed environment. But, it's usually based on an
michael@0 57 example from this source tree.
michael@0 58
michael@0 59 Project State
michael@0 60 =============
michael@0 61
michael@0 62 mach was originally written as a command dispatching framework to aid
michael@0 63 Firefox development. While the code is mostly generic, there are still
michael@0 64 some pieces that closely tie it to Mozilla/Firefox. The goal is for
michael@0 65 these to eventually be removed and replaced with generic features so
michael@0 66 mach is suitable for anybody to use. Until then, mach may not be the
michael@0 67 best fit for you.
michael@0 68
michael@0 69 Implementing Commands
michael@0 70 ---------------------
michael@0 71
michael@0 72 Mach commands are defined via Python decorators.
michael@0 73
michael@0 74 All the relevant decorators are defined in the *mach.decorators* module.
michael@0 75 The important decorators are as follows:
michael@0 76
michael@0 77 CommandProvider
michael@0 78 A class decorator that denotes that a class contains mach
michael@0 79 commands. The decorator takes no arguments.
michael@0 80
michael@0 81 Command
michael@0 82 A method decorator that denotes that the method should be called when
michael@0 83 the specified command is requested. The decorator takes a command name
michael@0 84 as its first argument and a number of additional arguments to
michael@0 85 configure the behavior of the command.
michael@0 86
michael@0 87 CommandArgument
michael@0 88 A method decorator that defines an argument to the command. Its
michael@0 89 arguments are essentially proxied to ArgumentParser.add_argument()
michael@0 90
michael@0 91 Classes with the *@CommandProvider* decorator *must* have an *__init__*
michael@0 92 method that accepts 1 or 2 arguments. If it accepts 2 arguments, the
michael@0 93 2nd argument will be a *MachCommandContext* instance. This is just a named
michael@0 94 tuple containing references to objects provided by the mach driver.
michael@0 95
michael@0 96 Here is a complete example::
michael@0 97
michael@0 98 from mach.decorators import (
michael@0 99 CommandArgument,
michael@0 100 CommandProvider,
michael@0 101 Command,
michael@0 102 )
michael@0 103
michael@0 104 @CommandProvider
michael@0 105 class MyClass(object):
michael@0 106 @Command('doit', help='Do ALL OF THE THINGS.')
michael@0 107 @CommandArgument('--force', '-f', action='store_true',
michael@0 108 help='Force doing it.')
michael@0 109 def doit(self, force=False):
michael@0 110 # Do stuff here.
michael@0 111
michael@0 112 When the module is loaded, the decorators tell mach about all handlers.
michael@0 113 When mach runs, it takes the assembled metadata from these handlers and
michael@0 114 hooks it up to the command line driver. Under the hood, arguments passed
michael@0 115 to the decorators are being used to help mach parse command arguments,
michael@0 116 formulate arguments to the methods, etc. See the documentation in the
michael@0 117 *mach.base* module for more.
michael@0 118
michael@0 119 The Python modules defining mach commands do not need to live inside the
michael@0 120 main mach source tree.
michael@0 121
michael@0 122 Conditionally Filtering Commands
michael@0 123 --------------------------------
michael@0 124
michael@0 125 Sometimes it might only make sense to run a command given a certain
michael@0 126 context. For example, running tests only makes sense if the product
michael@0 127 they are testing has been built, and said build is available. To make
michael@0 128 sure a command is only runnable from within a correct context, you can
michael@0 129 define a series of conditions on the *Command* decorator.
michael@0 130
michael@0 131 A condition is simply a function that takes an instance of the
michael@0 132 *CommandProvider* class as an argument, and returns True or False. If
michael@0 133 any of the conditions defined on a command return False, the command
michael@0 134 will not be runnable. The doc string of a condition function is used in
michael@0 135 error messages, to explain why the command cannot currently be run.
michael@0 136
michael@0 137 Here is an example:
michael@0 138
michael@0 139 from mach.decorators import (
michael@0 140 CommandProvider,
michael@0 141 Command,
michael@0 142 )
michael@0 143
michael@0 144 def build_available(cls):
michael@0 145 """The build needs to be available."""
michael@0 146 return cls.build_path is not None
michael@0 147
michael@0 148 @CommandProvider
michael@0 149 class MyClass(MachCommandBase):
michael@0 150 def __init__(self, build_path=None):
michael@0 151 self.build_path = build_path
michael@0 152
michael@0 153 @Command('run_tests', conditions=[build_available])
michael@0 154 def run_tests(self):
michael@0 155 # Do stuff here.
michael@0 156
michael@0 157 It is important to make sure that any state needed by the condition is
michael@0 158 available to instances of the command provider.
michael@0 159
michael@0 160 By default all commands without any conditions applied will be runnable,
michael@0 161 but it is possible to change this behaviour by setting *require_conditions*
michael@0 162 to True:
michael@0 163
michael@0 164 m = mach.main.Mach()
michael@0 165 m.require_conditions = True
michael@0 166
michael@0 167 Minimizing Code in Commands
michael@0 168 ---------------------------
michael@0 169
michael@0 170 Mach command modules, classes, and methods work best when they are
michael@0 171 minimal dispatchers. The reason is import bloat. Currently, the mach
michael@0 172 core needs to import every Python file potentially containing mach
michael@0 173 commands for every command invocation. If you have dozens of commands or
michael@0 174 commands in modules that import a lot of Python code, these imports
michael@0 175 could slow mach down and waste memory.
michael@0 176
michael@0 177 It is thus recommended that mach modules, classes, and methods do as
michael@0 178 little work as possible. Ideally the module should only import from
michael@0 179 the *mach* package. If you need external modules, you should import them
michael@0 180 from within the command method.
michael@0 181
michael@0 182 To keep code size small, the body of a command method should be limited
michael@0 183 to:
michael@0 184
michael@0 185 1. Obtaining user input (parsing arguments, prompting, etc)
michael@0 186 2. Calling into some other Python package
michael@0 187 3. Formatting output
michael@0 188
michael@0 189 Of course, these recommendations can be ignored if you want to risk
michael@0 190 slower performance.
michael@0 191
michael@0 192 In the future, the mach driver may cache the dispatching information or
michael@0 193 have it intelligently loaded to facilitate lazy loading.
michael@0 194
michael@0 195 Logging
michael@0 196 =======
michael@0 197
michael@0 198 Mach configures a built-in logging facility so commands can easily log
michael@0 199 data.
michael@0 200
michael@0 201 What sets the logging facility apart from most loggers you've seen is
michael@0 202 that it encourages structured logging. Instead of conventional logging
michael@0 203 where simple strings are logged, the internal logging mechanism logs all
michael@0 204 events with the following pieces of information:
michael@0 205
michael@0 206 * A string *action*
michael@0 207 * A dict of log message fields
michael@0 208 * A formatting string
michael@0 209
michael@0 210 Essentially, instead of assembling a human-readable string at
michael@0 211 logging-time, you create an object holding all the pieces of data that
michael@0 212 will constitute your logged event. For each unique type of logged event,
michael@0 213 you assign an *action* name.
michael@0 214
michael@0 215 Depending on how logging is configured, your logged event could get
michael@0 216 written a couple of different ways.
michael@0 217
michael@0 218 JSON Logging
michael@0 219 ------------
michael@0 220
michael@0 221 Where machines are the intended target of the logging data, a JSON
michael@0 222 logger is configured. The JSON logger assembles an array consisting of
michael@0 223 the following elements:
michael@0 224
michael@0 225 * Decimal wall clock time in seconds since UNIX epoch
michael@0 226 * String *action* of message
michael@0 227 * Object with structured message data
michael@0 228
michael@0 229 The JSON-serialized array is written to a configured file handle.
michael@0 230 Consumers of this logging stream can just perform a readline() then feed
michael@0 231 that into a JSON deserializer to reconstruct the original logged
michael@0 232 message. They can key off the *action* element to determine how to
michael@0 233 process individual events. There is no need to invent a parser.
michael@0 234 Convenient, isn't it?
michael@0 235
michael@0 236 Logging for Humans
michael@0 237 ------------------
michael@0 238
michael@0 239 Where humans are the intended consumer of a log message, the structured
michael@0 240 log message are converted to more human-friendly form. This is done by
michael@0 241 utilizing the *formatting* string provided at log time. The logger
michael@0 242 simply calls the *format* method of the formatting string, passing the
michael@0 243 dict containing the message's fields.
michael@0 244
michael@0 245 When *mach* is used in a terminal that supports it, the logging facility
michael@0 246 also supports terminal features such as colorization. This is done
michael@0 247 automatically in the logging layer - there is no need to control this at
michael@0 248 logging time.
michael@0 249
michael@0 250 In addition, messages intended for humans typically prepends every line
michael@0 251 with the time passed since the application started.
michael@0 252
michael@0 253 Logging HOWTO
michael@0 254 -------------
michael@0 255
michael@0 256 Structured logging piggybacks on top of Python's built-in logging
michael@0 257 infrastructure provided by the *logging* package. We accomplish this by
michael@0 258 taking advantage of *logging.Logger.log()*'s *extra* argument. To this
michael@0 259 argument, we pass a dict with the fields *action* and *params*. These
michael@0 260 are the string *action* and dict of message fields, respectively. The
michael@0 261 formatting string is passed as the *msg* argument, like normal.
michael@0 262
michael@0 263 If you were logging to a logger directly, you would do something like:
michael@0 264
michael@0 265 logger.log(logging.INFO, 'My name is {name}',
michael@0 266 extra={'action': 'my_name', 'params': {'name': 'Gregory'}})
michael@0 267
michael@0 268 The JSON logging would produce something like:
michael@0 269
michael@0 270 [1339985554.306338, "my_name", {"name": "Gregory"}]
michael@0 271
michael@0 272 Human logging would produce something like:
michael@0 273
michael@0 274 0.52 My name is Gregory
michael@0 275
michael@0 276 Since there is a lot of complexity using logger.log directly, it is
michael@0 277 recommended to go through a wrapping layer that hides part of the
michael@0 278 complexity for you. The easiest way to do this is by utilizing the
michael@0 279 LoggingMixin:
michael@0 280
michael@0 281 import logging
michael@0 282 from mach.mixin.logging import LoggingMixin
michael@0 283
michael@0 284 class MyClass(LoggingMixin):
michael@0 285 def foo(self):
michael@0 286 self.log(logging.INFO, 'foo_start', {'bar': True},
michael@0 287 'Foo performed. Bar: {bar}')
michael@0 288
michael@0 289 Entry Points
michael@0 290 ============
michael@0 291
michael@0 292 It is possible to use setuptools' entry points to load commands
michael@0 293 directly from python packages. A mach entry point is a function which
michael@0 294 returns a list of files or directories containing mach command
michael@0 295 providers. e.g.::
michael@0 296
michael@0 297 def list_providers():
michael@0 298 providers = []
michael@0 299 here = os.path.abspath(os.path.dirname(__file__))
michael@0 300 for p in os.listdir(here):
michael@0 301 if p.endswith('.py'):
michael@0 302 providers.append(os.path.join(here, p))
michael@0 303 return providers
michael@0 304
michael@0 305 See http://pythonhosted.org/setuptools/setuptools.html#dynamic-discovery-of-services-and-plugins
michael@0 306 for more information on creating an entry point. To search for entry
michael@0 307 point plugins, you can call *load_commands_from_entry_point*. This
michael@0 308 takes a single parameter called *group*. This is the name of the entry
michael@0 309 point group to load and defaults to ``mach.providers``. e.g.::
michael@0 310
michael@0 311 mach.load_commands_from_entry_point("mach.external.providers")
michael@0 312
michael@0 313 Adding Global Arguments
michael@0 314 =======================
michael@0 315
michael@0 316 Arguments to mach commands are usually command-specific. However,
michael@0 317 mach ships with a handful of global arguments that apply to all
michael@0 318 commands.
michael@0 319
michael@0 320 It is possible to extend the list of global arguments. In your
michael@0 321 *mach driver*, simply call ``add_global_argument()`` on your
michael@0 322 ``mach.main.Mach`` instance. e.g.::
michael@0 323
michael@0 324 mach = mach.main.Mach(os.getcwd())
michael@0 325
michael@0 326 # Will allow --example to be specified on every mach command.
michael@0 327 mach.add_global_argument('--example', action='store_true',
michael@0 328 help='Demonstrate an example global argument.')

mercurial