deploy: 5ecad4e7a5
				
					
				
			
							parent
							
								
									46f9dffe54
								
							
						
					
					
						commit
						ca15a1d699
					
				|  | @ -194,10 +194,13 @@ activity.</p> | |||
| <p>The <code>synapse.logging.context</code> module provides a facilities for managing | ||||
| the current log context (as well as providing the <code>LoggingContextFilter</code> | ||||
| class).</p> | ||||
| <p>Deferreds make the whole thing complicated, so this document describes | ||||
| <p>Asynchronous functions make the whole thing complicated, so this document describes | ||||
| how it all works, and how to write code which follows the rules.</p> | ||||
| <h2 id="logcontexts-without-deferreds"><a class="header" href="#logcontexts-without-deferreds">Logcontexts without Deferreds</a></h2> | ||||
| <p>In the absence of any Deferred voodoo, things are simple enough. As with | ||||
| <p>In this document, "awaitable" refers to any object which can be <code>await</code>ed. In the context of | ||||
| Synapse, that normally means either a coroutine or a Twisted  | ||||
| <a href="https://twistedmatrix.com/documents/current/api/twisted.internet.defer.Deferred.html"><code>Deferred</code></a>.</p> | ||||
| <h2 id="logcontexts-without-asynchronous-code"><a class="header" href="#logcontexts-without-asynchronous-code">Logcontexts without asynchronous code</a></h2> | ||||
| <p>In the absence of any asynchronous voodoo, things are simple enough. As with | ||||
| any code of this nature, the rule is that our function should leave | ||||
| things as it found them:</p> | ||||
| <pre><code class="language-python">from synapse.logging import context         # omitted from future snippets | ||||
|  | @ -227,39 +230,36 @@ can be written much more succinctly as:</p> | |||
| def do_request_handling(): | ||||
|     logger.debug("phew") | ||||
| </code></pre> | ||||
| <h2 id="using-logcontexts-with-deferreds"><a class="header" href="#using-logcontexts-with-deferreds">Using logcontexts with Deferreds</a></h2> | ||||
| <p>Deferreds --- and in particular, <code>defer.inlineCallbacks</code> --- break the | ||||
| linear flow of code so that there is no longer a single entry point | ||||
| where we should set the logcontext and a single exit point where we | ||||
| should remove it.</p> | ||||
| <h2 id="using-logcontexts-with-awaitables"><a class="header" href="#using-logcontexts-with-awaitables">Using logcontexts with awaitables</a></h2> | ||||
| <p>Awaitables break the linear flow of code so that there is no longer a single entry point | ||||
| where we should set the logcontext and a single exit point where we should remove it.</p> | ||||
| <p>Consider the example above, where <code>do_request_handling</code> needs to do some | ||||
| blocking operation, and returns a deferred:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def handle_request(request_id): | ||||
| blocking operation, and returns an awaitable:</p> | ||||
| <pre><code class="language-python">async def handle_request(request_id): | ||||
|     with context.LoggingContext() as request_context: | ||||
|         request_context.request = request_id | ||||
|         yield do_request_handling() | ||||
|         await do_request_handling() | ||||
|         logger.debug("finished") | ||||
| </code></pre> | ||||
| <p>In the above flow:</p> | ||||
| <ul> | ||||
| <li>The logcontext is set</li> | ||||
| <li><code>do_request_handling</code> is called, and returns a deferred</li> | ||||
| <li><code>handle_request</code> yields the deferred</li> | ||||
| <li>The <code>inlineCallbacks</code> wrapper of <code>handle_request</code> returns a deferred</li> | ||||
| <li><code>do_request_handling</code> is called, and returns an awaitable</li> | ||||
| <li><code>handle_request</code> awaits the awaitable</li> | ||||
| <li>Execution of <code>handle_request</code> is suspended</li> | ||||
| </ul> | ||||
| <p>So we have stopped processing the request (and will probably go on to | ||||
| start processing the next), without clearing the logcontext.</p> | ||||
| <p>To circumvent this problem, synapse code assumes that, wherever you have | ||||
| a deferred, you will want to yield on it. To that end, whereever | ||||
| functions return a deferred, we adopt the following conventions:</p> | ||||
| <p><strong>Rules for functions returning deferreds:</strong></p> | ||||
| an awaitable, you will want to <code>await</code> it. To that end, whereever | ||||
| functions return awaitables, we adopt the following conventions:</p> | ||||
| <p><strong>Rules for functions returning awaitables:</strong></p> | ||||
| <blockquote> | ||||
| <ul> | ||||
| <li>If the deferred is already complete, the function returns with the | ||||
| <li>If the awaitable is already complete, the function returns with the | ||||
| same logcontext it started with.</li> | ||||
| <li>If the deferred is incomplete, the function clears the logcontext | ||||
| before returning; when the deferred completes, it restores the | ||||
| <li>If the awaitable is incomplete, the function clears the logcontext | ||||
| before returning; when the awaitable completes, it restores the | ||||
| logcontext before running any callbacks.</li> | ||||
| </ul> | ||||
| </blockquote> | ||||
|  | @ -267,72 +267,60 @@ logcontext before running any callbacks.</li> | |||
| the example above) "just works". There are two cases:</p> | ||||
| <ul> | ||||
| <li> | ||||
| <p>If <code>do_request_handling</code> returns a completed deferred, then the | ||||
| <p>If <code>do_request_handling</code> returns a completed awaitable, then the | ||||
| logcontext will still be in place. In this case, execution will | ||||
| continue immediately after the <code>yield</code>; the "finished" line will | ||||
| continue immediately after the <code>await</code>; the "finished" line will | ||||
| be logged against the right context, and the <code>with</code> block restores | ||||
| the original context before we return to the caller.</p> | ||||
| </li> | ||||
| <li> | ||||
| <p>If the returned deferred is incomplete, <code>do_request_handling</code> clears | ||||
| <p>If the returned awaitable is incomplete, <code>do_request_handling</code> clears | ||||
| the logcontext before returning. The logcontext is therefore clear | ||||
| when <code>handle_request</code> yields the deferred. At that point, the | ||||
| <code>inlineCallbacks</code> wrapper adds a callback to the deferred, and | ||||
| returns another (incomplete) deferred to the caller, and it is safe | ||||
| to begin processing the next request.</p> | ||||
| <p>Once <code>do_request_handling</code>'s deferred completes, it will reinstate | ||||
| the logcontext, before running the callback added by the | ||||
| <code>inlineCallbacks</code> wrapper. That callback runs the second half of | ||||
| <code>handle_request</code>, so again the "finished" line will be logged | ||||
| against the right context, and the <code>with</code> block restores the | ||||
| original context.</p> | ||||
| when <code>handle_request</code> <code>await</code>s the awaitable.</p> | ||||
| <p>Once <code>do_request_handling</code>'s awaitable completes, it will reinstate | ||||
| the logcontext, before running the second half of <code>handle_request</code>, | ||||
| so again the "finished" line will be logged against the right context, | ||||
| and the <code>with</code> block restores the original context.</p> | ||||
| </li> | ||||
| </ul> | ||||
| <p>As an aside, it's worth noting that <code>handle_request</code> follows our rules | ||||
| -though that only matters if the caller has its own logcontext which it | ||||
| cares about.</p> | ||||
| <p>As an aside, it's worth noting that <code>handle_request</code> follows our rules</p> | ||||
| <ul> | ||||
| <li>though that only matters if the caller has its own logcontext which it | ||||
| cares about.</li> | ||||
| </ul> | ||||
| <p>The following sections describe pitfalls and helpful patterns when | ||||
| implementing these rules.</p> | ||||
| <h2 id="always-yield-your-deferreds"><a class="header" href="#always-yield-your-deferreds">Always yield your deferreds</a></h2> | ||||
| <p>Whenever you get a deferred back from a function, you should <code>yield</code> on | ||||
| it as soon as possible. (Returning it directly to your caller is ok too, | ||||
| if you're not doing <code>inlineCallbacks</code>.) Do not pass go; do not do any | ||||
| logging; do not call any other functions.</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def fun(): | ||||
| <h2 id="always-await-your-awaitables"><a class="header" href="#always-await-your-awaitables">Always await your awaitables</a></h2> | ||||
| <p>Whenever you get an awaitable back from a function, you should <code>await</code> on | ||||
| it as soon as possible. Do not pass go; do not do any logging; do not | ||||
| call any other functions.</p> | ||||
| <pre><code class="language-python">async def fun(): | ||||
|     logger.debug("starting") | ||||
|     yield do_some_stuff()       # just like this | ||||
|     await do_some_stuff()       # just like this | ||||
| 
 | ||||
|     d = more_stuff() | ||||
|     result = yield d            # also fine, of course | ||||
|     coro = more_stuff() | ||||
|     result = await coro         # also fine, of course | ||||
| 
 | ||||
|     return result | ||||
| 
 | ||||
| def nonInlineCallbacksFun(): | ||||
|     logger.debug("just a wrapper really") | ||||
|     return do_some_stuff()      # this is ok too - the caller will yield on | ||||
|                                 # it anyway. | ||||
| </code></pre> | ||||
| <p>Provided this pattern is followed all the way back up to the callchain | ||||
| to where the logcontext was set, this will make things work out ok: | ||||
| provided <code>do_some_stuff</code> and <code>more_stuff</code> follow the rules above, then | ||||
| so will <code>fun</code> (as wrapped by <code>inlineCallbacks</code>) and | ||||
| <code>nonInlineCallbacksFun</code>.</p> | ||||
| <p>It's all too easy to forget to <code>yield</code>: for instance if we forgot that | ||||
| <code>do_some_stuff</code> returned a deferred, we might plough on regardless. This | ||||
| so will <code>fun</code>.</p> | ||||
| <p>It's all too easy to forget to <code>await</code>: for instance if we forgot that | ||||
| <code>do_some_stuff</code> returned an awaitable, we might plough on regardless. This | ||||
| leads to a mess; it will probably work itself out eventually, but not | ||||
| before a load of stuff has been logged against the wrong context. | ||||
| (Normally, other things will break, more obviously, if you forget to | ||||
| <code>yield</code>, so this tends not to be a major problem in practice.)</p> | ||||
| <code>await</code>, so this tends not to be a major problem in practice.)</p> | ||||
| <p>Of course sometimes you need to do something a bit fancier with your | ||||
| Deferreds - not all code follows the linear A-then-B-then-C pattern. | ||||
| awaitable - not all code follows the linear A-then-B-then-C pattern. | ||||
| Notes on implementing more complex patterns are in later sections.</p> | ||||
| <h2 id="where-you-create-a-new-deferred-make-it-follow-the-rules"><a class="header" href="#where-you-create-a-new-deferred-make-it-follow-the-rules">Where you create a new Deferred, make it follow the rules</a></h2> | ||||
| <p>Most of the time, a Deferred comes from another synapse function. | ||||
| Sometimes, though, we need to make up a new Deferred, or we get a | ||||
| Deferred back from external code. We need to make it follow our rules.</p> | ||||
| <p>The easy way to do it is with a combination of <code>defer.inlineCallbacks</code>, | ||||
| and <code>context.PreserveLoggingContext</code>. Suppose we want to implement | ||||
| <h2 id="where-you-create-a-new-awaitable-make-it-follow-the-rules"><a class="header" href="#where-you-create-a-new-awaitable-make-it-follow-the-rules">Where you create a new awaitable, make it follow the rules</a></h2> | ||||
| <p>Most of the time, an awaitable comes from another synapse function. | ||||
| Sometimes, though, we need to make up a new awaitable, or we get an awaitable | ||||
| back from external code. We need to make it follow our rules.</p> | ||||
| <p>The easy way to do it is by using <code>context.make_deferred_yieldable</code>. Suppose we want to implement | ||||
| <code>sleep</code>, which returns a deferred which will run its callbacks after a | ||||
| given number of seconds. That might look like:</p> | ||||
| <pre><code class="language-python"># not a logcontext-rules-compliant function | ||||
|  | @ -341,37 +329,26 @@ def get_sleep_deferred(seconds): | |||
|     reactor.callLater(seconds, d.callback, None) | ||||
|     return d | ||||
| </code></pre> | ||||
| <p>That doesn't follow the rules, but we can fix it by wrapping it with | ||||
| <code>PreserveLoggingContext</code> and <code>yield</code> ing on it:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def sleep(seconds): | ||||
|     with PreserveLoggingContext(): | ||||
|         yield get_sleep_deferred(seconds) | ||||
| </code></pre> | ||||
| <p>This technique works equally for external functions which return | ||||
| deferreds, or deferreds we have made ourselves.</p> | ||||
| <p>You can also use <code>context.make_deferred_yieldable</code>, which just does the | ||||
| boilerplate for you, so the above could be written:</p> | ||||
| <pre><code class="language-python">def sleep(seconds): | ||||
|     return context.make_deferred_yieldable(get_sleep_deferred(seconds)) | ||||
| <p>That doesn't follow the rules, but we can fix it by calling it through | ||||
| <code>context.make_deferred_yieldable</code>:</p> | ||||
| <pre><code class="language-python">async def sleep(seconds): | ||||
|     return await context.make_deferred_yieldable(get_sleep_deferred(seconds)) | ||||
| </code></pre> | ||||
| <h2 id="fire-and-forget"><a class="header" href="#fire-and-forget">Fire-and-forget</a></h2> | ||||
| <p>Sometimes you want to fire off a chain of execution, but not wait for | ||||
| its result. That might look a bit like this:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def do_request_handling(): | ||||
|     yield foreground_operation() | ||||
| <pre><code class="language-python">async def do_request_handling(): | ||||
|     await foreground_operation() | ||||
| 
 | ||||
|     # *don't* do this | ||||
|     background_operation() | ||||
| 
 | ||||
|     logger.debug("Request handling complete") | ||||
| 
 | ||||
| @defer.inlineCallbacks | ||||
| def background_operation(): | ||||
|     yield first_background_step() | ||||
| async def background_operation(): | ||||
|     await first_background_step() | ||||
|     logger.debug("Completed first step") | ||||
|     yield second_background_step() | ||||
|     await second_background_step() | ||||
|     logger.debug("Completed second step") | ||||
| </code></pre> | ||||
| <p>The above code does a couple of steps in the background after | ||||
|  | @ -379,12 +356,12 @@ def background_operation(): | |||
| against the <code>request_context</code> logcontext, which may or may not be | ||||
| desirable. There are two big problems with the above, however. The first | ||||
| problem is that, if <code>background_operation</code> returns an incomplete | ||||
| Deferred, it will expect its caller to <code>yield</code> immediately, so will have | ||||
| awaitable, it will expect its caller to <code>await</code> immediately, so will have | ||||
| cleared the logcontext. In this example, that means that 'Request | ||||
| handling complete' will be logged without any context.</p> | ||||
| <p>The second problem, which is potentially even worse, is that when the | ||||
| Deferred returned by <code>background_operation</code> completes, it will restore | ||||
| the original logcontext. There is nothing waiting on that Deferred, so | ||||
| awaitable returned by <code>background_operation</code> completes, it will restore | ||||
| the original logcontext. There is nothing waiting on that awaitable, so | ||||
| the logcontext will leak into the reactor and possibly get attached to | ||||
| some arbitrary future operation.</p> | ||||
| <p>There are two potential solutions to this.</p> | ||||
|  | @ -393,9 +370,8 @@ some arbitrary future operation.</p> | |||
| starting <code>background_operation</code> (so the context restored when the | ||||
| deferred completes will be the empty logcontext), and will restore the | ||||
| current logcontext before continuing the foreground process:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def do_request_handling(): | ||||
|     yield foreground_operation() | ||||
| <pre><code class="language-python">async def do_request_handling(): | ||||
|     await foreground_operation() | ||||
| 
 | ||||
|     # start background_operation off in the empty logcontext, to | ||||
|     # avoid leaking the current context into the reactor. | ||||
|  | @ -411,14 +387,13 @@ def do_request_handling(): | |||
| <code>with LoggingContext(...)</code> in <code>background_operation</code>).</p> | ||||
| <p>The second option is to use <code>context.run_in_background</code>, which wraps a | ||||
| function so that it doesn't reset the logcontext even when it returns | ||||
| an incomplete deferred, and adds a callback to the returned deferred to | ||||
| an incomplete awaitable, and adds a callback to the returned awaitable to | ||||
| reset the logcontext. In other words, it turns a function that follows | ||||
| the Synapse rules about logcontexts and Deferreds into one which behaves | ||||
| the Synapse rules about logcontexts and awaitables into one which behaves | ||||
| more like an external function --- the opposite operation to that | ||||
| described in the previous section. It can be used like this:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def do_request_handling(): | ||||
|     yield foreground_operation() | ||||
| <pre><code class="language-python">async def do_request_handling(): | ||||
|     await foreground_operation() | ||||
| 
 | ||||
|     context.run_in_background(background_operation) | ||||
| 
 | ||||
|  | @ -427,120 +402,41 @@ def do_request_handling(): | |||
| </code></pre> | ||||
| <h2 id="passing-synapse-deferreds-into-third-party-functions"><a class="header" href="#passing-synapse-deferreds-into-third-party-functions">Passing synapse deferreds into third-party functions</a></h2> | ||||
| <p>A typical example of this is where we want to collect together two or | ||||
| more deferred via <code>defer.gatherResults</code>:</p> | ||||
| <pre><code class="language-python">d1 = operation1() | ||||
| d2 = operation2() | ||||
| d3 = defer.gatherResults([d1, d2]) | ||||
| more awaitables via <code>defer.gatherResults</code>:</p> | ||||
| <pre><code class="language-python">a1 = operation1() | ||||
| a2 = operation2() | ||||
| a3 = defer.gatherResults([a1, a2]) | ||||
| </code></pre> | ||||
| <p>This is really a variation of the fire-and-forget problem above, in that | ||||
| we are firing off <code>d1</code> and <code>d2</code> without yielding on them. The difference | ||||
| we are firing off <code>a1</code> and <code>a2</code> without awaiting on them. The difference | ||||
| is that we now have third-party code attached to their callbacks. Anyway | ||||
| either technique given in the <a href="#fire-and-forget">Fire-and-forget</a> | ||||
| section will work.</p> | ||||
| <p>Of course, the new Deferred returned by <code>gatherResults</code> needs to be | ||||
| <p>Of course, the new awaitable returned by <code>gather</code> needs to be | ||||
| wrapped in order to make it follow the logcontext rules before we can | ||||
| yield it, as described in <a href="#where-you-create-a-new-deferred-make-it-follow-the-rules">Where you create a new Deferred, make it | ||||
| yield it, as described in <a href="#where-you-create-a-new-awaitable-make-it-follow-the-rules">Where you create a new awaitable, make it | ||||
| follow the | ||||
| rules</a>.</p> | ||||
| <p>So, option one: reset the logcontext before starting the operations to | ||||
| be gathered:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def do_request_handling(): | ||||
| <pre><code class="language-python">async def do_request_handling(): | ||||
|     with PreserveLoggingContext(): | ||||
|         d1 = operation1() | ||||
|         d2 = operation2() | ||||
|         result = yield defer.gatherResults([d1, d2]) | ||||
|         a1 = operation1() | ||||
|         a2 = operation2() | ||||
|         result = await defer.gatherResults([a1, a2]) | ||||
| </code></pre> | ||||
| <p>In this case particularly, though, option two, of using | ||||
| <code>context.preserve_fn</code> almost certainly makes more sense, so that | ||||
| <code>context.run_in_background</code> almost certainly makes more sense, so that | ||||
| <code>operation1</code> and <code>operation2</code> are both logged against the original | ||||
| logcontext. This looks like:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def do_request_handling(): | ||||
|     d1 = context.preserve_fn(operation1)() | ||||
|     d2 = context.preserve_fn(operation2)() | ||||
| <pre><code class="language-python">async def do_request_handling(): | ||||
|     a1 = context.run_in_background(operation1) | ||||
|     a2 = context.run_in_background(operation2) | ||||
| 
 | ||||
|     with PreserveLoggingContext(): | ||||
|         result = yield defer.gatherResults([d1, d2]) | ||||
|     result = await make_deferred_yieldable(defer.gatherResults([a1, a2])) | ||||
| </code></pre> | ||||
| <h2 id="was-all-this-really-necessary"><a class="header" href="#was-all-this-really-necessary">Was all this really necessary?</a></h2> | ||||
| <p>The conventions used work fine for a linear flow where everything | ||||
| happens in series via <code>defer.inlineCallbacks</code> and <code>yield</code>, but are | ||||
| certainly tricky to follow for any more exotic flows. It's hard not to | ||||
| wonder if we could have done something else.</p> | ||||
| <p>We're not going to rewrite Synapse now, so the following is entirely of | ||||
| academic interest, but I'd like to record some thoughts on an | ||||
| alternative approach.</p> | ||||
| <p>I briefly prototyped some code following an alternative set of rules. I | ||||
| think it would work, but I certainly didn't get as far as thinking how | ||||
| it would interact with concepts as complicated as the cache descriptors.</p> | ||||
| <p>My alternative rules were:</p> | ||||
| <ul> | ||||
| <li>functions always preserve the logcontext of their caller, whether or | ||||
| not they are returning a Deferred.</li> | ||||
| <li>Deferreds returned by synapse functions run their callbacks in the | ||||
| same context as the function was orignally called in.</li> | ||||
| </ul> | ||||
| <p>The main point of this scheme is that everywhere that sets the | ||||
| logcontext is responsible for clearing it before returning control to | ||||
| the reactor.</p> | ||||
| <p>So, for example, if you were the function which started a | ||||
| <code>with LoggingContext</code> block, you wouldn't <code>yield</code> within it --- instead | ||||
| you'd start off the background process, and then leave the <code>with</code> block | ||||
| to wait for it:</p> | ||||
| <pre><code class="language-python">def handle_request(request_id): | ||||
|     with context.LoggingContext() as request_context: | ||||
|         request_context.request = request_id | ||||
|         d = do_request_handling() | ||||
| 
 | ||||
|     def cb(r): | ||||
|         logger.debug("finished") | ||||
| 
 | ||||
|     d.addCallback(cb) | ||||
|     return d | ||||
| </code></pre> | ||||
| <p>(in general, mixing <code>with LoggingContext</code> blocks and | ||||
| <code>defer.inlineCallbacks</code> in the same function leads to slighly | ||||
| counter-intuitive code, under this scheme).</p> | ||||
| <p>Because we leave the original <code>with</code> block as soon as the Deferred is | ||||
| returned (as opposed to waiting for it to be resolved, as we do today), | ||||
| the logcontext is cleared before control passes back to the reactor; so | ||||
| if there is some code within <code>do_request_handling</code> which needs to wait | ||||
| for a Deferred to complete, there is no need for it to worry about | ||||
| clearing the logcontext before doing so:</p> | ||||
| <pre><code class="language-python">def handle_request(): | ||||
|     r = do_some_stuff() | ||||
|     r.addCallback(do_some_more_stuff) | ||||
|     return r | ||||
| </code></pre> | ||||
| <p>--- and provided <code>do_some_stuff</code> follows the rules of returning a | ||||
| Deferred which runs its callbacks in the original logcontext, all is | ||||
| happy.</p> | ||||
| <p>The business of a Deferred which runs its callbacks in the original | ||||
| logcontext isn't hard to achieve --- we have it today, in the shape of | ||||
| <code>context._PreservingContextDeferred</code>:</p> | ||||
| <pre><code class="language-python">def do_some_stuff(): | ||||
|     deferred = do_some_io() | ||||
|     pcd = _PreservingContextDeferred(LoggingContext.current_context()) | ||||
|     deferred.chainDeferred(pcd) | ||||
|     return pcd | ||||
| </code></pre> | ||||
| <p>It turns out that, thanks to the way that Deferreds chain together, we | ||||
| automatically get the property of a context-preserving deferred with | ||||
| <code>defer.inlineCallbacks</code>, provided the final Defered the function | ||||
| <code>yields</code> on has that property. So we can just write:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def handle_request(): | ||||
|     yield do_some_stuff() | ||||
|     yield do_some_more_stuff() | ||||
| </code></pre> | ||||
| <p>To conclude: I think this scheme would have worked equally well, with | ||||
| less danger of messing it up, and probably made some more esoteric code | ||||
| easier to write. But again --- changing the conventions of the entire | ||||
| Synapse codebase is not a sensible option for the marginal improvement | ||||
| offered.</p> | ||||
| <h2 id="a-note-on-garbage-collection-of-deferred-chains"><a class="header" href="#a-note-on-garbage-collection-of-deferred-chains">A note on garbage-collection of Deferred chains</a></h2> | ||||
| <p>It turns out that our logcontext rules do not play nicely with Deferred | ||||
| <h2 id="a-note-on-garbage-collection-of-awaitable-chains"><a class="header" href="#a-note-on-garbage-collection-of-awaitable-chains">A note on garbage-collection of awaitable chains</a></h2> | ||||
| <p>It turns out that our logcontext rules do not play nicely with awaitable | ||||
| chains which get orphaned and garbage-collected.</p> | ||||
| <p>Imagine we have some code that looks like this:</p> | ||||
| <pre><code class="language-python">listener_queue = [] | ||||
|  | @ -549,13 +445,12 @@ def on_something_interesting(): | |||
|     for d in listener_queue: | ||||
|         d.callback("foo") | ||||
| 
 | ||||
| @defer.inlineCallbacks | ||||
| def await_something_interesting(): | ||||
|     new_deferred = defer.Deferred() | ||||
|     listener_queue.append(new_deferred) | ||||
| async def await_something_interesting(): | ||||
|     new_awaitable = defer.Deferred() | ||||
|     listener_queue.append(new_awaitable) | ||||
| 
 | ||||
|     with PreserveLoggingContext(): | ||||
|         yield new_deferred | ||||
|         await new_awaitable | ||||
| </code></pre> | ||||
| <p>Obviously, the idea here is that we have a bunch of things which are | ||||
| waiting for an event. (It's just an example of the problem here, but a | ||||
|  | @ -568,18 +463,19 @@ never going to happen, and we reset the listener queue:</p> | |||
| <pre><code class="language-python">def reset_listener_queue(): | ||||
|     listener_queue.clear() | ||||
| </code></pre> | ||||
| <p>So, both ends of the deferred chain have now dropped their references, | ||||
| and the deferred chain is now orphaned, and will be garbage-collected at | ||||
| some point. Note that <code>await_something_interesting</code> is a generator | ||||
| function, and when Python garbage-collects generator functions, it gives | ||||
| them a chance to clean up by making the <code>yield</code> raise a <code>GeneratorExit</code> | ||||
| <p>So, both ends of the awaitable chain have now dropped their references, | ||||
| and the awaitable chain is now orphaned, and will be garbage-collected at | ||||
| some point. Note that <code>await_something_interesting</code> is a coroutine,  | ||||
| which Python implements as a generator function.  When Python | ||||
| garbage-collects generator functions, it gives them a chance to  | ||||
| clean up by making the <code>async</code> (or <code>yield</code>) raise a <code>GeneratorExit</code> | ||||
| exception. In our case, that means that the <code>__exit__</code> handler of | ||||
| <code>PreserveLoggingContext</code> will carefully restore the request context, but | ||||
| there is now nothing waiting for its return, so the request context is | ||||
| never cleared.</p> | ||||
| <p>To reiterate, this problem only arises when <em>both</em> ends of a deferred | ||||
| chain are dropped. Dropping the the reference to a deferred you're | ||||
| supposed to be calling is probably bad practice, so this doesn't | ||||
| <p>To reiterate, this problem only arises when <em>both</em> ends of a awaitable | ||||
| chain are dropped. Dropping the the reference to an awaitable you're | ||||
| supposed to be awaiting is bad practice, so this doesn't | ||||
| actually happen too much. Unfortunately, when it does happen, it will | ||||
| lead to leaked logcontexts which are incredibly hard to track down.</p> | ||||
| 
 | ||||
|  |  | |||
|  | @ -11831,10 +11831,13 @@ activity.</p> | |||
| <p>The <code>synapse.logging.context</code> module provides a facilities for managing | ||||
| the current log context (as well as providing the <code>LoggingContextFilter</code> | ||||
| class).</p> | ||||
| <p>Deferreds make the whole thing complicated, so this document describes | ||||
| <p>Asynchronous functions make the whole thing complicated, so this document describes | ||||
| how it all works, and how to write code which follows the rules.</p> | ||||
| <h2 id="logcontexts-without-deferreds"><a class="header" href="#logcontexts-without-deferreds">Logcontexts without Deferreds</a></h2> | ||||
| <p>In the absence of any Deferred voodoo, things are simple enough. As with | ||||
| <p>In this document, "awaitable" refers to any object which can be <code>await</code>ed. In the context of | ||||
| Synapse, that normally means either a coroutine or a Twisted  | ||||
| <a href="https://twistedmatrix.com/documents/current/api/twisted.internet.defer.Deferred.html"><code>Deferred</code></a>.</p> | ||||
| <h2 id="logcontexts-without-asynchronous-code"><a class="header" href="#logcontexts-without-asynchronous-code">Logcontexts without asynchronous code</a></h2> | ||||
| <p>In the absence of any asynchronous voodoo, things are simple enough. As with | ||||
| any code of this nature, the rule is that our function should leave | ||||
| things as it found them:</p> | ||||
| <pre><code class="language-python">from synapse.logging import context         # omitted from future snippets | ||||
|  | @ -11864,39 +11867,36 @@ can be written much more succinctly as:</p> | |||
| def do_request_handling(): | ||||
|     logger.debug("phew") | ||||
| </code></pre> | ||||
| <h2 id="using-logcontexts-with-deferreds"><a class="header" href="#using-logcontexts-with-deferreds">Using logcontexts with Deferreds</a></h2> | ||||
| <p>Deferreds --- and in particular, <code>defer.inlineCallbacks</code> --- break the | ||||
| linear flow of code so that there is no longer a single entry point | ||||
| where we should set the logcontext and a single exit point where we | ||||
| should remove it.</p> | ||||
| <h2 id="using-logcontexts-with-awaitables"><a class="header" href="#using-logcontexts-with-awaitables">Using logcontexts with awaitables</a></h2> | ||||
| <p>Awaitables break the linear flow of code so that there is no longer a single entry point | ||||
| where we should set the logcontext and a single exit point where we should remove it.</p> | ||||
| <p>Consider the example above, where <code>do_request_handling</code> needs to do some | ||||
| blocking operation, and returns a deferred:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def handle_request(request_id): | ||||
| blocking operation, and returns an awaitable:</p> | ||||
| <pre><code class="language-python">async def handle_request(request_id): | ||||
|     with context.LoggingContext() as request_context: | ||||
|         request_context.request = request_id | ||||
|         yield do_request_handling() | ||||
|         await do_request_handling() | ||||
|         logger.debug("finished") | ||||
| </code></pre> | ||||
| <p>In the above flow:</p> | ||||
| <ul> | ||||
| <li>The logcontext is set</li> | ||||
| <li><code>do_request_handling</code> is called, and returns a deferred</li> | ||||
| <li><code>handle_request</code> yields the deferred</li> | ||||
| <li>The <code>inlineCallbacks</code> wrapper of <code>handle_request</code> returns a deferred</li> | ||||
| <li><code>do_request_handling</code> is called, and returns an awaitable</li> | ||||
| <li><code>handle_request</code> awaits the awaitable</li> | ||||
| <li>Execution of <code>handle_request</code> is suspended</li> | ||||
| </ul> | ||||
| <p>So we have stopped processing the request (and will probably go on to | ||||
| start processing the next), without clearing the logcontext.</p> | ||||
| <p>To circumvent this problem, synapse code assumes that, wherever you have | ||||
| a deferred, you will want to yield on it. To that end, whereever | ||||
| functions return a deferred, we adopt the following conventions:</p> | ||||
| <p><strong>Rules for functions returning deferreds:</strong></p> | ||||
| an awaitable, you will want to <code>await</code> it. To that end, whereever | ||||
| functions return awaitables, we adopt the following conventions:</p> | ||||
| <p><strong>Rules for functions returning awaitables:</strong></p> | ||||
| <blockquote> | ||||
| <ul> | ||||
| <li>If the deferred is already complete, the function returns with the | ||||
| <li>If the awaitable is already complete, the function returns with the | ||||
| same logcontext it started with.</li> | ||||
| <li>If the deferred is incomplete, the function clears the logcontext | ||||
| before returning; when the deferred completes, it restores the | ||||
| <li>If the awaitable is incomplete, the function clears the logcontext | ||||
| before returning; when the awaitable completes, it restores the | ||||
| logcontext before running any callbacks.</li> | ||||
| </ul> | ||||
| </blockquote> | ||||
|  | @ -11904,72 +11904,60 @@ logcontext before running any callbacks.</li> | |||
| the example above) "just works". There are two cases:</p> | ||||
| <ul> | ||||
| <li> | ||||
| <p>If <code>do_request_handling</code> returns a completed deferred, then the | ||||
| <p>If <code>do_request_handling</code> returns a completed awaitable, then the | ||||
| logcontext will still be in place. In this case, execution will | ||||
| continue immediately after the <code>yield</code>; the "finished" line will | ||||
| continue immediately after the <code>await</code>; the "finished" line will | ||||
| be logged against the right context, and the <code>with</code> block restores | ||||
| the original context before we return to the caller.</p> | ||||
| </li> | ||||
| <li> | ||||
| <p>If the returned deferred is incomplete, <code>do_request_handling</code> clears | ||||
| <p>If the returned awaitable is incomplete, <code>do_request_handling</code> clears | ||||
| the logcontext before returning. The logcontext is therefore clear | ||||
| when <code>handle_request</code> yields the deferred. At that point, the | ||||
| <code>inlineCallbacks</code> wrapper adds a callback to the deferred, and | ||||
| returns another (incomplete) deferred to the caller, and it is safe | ||||
| to begin processing the next request.</p> | ||||
| <p>Once <code>do_request_handling</code>'s deferred completes, it will reinstate | ||||
| the logcontext, before running the callback added by the | ||||
| <code>inlineCallbacks</code> wrapper. That callback runs the second half of | ||||
| <code>handle_request</code>, so again the "finished" line will be logged | ||||
| against the right context, and the <code>with</code> block restores the | ||||
| original context.</p> | ||||
| when <code>handle_request</code> <code>await</code>s the awaitable.</p> | ||||
| <p>Once <code>do_request_handling</code>'s awaitable completes, it will reinstate | ||||
| the logcontext, before running the second half of <code>handle_request</code>, | ||||
| so again the "finished" line will be logged against the right context, | ||||
| and the <code>with</code> block restores the original context.</p> | ||||
| </li> | ||||
| </ul> | ||||
| <p>As an aside, it's worth noting that <code>handle_request</code> follows our rules | ||||
| -though that only matters if the caller has its own logcontext which it | ||||
| cares about.</p> | ||||
| <p>As an aside, it's worth noting that <code>handle_request</code> follows our rules</p> | ||||
| <ul> | ||||
| <li>though that only matters if the caller has its own logcontext which it | ||||
| cares about.</li> | ||||
| </ul> | ||||
| <p>The following sections describe pitfalls and helpful patterns when | ||||
| implementing these rules.</p> | ||||
| <h2 id="always-yield-your-deferreds"><a class="header" href="#always-yield-your-deferreds">Always yield your deferreds</a></h2> | ||||
| <p>Whenever you get a deferred back from a function, you should <code>yield</code> on | ||||
| it as soon as possible. (Returning it directly to your caller is ok too, | ||||
| if you're not doing <code>inlineCallbacks</code>.) Do not pass go; do not do any | ||||
| logging; do not call any other functions.</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def fun(): | ||||
| <h2 id="always-await-your-awaitables"><a class="header" href="#always-await-your-awaitables">Always await your awaitables</a></h2> | ||||
| <p>Whenever you get an awaitable back from a function, you should <code>await</code> on | ||||
| it as soon as possible. Do not pass go; do not do any logging; do not | ||||
| call any other functions.</p> | ||||
| <pre><code class="language-python">async def fun(): | ||||
|     logger.debug("starting") | ||||
|     yield do_some_stuff()       # just like this | ||||
|     await do_some_stuff()       # just like this | ||||
| 
 | ||||
|     d = more_stuff() | ||||
|     result = yield d            # also fine, of course | ||||
|     coro = more_stuff() | ||||
|     result = await coro         # also fine, of course | ||||
| 
 | ||||
|     return result | ||||
| 
 | ||||
| def nonInlineCallbacksFun(): | ||||
|     logger.debug("just a wrapper really") | ||||
|     return do_some_stuff()      # this is ok too - the caller will yield on | ||||
|                                 # it anyway. | ||||
| </code></pre> | ||||
| <p>Provided this pattern is followed all the way back up to the callchain | ||||
| to where the logcontext was set, this will make things work out ok: | ||||
| provided <code>do_some_stuff</code> and <code>more_stuff</code> follow the rules above, then | ||||
| so will <code>fun</code> (as wrapped by <code>inlineCallbacks</code>) and | ||||
| <code>nonInlineCallbacksFun</code>.</p> | ||||
| <p>It's all too easy to forget to <code>yield</code>: for instance if we forgot that | ||||
| <code>do_some_stuff</code> returned a deferred, we might plough on regardless. This | ||||
| so will <code>fun</code>.</p> | ||||
| <p>It's all too easy to forget to <code>await</code>: for instance if we forgot that | ||||
| <code>do_some_stuff</code> returned an awaitable, we might plough on regardless. This | ||||
| leads to a mess; it will probably work itself out eventually, but not | ||||
| before a load of stuff has been logged against the wrong context. | ||||
| (Normally, other things will break, more obviously, if you forget to | ||||
| <code>yield</code>, so this tends not to be a major problem in practice.)</p> | ||||
| <code>await</code>, so this tends not to be a major problem in practice.)</p> | ||||
| <p>Of course sometimes you need to do something a bit fancier with your | ||||
| Deferreds - not all code follows the linear A-then-B-then-C pattern. | ||||
| awaitable - not all code follows the linear A-then-B-then-C pattern. | ||||
| Notes on implementing more complex patterns are in later sections.</p> | ||||
| <h2 id="where-you-create-a-new-deferred-make-it-follow-the-rules"><a class="header" href="#where-you-create-a-new-deferred-make-it-follow-the-rules">Where you create a new Deferred, make it follow the rules</a></h2> | ||||
| <p>Most of the time, a Deferred comes from another synapse function. | ||||
| Sometimes, though, we need to make up a new Deferred, or we get a | ||||
| Deferred back from external code. We need to make it follow our rules.</p> | ||||
| <p>The easy way to do it is with a combination of <code>defer.inlineCallbacks</code>, | ||||
| and <code>context.PreserveLoggingContext</code>. Suppose we want to implement | ||||
| <h2 id="where-you-create-a-new-awaitable-make-it-follow-the-rules"><a class="header" href="#where-you-create-a-new-awaitable-make-it-follow-the-rules">Where you create a new awaitable, make it follow the rules</a></h2> | ||||
| <p>Most of the time, an awaitable comes from another synapse function. | ||||
| Sometimes, though, we need to make up a new awaitable, or we get an awaitable | ||||
| back from external code. We need to make it follow our rules.</p> | ||||
| <p>The easy way to do it is by using <code>context.make_deferred_yieldable</code>. Suppose we want to implement | ||||
| <code>sleep</code>, which returns a deferred which will run its callbacks after a | ||||
| given number of seconds. That might look like:</p> | ||||
| <pre><code class="language-python"># not a logcontext-rules-compliant function | ||||
|  | @ -11978,37 +11966,26 @@ def get_sleep_deferred(seconds): | |||
|     reactor.callLater(seconds, d.callback, None) | ||||
|     return d | ||||
| </code></pre> | ||||
| <p>That doesn't follow the rules, but we can fix it by wrapping it with | ||||
| <code>PreserveLoggingContext</code> and <code>yield</code> ing on it:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def sleep(seconds): | ||||
|     with PreserveLoggingContext(): | ||||
|         yield get_sleep_deferred(seconds) | ||||
| </code></pre> | ||||
| <p>This technique works equally for external functions which return | ||||
| deferreds, or deferreds we have made ourselves.</p> | ||||
| <p>You can also use <code>context.make_deferred_yieldable</code>, which just does the | ||||
| boilerplate for you, so the above could be written:</p> | ||||
| <pre><code class="language-python">def sleep(seconds): | ||||
|     return context.make_deferred_yieldable(get_sleep_deferred(seconds)) | ||||
| <p>That doesn't follow the rules, but we can fix it by calling it through | ||||
| <code>context.make_deferred_yieldable</code>:</p> | ||||
| <pre><code class="language-python">async def sleep(seconds): | ||||
|     return await context.make_deferred_yieldable(get_sleep_deferred(seconds)) | ||||
| </code></pre> | ||||
| <h2 id="fire-and-forget"><a class="header" href="#fire-and-forget">Fire-and-forget</a></h2> | ||||
| <p>Sometimes you want to fire off a chain of execution, but not wait for | ||||
| its result. That might look a bit like this:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def do_request_handling(): | ||||
|     yield foreground_operation() | ||||
| <pre><code class="language-python">async def do_request_handling(): | ||||
|     await foreground_operation() | ||||
| 
 | ||||
|     # *don't* do this | ||||
|     background_operation() | ||||
| 
 | ||||
|     logger.debug("Request handling complete") | ||||
| 
 | ||||
| @defer.inlineCallbacks | ||||
| def background_operation(): | ||||
|     yield first_background_step() | ||||
| async def background_operation(): | ||||
|     await first_background_step() | ||||
|     logger.debug("Completed first step") | ||||
|     yield second_background_step() | ||||
|     await second_background_step() | ||||
|     logger.debug("Completed second step") | ||||
| </code></pre> | ||||
| <p>The above code does a couple of steps in the background after | ||||
|  | @ -12016,12 +11993,12 @@ def background_operation(): | |||
| against the <code>request_context</code> logcontext, which may or may not be | ||||
| desirable. There are two big problems with the above, however. The first | ||||
| problem is that, if <code>background_operation</code> returns an incomplete | ||||
| Deferred, it will expect its caller to <code>yield</code> immediately, so will have | ||||
| awaitable, it will expect its caller to <code>await</code> immediately, so will have | ||||
| cleared the logcontext. In this example, that means that 'Request | ||||
| handling complete' will be logged without any context.</p> | ||||
| <p>The second problem, which is potentially even worse, is that when the | ||||
| Deferred returned by <code>background_operation</code> completes, it will restore | ||||
| the original logcontext. There is nothing waiting on that Deferred, so | ||||
| awaitable returned by <code>background_operation</code> completes, it will restore | ||||
| the original logcontext. There is nothing waiting on that awaitable, so | ||||
| the logcontext will leak into the reactor and possibly get attached to | ||||
| some arbitrary future operation.</p> | ||||
| <p>There are two potential solutions to this.</p> | ||||
|  | @ -12030,9 +12007,8 @@ some arbitrary future operation.</p> | |||
| starting <code>background_operation</code> (so the context restored when the | ||||
| deferred completes will be the empty logcontext), and will restore the | ||||
| current logcontext before continuing the foreground process:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def do_request_handling(): | ||||
|     yield foreground_operation() | ||||
| <pre><code class="language-python">async def do_request_handling(): | ||||
|     await foreground_operation() | ||||
| 
 | ||||
|     # start background_operation off in the empty logcontext, to | ||||
|     # avoid leaking the current context into the reactor. | ||||
|  | @ -12048,14 +12024,13 @@ def do_request_handling(): | |||
| <code>with LoggingContext(...)</code> in <code>background_operation</code>).</p> | ||||
| <p>The second option is to use <code>context.run_in_background</code>, which wraps a | ||||
| function so that it doesn't reset the logcontext even when it returns | ||||
| an incomplete deferred, and adds a callback to the returned deferred to | ||||
| an incomplete awaitable, and adds a callback to the returned awaitable to | ||||
| reset the logcontext. In other words, it turns a function that follows | ||||
| the Synapse rules about logcontexts and Deferreds into one which behaves | ||||
| the Synapse rules about logcontexts and awaitables into one which behaves | ||||
| more like an external function --- the opposite operation to that | ||||
| described in the previous section. It can be used like this:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def do_request_handling(): | ||||
|     yield foreground_operation() | ||||
| <pre><code class="language-python">async def do_request_handling(): | ||||
|     await foreground_operation() | ||||
| 
 | ||||
|     context.run_in_background(background_operation) | ||||
| 
 | ||||
|  | @ -12064,120 +12039,41 @@ def do_request_handling(): | |||
| </code></pre> | ||||
| <h2 id="passing-synapse-deferreds-into-third-party-functions"><a class="header" href="#passing-synapse-deferreds-into-third-party-functions">Passing synapse deferreds into third-party functions</a></h2> | ||||
| <p>A typical example of this is where we want to collect together two or | ||||
| more deferred via <code>defer.gatherResults</code>:</p> | ||||
| <pre><code class="language-python">d1 = operation1() | ||||
| d2 = operation2() | ||||
| d3 = defer.gatherResults([d1, d2]) | ||||
| more awaitables via <code>defer.gatherResults</code>:</p> | ||||
| <pre><code class="language-python">a1 = operation1() | ||||
| a2 = operation2() | ||||
| a3 = defer.gatherResults([a1, a2]) | ||||
| </code></pre> | ||||
| <p>This is really a variation of the fire-and-forget problem above, in that | ||||
| we are firing off <code>d1</code> and <code>d2</code> without yielding on them. The difference | ||||
| we are firing off <code>a1</code> and <code>a2</code> without awaiting on them. The difference | ||||
| is that we now have third-party code attached to their callbacks. Anyway | ||||
| either technique given in the <a href="log_contexts.html#fire-and-forget">Fire-and-forget</a> | ||||
| section will work.</p> | ||||
| <p>Of course, the new Deferred returned by <code>gatherResults</code> needs to be | ||||
| <p>Of course, the new awaitable returned by <code>gather</code> needs to be | ||||
| wrapped in order to make it follow the logcontext rules before we can | ||||
| yield it, as described in <a href="log_contexts.html#where-you-create-a-new-deferred-make-it-follow-the-rules">Where you create a new Deferred, make it | ||||
| yield it, as described in <a href="log_contexts.html#where-you-create-a-new-awaitable-make-it-follow-the-rules">Where you create a new awaitable, make it | ||||
| follow the | ||||
| rules</a>.</p> | ||||
| <p>So, option one: reset the logcontext before starting the operations to | ||||
| be gathered:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def do_request_handling(): | ||||
| <pre><code class="language-python">async def do_request_handling(): | ||||
|     with PreserveLoggingContext(): | ||||
|         d1 = operation1() | ||||
|         d2 = operation2() | ||||
|         result = yield defer.gatherResults([d1, d2]) | ||||
|         a1 = operation1() | ||||
|         a2 = operation2() | ||||
|         result = await defer.gatherResults([a1, a2]) | ||||
| </code></pre> | ||||
| <p>In this case particularly, though, option two, of using | ||||
| <code>context.preserve_fn</code> almost certainly makes more sense, so that | ||||
| <code>context.run_in_background</code> almost certainly makes more sense, so that | ||||
| <code>operation1</code> and <code>operation2</code> are both logged against the original | ||||
| logcontext. This looks like:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def do_request_handling(): | ||||
|     d1 = context.preserve_fn(operation1)() | ||||
|     d2 = context.preserve_fn(operation2)() | ||||
| <pre><code class="language-python">async def do_request_handling(): | ||||
|     a1 = context.run_in_background(operation1) | ||||
|     a2 = context.run_in_background(operation2) | ||||
| 
 | ||||
|     with PreserveLoggingContext(): | ||||
|         result = yield defer.gatherResults([d1, d2]) | ||||
|     result = await make_deferred_yieldable(defer.gatherResults([a1, a2])) | ||||
| </code></pre> | ||||
| <h2 id="was-all-this-really-necessary"><a class="header" href="#was-all-this-really-necessary">Was all this really necessary?</a></h2> | ||||
| <p>The conventions used work fine for a linear flow where everything | ||||
| happens in series via <code>defer.inlineCallbacks</code> and <code>yield</code>, but are | ||||
| certainly tricky to follow for any more exotic flows. It's hard not to | ||||
| wonder if we could have done something else.</p> | ||||
| <p>We're not going to rewrite Synapse now, so the following is entirely of | ||||
| academic interest, but I'd like to record some thoughts on an | ||||
| alternative approach.</p> | ||||
| <p>I briefly prototyped some code following an alternative set of rules. I | ||||
| think it would work, but I certainly didn't get as far as thinking how | ||||
| it would interact with concepts as complicated as the cache descriptors.</p> | ||||
| <p>My alternative rules were:</p> | ||||
| <ul> | ||||
| <li>functions always preserve the logcontext of their caller, whether or | ||||
| not they are returning a Deferred.</li> | ||||
| <li>Deferreds returned by synapse functions run their callbacks in the | ||||
| same context as the function was orignally called in.</li> | ||||
| </ul> | ||||
| <p>The main point of this scheme is that everywhere that sets the | ||||
| logcontext is responsible for clearing it before returning control to | ||||
| the reactor.</p> | ||||
| <p>So, for example, if you were the function which started a | ||||
| <code>with LoggingContext</code> block, you wouldn't <code>yield</code> within it --- instead | ||||
| you'd start off the background process, and then leave the <code>with</code> block | ||||
| to wait for it:</p> | ||||
| <pre><code class="language-python">def handle_request(request_id): | ||||
|     with context.LoggingContext() as request_context: | ||||
|         request_context.request = request_id | ||||
|         d = do_request_handling() | ||||
| 
 | ||||
|     def cb(r): | ||||
|         logger.debug("finished") | ||||
| 
 | ||||
|     d.addCallback(cb) | ||||
|     return d | ||||
| </code></pre> | ||||
| <p>(in general, mixing <code>with LoggingContext</code> blocks and | ||||
| <code>defer.inlineCallbacks</code> in the same function leads to slighly | ||||
| counter-intuitive code, under this scheme).</p> | ||||
| <p>Because we leave the original <code>with</code> block as soon as the Deferred is | ||||
| returned (as opposed to waiting for it to be resolved, as we do today), | ||||
| the logcontext is cleared before control passes back to the reactor; so | ||||
| if there is some code within <code>do_request_handling</code> which needs to wait | ||||
| for a Deferred to complete, there is no need for it to worry about | ||||
| clearing the logcontext before doing so:</p> | ||||
| <pre><code class="language-python">def handle_request(): | ||||
|     r = do_some_stuff() | ||||
|     r.addCallback(do_some_more_stuff) | ||||
|     return r | ||||
| </code></pre> | ||||
| <p>--- and provided <code>do_some_stuff</code> follows the rules of returning a | ||||
| Deferred which runs its callbacks in the original logcontext, all is | ||||
| happy.</p> | ||||
| <p>The business of a Deferred which runs its callbacks in the original | ||||
| logcontext isn't hard to achieve --- we have it today, in the shape of | ||||
| <code>context._PreservingContextDeferred</code>:</p> | ||||
| <pre><code class="language-python">def do_some_stuff(): | ||||
|     deferred = do_some_io() | ||||
|     pcd = _PreservingContextDeferred(LoggingContext.current_context()) | ||||
|     deferred.chainDeferred(pcd) | ||||
|     return pcd | ||||
| </code></pre> | ||||
| <p>It turns out that, thanks to the way that Deferreds chain together, we | ||||
| automatically get the property of a context-preserving deferred with | ||||
| <code>defer.inlineCallbacks</code>, provided the final Defered the function | ||||
| <code>yields</code> on has that property. So we can just write:</p> | ||||
| <pre><code class="language-python">@defer.inlineCallbacks | ||||
| def handle_request(): | ||||
|     yield do_some_stuff() | ||||
|     yield do_some_more_stuff() | ||||
| </code></pre> | ||||
| <p>To conclude: I think this scheme would have worked equally well, with | ||||
| less danger of messing it up, and probably made some more esoteric code | ||||
| easier to write. But again --- changing the conventions of the entire | ||||
| Synapse codebase is not a sensible option for the marginal improvement | ||||
| offered.</p> | ||||
| <h2 id="a-note-on-garbage-collection-of-deferred-chains"><a class="header" href="#a-note-on-garbage-collection-of-deferred-chains">A note on garbage-collection of Deferred chains</a></h2> | ||||
| <p>It turns out that our logcontext rules do not play nicely with Deferred | ||||
| <h2 id="a-note-on-garbage-collection-of-awaitable-chains"><a class="header" href="#a-note-on-garbage-collection-of-awaitable-chains">A note on garbage-collection of awaitable chains</a></h2> | ||||
| <p>It turns out that our logcontext rules do not play nicely with awaitable | ||||
| chains which get orphaned and garbage-collected.</p> | ||||
| <p>Imagine we have some code that looks like this:</p> | ||||
| <pre><code class="language-python">listener_queue = [] | ||||
|  | @ -12186,13 +12082,12 @@ def on_something_interesting(): | |||
|     for d in listener_queue: | ||||
|         d.callback("foo") | ||||
| 
 | ||||
| @defer.inlineCallbacks | ||||
| def await_something_interesting(): | ||||
|     new_deferred = defer.Deferred() | ||||
|     listener_queue.append(new_deferred) | ||||
| async def await_something_interesting(): | ||||
|     new_awaitable = defer.Deferred() | ||||
|     listener_queue.append(new_awaitable) | ||||
| 
 | ||||
|     with PreserveLoggingContext(): | ||||
|         yield new_deferred | ||||
|         await new_awaitable | ||||
| </code></pre> | ||||
| <p>Obviously, the idea here is that we have a bunch of things which are | ||||
| waiting for an event. (It's just an example of the problem here, but a | ||||
|  | @ -12205,18 +12100,19 @@ never going to happen, and we reset the listener queue:</p> | |||
| <pre><code class="language-python">def reset_listener_queue(): | ||||
|     listener_queue.clear() | ||||
| </code></pre> | ||||
| <p>So, both ends of the deferred chain have now dropped their references, | ||||
| and the deferred chain is now orphaned, and will be garbage-collected at | ||||
| some point. Note that <code>await_something_interesting</code> is a generator | ||||
| function, and when Python garbage-collects generator functions, it gives | ||||
| them a chance to clean up by making the <code>yield</code> raise a <code>GeneratorExit</code> | ||||
| <p>So, both ends of the awaitable chain have now dropped their references, | ||||
| and the awaitable chain is now orphaned, and will be garbage-collected at | ||||
| some point. Note that <code>await_something_interesting</code> is a coroutine,  | ||||
| which Python implements as a generator function.  When Python | ||||
| garbage-collects generator functions, it gives them a chance to  | ||||
| clean up by making the <code>async</code> (or <code>yield</code>) raise a <code>GeneratorExit</code> | ||||
| exception. In our case, that means that the <code>__exit__</code> handler of | ||||
| <code>PreserveLoggingContext</code> will carefully restore the request context, but | ||||
| there is now nothing waiting for its return, so the request context is | ||||
| never cleared.</p> | ||||
| <p>To reiterate, this problem only arises when <em>both</em> ends of a deferred | ||||
| chain are dropped. Dropping the the reference to a deferred you're | ||||
| supposed to be calling is probably bad practice, so this doesn't | ||||
| <p>To reiterate, this problem only arises when <em>both</em> ends of a awaitable | ||||
| chain are dropped. Dropping the the reference to an awaitable you're | ||||
| supposed to be awaiting is bad practice, so this doesn't | ||||
| actually happen too much. Unfortunately, when it does happen, it will | ||||
| lead to leaked logcontexts which are incredibly hard to track down.</p> | ||||
| <div id="chapter_begin" style="break-before: page; page-break-before: always;"></div><h1 id="replication-architecture"><a class="header" href="#replication-architecture">Replication Architecture</a></h1> | ||||
|  |  | |||
										
											
												File diff suppressed because one or more lines are too long
											
										
									
								
							
										
											
												File diff suppressed because one or more lines are too long
											
										
									
								
							
		Loading…
	
		Reference in New Issue
	
	 babolivier
						babolivier