Add developer documentation for the Federation Sender and add a documentation mechanism using Sphinx. (#15265)

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
pull/15319/head
reivilibre 2023-03-24 16:41:10 +00:00 committed by GitHub
parent 5f7c908280
commit d5324ee111
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
12 changed files with 1279 additions and 508 deletions

View File

@ -35,9 +35,9 @@ sed -i \
# compatible (as far the package metadata declares, anyway); pip's package resolver # compatible (as far the package metadata declares, anyway); pip's package resolver
# is more lax. # is more lax.
# #
# Rather than `poetry install --no-dev`, we drop all dev dependencies from the # Rather than `poetry install --no-dev`, we drop all dev dependencies and the dev-docs
# toml file. This means we don't have to ensure compatibility between old deps and # group from the toml file. This means we don't have to ensure compatibility between
# dev tools. # old deps and dev tools.
pip install toml wheel pip install toml wheel
@ -47,6 +47,7 @@ with open('pyproject.toml', 'r') as f:
data = toml.loads(f.read()) data = toml.loads(f.read())
del data['tool']['poetry']['dev-dependencies'] del data['tool']['poetry']['dev-dependencies']
del data['tool']['poetry']['group']['dev-docs']
with open('pyproject.toml', 'w') as f: with open('pyproject.toml', 'w') as f:
toml.dump(data, f) toml.dump(data, f)

View File

@ -13,25 +13,10 @@ on:
workflow_dispatch: workflow_dispatch:
jobs: jobs:
pages: pre:
name: GitHub Pages name: Calculate variables for GitHub Pages deployment
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v3
- name: Setup mdbook
uses: peaceiris/actions-mdbook@adeb05db28a0c0004681db83893d56c0388ea9ea # v1.2.0
with:
mdbook-version: '0.4.17'
- name: Build the documentation
# mdbook will only create an index.html if we're including docs/README.md in SUMMARY.md.
# However, we're using docs/README.md for other purposes and need to pick a new page
# as the default. Let's opt for the welcome page instead.
run: |
mdbook build
cp book/welcome_and_overview.html book/index.html
# Figure out the target directory. # Figure out the target directory.
# #
# The target directory depends on the name of the branch # The target directory depends on the name of the branch
@ -55,6 +40,30 @@ jobs:
# finally, set the 'branch-version' var. # finally, set the 'branch-version' var.
echo "branch-version=$branch" >> "$GITHUB_OUTPUT" echo "branch-version=$branch" >> "$GITHUB_OUTPUT"
outputs:
branch-version: ${{ steps.vars.outputs.branch-version }}
################################################################################
pages-docs:
name: GitHub Pages
runs-on: ubuntu-latest
needs:
- pre
steps:
- uses: actions/checkout@v3
- name: Setup mdbook
uses: peaceiris/actions-mdbook@adeb05db28a0c0004681db83893d56c0388ea9ea # v1.2.0
with:
mdbook-version: '0.4.17'
- name: Build the documentation
# mdbook will only create an index.html if we're including docs/README.md in SUMMARY.md.
# However, we're using docs/README.md for other purposes and need to pick a new page
# as the default. Let's opt for the welcome page instead.
run: |
mdbook build
cp book/welcome_and_overview.html book/index.html
# Deploy to the target directory. # Deploy to the target directory.
- name: Deploy to gh pages - name: Deploy to gh pages
@ -62,4 +71,34 @@ jobs:
with: with:
github_token: ${{ secrets.GITHUB_TOKEN }} github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./book publish_dir: ./book
destination_dir: ./${{ steps.vars.outputs.branch-version }} destination_dir: ./${{ needs.pre.outputs.branch-version }}
################################################################################
pages-devdocs:
name: GitHub Pages (developer docs)
runs-on: ubuntu-latest
needs:
- pre
steps:
- uses: action/checkout@v3
- name: "Set up Sphinx"
uses: matrix-org/setup-python-poetry@v1
with:
python-version: "3.x"
poetry-version: "1.3.2"
groups: "dev-docs"
extras: ""
- name: Build the documentation
run: |
cd dev-docs
poetry run make html
# Deploy to the target directory.
- name: Deploy to gh pages
uses: peaceiris/actions-gh-pages@bd8c6b06eba6b3d25d72b7a1767993c0aeee42e7 # v3.9.2
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./dev-docs/_build/html
destination_dir: ./dev-docs/${{ needs.pre.outputs.branch-version }}

1
.gitignore vendored
View File

@ -53,6 +53,7 @@ __pycache__/
/coverage.* /coverage.*
/dist/ /dist/
/docs/build/ /docs/build/
/dev-docs/_build/
/htmlcov /htmlcov
/pip-wheel-metadata/ /pip-wheel-metadata/

1
changelog.d/15265.misc Normal file
View File

@ -0,0 +1 @@
Add developer documentation for the Federation Sender and add a documentation mechanism using Sphinx.

20
dev-docs/Makefile Normal file
View File

@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

50
dev-docs/conf.py Normal file
View File

@ -0,0 +1,50 @@
# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
project = "Synapse development"
copyright = "2023, The Matrix.org Foundation C.I.C."
author = "The Synapse Maintainers and Community"
# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
extensions = [
"autodoc2",
"myst_parser",
]
templates_path = ["_templates"]
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
# -- Options for Autodoc2 ----------------------------------------------------
autodoc2_docstring_parser_regexes = [
# this will render all docstrings as 'MyST' Markdown
(r".*", "myst"),
]
autodoc2_packages = [
{
"path": "../synapse",
# Don't render documentation for everything as a matter of course
"auto_mode": False,
},
]
# -- Options for MyST (Markdown) ---------------------------------------------
# myst_heading_anchors = 2
# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
html_theme = "furo"
html_static_path = ["_static"]

22
dev-docs/index.rst Normal file
View File

@ -0,0 +1,22 @@
.. Synapse Developer Documentation documentation master file, created by
sphinx-quickstart on Mon Mar 13 08:59:51 2023.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to the Synapse Developer Documentation!
===========================================================
.. toctree::
:maxdepth: 2
:caption: Contents:
modules/federation_sender
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

View File

@ -0,0 +1,5 @@
Federation Sender
=================
```{autodoc2-docstring} synapse.federation.sender
```

1478
poetry.lock generated

File diff suppressed because it is too large Load Diff

View File

@ -350,6 +350,18 @@ towncrier = ">=18.6.0rc1"
# Used for checking the Poetry lockfile # Used for checking the Poetry lockfile
tomli = ">=1.2.3" tomli = ">=1.2.3"
# Dependencies for building the development documentation
[tool.poetry.group.dev-docs]
optional = true
[tool.poetry.group.dev-docs.dependencies]
sphinx = {version = "^6.1", python = "^3.8"}
sphinx-autodoc2 = {version = "^0.4.2", python = "^3.8"}
myst-parser = {version = "^1.0.0", python = "^3.8"}
furo = "^2022.12.7"
[build-system] [build-system]
# The upper bounds here are defensive, intended to prevent situations like # The upper bounds here are defensive, intended to prevent situations like
# #13849 and #14079 where we see buildtime or runtime errors caused by build # #13849 and #14079 where we see buildtime or runtime errors caused by build

View File

@ -91,6 +91,7 @@ else
"synapse" "docker" "tests" "synapse" "docker" "tests"
"scripts-dev" "scripts-dev"
"contrib" "synmark" "stubs" ".ci" "contrib" "synmark" "stubs" ".ci"
"dev-docs"
) )
fi fi
fi fi

View File

@ -11,6 +11,119 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and # See the License for the specific language governing permissions and
# limitations under the License. # limitations under the License.
"""
The Federation Sender is responsible for sending Persistent Data Units (PDUs)
and Ephemeral Data Units (EDUs) to other homeservers using
the `/send` Federation API.
## How do PDUs get sent?
The Federation Sender is made aware of new PDUs due to `FederationSender.notify_new_events`.
When the sender is notified about a newly-persisted PDU that originates from this homeserver
and is not an out-of-band event, we pass the PDU to the `_PerDestinationQueue` for each
remote homeserver that is in the room at that point in the DAG.
### Per-Destination Queues
There is one `PerDestinationQueue` per 'destination' homeserver.
The `PerDestinationQueue` maintains the following information about the destination:
- whether the destination is currently in [catch-up mode (see below)](#catch-up-mode);
- a queue of PDUs to be sent to the destination; and
- a queue of EDUs to be sent to the destination (not considered in this section).
Upon a new PDU being enqueued, `attempt_new_transaction` is called to start a new
transaction if there is not already one in progress.
### Transactions and the Transaction Transmission Loop
Each federation HTTP request to the `/send` endpoint is referred to as a 'transaction'.
The body of the HTTP request contains a list of PDUs and EDUs to send to the destination.
The *Transaction Transmission Loop* (`_transaction_transmission_loop`) is responsible
for emptying the queued PDUs (and EDUs) from a `PerDestinationQueue` by sending
them to the destination.
There can only be one transaction in flight for a given destination at any time.
(Other than preventing us from overloading the destination, this also makes it easier to
reason about because we process events sequentially for each destination.
This is useful for *Catch-Up Mode*, described later.)
The loop continues so long as there is anything to send. At each iteration of the loop, we:
- dequeue up to 50 PDUs (and up to 100 EDUs).
- make the `/send` request to the destination homeserver with the dequeued PDUs and EDUs.
- if successful, make note of the fact that we succeeded in transmitting PDUs up to
the given `stream_ordering` of the latest PDU by
- if unsuccessful, back off from the remote homeserver for some time.
If we have been unsuccessful for too long (when the backoff interval grows to exceed 1 hour),
the in-memory queues are emptied and we enter [*Catch-Up Mode*, described below](#catch-up-mode).
### Catch-Up Mode
When the `PerDestinationQueue` has the catch-up flag set, the *Catch-Up Transmission Loop*
(`_catch_up_transmission_loop`) is used in lieu of the regular `_transaction_transmission_loop`.
(Only once the catch-up mode has been exited can the regular tranaction transmission behaviour
be resumed.)
*Catch-Up Mode*, entered upon Synapse startup or once a homeserver has fallen behind due to
connection problems, is responsible for sending PDUs that have been missed by the destination
homeserver. (PDUs can be missed because the `PerDestinationQueue` is volatile i.e. resets
on startup and it does not hold PDUs forever if `/send` requests to the destination fail.)
The catch-up mechanism makes use of the `last_successful_stream_ordering` column in the
`destinations` table (which gives the `stream_ordering` of the most recent successfully
sent PDU) and the `stream_ordering` column in the `destination_rooms` table (which gives,
for each room, the `stream_ordering` of the most recent PDU that needs to be sent to this
destination).
Each iteration of the loop pulls out 50 `destination_rooms` entries with the oldest
`stream_ordering`s that are greater than the `last_successful_stream_ordering`.
In other words, from the set of latest PDUs in each room to be sent to the destination,
the 50 oldest such PDUs are pulled out.
These PDUs could, in principle, now be directly sent to the destination. However, as an
optimisation intended to prevent overloading destination homeservers, we instead attempt
to send the latest forward extremities so long as the destination homeserver is still
eligible to receive those.
This reduces load on the destination **in aggregate** because all Synapse homeservers
will behave according to this principle and therefore avoid sending lots of different PDUs
at different points in the DAG to a recovering homeserver.
*This optimisation is not currently valid in rooms which are partial-state on this homeserver,
since we are unable to determine whether the destination homeserver is eligible to receive
the latest forward extremities unless this homeserver sent those PDUs in this case, we
just send the latest PDUs originating from this server and skip this optimisation.*
Whilst PDUs are sent through this mechanism, the position of `last_successful_stream_ordering`
is advanced as normal.
Once there are no longer any rooms containing outstanding PDUs to be sent to the destination
*that are not already in the `PerDestinationQueue` because they arrived since Catch-Up Mode
was enabled*, Catch-Up Mode is exited and we return to `_transaction_transmission_loop`.
#### A note on failures and back-offs
If a remote server is unreachable over federation, we back off from that server,
with an exponentially-increasing retry interval.
Whilst we don't automatically retry after the interval, we prevent making new attempts
until such time as the back-off has cleared.
Once the back-off is cleared and a new PDU or EDU arrives for transmission, the transmission
loop resumes and empties the queue by making federation requests.
If the backoff grows too large (> 1 hour), the in-memory queue is emptied (to prevent
unbounded growth) and Catch-Up Mode is entered.
It is worth noting that the back-off for a remote server is cleared once an inbound
request from that remote server is received (see `notify_remote_server_up`).
At this point, the transaction transmission loop is also started up, to proactively
send missed PDUs and EDUs to the destination (i.e. you don't need to wait for a new PDU
or EDU, destined for that destination, to be created in order to send out missed PDUs and
EDUs).
"""
import abc import abc
import logging import logging