Now that we wait for stream positions whenever we do a HTTP replication
hit, we need to be less brutal in the case where we do timeout (as we
have bugs around this).
Currently, we will try to start a new partial state sync every time we
perform a remote join, which is undesirable if there is already one
running for a given room.
We intend to perform remote joins whenever additional local users wish
to join a partial state room, so let's ensure that we do not start more
than one concurrent partial state sync for any given room.
------------------------------------------------------------------------
There is a race condition where the homeserver leaves a room and later
rejoins while the partial state sync from the previous membership is
still running. There is no guarantee that the previous partial state
sync will process the latest join, so we restart it if needed.
Signed-off-by: Sean Quah <seanq@matrix.org>
* Change Documentation to have v10 as default room version
* Change Default Room version to 10
* Add changelog entry for default room version swap
* Add changelog entry for v10 default room version in docs
* Clarify doc changelog entry
Co-authored-by: David Robertson <david.m.robertson1@gmail.com>
* Improve Documentation changes.
Co-authored-by: David Robertson <david.m.robertson1@gmail.com>
* Update Changelog entry to have correct format
Co-authored-by: David Robertson <david.m.robertson1@gmail.com>
* Update Spec Version to 1.5
* Only need 1 changelog.
* Fix test.
* Update "Changed in" line
Co-authored-by: David Robertson <david.m.robertson1@gmail.com>
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: Patrick Cloke <patrickc@matrix.org>
* Upgrade to new lockfile format
Now requires poetry >= 1.2.2 to read and poetry >= 1.3.0 to write.
Cheat sheet:
```
poetry --version
poetry show > scratch/before
pipx upgrade poetry
poetry --version
poetry show > scratch/after
diff scratch{before,after} && echo "no change!"
```
* Use Poetry 1.3.2 when reading or writing lockfile
* Remove unneeded(?) poetry dep for cibuildwheel
* Update docs
* Remove redundant call to setup-python
* Remove outdated comments related to Poetry 1.x
* Remove outdated docs line
was fixed in #13082
* Minor improvements to poetry cheat sheet
* Invoke setup-python-poetry with explicit version
Not sure about this. It's hardcoding versions everywhere.
* Changelog
* Check the lockfile is version 2.0
Might one day incorporate other checks like #14742
* Typo fixes, thanks Sean
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Serving partial join responses is no longer experimental. They will only be served under the stable identifier if the the undocumented config flag experimental.msc3706_enabled is set to true.
Synapse continues to request a partial join only if the undocumented config flag experimental.faster_joins is set to true; this setting remains present and unaffected.
We were incorrectly checking if the *local* token had been advanced, rather than the token for the remote instance.
In practice, I don't think this has caused any bugs due to where we use `wait_for_stream_position`, as critically we don't use it on instances that also write to the given streams (and so the local token will lag behind all remote tokens).
When the local homeserver is already joined to a room and wants to
perform another remote join, we may find it useful to do a non-partial
state join if we already have the full state for the room.
Signed-off-by: Sean Quah <seanq@matrix.org>
* Also use stable name in SendJoinResponse struct
follow-up to #14832
* Changelog
* Fix a rename I missed
* Run black
* Update synapse/federation/federation_client.py
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
* Use new query param when requesting a partial join
* Read new query param when serving partial join
* Provide new field names when serving partial joins
* Read new field names from partial join response
* Changelog
When there are many synchronous requests waiting on a
`_PerHostRatelimiter`, each request will be started recursively just
after the previous request has completed. Under the right conditions,
this leads to stack exhaustion.
A common way for requests to become synchronous is when the remote
client disconnects early, because the homeserver is overloaded and slow
to respond.
Avoid stack exhaustion under these conditions by deferring subsequent
requests until the next reactor tick.
Fixes#14480.
Signed-off-by: Sean Quah <seanq@matrix.org>