Commit Graph

1976 Commits (a06b7a5d94fa8b1a5c18e563420fe78870c3473e)

Author SHA1 Message Date
Erik Johnston fa8934b175 Reduce serialization errors in MultiWriterIdGen (#8456)
We call `_update_stream_positions_table_txn` a lot, which is an UPSERT
that can conflict in `REPEATABLE READ` isolation level. Instead of doing
a transaction consisting of a single query we may as well run it outside
of a transaction.
2020-10-07 17:08:58 +01:00
Erik Johnston 7941372ec8
Make token serializing/deserializing async (#8427)
The idea is that in future tokens will encode a mapping of instance to position. However, we don't want to include the full instance name in the string representation, so instead we'll have a mapping between instance name and an immutable integer ID in the DB that we can use instead. We'll then do the lookup when we serialize/deserialize the token (we could alternatively pass around an `Instance` type that includes both the name and ID, but that turns out to be a lot more invasive).
2020-09-30 20:29:19 +01:00
Richard van der Hoff a0a1ba6973
Merge pull request #8425 from matrix-org/rav/extremity_metrics
Add an improved "forward extremities" metric
2020-09-30 19:33:27 +01:00
Patrick Cloke 8b40843392
Allow additional SSO properties to be passed to the client (#8413) 2020-09-30 13:02:43 -04:00
Richard van der Hoff 6d2d42f8fb Rewrite BucketCollector
This was a bit unweildy for what I wanted: in particular, I wanted to assign
each measurement straight into a bucket, rather than storing an intermediate
Counter which didn't do any bucketing at all.

I've replaced it with something that is hopefully a bit easier to use.

(I'm not entirely sure what the difference between a HistogramMetricFamily and
a GaugeHistogramMetricFamily is, but given our counters can go down as well as
up the latter *sounds* more accurate?)
2020-09-30 16:49:15 +01:00
Erik Johnston ea70f1c362
Various clean ups to room stream tokens. (#8423) 2020-09-29 21:48:33 +01:00
Erik Johnston b1433bf231
Don't table scan events on worker startup (#8419)
* Fix table scan of events on worker startup.

This happened because we assumed "new" writers had an initial stream
position of 0, so the replication code tried to fetch all events written
by the instance between 0 and the current position.

Instead, set the initial position of new writers to the current
persisted up to position, on the assumption that new writers won't have
written anything before that point.

* Consider old writers coming back as "new".

Otherwise we'd try and fetch entries between the old stale token and the
current position, even though it won't have written any rows.

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2020-09-29 16:42:19 +01:00
Will Hunt 8676d8ab2e
Filter out appservices from mau count (#8404)
This is an attempt to fix #8403.
2020-09-29 13:11:02 +01:00
Andrew Morgan 1c6b8752b8
Only assert valid next_link params when provided (#8417)
Broken in https://github.com/matrix-org/synapse/pull/8275 and has yet to be put in a release. Fixes https://github.com/matrix-org/synapse/issues/8418.

`next_link` is an optional parameter. However, we were checking whether the `next_link` param was valid, even if it wasn't provided. In that case, `next_link` was `None`, which would clearly not be a valid URL.

This would prevent password reset and other operations if `next_link` was not provided, and the `next_link_domain_whitelist` config option was set.
2020-09-29 12:36:44 +01:00
Richard van der Hoff 1c262431f9
Fix handling of connection timeouts in outgoing http requests (#8400)
* Remove `on_timeout_cancel` from `timeout_deferred`

The `on_timeout_cancel` param to `timeout_deferred` wasn't always called on a
timeout (in particular if the canceller raised an exception), so it was
unreliable. It was also only used in one place, and to be honest it's easier to
do what it does a different way.

* Fix handling of connection timeouts in outgoing http requests

Turns out that if we get a timeout during connection, then a different
exception is raised, which wasn't always handled correctly.

To fix it, catch the exception in SimpleHttpClient and turn it into a
RequestTimedOutError (which is already a documented exception).

Also add a description to RequestTimedOutError so that we can see which stage
it failed at.

* Fix incorrect handling of timeouts reading federation responses

This was trapping the wrong sort of TimeoutError, so was never being hit.

The effect was relatively minor, but we should fix this so that it does the
expected thing.

* Fix inconsistent handling of `timeout` param between methods

`get_json`, `put_json` and `delete_json` were applying a different timeout to
the response body to `post_json`; bring them in line and test.

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: Erik Johnston <erik@matrix.org>
2020-09-29 10:29:21 +01:00
Erik Johnston bd380d942f
Add checks for postgres sequence consistency (#8402) 2020-09-28 18:00:30 +01:00
Richard van der Hoff 5e3ca12b15
Create a mechanism for marking tests "logcontext clean" (#8399) 2020-09-28 17:58:33 +01:00
Richard van der Hoff fec6f9ac17
Fix occasional "Re-starting finished log context" from keyring (#8398)
* Fix test_verify_json_objects_for_server_awaits_previous_requests

It turns out that this wasn't really testing what it thought it was testing
(in particular, `check_context` was turning failures into success, which was
making the tests pass even though it wasn't clear they should have been.

It was also somewhat overcomplex - we can test what it was trying to test
without mocking out perspectives servers.

* Fix warnings about finished logcontexts in the keyring

We need to make sure that we finish the key fetching magic before we run the
verifying code, to ensure that we don't mess up our logcontexts.
2020-09-25 12:29:54 +01:00
Tdxdxoz abd04b6af0
Allow existing users to login via OpenID Connect. (#8345)
Co-authored-by: Benjamin Koch <bbbsnowball@gmail.com>

This adds configuration flags that will match a user to pre-existing users
when logging in via OpenID Connect. This is useful when switching to
an existing SSO system.
2020-09-25 07:01:45 -04:00
Erik Johnston f112cfe5bb
Fix MultiWriteIdGenerator's handling of restarts. (#8374)
On startup `MultiWriteIdGenerator` fetches the maximum stream ID for
each instance from the table and uses that as its initial "current
position" for each writer. This is problematic as a) it involves either
a scan of events table or an index (neither of which is ideal), and b)
if rows are being persisted out of order elsewhere while the process
restarts then using the maximum stream ID is not correct. This could
theoretically lead to race conditions where e.g. events that are
persisted out of order are not sent down sync streams.

We fix this by creating a new table that tracks the current positions of
each writer to the stream, and update it each time we finish persisting
a new entry. This is a relatively small overhead when persisting events.
However for the cache invalidation stream this is a much bigger relative
overhead, so instead we note that for invalidation we don't actually
care about reliability over restarts (as there's no caches to
invalidate) and simply don't bother reading and writing to the new table
in that particular case.
2020-09-24 16:53:51 +01:00
Erik Johnston ac11fcbbb8
Add EventStreamPosition type (#8388)
The idea is to remove some of the places we pass around `int`, where it can represent one of two things:

1. the position of an event in the stream; or
2. a token that partitions the stream, used as part of the stream tokens.

The valid operations are then:

1. did a position happen before or after a token;
2. get all events that happened before or after a token; and
3. get all events between two tokens.

(Note that we don't want to allow other operations as we want to change the tokens to be vector clocks rather than simple ints)
2020-09-24 13:24:17 +01:00
Erik Johnston cbabb312e0
Use `async with` for ID gens (#8383)
This will allow us to hit the DB after we've finished using the generated stream ID.
2020-09-23 16:11:18 +01:00
Dirk Klimpel 8998217540
Fixed a bug with reactivating users with the admin API (#8362)
Fixes: #8359 

Trying to reactivate a user with the admin API (`PUT /_synapse/admin/v2/users/<user_name>`) causes an internal server error.

Seems to be a regression in #8033.
2020-09-22 18:19:01 +01:00
Dirk Klimpel 4da01f9c61
Admin API for reported events (#8217)
Add an admin API to read entries of table `event_reports`. API: `GET /_synapse/admin/v1/event_reports`
2020-09-22 18:15:04 +01:00
Dionysis Grigoropoulos 37ca5924bd
Create function to check for long names in devices (#8364)
* Create a new function to verify that the length of a device name is
under a certain threshold.
* Refactor old code and tests to use said function.
* Verify device name length during registration of device
* Add a test for the above

Signed-off-by: Dionysis Grigoropoulos <dgrig@erethon.com>
2020-09-22 11:42:55 +01:00
Dirk Klimpel d688b4bafc
Admin API for querying rooms where a user is a member (#8306)
Add a new admin API `GET /_synapse/admin/v1/users/<user_id>/joined_rooms` to
list all rooms where a user is a member.
2020-09-18 15:26:36 +01:00
reivilibre 36efbcaf51
Catch-up after Federation Outage (bonus): Catch-up on Synapse Startup (#8322)
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>

* Fix _set_destination_retry_timings

This came about because the code assumed that retry_interval
could not be NULL — which has been challenged by catch-up.
2020-09-18 14:59:13 +01:00
Patrick Cloke 8a4a4186de
Simplify super() calls to Python 3 syntax. (#8344)
This converts calls like super(Foo, self) -> super().

Generated with:

    sed -i "" -Ee 's/super\([^\(]+\)/super()/g' **/*.py
2020-09-18 09:56:44 -04:00
Will Hunt 68c7a6936f
Allow appservice users to /login (#8320)
Add ability for ASes to /login using the `uk.half-shot.msc2778.login.application_service` login `type`.

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2020-09-18 14:55:13 +01:00
Jonathan de Jong 7c407efdc8
Update test logging to be able to accept braces (#8335) 2020-09-18 07:56:40 -04:00
reivilibre 576bc37d31
Catch-up after Federation Outage (split, 4): catch-up loop (#8272) 2020-09-15 09:07:19 +01:00
Tulir Asokan b82d68c0bd
Add the topic and avatar to the room details admin API (#8305) 2020-09-14 10:07:04 -04:00
Patrick Cloke a9dbe98ef9 Synapse 1.20.0rc3 (2020-09-11)
==============================
 
 Bugfixes
 --------
 
 - Fix a bug introduced in v1.20.0rc1 where the wrong exception was raised when invalid JSON data is encountered. ([\#8291](https://github.com/matrix-org/synapse/issues/8291))
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEF3tZXk38tRDFVnUIM/xY9qcRMEgFAl9bbA8ACgkQM/xY9qcR
 MEjjJg//ZGcvPLr8y3F5JXptcjWA3AFja3DgztBA3uNFoDrwVhR9m3+F4rL/Fbat
 gnaccWsKIXOjZw5WOThQEUIfYMKmHEsaDl7xtlUPC4ylmSV+31ypu1KiRSxND2kT
 TmOupvqw+4/E4RajpYX6WT3e2oQUpIcBAKsUyZL3ZXECR/yNYrA6w2w7i4wrv3cU
 QDzBFrCcr5aJql7VUc88BlaZUG6xom2/kkPtjmO6imPnPGBHLN22uPU7zQ4n7Vgi
 Nrg45v5WHVCx57WoXqAEap1zdKgoRCna1x0NkqYh0OAXq1l6aVwlf/Pdynt91f2x
 yyaMYjMskjyatHlzONMV4kz0w03dUrGJXiAx2ldEDd32SKBUxLJ/CYtNEdoXZ4Mm
 o0OJfDbaSJwlqK7FhdZJE3kCSkUvlmVasDErXfQRDVHy5sx9Ufr4P0kkaZyXDys1
 UAWFcCJwy9dgFR2679PoXk1s/gHf10wuk6FOs8ESbcJJOvCljxWrOIqKNzSUL3HQ
 Hj5V1zHYXB6HtgiI7tbKxSn/d2uajHI9b8LVOkyV9RKsb1DGGRgNQQ5Kz4Zud7dC
 vuQ2jVw7JIo41W3QIXvcL1KdQeje2nwYwuwNREZaEDZJsGLjxaJ6dNyBVgkmYYFP
 gxhplSOSsFjQRGEsBrc2utWxDnLVGPHEAt7iHVSqjwVUyARvxjo=
 =e2T1
 -----END PGP SIGNATURE-----

Merge tag 'v1.20.0rc3' into develop

Synapse 1.20.0rc3 (2020-09-11)
==============================

Bugfixes
--------

- Fix a bug introduced in v1.20.0rc1 where the wrong exception was raised when invalid JSON data is encountered. ([\#8291](https://github.com/matrix-org/synapse/issues/8291))
2020-09-11 08:30:36 -04:00
Erik Johnston fe8ed1b46f
Make `StreamToken.room_key` be a `RoomStreamToken` instance. (#8281) 2020-09-11 12:22:55 +01:00
Patrick Cloke b86764662b
Fix the exception that is raised when invalid JSON is encountered. (#8291) 2020-09-10 14:55:25 -04:00
Dan Callaghan c312ee3cde
Use TLSv1.2 for fake servers in tests (#8208)
Some Linux distros have begun disabling TLSv1.0 and TLSv1.1 by default
for security reasons, for example in Fedora 33 onwards:

https://fedoraproject.org/wiki/Changes/StrongCryptoSettings2

Use TLSv1.2 for the fake TLS servers created in the test suite, to avoid
failures due to OpenSSL disallowing TLSv1.0:

    <twisted.python.failure.Failure OpenSSL.SSL.Error: [('SSL routines',
    'ssl_choose_client_version', 'unsupported protocol')]>

Signed-off-by: Dan Callaghan <djc@djc.id.au>
2020-09-10 19:49:08 +01:00
Andrew Morgan a3a90ee031
Show a confirmation page during user password reset (#8004)
This PR adds a confirmation step to resetting your user password between clicking the link in your email and your password actually being reset.

This is to better align our password reset flow with the industry standard of requiring a confirmation from the user after email validation.
2020-09-10 11:45:12 +01:00
Patrick Cloke b312769c0e
Do not error when thumbnailing invalid files (#8236)
If a file cannot be thumbnailed for some reason (e.g. the file is empty), then
catch the exception and convert it to a reasonable error message for the client.
2020-09-09 12:59:41 -04:00
Erik Johnston c9dbee50ae
Fixup pusher pool notifications (#8287)
`pusher_pool.on_new_notifications` expected a min and max stream ID, however that was not what we were passing in. Instead, let's just pass it the current max stream ID and have it track the last stream ID it got passed.

I believe that it mostly worked as we called the function for every event. However, it would break for events that got persisted out of order, i.e, that were persisted but the max stream ID wasn't incremented as not all preceding events had finished persisting, and push for that event would be delayed until another event got pushed to the effected users.
2020-09-09 16:56:08 +01:00
Erik Johnston dc9dcdbd59 Revert "Fixup pusher pool notifications"
This reverts commit e7fd336a53.
2020-09-09 16:19:22 +01:00
Erik Johnston e7fd336a53 Fixup pusher pool notifications 2020-09-09 16:17:50 +01:00
reivilibre a5370072b5
Don't remember `enabled` of deleted push rules and properly return 404 for missing push rules in `.../actions` and `.../enabled` (#7796)
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2020-09-09 11:39:39 +01:00
Andrew Morgan 094896a69d
Add a config option for validating 'next_link' parameters against a domain whitelist (#8275)
This is a config option ported over from DINUM's Sydent: https://github.com/matrix-org/sydent/pull/285

They've switched to validating 3PIDs via Synapse rather than Sydent, and would like to retain this functionality.

This original purpose for this change is phishing prevention. This solution could also potentially be replaced by a similar one to https://github.com/matrix-org/synapse/pull/8004, but across all `*/submit_token` endpoint.

This option may still be useful to enterprise even with that safeguard in place though, if they want to be absolutely sure that their employees don't follow links to other domains.
2020-09-08 16:03:09 +01:00
Erik Johnston deedb91732
Fix `MultiWriterIdGenerator.current_position`. (#8257)
It did not correctly handle IDs finishing being persisted out of
order, resulting in the `current_position` lagging until new IDs are
persisted.
2020-09-08 14:26:54 +01:00
Patrick Cloke cef00211c8
Allow for make_awaitable's return value to be re-used. (#8261) 2020-09-08 07:26:55 -04:00
Andrew Morgan 68cdb3708e
Rename 'populate_stats_process_rooms_2' background job back to 'populate_stats_process_rooms' again (#8243)
Fixes https://github.com/matrix-org/synapse/issues/8238

Alongside the delta file, some changes were also necessary to the codebase to remove references to the now defunct `populate_stats_process_rooms_2` background job. Thankfully the latter doesn't seem to have made it into any documentation yet :)
2020-09-08 11:05:59 +01:00
reivilibre 765437df54
Add tests for `last_successful_stream_ordering` (#8258) 2020-09-07 10:11:38 +01:00
reivilibre 58f61f10f7
Catch-up after Federation Outage (split, 1) (#8230)
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
2020-09-04 12:22:23 +01:00
Patrick Cloke c619253db8
Stop sub-classing object (#8249) 2020-09-04 06:54:56 -04:00
Brendan Abolivier 5a1dd297c3
Re-implement unread counts (again) (#8059) 2020-09-02 17:19:37 +01:00
Will Hunt b257c788c0
Add /user/{user_id}/shared_rooms/ api (#7785)
* Add shared_rooms api

* Add changelog

* Add .

* Wrap response in {"rooms": }

* linting

* Add unstable_features key

* Remove options from isort that aren't part of 5.x

`-y` and `-rc` are now default behaviour and no longer exist.

`dont-skip` is no longer required

https://timothycrosley.github.io/isort/CHANGELOG/#500-penny-july-4-2020

* Update imports to make isort happy

* Add changelog

* Update tox.ini file with correct invocation

* fix linting again for isort

* Vendor prefix unstable API

* Fix to match spec

* import Codes

* import Codes

* Use FORBIDDEN

* Update changelog.d/7785.feature

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>

* Implement get_shared_rooms_for_users

* a comma

* trailing whitespace

* Handle the easy feedback

* Switch to using runInteraction

* Add tests

* Feedback

* Seperate unstable endpoint from v2

* Add upgrade node

* a line

* Fix style by adding a blank line at EOF.

* Update synapse/storage/databases/main/user_directory.py

Co-authored-by: Tulir Asokan <tulir@maunium.net>

* Update synapse/storage/databases/main/user_directory.py

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>

* Update UPGRADE.rst

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>

* Fix UPGRADE/CHANGELOG unstable paths

unstable unstable unstable

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-authored-by: Tulir Asokan <tulir@maunium.net>

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: Tulir Asokan <tulir@maunium.net>
2020-09-02 13:18:40 +01:00
Patrick Cloke 5bf8e5f55b
Convert the well known resolver to async (#8214) 2020-09-01 09:15:22 -04:00
Patrick Cloke da77520cd1
Convert additional databases to async/await part 2 (#8200) 2020-09-01 08:39:04 -04:00
Erik Johnston bbb3c8641c
Make MultiWriterIDGenerator work for streams that use negative stream IDs (#8203)
This is so that we can use it for the backfill events stream.
2020-09-01 13:36:25 +01:00
Richard van der Hoff 45e8f7726f
Rename `get_e2e_device_keys` to better reflect its purpose (#8205)
... and to show that it does something slightly different to
`_get_e2e_device_keys_txn`.

`include_all_devices` and `include_deleted_devices` were never used (and
`include_deleted_devices` was broken, since that would cause `None`s in the
result which were not handled in the loop below.

Add some typing too.
2020-08-29 00:14:17 +01:00