Commit Graph

11914 Commits (2a8ed93bd42518042d732ff79559b769f52bedeb)

Author SHA1 Message Date
Richard van der Hoff 1bbc9e2df6
Clean up exception handling in SAML2ResponseResource (#7614)
* Expose `return_html_error`, and allow it to take a Jinja2 template instead of a raw string

* Clean up exception handling in SAML2ResponseResource

  * use the existing code in `return_html_error` instead of re-implementing it
    (giving it a jinja2 template rather than inventing a new form of template)

  * do the exception-catching in the REST layer rather than in the handler
    layer, to make sure we catch all exceptions.
2020-06-03 10:41:12 +01:00
Olof Johansson fe434cd3c9
Fix a bug in automatic user creation with m.login.jwt. (#7585) 2020-06-01 12:55:07 -04:00
Brendan Abolivier 33c39ab93c
Process cross-signing keys when resyncing device lists (#7594)
It looks like `user_device_resync` was ignoring cross-signing keys from the results received from the remote server. This patch fixes this, by processing these keys using the same process `_handle_signing_key_updates` does (and effectively factor that part out of that function).
2020-06-01 17:47:30 +02:00
Dirk Klimpel 901b1fa561
Email notifications for new users when creating via the Admin API. (#7267) 2020-06-01 15:34:33 +01:00
Dagfinn Ilmari Mannsåker df8a3cef6b
Improve performance of _get_state_groups_from_groups_txn (#7567)
The query keeps showing up in my slow query log.

This changes the plan under the top-level Sort node from

```
    WindowAgg  (cost=280335.88..292963.15 rows=561212 width=80) (actual time=138.651..160.562 rows=27112 loops=1)
      ->  Sort  (cost=280335.88..281738.91 rows=561212 width=84) (actual time=138.597..140.622 rows=27112 loops=1)
            Sort Key: state_groups_state.type, state_groups_state.state_key, state_groups_state.state_group
            Sort Method: quicksort  Memory: 4581kB
            ->  Nested Loop  (cost=2.83..226745.22 rows=561212 width=84) (actual time=21.548..47.657 rows=27112 loops=1)
                  ->  HashAggregate  (cost=2.27..3.28 rows=101 width=8) (actual time=21.526..21.535 rows=20 loops=1)
                        Group Key: state.state_group
                        ->  CTE Scan on state  (cost=0.00..2.02 rows=101 width=8) (actual time=21.280..21.493 rows=20 loops=1)
                  ->  Index Scan using state_groups_state_type_idx on state_groups_state  (cost=0.56..2189.40 rows=5557 width=84) (actual time=0.005..0.991 rows=1356 loops=20)
                        Index Cond: (state_group = state.state_group)
```

to

```
    Nested Loop  (cost=2.83..226745.22 rows=561212 width=84) (actual time=24.194..52.834 rows=27112 loops=1)
      ->  HashAggregate  (cost=2.27..3.28 rows=101 width=8) (actual time=24.130..24.138 rows=20 loops=1)
            Group Key: state.state_group
            ->  CTE Scan on state  (cost=0.00..2.02 rows=101 width=8) (actual time=23.887..24.113 rows=20 loops=1)
      ->  Index Scan using state_groups_state_type_idx on state_groups_state  (cost=0.56..2189.40 rows=5557 width=84) (actual time=0.016..1.159 rows=1356 loops=20)
            Index Cond: (state_group = state.state_group)
```

This cuts the execution time from ~190ms to ~130ms, i.e. a reduction
of ~30%.

The full plans are visualised at https://explain.depesz.com/s/WpbT and
https://explain.depesz.com/s/KlEk

Signed-off-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
2020-06-01 15:23:43 +01:00
Patrick Cloke 6af9cdca24
Convert groups local and server to async/await. (#7600) 2020-06-01 07:28:43 -04:00
Brendan Abolivier c1bdd4fac7
Don't fail all of an iteration of the device list retry loop on error (#7609)
Without this patch, if an error happens which isn't caught by `user_device_resync`, then `_maybe_retry_device_resync` would fail, without retrying the next users in the iteration. This patch fixes this so that it now only logs an error in this case.
2020-06-01 12:55:14 +02:00
Dagfinn Ilmari Mannsåker 2dc430d36e
Use upsert when inserting read receipts (#7607)
Fixes #7469

Signed-off-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
2020-06-01 10:53:06 +01:00
Erik Johnston cb495f526d
Fix 'FederationGroupsRoomsServlet' API when group has room server is not in. (#7599) 2020-05-29 17:49:47 +01:00
Erik Johnston f5353eff21
Make inflight background metrics more efficient. (#7597)
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2020-05-29 13:25:32 +01:00
Brendan Abolivier 5cb470b495 Merge branch 'master' into develop 2020-05-28 12:50:26 +02:00
Brendan Abolivier 61469308df
1.14.0 2020-05-28 12:36:00 +02:00
Erik Johnston 8c5f88fa4d
Merge pull request #7584 from matrix-org/erikj/save_and_send_fed_token_in_bg
Speed up processing of federation stream RDATA rows.
2020-05-27 20:06:29 +01:00
Erik Johnston ef3934ec8f Ensure we persist and ack the same token 2020-05-27 19:45:42 +01:00
Erik Johnston 3d7f1b53d9 Remove spurious change 2020-05-27 19:41:44 +01:00
Erik Johnston 35c308731d Speed up processing of federation stream RDATA rows.
Instead of storing and sending an ACK for every single row we send
synchronously, we instead do it asynchronously while batching up
updates.
2020-05-27 19:34:07 +01:00
Christopher Cooper c4a820b32a
allow emails to be passed through SAML (#7385)
Signed-off-by: Christopher Cooper <cooperc@ocf.berkeley.edu>
2020-05-27 17:40:08 +01:00
Brendan Abolivier 5af572ada0 Synapse 1.14.0rc2 (2020-05-27)
==============================
 
 Bugfixes
 --------
 
 - Fix cache config to not apply cache factor to event cache. Regression in v1.14.0rc1. ([\#7578](https://github.com/matrix-org/synapse/issues/7578))
 - Fix bug where `ReplicationStreamer` was not always started when replication was enabled. Bug introduced in v1.14.0rc1. ([\#7579](https://github.com/matrix-org/synapse/issues/7579))
 - Fix specifying individual cache factors for caches with special characters in their name. Regression in v1.14.0rc1. ([\#7580](https://github.com/matrix-org/synapse/issues/7580))
 
 Improved Documentation
 ----------------------
 
 - Fix the OIDC `client_auth_method` value in the sample config. ([\#7581](https://github.com/matrix-org/synapse/issues/7581))
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdVkXOgzrGzds0jtrHgFcFF8ZFs0FAl7Oh9sACgkQHgFcFF8Z
 Fs2OmQ//SlksllrJ7egG0JeprfCCz9TJt7cYJJhCprqfBVNs8lIc+UT/NX0/duGe
 g2KC4kPGEyusRmGSV43Yt/m/qqhcdM4tXq4iErgHG/xOTqNN/GVvrE3RaUvo9ydS
 9IdeIkDON5ylEe8sSigbBGUnpCS20Stch1z2edoUSQHNnMMSgmTNpFaZnCEXc+Us
 EYv+HfCeShdLXbzMioYj6B5qNiRnG+hJmz+h40Bmp/HAuZRyp4kx5EawAgMXIDfy
 DGWn3H7TksqC+7zETAlQgMwFzggwQx64Dpa9w2RFAchUCG6bi8pM2U3doVDGlu1i
 4HcltfBE+fE5Sy2tT2zz8qpaGGFwAp6K/c+4h29PwVBKSHRN4nepSqNHHb3fA1Ea
 GI8DWiHGyZWzlMmKI4x85lMFyAGvGJiO1Jo8icietJHZ3+7tr6SUlnIPwlLFdkUv
 xCKmlrYzwDO5enzoF6HuddnMLbtE304Ckr+vj5gWpBwYf3FIXpUS101YBV8zXTsB
 NJBMu2fjYEkPQ/d9YjWRZwL312Cb68Kytlp2ETiTuuyfLCA5Df2wdA6yReemyg//
 ZFfN1Z/Dc6p6MVRF3sf38jcWKX3r9ErNQ0p5q//uo4JbpnMW7FLXiSv/xonj4JTv
 VLSFy6F0YIIVTol4H9Fj7f3iq8/zsJe3kBE/Kycd4uhplmfvK2w=
 =BVTL
 -----END PGP SIGNATURE-----

Merge tag 'v1.14.0rc2' into develop

Synapse 1.14.0rc2 (2020-05-27)
==============================

Bugfixes
--------

- Fix cache config to not apply cache factor to event cache. Regression in v1.14.0rc1. ([\#7578](https://github.com/matrix-org/synapse/issues/7578))
- Fix bug where `ReplicationStreamer` was not always started when replication was enabled. Bug introduced in v1.14.0rc1. ([\#7579](https://github.com/matrix-org/synapse/issues/7579))
- Fix specifying individual cache factors for caches with special characters in their name. Regression in v1.14.0rc1. ([\#7580](https://github.com/matrix-org/synapse/issues/7580))

Improved Documentation
----------------------

- Fix the OIDC `client_auth_method` value in the sample config. ([\#7581](https://github.com/matrix-org/synapse/issues/7581))
2020-05-27 17:35:29 +02:00
Andrew Morgan 0a6e837aaa
Fix incorrect placeholder syntax in database prepartion code (#7575)
We were using `logger` syntax which isn't supported by `Exception`s.
2020-05-27 16:26:59 +01:00
Brendan Abolivier b4109499b4 1.14.0rc2 2020-05-27 17:22:28 +02:00
Jason Robinson 4be968d05d
Fix sample config docs error (#7581)
'client_auth_method' commented out value was erronously 'client_auth_basic',
when code and docstring says it should be 'client_secret_basic'.

Signed-off-by: Jason Robinson <jasonr@matrix.org>
2020-05-27 13:52:18 +01:00
Erik Johnston d7d8a2e7ee Fix up comments 2020-05-27 13:34:46 +01:00
Erik Johnston 4ba55559ac
Fix specifying cache factors via env vars with * in name. (#7580)
This mostly applise to `*stateGroupCache*` and co.

Broke in #6391.
2020-05-27 13:17:01 +01:00
Erik Johnston eefc6b3a0d
Don't apply cache factor to event cache. (#7578)
This is already correctly done when we instansiate the cache, but wasn't
when it got reloaded (which always happens at least once on startup).
2020-05-27 12:04:37 +01:00
Erik Johnston 9bac5d62b3
Ensure ReplicationStreamer is always started when replication enabled. (#7579)
Fixes #7566.
2020-05-27 11:44:19 +01:00
Brendan Abolivier 98483890ee Merge branch 'develop' of github.com:matrix-org/synapse into develop 2020-05-26 20:30:41 +02:00
Patrick Cloke ef884f6d04
Convert identity handler to async/await. (#7561) 2020-05-26 13:46:22 -04:00
Brendan Abolivier 87e417c5cb
Not full release yet, this is rc1 2020-05-26 17:20:43 +02:00
Brendan Abolivier 3b19c17247 1.14.0 2020-05-26 16:45:37 +02:00
Richard van der Hoff edd9a7214c
Replace device_27_unique_idx bg update with a fg one (#7562)
The bg update never managed to complete, because it kept being interrupted by
transactions which want to take a lock.

Just doing it in the foreground isn't that bad, and is a good deal simpler.
2020-05-26 11:43:17 +01:00
Richard van der Hoff 04729b86f8
Fix incorrect exception handling in KeyUploadServlet.on_POST (#7563)
Introduced in #7556
2020-05-26 11:42:22 +01:00
Richard van der Hoff 00db90f409
Fix recording of federation stream token (#7564)
A couple of changes of significance:

 * remove the `_last_ack < federation_position` condition, so that
   updates will still be correctly processed after restart

 * Correctly wire up send_federation_ack to the right class.
2020-05-26 11:41:38 +01:00
Richard van der Hoff d14c4d6b6d
Simplify reap_monthly_active_users (#7558)
we can use `make_in_list_sql_clause` rather than doing our own half-baked
equivalent, which has the benefit of working just fine with empty lists.

(This has quite a lot of tests, so I think it's pretty safe)
2020-05-23 01:20:10 +01:00
Richard van der Hoff f4269694ce
Optimise some references to hs.config (#7546)
These are surprisingly expensive, and we only really need to do them at startup.
2020-05-22 21:47:07 +01:00
Erik Johnston 2901f54359
Fix missing CORS headers on OPTION responses (#7560)
Broke in #7534.
2020-05-22 17:42:39 +01:00
Erik Johnston e5c67d04db
Add option to move event persistence off master (#7517) 2020-05-22 16:11:35 +01:00
Patrick Cloke 4429764c9f
Return 200 OK for all OPTIONS requests (#7534) 2020-05-22 09:30:07 -04:00
Erik Johnston 1531b214fc
Add ability to wait for replication streams (#7542)
The idea here is that if an instance persists an event via the replication HTTP API it can return before we receive that event over replication, which can lead to races where code assumes that persisting an event immediately updates various caches (e.g. current state of the room).

Most of Synapse doesn't hit such races, so we don't do the waiting automagically, instead we do so where necessary to avoid unnecessary delays. We may decide to change our minds here if it turns out there are a lot of subtle races going on.

People probably want to look at this commit by commit.
2020-05-22 14:21:54 +01:00
Erik Johnston 06a02bc1ce
Convert sending mail to async/await. (#7557)
Mainly because sometimes the email push code raises exceptions where the
stack traces have gotten lost, which is hopefully fixed by this.
2020-05-22 13:41:11 +01:00
Patrick Cloke 66f2ebc22f
Use a non-empty RelayState for user interactive auth with SAML. (#7552) 2020-05-22 07:17:30 -04:00
Erik Johnston 710d958c64
On upgrade room only send canonical alias once. (#7547)
Instead of doing a complicated dance of deleting and moving aliases one
by one, which sends a canonical alias update into the old room for each
one, lets do it all in one go.

This also changes the function to move *all* local alias events to the new
room, however that happens later on anyway.
2020-05-22 11:41:41 +01:00
Erik Johnston 547e4dd83e
Fix exception reporting due to HTTP request errors. (#7556)
These are business as usual errors, rather than stuff we want to log at
error.
2020-05-22 11:39:20 +01:00
Ivan Shapovalov ac481a738e
synapse.metrics: implement detailed memory usage reporting on PyPy (#7536)
PyPy's gc.get_stats() returns an object containing detailed allocator statistics
which could be beneficial to collect as metrics.

Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
2020-05-22 11:08:41 +01:00
Richard van der Hoff a0f99f81b3
Fix stacktrace mangling in `patch_inline_callbacks` (#7554)
`Failure()` is more cunning than `Failure(e)`.
2020-05-22 10:17:36 +01:00
Richard van der Hoff d84bdfe599
mypy for synapse.http.site (#7553) 2020-05-22 10:12:17 +01:00
Richard van der Hoff 66a564c859
Fix some DETECTED VIOLATIONS in the config file (#7550)
consistency ftw
2020-05-22 10:11:50 +01:00
Brendan Abolivier d1ae1015ec
Retry to sync out of sync device lists (#7453)
When a call to `user_device_resync` fails, we don't currently mark the remote user's device list as out of sync, nor do we retry to sync it.

https://github.com/matrix-org/synapse/pull/6776 introduced some code infrastructure to mark device lists as stale/out of sync.

This commit uses that code infrastructure to mark device lists as out of sync if processing an incoming device list update makes the device handler realise that the device list is out of sync, but we can't resync right now.

It also adds a looping call to retry all failed resync every 30s. This shouldn't cause too much spam in the logs as this commit also removes the "Failed to handle device list update for..." warning logs when catching `NotRetryingDestination`.

Fixes #7418
2020-05-21 17:41:12 +02:00
Richard van der Hoff 0bbbd10513
Stub out GET presence requests in the frontend proxy (#7545)
We don't really make any promises about returning accurate presence data when
presence is disabled, so we may as well just return a static response, rather
than making the master handle a request.
2020-05-21 14:36:46 +01:00
Richard van der Hoff 075375bbc9 add a comment 2020-05-21 13:25:41 +01:00
Erik Johnston f6f92845f8
Fix bug in persist events when dealing with non member types. (#7548)
`_is_server_still_joined` will throw if it is given state updates with non-user ID state keys with local user leaves. This is actually rarely a problem since local leaves almost always get persisted by themselves.

(I discovered this on a branch that was otherwise broken, so I haven't seen this in the wild)
2020-05-21 13:20:10 +01:00