Commit Graph

4112 Commits (3acf85c85f62655077f8c4b466389de4a4183604)

Author SHA1 Message Date
Erik Johnston 3acf85c85f
Reduce likelihood of Postgres table scanning `state_groups_state`. (#10359)
The postgres statistics collector sometimes massively underestimates the
number of distinct state groups are in the `state_groups_state`, which
can cause postgres to use table scans for queries for multiple state
groups.

We fix this by manually setting `n_distinct` on the column.
2021-07-15 16:02:12 +01:00
Patrick Cloke 2d16e69b4b
Show all joinable rooms in the spaces summary. (#10298)
Previously only world-readable rooms were shown. This means that
rooms which are public, knockable, or invite-only with a pending invitation,
are included in a space summary. It also applies the same logic to
the experimental room version from MSC3083 -- if a user has access
to the proper allowed rooms then it is shown in the spaces summary.

This change is made per MSC3173 allowing stripped state of a room to
be shown to any potential room joiner.
2021-07-13 08:59:27 -04:00
Erik Johnston 879d8c1ee1
Fix federation inbound age metric. (#10355)
We should be reporting the age rather than absolute timestamp.
2021-07-13 11:33:15 +01:00
Richard van der Hoff c2c364f27f
Replace `room_depth.min_depth` with a BIGINT (#10289)
while I'm dealing with INTEGERs and BIGINTs, let's replace room_depth.min_depth
with a BIGINT.
2021-07-12 17:22:54 +01:00
reivilibre ca9dface8c
Fix the user directory becoming broken (and noisy errors being logged) when knocking and room statistics are in use. (#10344)
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
2021-07-09 14:12:47 +01:00
Richard van der Hoff 751372fa61
Switch `application_services_txns.txn_id` to BIGINT (#10349) 2021-07-09 13:01:11 +01:00
Erik Johnston 251cfc4e09 Synapse 1.38.0rc2 (2021-07-09)
==============================
 
 Bugfixes
 --------
 
 - Fix bug where inbound federation in a room could be delayed due to not correctly dropping a lock. Introduced in v1.37.1. ([\#10336](https://github.com/matrix-org/synapse/issues/10336))
 
 Improved Documentation
 ----------------------
 
 - Update links to documentation in the sample config. Contributed by @dklimpel. ([\#10287](https://github.com/matrix-org/synapse/issues/10287))
 - Fix broken links in [INSTALL.md](INSTALL.md). Contributed by @dklimpel. ([\#10331](https://github.com/matrix-org/synapse/issues/10331))
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEBTGR3/RnAzBGUif3pULk7RsPrAkFAmDoH+4QHGVyaWtAbWF0
 cml4Lm9yZwAKCRClQuTtGw+sCXxYCACneuRvkdvYqiH+PhPe8tXqhhJIifH1LecY
 FlJqp4OJPR2VFzio1btsgpRPQyLBLHZkJ9pgWsXAETbYOO+hSeOc4nIHsyqlSJhe
 v01sCUE4sle3DBrw15fG4XpercsiM3TFMyR9pV9laq9nIn8j+CY5K6W5t12/mYGy
 asHS0IKilCMhJlFwgE3eBr6P6fywi0JoIrr8EpfIs4eC2qDFpUlsrAQSkbE1JvdP
 O4BGZJKVysg3a6WYSWdJytqLYe942k8qUF4B4h4VmQi0xbuKSsTLiK/cFC8ohRMv
 E+O5O/KgwqwE/XOcukbsjlHxuiiFZTq6154PwLxXUpNnsMNn2/ph
 =6iBw
 -----END PGP SIGNATURE-----

Merge tag 'v1.38.0rc2' into develop

Synapse 1.38.0rc2 (2021-07-09)
==============================

Bugfixes
--------

- Fix bug where inbound federation in a room could be delayed due to not correctly dropping a lock. Introduced in v1.37.1. ([\#10336](https://github.com/matrix-org/synapse/issues/10336))

Improved Documentation
----------------------

- Update links to documentation in the sample config. Contributed by @dklimpel. ([\#10287](https://github.com/matrix-org/synapse/issues/10287))
- Fix broken links in [INSTALL.md](INSTALL.md). Contributed by @dklimpel. ([\#10331](https://github.com/matrix-org/synapse/issues/10331))
2021-07-09 11:26:17 +01:00
Andreas Rammhold e3e73e181b
Upsert redactions in case they already exists (#10343)
* Upsert redactions in case they already exists

Occasionally, in combination with retention, redactions aren't deleted
from the database whenever they are due for deletion. The server will
eventually try to backfill the deleted events and trip over the already
existing redaction events.

Switching to an UPSERT for those events allows us to recover from there
situations. The retention code still needs fixing but that is outside of
my current comfort zone on this code base.

This is related to #8707 where the error was discussed already.

Signed-off-by: Andreas Rammhold <andreas@rammhold.de>

* Also purge redactions when purging events

Previously redacints where left behind leading to backfilling issues
when the server stumbled across the already existing yet to be
backfilled redactions.

This issues has been discussed in #8707.

Signed-off-by: Andreas Rammhold <andreas@rammhold.de>
2021-07-09 11:03:02 +01:00
Erik Johnston 1579fdd54a
Ensure we always drop the federation inbound lock (#10336) 2021-07-09 10:16:54 +01:00
Cristina f6767abc05
Remove functionality associated with unused historical stats tables (#9721)
Fixes #9602
2021-07-08 16:57:13 +01:00
reivilibre aa78064869
Minor changes to `user_daily_visits` (#10324)
* Use fake time in tests in _get_start_of_day.

* Change the inequality of last_seen in user_daily_visits

Co-authored-by: Erik Johnston <erik@matrix.org>
2021-07-08 14:27:12 +01:00
Brendan Abolivier 9ad8455895
ANALYZE new stream ordering column (#10326)
Fixes #10325
2021-07-07 11:56:17 +02:00
Dirk Klimpel bcb0962a72
Fix deactivate a user if he does not have a profile (#10252) 2021-07-06 13:08:53 +01:00
Erik Johnston 6655ea5587
Add script for getting info about recently registered users (#10290) 2021-07-06 13:03:16 +01:00
Erik Johnston c65067d673
Handle old staged inbound events (#10303)
We might have events in the staging area if the service was restarted while there were unhandled events in the staging area.

Fixes #10295
2021-07-06 13:02:37 +01:00
Richard van der Hoff 0aab50c772
fix ordering of bg update (#10291)
this was a typo introduced in #10282. We don't want to end up doing the
`replace_stream_ordering_column` update after anything that comes up in
migration 60/03.
2021-07-01 18:45:55 +01:00
Erik Johnston 76addadd7c
Add some metrics to staging area (#10284) 2021-07-01 10:18:25 +01:00
Richard van der Hoff b6dbf89fae
Change more stream_ordering columns to BIGINT (#10286) 2021-06-30 17:27:20 +01:00
Richard van der Hoff 859dc05b36
Rebuild other indexes using `stream_ordering` (#10282)
We need to rebuild *all* of the indexes that use the current `stream_ordering`
column.
2021-06-30 15:01:24 +01:00
Erik Johnston 329ef5c715
Fix the inbound PDU metric (#10279)
This broke in #10272
2021-06-30 12:07:16 +01:00
Richard van der Hoff 785bceef72 Merge branch 'release-v1.37' into develop 2021-06-29 20:25:47 +01:00
Erik Johnston c54db67d0e
Handle inbound events from federation asynchronously (#10272)
Fixes #9490

This will break a couple of SyTest that are expecting failures to be added to the response of a federation /send, which obviously doesn't happen now that things are asynchronous.

Two drawbacks:

    Currently there is no logic to handle any events left in the staging area after restart, and so they'll only be handled on the next incoming event in that room. That can be fixed separately.
    We now only process one event per room at a time. This can be fixed up further down the line.
2021-06-29 19:55:22 +01:00
Erik Johnston 85d237eba7
Add a distributed lock (#10269)
This adds a simple best effort locking mechanism that works cross workers.
2021-06-29 19:15:47 +01:00
Richard van der Hoff 7647b0337f
Fix `populate_stream_ordering2` background job (#10267)
It was possible for us not to find any rows in a batch, and hence conclude that
we had finished. Let's not do that.
2021-06-29 12:43:36 +01:00
Richard van der Hoff 60efc51a2b
Migrate stream_ordering to a bigint (#10264)
* Move background update names out to a separate class

`EventsBackgroundUpdatesStore` gets inherited and we don't really want to
further pollute the namespace.

* Migrate stream_ordering to a bigint

* changelog
2021-06-29 11:25:34 +01:00
Quentin Gliech bd4919fb72
MSC2918 Refresh tokens implementation (#9450)
This implements refresh tokens, as defined by MSC2918

This MSC has been implemented client side in Hydrogen Web: vector-im/hydrogen-web#235

The basics of the MSC works: requesting refresh tokens on login, having the access tokens expire, and using the refresh token to get a new one.

Signed-off-by: Quentin Gliech <quentingliech@gmail.com>
2021-06-24 14:33:20 +01:00
Erik Johnston 33701dc116
Fix schema delta to not take as long on large servers (#10227)
Introduced in #6739
2021-06-22 12:00:45 +01:00
Eric Eastwood 96f6293de5
Add endpoints for backfilling history (MSC2716) (#9247)
Work on https://github.com/matrix-org/matrix-doc/pull/2716
2021-06-22 10:02:53 +01:00
Erik Johnston a5cd05beee
Fix performance of responding to user key requests over federation (#10221)
We were repeatedly looking up a config option in a loop (using the
unclassed config style), which is expensive enough that it can cause
large CPU usage.
2021-06-21 14:38:59 +01:00
Andrew Morgan 6f1a28de19
Fix incorrect time magnitude on delayed call (#10195)
Fixes https://github.com/matrix-org/synapse/issues/10030.

We were expecting milliseconds where we should have provided a value in seconds.

The impact of this bug isn't too bad. The code is intended to count the number of remote servers that the homeserver can see and report that as a metric. This metric is supposed to run initially 1 second after server startup, and every 60s as well. Instead, it ran 1,000 seconds after server startup, and every 60s after startup.

This fix allows for the correct metrics to be collected immediately, as well as preventing a random collection 1,000s in the future after startup.
2021-06-17 15:04:26 +01:00
Richard van der Hoff 52c60bd0a9
Fix persist_events to stop leaking opentracing contexts (#10193) 2021-06-17 11:21:53 +01:00
Richard van der Hoff 9e405034e5
Make opentracing trace into event persistence (#10134)
* Trace event persistence

When we persist a batch of events, set the parent opentracing span to the that
from the request, so that we can trace all the way in.

* changelog

* When we force tracing, set a baggage item

... so that we can check again later.

* Link in both directions between persist_events spans
2021-06-16 11:41:15 +01:00
Richard van der Hoff 1dfdc87b9b
Refactor `EventPersistenceQueue` (#10145)
some cleanup, pulled out of #10134.
2021-06-14 11:59:27 +01:00
Richard van der Hoff c1b9922498
Support for database schema version ranges (#9933)
This is essentially an implementation of the proposal made at https://hackmd.io/@richvdh/BJYXQMQHO, though the details have ended up looking slightly different.
2021-06-11 14:45:53 +01:00
Erik Johnston d26d15ba3d
Fix bug when running presence off master (#10149)
Hopefully fixes #10027.
2021-06-11 10:27:12 +01:00
Andrew Morgan a7a37437bc
Integrate knock rooms with the public rooms directory (#9359)
This PR implements the ["Changes regarding the Public Rooms Directory"](https://github.com/Sorunome/matrix-doc/blob/soru/knock/proposals/2403-knock.md#changes-regarding-the-public-rooms-directory) section of knocking MSC2403.

Specifically, it:

* Allows rooms with `join_rule` "knock" to be returned by the query behind the public rooms directory
* Adds the field `join_rule` to each room entry returned by a public rooms directory query, so clients can know whether to attempt a join or knock on a room

Based on https://github.com/matrix-org/synapse/issues/6739. Complement tests for this change: https://github.com/matrix-org/complement/pull/72
2021-06-09 20:31:31 +01:00
Sorunome d936371b69
Implement knock feature (#6739)
This PR aims to implement the knock feature as proposed in https://github.com/matrix-org/matrix-doc/pull/2403

Signed-off-by: Sorunome mail@sorunome.de
Signed-off-by: Andrew Morgan andrewm@element.io
2021-06-09 19:39:51 +01:00
Erik Johnston 1092718cac
Fix logging context when opening new DB connection (#10141)
Fixes #10140
2021-06-08 13:49:29 +01:00
Richard van der Hoff 0acb5010ec
More database opentracing (#10136)
Add a couple of extra logs/spans, to give a bit of a better idea.
2021-06-07 18:01:32 +01:00
Richard van der Hoff 9eea4646be
Add OpenTracing for database activity. (#10113)
This adds quite a lot of OpenTracing decoration for database activity. Specifically it adds tracing at four different levels:

 * emit a span for each "interaction" - ie, the top level database function that we tend to call "transaction", but isn't really, because it can end up as multiple transactions.
 * emit a span while we hold a database connection open
 * emit a span for each database transaction - actual actual transaction.
 * emit a span for each database query.

I'm aware this might be quite a lot of overhead, but even just running it on a local Synapse it looks really interesting, and I hope the overhead can be offset just by turning down the sampling frequency and finding other ways of tracing requests of interest (eg, the `force_tracing_for_users` setting).
2021-06-03 16:31:56 +01:00
Dirk Klimpel 0284d2a297
Add new admin APIs to remove media by media ID from quarantine. (#10044)
Related to: #6681, #5956, #10040

Signed-off-by: Dirk Klimpel dirk@klimpel.org
2021-06-02 18:50:35 +01:00
Richard van der Hoff b4b2fd2ece
add a cache to have_seen_event (#9953)
Empirically, this helped my server considerably when handling gaps in Matrix HQ. The problem was that we would repeatedly call have_seen_events for the same set of (50K or so) auth_events, each of which would take many minutes to complete, even though it's only an index scan.
2021-06-01 12:04:47 +01:00
Callum Brown 8fb9af570f
Make reason and score optional for report_event (#10077)
Implements MSC2414: https://github.com/matrix-org/matrix-doc/pull/2414
See #8551 

Signed-off-by: Callum Brown <callum@calcuode.com>
2021-05-27 18:42:23 +01:00
Richard van der Hoff 224f2f949b
Combine `LruCache.invalidate` and `invalidate_many` (#9973)
* Make `invalidate` and `invalidate_many` do the same thing

... so that we can do either over the invalidation replication stream, and also
because they always confused me a bit.

* Kill off `invalidate_many`

* changelog
2021-05-27 10:33:56 +01:00
Dirk Klimpel 65e6c64d83
Add an admin API for unprotecting local media from quarantine (#10040)
Signed-off-by: Dirk Klimpel dirk@klimpel.org
2021-05-26 11:19:47 +01:00
Patrick Cloke 7adcb20fc0
Add missing type hints to synapse.util (#9982) 2021-05-24 15:32:01 -04:00
Richard van der Hoff c0df6bae06
Remove `keylen` from `LruCache`. (#9993)
`keylen` seems to be a thing that is frequently incorrectly set, and we don't really need it.

The only time it was used was to figure out if we had removed a subtree in `del_multi`, which we can do better by changing `TreeCache.pop` to return a different type (`TreeCacheNode`).

Commits should be independently reviewable.
2021-05-24 14:02:01 +01:00
Eric Eastwood 5f1198a67e
Fix `get_state_ids_for_event` return type typo to match what the function actually does (#10050)
It looks like a typo copy/paste from `get_state_for_event` above.
2021-05-24 10:43:33 +01:00
Erik Johnston 3e831f24ff
Don't hammer the database for destination retry timings every ~5mins (#10036) 2021-05-21 17:57:08 +01:00
Marek Matys 6a8643ff3d
Fixed removal of new presence stream states (#10014)
Fixes: https://github.com/matrix-org/synapse/issues/9962

This is a fix for above problem.

I fixed it by swaping the order of insertion of new records and deletion of old ones. This ensures that we don't delete fresh database records as we do deletes before inserts.

Signed-off-by: Marek Matys <themarcq@gmail.com>
2021-05-21 12:02:06 +01:00