Commit Graph

40 Commits (eaf4d11af9da7d6d9ce71cb83f70424bb38e0703)

Author SHA1 Message Date
Erik Johnston cf82338930
Merge pull request #4627 from matrix-org/erikj/user_ips_analyze
Analyze user_ips before running deduplication
2019-02-12 13:05:09 +00:00
Erik Johnston 495ea92350 Fix pep8 2019-02-12 12:40:42 +00:00
Erik Johnston 483ba85c7a Analyze user_ips before running deduplication
Due to the table locks taken out by the naive upsert, the table
statistics may be out of date. During deduplication it is important that
the correct index is used as otherwise a full table scan may be
incorrectly used, which can end up thrashing the database badly.
2019-02-12 11:55:27 +00:00
Erik Johnston 362d80b770 Reduce user_ips bloat during dedupe background update
The background update to remove duplicate rows naively deleted and
reinserted the duplicates. For large tables with a large number of
duplicates this causes a lot of bloat (with postgres), as the inserted
rows are appended to the table, since deleted rows will not be
overwritten until a VACUUM has happened.

This should hopefully also help ensure that the query in the last batch
uses the correct index, as inserting a large number of new rows without
analyzing will upset the query planner.
2019-02-12 11:39:34 +00:00
Amber Brown 58f6c48183
Use native UPSERTs where possible (#4306) 2019-01-24 21:31:54 +11:00
Erik Johnston 90743c9d89 Fixup removal of duplicate `user_ips` rows (#4432)
* Remove unnecessary ORDER BY clause

* Add logging

* Newsfile
2019-01-23 19:45:18 +11:00
Erik Johnston 7f503f83b9 Refactor to rewrite the SQL instead 2019-01-22 16:31:05 +00:00
Erik Johnston 1c9704f8ab Don't shadow params 2019-01-22 16:20:33 +00:00
Erik Johnston 2557531f0f Fix bug when removing duplicate rows from user_ips
This was caused by accidentally overwritting a `last_seen` variable
in a for loop, causing the wrong value to be written to the progress
table. The result of which was that we didn't scan sections of the table
when searching for duplicates, and so some duplicates did not get
deleted.
2019-01-22 13:33:46 +00:00
Amber Brown a35c66a00b
Remove duplicates in the user_ips table and add an index (#4370) 2019-01-12 06:21:50 +11:00
Amber Brown 1f3f5fcf52
Fix client IPs being broken on Python 3 (#3908) 2018-09-20 20:14:34 +10:00
Amber Brown 99dd975dae
Run tests under PostgreSQL (#3423) 2018-08-13 16:47:46 +10:00
Neil Johnson e10830e976 wip commit - tests failing 2018-08-03 17:55:50 +01:00
Neil Johnson 74b1d46ad9 do mau checks based on monthly_active_users table 2018-08-02 16:57:35 +01:00
Neil Johnson 00f99f74b1 insertion into monthly_active_users 2018-08-02 13:47:19 +01:00
Richard van der Hoff 03751a6420 Fix some looping_call calls which were broken in #3604
It turns out that looping_call does check the deferred returned by its
callback, and (at least in the case of client_ips), we were relying on this,
and I broke it in #3604.

Update run_as_background_process to return the deferred, and make sure we
return it to clock.looping_call.
2018-07-26 11:48:08 +01:00
Richard van der Hoff 667fba68f3 Run things as background processes
This fixes #3518, and ensures that we get useful logs and metrics for lots of
things that happen in the background.

(There are certainly more things that happen in the background; these are just
the common ones I've found running a single-process synapse locally).
2018-07-18 20:55:05 +01:00
Amber Brown 49af402019 run isort 2018-07-09 16:09:20 +10:00
Amber Brown 77ac14b960
Pass around the reactor explicitly (#3385) 2018-06-22 09:37:10 +01:00
Adrian Tschira 933bf2dd35 replace some iteritems with six
Signed-off-by: Adrian Tschira <nota@notafile.com>
2018-05-19 17:59:26 +02:00
Neil Johnson 617bf40924 Generate user daily stats 2018-04-25 17:37:29 +01:00
Neil Johnson 788e69098c Add user_ips last seen index 2018-03-28 12:03:13 +01:00
Richard van der Hoff 6cfee09be9 Make __init__ consitstent across Store heirarchy
Add db_conn parameters to the `__init__` methods of the *Store classes, so that
they are all consistent, which makes the multiple inheritance work correctly
(and so that we can later extract mixins which can be used in the slavedstores)
2017-11-13 10:46:07 +00:00
Erik Johnston ed9a7f5436 Merge pull request #2309 from matrix-org/erikj/user_ip_repl
Fix up user_ip replication commands
2017-07-06 14:33:14 +01:00
Erik Johnston b5e8d529e6 Define CACHE_SIZE_FACTOR once 2017-07-04 09:56:44 +01:00
Erik Johnston 8c23221666 Fix up 2017-06-27 15:53:45 +01:00
Erik Johnston a0a561ae85 Fix up client ips to read from pending data 2017-06-27 14:46:12 +01:00
Erik Johnston ed3d0170d9 Batch upsert user ips 2017-06-27 13:37:04 +01:00
Erik Johnston 6f83c4537c Increase size of IP cache 2017-06-07 10:18:44 +01:00
Erik Johnston dcabef952c Increase client_ip cache size 2017-05-08 15:09:19 +01:00
Erik Johnston 85657eedf8 Bail on where clause instead 2017-04-11 16:24:31 +01:00
Erik Johnston b48045a8f5 Don't bother with outer check for now 2017-04-11 16:23:24 +01:00
Erik Johnston 34840cdcef Fix getting latest device IP for user with no devices 2017-04-11 09:56:54 +01:00
Richard van der Hoff 363786845b PEP8 2016-07-22 13:21:07 +01:00
Richard van der Hoff ec5717caf5 Create index on user_ips in the background
user_ips is kinda big, so really we want to add the index in the background
once we're running. Replace the schema delta with one which will do that.

I've done this in a way that's reasonably easy to reuse as there a few other
indexes I need, and I don't suppose they will be the last.
2016-07-22 13:16:39 +01:00
Richard van der Hoff c445f5fec7 storage/client_ips: remove some dead code 2016-07-21 11:58:47 +01:00
Richard van der Hoff 7314bf4682 Merge branch 'develop' into rav/get_devices_api
(pick up PR #938 in the hope of fixing the UTs)
2016-07-20 17:40:00 +01:00
Richard van der Hoff bc8f265f0a GET /devices endpoint
implement a GET /devices endpoint which lists all of the user's devices.

It also returns the last IP where we saw that device, so there is some dancing
to fish that out of the user_ips table.
2016-07-20 16:42:32 +01:00
Richard van der Hoff ec041b335e Record device_id in client_ips
Record the device_id when we add a client ip; it's somewhat redundant as we
could get it via the access_token, but it will make querying rather easier.
2016-07-20 16:41:03 +01:00
Mark Haines eef541a291 Move insert_client_ip to a separate class 2016-06-03 14:42:35 +01:00