devices: use combined ANY clause for faster cleanup (#15861)

Old device entries for the same user were being removed in individual
SQL commands, making the batch take way longer than necessary.

This combines the commands into a single one with a IN/ANY clause.

Example of log entry before the change, regularly observed with
"log_min_duration_statement = 10000" in PostgreSQL's config:

    LOG:  duration: 42538.282 ms  statement:
    DELETE FROM device_lists_stream
    WHERE user_id = '@someone' AND device_id = 'someid1'
    AND stream_id < 123456789
    ;
    DELETE FROM device_lists_stream
    WHERE user_id = '@someone' AND device_id = 'someid2'
    AND stream_id < 123456789
    ;
    [repeated for each device ID of that user, potentially a lot...]

With the patch applied on my instance for the past couple of days, I
no longer notice overly long statements of that particular kind.

Signed-off-by: pacien <pacien.trangirard@pacien.net>
pull/15876/head
pacien 2023-07-03 16:39:38 +02:00 committed by GitHub
parent cd8b73aa97
commit 07d7cbfe69
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 10 additions and 5 deletions

1
changelog.d/15861.misc Normal file
View File

@ -0,0 +1 @@
Optimised cleanup of old entries in device_lists_stream.

View File

@ -1950,12 +1950,16 @@ class DeviceStore(DeviceWorkerStore, DeviceBackgroundUpdateStore):
# Delete older entries in the table, as we really only care about
# when the latest change happened.
txn.execute_batch(
"""
cleanup_obsolete_stmt = """
DELETE FROM device_lists_stream
WHERE user_id = ? AND device_id = ? AND stream_id < ?
""",
[(user_id, device_id, min_stream_id) for device_id in device_ids],
WHERE user_id = ? AND stream_id < ? AND %s
"""
device_ids_clause, device_ids_args = make_in_list_sql_clause(
txn.database_engine, "device_id", device_ids
)
txn.execute(
cleanup_obsolete_stmt % (device_ids_clause,),
[user_id, min_stream_id] + device_ids_args,
)
self.db_pool.simple_insert_many_txn(