Commit Graph

116 Commits (f5ef7e13d7e60bd9b901f90da250c7c6d1ba1f53)

Author SHA1 Message Date
reivilibre 9af2be192a
Remove legacy Prometheus metrics names. They were deprecated in Synapse v1.69.0 and disabled by default in Synapse v1.71.0. (#14538) 2022-11-24 09:09:17 +00:00
reivilibre be4250c7a8
Add experimental configuration option to allow disabling legacy Prometheus metric names. (#13540)
Co-authored-by: David Robertson <davidr@element.io>
2022-08-24 11:35:54 +00:00
David Robertson f30bcbd84a
Fix Synapse git info missing in version strings (#12973) 2022-06-07 15:24:11 +01:00
Richard van der Hoff ae01a7edd3
Update type annotations for compatiblity with prometheus_client 0.14 (#12389)
Principally, `prometheus_client.REGISTRY.register` now requires its argument to
extend `prometheus_client.Collector`.

Additionally, `Gauge.set` is now annotated so that passing `Optional[int]`
causes an error.
2022-04-06 12:59:04 +00:00
David Robertson 4ae956c8bb
Use version string helper from matrix-common (#11979)
* Require latest matrix-common
* Use the common function
2022-02-14 13:12:22 +00:00
reivilibre 41818cda1f
Fix type errors introduced by new annotations in the Prometheus Client library. (#11832)
Co-authored-by: David Robertson <davidr@element.io>
2022-02-02 16:51:00 +00:00
Richard van der Hoff 6a78ede569
Improve `reactor_tick_time` metric (#11724)
The existing implementation of the `python_twisted_reactor_tick_time` metric is pretty useless, because it *only* 
measures the time taken to execute timed calls and callbacks from threads. That neglects everything that 
happens off the back of I/O, which is obviously quite a lot for us.

To improve this, I've hooked into a different place in the reactor - in particular, where it calls `epoll`. That call is 
the only place it should wait for something to happen - the rest of the loop *should* be quick.

I've also removed `python_twisted_reactor_pending_calls`, because I don't believe anyone ever looks at it, and
it's a nuisance to populate.
2022-01-17 12:14:40 +00:00
Richard van der Hoff 20c6d85c6e
Simplify GC prometheus metrics (#11723)
Rather than hooking into the reactor loop, just add a timed task that runs every 100 ms to do the garbage collection.

Part 1 of a quest to simplify the reactor monkey-patching.
2022-01-13 14:35:52 +00:00
Patrick Cloke 10a88ba91c
Use auto_attribs/native type hints for attrs classes. (#11692) 2022-01-13 13:49:28 +00:00
Sean Quah 84fac0f814
Add type annotations to `synapse.metrics` (#10847) 2021-11-17 19:07:02 +00:00
Erik Johnston 82d2168a15
Add metrics to the threadpools (#11178) 2021-11-01 11:21:36 +00:00
David Robertson 1bfd141205
Type hints for the remaining two files in `synapse.http`. (#11164)
* Teach MyPy that the sentinel context is False

This means that if `ctx: LoggingContextOrSentinel`
then `bool(ctx)` narrows us to `ctx:LoggingContext`, which is a really
neat find!

* Annotate RequestMetrics

- Raise errors for sentry if we use the sentinel context
- Ensure we don't raise an error and carry on, but not recording stats
- Include stack trace in the error case to lower Sean's blood pressure

* Make mypy pass for synapse.http.request_metrics

* Make synapse.http.connectproxyclient pass mypy

Co-authored-by: reivilibre <oliverw@matrix.org>
2021-10-28 14:14:42 +01:00
Jonathan de Jong bf72d10dbf
Use inline type hints in various other places (in `synapse/`) (#10380) 2021-07-15 11:02:43 +01:00
Erik Johnston 8771b1337d
Export jemalloc stats to prometheus when used (#9882) 2021-05-06 15:54:07 +01:00
Erik Johnston 1fb9a2d0bf
Limit how often GC happens by time. (#9902)
Synapse can be quite memory intensive, and unless care is taken to tune
the GC thresholds it can end up thrashing, causing noticable performance
problems for large servers. We fix this by limiting how often we GC a
given generation, regardless of current counts/thresholds.

This does not help with the reverse problem where the thresholds are set
too high, but that should only happen in situations where they've been
manually configured.

Adds a `gc_min_seconds_between` config option to override the defaults.

Fixes #9890.
2021-05-05 16:53:45 +01:00
Jonathan de Jong 4b965c862d
Remove redundant "coding: utf-8" lines (#9786)
Part of #9744

Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.

`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
2021-04-14 15:34:27 +01:00
Andrew Morgan 0d87c6bd12
Don't report anything from GaugeBucketCollector metrics until data is present (#8926)
This PR modifies `GaugeBucketCollector` to only report data once it has been updated, rather than initially reporting a value of 0. Fixes zero values being reported for some metrics on startup until a background job to update the metric's value runs later.
2021-04-06 16:32:04 +01:00
Patrick Cloke 33a02f0f52
Fix additional type hints from Twisted upgrade. (#9518) 2021-03-03 15:47:38 -05:00
Eric Eastwood 0a00b7ff14
Update black, and run auto formatting over the codebase (#9381)
- Update black version to the latest
 - Run black auto formatting over the codebase
    - Run autoformatting according to [`docs/code_style.md
`](80d6dc9783/docs/code_style.md)
 - Update `code_style.md` docs around installing black to use the correct version
2021-02-16 22:32:34 +00:00
Erik Johnston 427ede619f
Add metrics for tracking 3PID /requestToken requests. (#8712)
The main use case is to see how many requests are being made, and how
many are second/third/etc attempts. If there are large number of retries
then that likely indicates a delivery problem.
2020-11-13 12:03:51 +00:00
Richard van der Hoff 6d2d42f8fb Rewrite BucketCollector
This was a bit unweildy for what I wanted: in particular, I wanted to assign
each measurement straight into a bucket, rather than storing an intermediate
Counter which didn't do any bucketing at all.

I've replaced it with something that is hopefully a bit easier to use.

(I'm not entirely sure what the difference between a HistogramMetricFamily and
a GaugeHistogramMetricFamily is, but given our counters can go down as well as
up the latter *sounds* more accurate?)
2020-09-30 16:49:15 +01:00
Patrick Cloke aec294ee0d
Use slots in attrs classes where possible (#8296)
slots use less memory (and attribute access is faster) while slightly
limiting the flexibility of the class attributes. This focuses on objects
which are instantiated "often" and for short periods of time.
2020-09-14 12:50:06 -04:00
Patrick Cloke c619253db8
Stop sub-classing object (#8249) 2020-09-04 06:54:56 -04:00
Erik Johnston a99658074d
Add some metrics for inbound and outbound federation processing times (#7755) 2020-06-30 16:58:06 +01:00
Patrick Cloke bd6dc17221
Replace iteritems/itervalues/iterkeys with native versions. (#7692) 2020-06-15 07:03:36 -04:00
Ivan Shapovalov ac481a738e
synapse.metrics: implement detailed memory usage reporting on PyPy (#7536)
PyPy's gc.get_stats() returns an object containing detailed allocator statistics
which could be beneficial to collect as metrics.

Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
2020-05-22 11:08:41 +01:00
Richard van der Hoff 8c75667ad7
Add prometheus metrics for the number of active pushers (#7103) 2020-03-19 10:00:24 +00:00
Patrick Cloke 509e381afa
Clarify list/set/dict/tuple comprehensions and enforce via flake8 (#6957)
Ensure good comprehension hygiene using flake8-comprehensions.
2020-02-21 07:15:07 -05:00
Amber Brown 864f144543
Fix up some typechecking (#6150)
* type checking fixes

* changelog
2019-10-02 05:29:01 -07:00
Amber Brown b617864cd9
Fix for structured logging tests stomping on logs (#6023) 2019-09-13 02:29:55 +10:00
Amber Brown aeb9b2179e
Add a build info metric to Prometheus (#6005) 2019-09-10 00:14:58 +10:00
Amber Brown 7ad1d76356
Support Prometheus_client 0.4.0+ (#5636) 2019-07-18 23:57:15 +10:00
Amber Brown 071150ce19
Don't log GC 0s at INFO (#5557) 2019-06-28 21:45:33 +10:00
Amber Brown 32e7c9e7f2
Run Black. (#5482) 2019-06-20 19:32:02 +10:00
Erik Johnston 3ed595e327 Prometheus histograms are cumalative 2019-06-14 14:07:32 +01:00
Amber H. Brown a10c8dae85 fix prometheus rendering error 2019-06-14 21:09:33 +10:00
Amber Brown 6312d6cc7c
Expose statistics on extrems to prometheus (#5384) 2019-06-13 22:40:52 +10:00
Richard van der Hoff 82ca6d1f9f
Add metrics for number of outgoing EDUs, by type (#4695) 2019-02-20 14:13:14 +00:00
Erik Johnston d0f6c1ce21 Remove spurious comment 2018-09-14 15:12:36 +01:00
Erik Johnston 0a81038ea0 Add in flight real time metrics for Measure blocks 2018-09-14 15:08:37 +01:00
Richard van der Hoff bab94da79c fix metric name 2018-08-07 22:11:45 +01:00
Richard van der Hoff 53bca4690b more metrics for the federation and appservice senders 2018-08-07 19:09:48 +01:00
Amber Brown 49af402019 run isort 2018-07-09 16:09:20 +10:00
Amber Brown 6350bf925e
Attempt to be more performant on PyPy (#3462) 2018-06-28 14:49:57 +01:00
Richard van der Hoff cbbfaa4be8 Fix description of "python_gc_time" metric 2018-06-21 10:02:42 +01:00
Matthew Hodgson ccfdaf68be spell gauge correctly 2018-06-16 07:10:34 +01:00
Amber Brown f116f32ace
add a last seen metric (#3396) 2018-06-14 20:26:59 +10:00
Richard van der Hoff 694968fa81 Hopefully, fix LaterGuage error handling 2018-06-04 15:59:14 +01:00
Amber Brown febe0ec8fd
Run Prometheus on a different port, optionally. (#3274) 2018-05-31 19:04:50 +10:00
Matthew Hodgson ff1bc0a279 pep8 2018-05-29 02:32:15 +01:00