286 lines
		
	
	
		
			22 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
			
		
		
	
	
			286 lines
		
	
	
		
			22 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
| How to monitor Synapse metrics using Prometheus
 | |
| ===============================================
 | |
| 
 | |
| 1. Install Prometheus:
 | |
| 
 | |
|    Follow instructions at http://prometheus.io/docs/introduction/install/
 | |
| 
 | |
| 2. Enable Synapse metrics:
 | |
| 
 | |
|    There are two methods of enabling metrics in Synapse.
 | |
| 
 | |
|    The first serves the metrics as a part of the usual web server and can be
 | |
|    enabled by adding the "metrics" resource to the existing listener as such::
 | |
| 
 | |
|      resources:
 | |
|        - names:
 | |
|          - client
 | |
|          - metrics
 | |
| 
 | |
|    This provides a simple way of adding metrics to your Synapse installation,
 | |
|    and serves under ``/_synapse/metrics``. If you do not wish your metrics be
 | |
|    publicly exposed, you will need to either filter it out at your load
 | |
|    balancer, or use the second method.
 | |
| 
 | |
|    The second method runs the metrics server on a different port, in a
 | |
|    different thread to Synapse. This can make it more resilient to heavy load
 | |
|    meaning metrics cannot be retrieved, and can be exposed to just internal
 | |
|    networks easier. The served metrics are available over HTTP only, and will
 | |
|    be available at ``/``.
 | |
| 
 | |
|    Add a new listener to homeserver.yaml::
 | |
| 
 | |
|      listeners:
 | |
|        - type: metrics
 | |
|          port: 9000
 | |
|          bind_addresses:
 | |
|            - '0.0.0.0'
 | |
| 
 | |
|    For both options, you will need to ensure that ``enable_metrics`` is set to
 | |
|    ``True``.
 | |
| 
 | |
|    Restart Synapse.
 | |
| 
 | |
| 3. Add a Prometheus target for Synapse.
 | |
| 
 | |
|    It needs to set the ``metrics_path`` to a non-default value (under ``scrape_configs``)::
 | |
| 
 | |
|     - job_name: "synapse"
 | |
|       metrics_path: "/_synapse/metrics"
 | |
|       static_configs:
 | |
|         - targets: ["my.server.here:port"]
 | |
| 
 | |
|    where ``my.server.here`` is the IP address of Synapse, and ``port`` is the listener port
 | |
|    configured with the ``metrics`` resource.
 | |
| 
 | |
|    If your prometheus is older than 1.5.2, you will need to replace
 | |
|    ``static_configs`` in the above with ``target_groups``.
 | |
| 
 | |
|    Restart Prometheus.
 | |
| 
 | |
| 
 | |
| Renaming of metrics & deprecation of old names in 1.2
 | |
| -----------------------------------------------------
 | |
| 
 | |
| Synapse 1.2 updates the Prometheus metrics to match the naming convention of the
 | |
| upstream ``prometheus_client``. The old names are considered deprecated and will
 | |
| be removed in a future version of Synapse.
 | |
| 
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| |                                  New Name                                   |                               Old Name                                |
 | |
| +=============================================================================+=======================================================================+
 | |
| | python_gc_objects_collected_total                                           | python_gc_objects_collected                                           |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | python_gc_objects_uncollectable_total                                       | python_gc_objects_uncollectable                                       |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | python_gc_collections_total                                                 | python_gc_collections                                                 |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | process_cpu_seconds_total                                                   | process_cpu_seconds                                                   |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_federation_client_sent_transactions_total                           | synapse_federation_client_sent_transactions                           |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_federation_client_events_processed_total                            | synapse_federation_client_events_processed                            |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_event_processing_loop_count_total                                   | synapse_event_processing_loop_count                                   |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_event_processing_loop_room_count_total                              | synapse_event_processing_loop_room_count                              |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_util_metrics_block_count_total                                      | synapse_util_metrics_block_count                                      |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_util_metrics_block_time_seconds_total                               | synapse_util_metrics_block_time_seconds                               |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_util_metrics_block_ru_utime_seconds_total                           | synapse_util_metrics_block_ru_utime_seconds                           |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_util_metrics_block_ru_stime_seconds_total                           | synapse_util_metrics_block_ru_stime_seconds                           |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_util_metrics_block_db_txn_count_total                               | synapse_util_metrics_block_db_txn_count                               |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_util_metrics_block_db_txn_duration_seconds_total                    | synapse_util_metrics_block_db_txn_duration_seconds                    |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_util_metrics_block_db_sched_duration_seconds_total                  | synapse_util_metrics_block_db_sched_duration_seconds                  |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_background_process_start_count_total                                | synapse_background_process_start_count                                |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_background_process_ru_utime_seconds_total                           | synapse_background_process_ru_utime_seconds                           |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_background_process_ru_stime_seconds_total                           | synapse_background_process_ru_stime_seconds                           |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_background_process_db_txn_count_total                               | synapse_background_process_db_txn_count                               |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_background_process_db_txn_duration_seconds_total                    | synapse_background_process_db_txn_duration_seconds                    |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_background_process_db_sched_duration_seconds_total                  | synapse_background_process_db_sched_duration_seconds                  |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_storage_events_persisted_events_total                               | synapse_storage_events_persisted_events                               |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_storage_events_persisted_events_sep_total                           | synapse_storage_events_persisted_events_sep                           |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_storage_events_state_delta_total                                    | synapse_storage_events_state_delta                                    |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_storage_events_state_delta_single_event_total                       | synapse_storage_events_state_delta_single_event                       |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_storage_events_state_delta_reuse_delta_total                        | synapse_storage_events_state_delta_reuse_delta                        |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_federation_server_received_pdus_total                               | synapse_federation_server_received_pdus                               |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_federation_server_received_edus_total                               | synapse_federation_server_received_edus                               |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_handler_presence_notified_presence_total                            | synapse_handler_presence_notified_presence                            |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_handler_presence_federation_presence_out_total                      | synapse_handler_presence_federation_presence_out                      |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_handler_presence_presence_updates_total                             | synapse_handler_presence_presence_updates                             |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_handler_presence_timers_fired_total                                 | synapse_handler_presence_timers_fired                                 |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_handler_presence_federation_presence_total                          | synapse_handler_presence_federation_presence                          |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_handler_presence_bump_active_time_total                             | synapse_handler_presence_bump_active_time                             |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_federation_client_sent_edus_total                                   | synapse_federation_client_sent_edus                                   |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_federation_client_sent_pdu_destinations_count_total                 | synapse_federation_client_sent_pdu_destinations:count                 |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_federation_client_sent_pdu_destinations_total                       | synapse_federation_client_sent_pdu_destinations:total                 |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_handlers_appservice_events_processed_total                          | synapse_handlers_appservice_events_processed                          |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_notifier_notified_events_total                                      | synapse_notifier_notified_events                                      |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_push_bulk_push_rule_evaluator_push_rules_invalidation_counter_total | synapse_push_bulk_push_rule_evaluator_push_rules_invalidation_counter |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_push_bulk_push_rule_evaluator_push_rules_state_size_counter_total   | synapse_push_bulk_push_rule_evaluator_push_rules_state_size_counter   |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_http_httppusher_http_pushes_processed_total                         | synapse_http_httppusher_http_pushes_processed                         |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_http_httppusher_http_pushes_failed_total                            | synapse_http_httppusher_http_pushes_failed                            |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_http_httppusher_badge_updates_processed_total                       | synapse_http_httppusher_badge_updates_processed                       |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| | synapse_http_httppusher_badge_updates_failed_total                          | synapse_http_httppusher_badge_updates_failed                          |
 | |
| +-----------------------------------------------------------------------------+-----------------------------------------------------------------------+
 | |
| 
 | |
| 
 | |
| Removal of deprecated metrics & time based counters becoming histograms in 0.31.0
 | |
| ---------------------------------------------------------------------------------
 | |
| 
 | |
| The duplicated metrics deprecated in Synapse 0.27.0 have been removed.
 | |
| 
 | |
| All time duration-based metrics have been changed to be seconds. This affects:
 | |
| 
 | |
| +----------------------------------+
 | |
| | msec -> sec metrics              |
 | |
| +==================================+
 | |
| | python_gc_time                   |
 | |
| +----------------------------------+
 | |
| | python_twisted_reactor_tick_time |
 | |
| +----------------------------------+
 | |
| | synapse_storage_query_time       |
 | |
| +----------------------------------+
 | |
| | synapse_storage_schedule_time    |
 | |
| +----------------------------------+
 | |
| | synapse_storage_transaction_time |
 | |
| +----------------------------------+
 | |
| 
 | |
| Several metrics have been changed to be histograms, which sort entries into
 | |
| buckets and allow better analysis. The following metrics are now histograms:
 | |
| 
 | |
| +-------------------------------------------+
 | |
| | Altered metrics                           |
 | |
| +===========================================+
 | |
| | python_gc_time                            |
 | |
| +-------------------------------------------+
 | |
| | python_twisted_reactor_pending_calls      |
 | |
| +-------------------------------------------+
 | |
| | python_twisted_reactor_tick_time          |
 | |
| +-------------------------------------------+
 | |
| | synapse_http_server_response_time_seconds |
 | |
| +-------------------------------------------+
 | |
| | synapse_storage_query_time                |
 | |
| +-------------------------------------------+
 | |
| | synapse_storage_schedule_time             |
 | |
| +-------------------------------------------+
 | |
| | synapse_storage_transaction_time          |
 | |
| +-------------------------------------------+
 | |
| 
 | |
| 
 | |
| Block and response metrics renamed for 0.27.0
 | |
| ---------------------------------------------
 | |
| 
 | |
| Synapse 0.27.0 begins the process of rationalising the duplicate ``*:count``
 | |
| metrics reported for the resource tracking for code blocks and HTTP requests.
 | |
| 
 | |
| At the same time, the corresponding ``*:total`` metrics are being renamed, as
 | |
| the ``:total`` suffix no longer makes sense in the absence of a corresponding
 | |
| ``:count`` metric.
 | |
| 
 | |
| To enable a graceful migration path, this release just adds new names for the
 | |
| metrics being renamed. A future release will remove the old ones.
 | |
| 
 | |
| The following table shows the new metrics, and the old metrics which they are
 | |
| replacing.
 | |
| 
 | |
| ==================================================== ===================================================
 | |
| New name                                             Old name
 | |
| ==================================================== ===================================================
 | |
| synapse_util_metrics_block_count                     synapse_util_metrics_block_timer:count
 | |
| synapse_util_metrics_block_count                     synapse_util_metrics_block_ru_utime:count
 | |
| synapse_util_metrics_block_count                     synapse_util_metrics_block_ru_stime:count
 | |
| synapse_util_metrics_block_count                     synapse_util_metrics_block_db_txn_count:count
 | |
| synapse_util_metrics_block_count                     synapse_util_metrics_block_db_txn_duration:count
 | |
| 
 | |
| synapse_util_metrics_block_time_seconds              synapse_util_metrics_block_timer:total
 | |
| synapse_util_metrics_block_ru_utime_seconds          synapse_util_metrics_block_ru_utime:total
 | |
| synapse_util_metrics_block_ru_stime_seconds          synapse_util_metrics_block_ru_stime:total
 | |
| synapse_util_metrics_block_db_txn_count              synapse_util_metrics_block_db_txn_count:total
 | |
| synapse_util_metrics_block_db_txn_duration_seconds   synapse_util_metrics_block_db_txn_duration:total
 | |
| 
 | |
| synapse_http_server_response_count                   synapse_http_server_requests
 | |
| synapse_http_server_response_count                   synapse_http_server_response_time:count
 | |
| synapse_http_server_response_count                   synapse_http_server_response_ru_utime:count
 | |
| synapse_http_server_response_count                   synapse_http_server_response_ru_stime:count
 | |
| synapse_http_server_response_count                   synapse_http_server_response_db_txn_count:count
 | |
| synapse_http_server_response_count                   synapse_http_server_response_db_txn_duration:count
 | |
| 
 | |
| synapse_http_server_response_time_seconds            synapse_http_server_response_time:total
 | |
| synapse_http_server_response_ru_utime_seconds        synapse_http_server_response_ru_utime:total
 | |
| synapse_http_server_response_ru_stime_seconds        synapse_http_server_response_ru_stime:total
 | |
| synapse_http_server_response_db_txn_count            synapse_http_server_response_db_txn_count:total
 | |
| synapse_http_server_response_db_txn_duration_seconds synapse_http_server_response_db_txn_duration:total
 | |
| ==================================================== ===================================================
 | |
| 
 | |
| 
 | |
| Standard Metric Names
 | |
| ---------------------
 | |
| 
 | |
| As of synapse version 0.18.2, the format of the process-wide metrics has been
 | |
| changed to fit prometheus standard naming conventions. Additionally the units
 | |
| have been changed to seconds, from miliseconds.
 | |
| 
 | |
| ================================== =============================
 | |
| New name                           Old name
 | |
| ================================== =============================
 | |
| process_cpu_user_seconds_total     process_resource_utime / 1000
 | |
| process_cpu_system_seconds_total   process_resource_stime / 1000
 | |
| process_open_fds (no 'type' label) process_fds
 | |
| ================================== =============================
 | |
| 
 | |
| The python-specific counts of garbage collector performance have been renamed.
 | |
| 
 | |
| =========================== ======================
 | |
| New name                    Old name
 | |
| =========================== ======================
 | |
| python_gc_time              reactor_gc_time
 | |
| python_gc_unreachable_total reactor_gc_unreachable
 | |
| python_gc_counts            reactor_gc_counts
 | |
| =========================== ======================
 | |
| 
 | |
| The twisted-specific reactor metrics have been renamed.
 | |
| 
 | |
| ==================================== =====================
 | |
| New name                             Old name
 | |
| ==================================== =====================
 | |
| python_twisted_reactor_pending_calls reactor_pending_calls
 | |
| python_twisted_reactor_tick_time     reactor_tick_time
 | |
| ==================================== =====================
 |