5 Commits
0.5 ... 0.6.2

Author SHA1 Message Date
Mitsuo Takaki
eeec3a2220 Bumping the version 2020-01-18 14:02:42 -08:00
mtakaki
5be0217c00 #10 - Creating auxiliary client class to generate configuration class (#78)
* #10 - Creating auxiliary client class to generate configuration class based on cachet's component list.

* Updating the python version in codacy to reduce false positives

* Moving some of the cachet operations to the client class to clean up the configuration class and making better constants

* Refactoring status to have proper classes and adding more tests. Refactoring the requests tests to use requests-mock.

* Removing unused imports from test_scheduler

* Adding more tests and the ability to run the client from command line

* Updating README and client arg parsing

* Fixing broken unit tests
2020-01-18 13:55:07 -08:00
mtakaki
bcafbd64f7 21 2 (#77)
* 21 - Improving exception when an invalid type is used in the config

* Bumping the version
2020-01-06 07:40:11 -08:00
mtakaki
a13a42d51c Multithreading #66 (#76)
* feat(multihreading): each url has it's own thread

* Fixing broken unit tests

* Improving readability when there are multiple URLs registerd and creating new action to upload metrics

* Improving error message when there's no file found

* Bumping the version

Co-authored-by: Alex Berenshtein <aberenshtein@yotpo.com>
2020-01-05 10:25:06 -08:00
Mitsuo Takaki
9a73063a6f Adding a troubleshooting section 2019-10-30 08:03:43 -07:00
18 changed files with 833 additions and 354 deletions

5
.codacy.yml Normal file
View File

@@ -0,0 +1,5 @@
---
engines:
pylint:
enabled: true
python_version: 3

104
README.md
View File

@@ -16,7 +16,8 @@ This project is available at PyPI: [https://pypi.python.org/pypi/cachet-url-moni
## Configuration ## Configuration
```yaml ```yaml
endpoint: endpoints:
- name: Google
url: http://www.google.com url: http://www.google.com
method: GET method: GET
header: header:
@@ -24,47 +25,69 @@ endpoint:
timeout: 1 # seconds timeout: 1 # seconds
expectation: expectation:
- type: HTTP_STATUS - type: HTTP_STATUS
status_range: 200-300 status_range: 200-205
incident: MAJOR
- type: LATENCY - type: LATENCY
threshold: 1 threshold: 1
- type: REGEX - type: REGEX
regex: ".*<body>.*" regex: ".*<body>.*"
allowed_fails: 0 allowed_fails: 0
cachet:
api_url: http://status.cachethq.io/api/v1
token: my_token
component_id: 1 component_id: 1
metric_id: 1 metric_id: 1
action: action:
- CREATE_INCIDENT
- UPDATE_STATUS - UPDATE_STATUS
public_incidents: true public_incidents: true
latency_unit: ms latency_unit: ms
frequency: 30 frequency: 5
- name: Amazon
url: http://www.amazon.com
method: GET
header:
SOME-HEADER: SOME-VALUE
timeout: 1 # seconds
expectation:
- type: HTTP_STATUS
status_range: 200-205
incident: MAJOR
- type: LATENCY
threshold: 1
- type: REGEX
regex: ".*<body>.*"
threshold: 10
allowed_fails: 0
component_id: 2
action:
- CREATE_INCIDENT
public_incidents: true
latency_unit: ms
frequency: 5
cachet:
api_url: http://status.cachethq.io/api/v1
token: mytoken
``` ```
- **endpoint**, the configuration about the URL that will be monitored. - **endpoints**, the configuration about the URL/Urls that will be monitored.
- **url**, the URL that is going to be monitored. - **name**, The name of the component. This is now mandatory (since 0.6.0) so we can distinguish the logs for each URL being monitored.
- **method**, the HTTP method that will be used by the monitor. - **url**, the URL that is going to be monitored. *mandatory*
- **method**, the HTTP method that will be used by the monitor. *mandatory*
- **header**, client header passed to the request. Remove if you do not want to pass a header. - **header**, client header passed to the request. Remove if you do not want to pass a header.
- **timeout**, how long we'll wait to consider the request failed. The unit of it is seconds. - **timeout**, how long we'll wait to consider the request failed. The unit of it is seconds. *mandatory*
- **expectation**, the list of expectations set for the URL. - **expectation**, the list of expectations set for the URL. *mandatory*
- **HTTP_STATUS**, we will verify if the response status code falls into the expected range. Please keep in mind the range is inclusive on the first number and exclusive on the second number. If just one value is specified, it will default to only the given value, for example `200` will be converted to `200-201`. - **HTTP_STATUS**, we will verify if the response status code falls into the expected range. Please keep in mind the range is inclusive on the first number and exclusive on the second number. If just one value is specified, it will default to only the given value, for example `200` will be converted to `200-201`.
- **LATENCY**, we measure how long the request took to get a response and fail if it's above the threshold. The unit is in seconds. - **LATENCY**, we measure how long the request took to get a response and fail if it's above the threshold. The unit is in seconds.
- **REGEX**, we verify if the response body matches the given regex. - **REGEX**, we verify if the response body matches the given regex.
- **allowed_fails**, create incident/update component status only after specified amount of failed connection trials. - **allowed_fails**, create incident/update component status only after specified amount of failed connection trials.
- **cachet**, this is the settings for our cachet server. - **component_id**, the id of the component we're monitoring. This will be used to update the status of the component. *mandatory*
- **api_url**, the cachet API endpoint.
- **token**, the API token.
- **component_id**, the id of the component we're monitoring. This will be used to update the status of the component.
- **metric_id**, this will be used to store the latency of the API. If this is not set, it will be ignored. - **metric_id**, this will be used to store the latency of the API. If this is not set, it will be ignored.
- **action**, the action to be done when one of the expectations fails. This is optional and if left blank, nothing will be done to the component. - **action**, the action to be done when one of the expectations fails. This is optional and if left blank, nothing will be done to the component.
- **CREATE_INCIDENT**, we will create an incident when the expectation fails. - **CREATE_INCIDENT**, we will create an incident when the expectation fails.
- **UPDATE_STATUS**, updates the component status - **UPDATE_STATUS**, updates the component status.
- **PUSH_METRICS**, uploads response latency metrics.
- **public_incidents**, boolean to decide if created incidents should be visible to everyone or only to logged in users. Important only if `CREATE_INCIDENT` or `UPDATE_STATUS` are set. - **public_incidents**, boolean to decide if created incidents should be visible to everyone or only to logged in users. Important only if `CREATE_INCIDENT` or `UPDATE_STATUS` are set.
- **latency_unit**, the latency unit used when reporting the metrics. It will automatically convert to the specified unit. It's not mandatory and it will default to **seconds**. Available units: `ms`, `s`, `m`, `h`. - **latency_unit**, the latency unit used when reporting the metrics. It will automatically convert to the specified unit. It's not mandatory and it will default to **seconds**. Available units: `ms`, `s`, `m`, `h`.
- **frequency**, how often we'll send a request to the given URL. The unit is in seconds. - **frequency**, how often we'll send a request to the given URL. The unit is in seconds.
- **cachet**, this is the settings for our cachet server.
- **api_url**, the cachet API endpoint. *mandatory*
- **token**, the API token. *mandatory*
Each `expectation` has their own default incident status. It can be overridden by setting the `incident` property to any of the following values: Each `expectation` has their own default incident status. It can be overridden by setting the `incident` property to any of the following values:
- `PARTIAL` - `PARTIAL`
@@ -83,7 +106,7 @@ By choosing any of the aforementioned statuses, it will let you control the kind
The application should be installed using **virtualenv**, through the following command: The application should be installed using **virtualenv**, through the following command:
``` ```bash
$ git clone https://github.com/mtakaki/cachet-url-monitor.git $ git clone https://github.com/mtakaki/cachet-url-monitor.git
$ virtualenv cachet-url-monitor $ virtualenv cachet-url-monitor
$ cd cachet-url-monitor $ cd cachet-url-monitor
@@ -94,7 +117,7 @@ $ python3 setup.py install
To start the agent: To start the agent:
``` ```bash
$ python3 cachet_url_monitor/scheduler.py config.yml $ python3 cachet_url_monitor/scheduler.py config.yml
``` ```
@@ -104,13 +127,48 @@ You can run the agent in docker, so you won't need to worry about installing pyt
You have two choices, checking this repo out and building the docker image or it can be pulled directly from [dockerhub](https://hub.docker.com/r/mtakaki/cachet-url-monitor/). You will need to create your own custom `config.yml` file and run (it will pull latest): You have two choices, checking this repo out and building the docker image or it can be pulled directly from [dockerhub](https://hub.docker.com/r/mtakaki/cachet-url-monitor/). You will need to create your own custom `config.yml` file and run (it will pull latest):
``` ```bash
$ docker pull mtakaki/cachet-url-monitor $ docker pull mtakaki/cachet-url-monitor
$ docker run --rm -it -v "$PWD":/usr/src/app/config/ mtakaki/cachet-url-monitor $ docker run --rm -it -v "$PWD":/usr/src/app/config/ mtakaki/cachet-url-monitor
``` ```
If you're going to use a file with a name other than `config.yml`, you will need to map the local file, like this: If you're going to use a file with a name other than `config.yml`, you will need to map the local file, like this:
``` ```bash
$ docker run --rm -it -v "$PWD"/my_config.yml:/usr/src/app/config/config.yml:ro mtakaki/cachet-url-monitor $ docker run --rm -it -v "$PWD"/my_config.yml:/usr/src/app/config/config.yml:ro mtakaki/cachet-url-monitor
``` ```
## Generating configuration from existing CachetHQ instance (since 0.6.2)
In order to expedite the creation of your configuration file, you can use the client to automatically scrape the CachetHQ instance and spit out a YAML file. It can be used like this:
```bash
$ python cachet_url_monitor/client.py http://localhost/api/v1 my-token test.yml
```
Or from docker (you will end up with a `test.yml` in your `$PWD/tmp` folder):
```bash
$ docker run --rm -it -v $PWD/tmp:/home/tmp/ mtakaki/cachet-url-monitor python3.7 ./cachet_url_monitor/client.py http://localhost/api/v1 my-token /home/tmp/test.yml
```
The arguments are:
- **URL**, the CachetHQ API URL, so that means appending `/api/v1` to your hostname.
- **token**, the token that has access to your CachetHQ instance.
- **filename**, the file where it should write the configuration.
### Caveats
Because we can't predict what expectations will be needed, it will default to these behavior:
- Verify a [200-300[ HTTP status range.
- If status fail, make the incident major and public.
- Frequency of 30 seconds.
- `GET` request.
- Timeout of 1s.
- We'll read the `link` field from the components and use it as the URL.
## Troubleshooting
### SSLERROR
If it's throwing the following exception:
```python
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='redacted', port=443): Max retries exceeded with url: /api/v1/components/19 (Caused by SSLError(SSLError(1, u'[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:579)'),))
```
It can be resolved by seting the CA bundle environment variable `REQUESTS_CA_BUNDLE` pointing at your certificate file. It can either be set in your python environment, before running this tool, or in your docker container.

View File

@@ -0,0 +1,143 @@
#!/usr/bin/env python
from typing import Dict
import click
import requests
from yaml import dump
from cachet_url_monitor import latency_unit, status, exceptions
def normalize_url(url: str) -> str:
"""If passed url doesn't include schema return it with default one - http."""
if not url.lower().startswith('http'):
return f'http://{url}'
return url
def save_config(config_map, filename: str):
with open(filename, 'w') as file:
dump(config_map, file)
class CachetClient(object):
"""Utility class to interact with CahetHQ server."""
url: str
token: str
headers: Dict[str, str]
def __init__(self, url: str, token: str):
self.url = normalize_url(url)
self.token = token
self.headers = {'X-Cachet-Token': token}
def get_components(self):
"""Retrieves all components registered in cachet-hq"""
return requests.get(f"{self.url}/components", headers=self.headers).json()['data']
def get_metrics(self):
"""Retrieves all metrics registered in cachet-hq"""
return requests.get(f"{self.url}/metrics", headers=self.headers).json()['data']
def generate_config(self):
components = self.get_components()
generated_endpoints = [
{
'name': component['name'],
'url': component['link'],
'method': 'GET',
'timeout': 1,
'expectation': [
{
'type': 'HTTP_STATUS',
'status_range': '200-300',
'incident': 'MAJOR'
}
],
'allowed_fails': 0,
'frequency': 30,
'component_id': component['id'],
'action': [
'CREATE_INCIDENT',
'UPDATE_STATUS',
],
'public_incidents': True,
} for component in components if component['enabled']
]
generated_config = {
'cachet': {
'api_url': self.url,
'token': self.token,
},
'endpoints': generated_endpoints
}
return generated_config
def get_default_metric_value(self, metric_id):
"""Returns default value for configured metric."""
get_metric_request = requests.get(f"{self.url}/metrics/{metric_id}", headers=self.headers)
if get_metric_request.ok:
return get_metric_request.json()['data']['default_value']
else:
raise exceptions.MetricNonexistentError(metric_id)
def get_component_status(self, component_id):
"""Retrieves the current status of the given component. It will fail if the component does
not exist or doesn't respond with the expected data.
:return component status.
"""
get_status_request = requests.get(f'{self.url}/components/{component_id}', headers=self.headers)
if get_status_request.ok:
# The component exists.
return status.ComponentStatus(int(get_status_request.json()['data']['status']))
else:
raise exceptions.ComponentNonexistentError(component_id)
def push_status(self, component_id, component_status):
"""Pushes the status of the component to the cachet server.
"""
params = {'id': component_id, 'status': component_status}
return requests.put(f"{self.url}/components/{component_id}", params=params, headers=self.headers)
def push_metrics(self, metric_id, latency_time_unit, elapsed_time_in_seconds, timestamp):
"""Pushes the total amount of seconds the request took to get a response from the URL.
"""
value = latency_unit.convert_to_unit(latency_time_unit, elapsed_time_in_seconds)
params = {'id': metric_id, 'value': value, 'timestamp': timestamp}
return requests.post(f"{self.url}/metrics/{metric_id}/points", params=params, headers=self.headers)
def push_incident(self, status_value: status.ComponentStatus, is_public_incident: bool, component_id: int,
previous_incident_id=None, message=None):
"""If the component status has changed, we create a new incident (if this is the first time it becomes unstable)
or updates the existing incident once it becomes healthy again.
"""
if previous_incident_id and status_value == status.ComponentStatus.OPERATIONAL:
# If the incident already exists, it means it was unhealthy but now it's healthy again.
params = {'status': status.IncidentStatus.FIXED.value, 'visible': is_public_incident,
'component_id': component_id, 'component_status': status_value.value, 'notify': True}
return requests.put(f'{self.url}/incidents/{previous_incident_id}', params=params, headers=self.headers)
elif not previous_incident_id and status_value != status.ComponentStatus.OPERATIONAL:
# This is the first time the incident is being created.
params = {'name': 'URL unavailable', 'message': message,
'status': status.IncidentStatus.INVESTIGATING.value,
'visible': is_public_incident, 'component_id': component_id, 'component_status': status_value,
'notify': True}
return requests.post(f'{self.url}/incidents', params=params, headers=self.headers)
@click.group()
def run_client():
pass
@click.command()
@click.argument('url')
@click.argument('token')
@click.argument('output')
def run_client(url, token, output):
client = CachetClient(url, token)
config = client.generate_config()
save_config(config, output)
if __name__ == '__main__':
run_client()

View File

@@ -8,18 +8,15 @@ import time
import requests import requests
from yaml import dump from yaml import dump
from yaml import load
from yaml import FullLoader
import cachet_url_monitor.latency_unit as latency_unit
import cachet_url_monitor.status as st import cachet_url_monitor.status as st
from cachet_url_monitor.client import CachetClient, normalize_url
from cachet_url_monitor.exceptions import MetricNonexistentError
from cachet_url_monitor.status import ComponentStatus
# This is the mandatory fields that must be in the configuration file in this # This is the mandatory fields that must be in the configuration file in this
# same exact structure. # same exact structure.
configuration_mandatory_fields = { configuration_mandatory_fields = ['url', 'method', 'timeout', 'expectation', 'component_id', 'frequency']
'endpoint': ['url', 'method', 'timeout', 'expectation'],
'cachet': ['api_url', 'token', 'component_id'],
'frequency': []}
class ConfigurationValidationError(Exception): class ConfigurationValidationError(Exception):
@@ -32,58 +29,23 @@ class ConfigurationValidationError(Exception):
return repr(self.value) return repr(self.value)
class ComponentNonexistentError(Exception):
"""Exception raised when the component does not exist."""
def __init__(self, component_id):
self.component_id = component_id
def __str__(self):
return repr(f'Component with id [{self.component_id}] does not exist.')
class MetricNonexistentError(Exception):
"""Exception raised when the component does not exist."""
def __init__(self, metric_id):
self.metric_id = metric_id
def __str__(self):
return repr(f'Metric with id [{self.metric_id}] does not exist.')
def get_current_status(endpoint_url, component_id, headers):
"""Retrieves the current status of the component that is being monitored. It will fail if the component does
not exist or doesn't respond with the expected data.
:return component status.
"""
get_status_request = requests.get(f'{endpoint_url}/components/{component_id}', headers=headers)
if get_status_request.ok:
# The component exists.
return int(get_status_request.json()['data']['status'])
else:
raise ComponentNonexistentError(component_id)
def normalize_url(url):
"""If passed url doesn't include schema return it with default one - http."""
if not url.lower().startswith('http'):
return f'http://{url}'
return url
class Configuration(object): class Configuration(object):
"""Represents a configuration file, but it also includes the functionality """Represents a configuration file, but it also includes the functionality
of assessing the API and pushing the results to cachet. of assessing the API and pushing the results to cachet.
""" """
def __init__(self, config_file): def __init__(self, config, endpoint_index: int):
self.logger = logging.getLogger('cachet_url_monitor.configuration.Configuration') self.endpoint_index: int = endpoint_index
self.config_file = config_file self.data = config
self.data = load(open(self.config_file, 'r'), Loader=FullLoader) self.endpoint = self.data['endpoints'][endpoint_index]
self.current_fails = 0 self.current_fails: int = 0
self.trigger_update = True self.trigger_update: bool = True
if 'name' not in self.endpoint:
# We have to make this mandatory, otherwise the logs are confusing when there are multiple URLs.
raise ConfigurationValidationError('name')
self.logger = logging.getLogger(f'cachet_url_monitor.configuration.Configuration.{self.endpoint["name"]}')
# Exposing the configuration to confirm it's parsed as expected. # Exposing the configuration to confirm it's parsed as expected.
self.print_out() self.print_out()
@@ -92,35 +54,37 @@ class Configuration(object):
self.validate() self.validate()
# We store the main information from the configuration file, so we don't keep reading from the data dictionary. # We store the main information from the configuration file, so we don't keep reading from the data dictionary.
self.headers = {'X-Cachet-Token': os.environ.get('CACHET_TOKEN') or self.data['cachet']['token']} self.token = os.environ.get('CACHET_TOKEN') or self.data['cachet']['token']
self.headers = {'X-Cachet-Token': self.token}
self.endpoint_method = os.environ.get('ENDPOINT_METHOD') or self.data['endpoint']['method'] self.endpoint_method = self.endpoint['method']
self.endpoint_url = os.environ.get('ENDPOINT_URL') or self.data['endpoint']['url'] self.endpoint_url = normalize_url(self.endpoint['url'])
self.endpoint_url = normalize_url(self.endpoint_url) self.endpoint_timeout = self.endpoint.get('timeout') or 1
self.endpoint_timeout = os.environ.get('ENDPOINT_TIMEOUT') or self.data['endpoint'].get('timeout') or 1 self.endpoint_header = self.endpoint.get('header') or None
self.endpoint_header = self.data['endpoint'].get('header') or None self.allowed_fails = self.endpoint.get('allowed_fails') or 0
self.allowed_fails = os.environ.get('ALLOWED_FAILS') or self.data['endpoint'].get('allowed_fails') or 0
self.api_url = os.environ.get('CACHET_API_URL') or self.data['cachet']['api_url'] self.api_url = os.environ.get('CACHET_API_URL') or self.data['cachet']['api_url']
self.component_id = os.environ.get('CACHET_COMPONENT_ID') or self.data['cachet']['component_id'] self.component_id = self.endpoint['component_id']
self.metric_id = os.environ.get('CACHET_METRIC_ID') or self.data['cachet'].get('metric_id') self.metric_id = self.endpoint.get('metric_id')
self.client = CachetClient(self.api_url, self.token)
if self.metric_id is not None: if self.metric_id is not None:
self.default_metric_value = self.get_default_metric_value(self.metric_id) self.default_metric_value = self.get_default_metric_value(self.metric_id)
# The latency_unit configuration is not mandatory and we fallback to seconds, by default. # The latency_unit configuration is not mandatory and we fallback to seconds, by default.
self.latency_unit = os.environ.get('LATENCY_UNIT') or self.data['cachet'].get('latency_unit') or 's' self.latency_unit = self.data['cachet'].get('latency_unit') or 's'
# We need the current status so we monitor the status changes. This is necessary for creating incidents. # We need the current status so we monitor the status changes. This is necessary for creating incidents.
self.status = get_current_status(self.api_url, self.component_id, self.headers) self.status = self.client.get_component_status(self.component_id)
self.previous_status = self.status self.previous_status = self.status
self.logger.info(f'Component current status: {self.status}')
# Get remaining settings # Get remaining settings
self.public_incidents = int( self.public_incidents = int(self.endpoint['public_incidents'])
os.environ.get('CACHET_PUBLIC_INCIDENTS') or self.data['cachet']['public_incidents'])
self.logger.info('Monitoring URL: %s %s' % (self.endpoint_method, self.endpoint_url)) self.logger.info('Monitoring URL: %s %s' % (self.endpoint_method, self.endpoint_url))
self.expectations = [Expectation.create(expectation) for expectation in self.data['endpoint']['expectation']] self.expectations = [Expectation.create(expectation) for expectation in self.endpoint['expectation']]
for expectation in self.expectations: for expectation in self.expectations:
self.logger.info('Registered expectation: %s' % (expectation,)) self.logger.info('Registered expectation: %s' % (expectation,))
@@ -137,10 +101,10 @@ class Configuration(object):
"""Retrieves the action list from the configuration. If it's empty, returns an empty list. """Retrieves the action list from the configuration. If it's empty, returns an empty list.
:return: The list of actions, which can be an empty list. :return: The list of actions, which can be an empty list.
""" """
if self.data['cachet'].get('action') is None: if self.endpoint.get('action') is None:
return [] return []
else: else:
return self.data['cachet']['action'] return self.endpoint['action']
def validate(self): def validate(self):
"""Validates the configuration by verifying the mandatory fields are """Validates the configuration by verifying the mandatory fields are
@@ -148,24 +112,20 @@ class Configuration(object):
ConfigurationValidationError is raised. Otherwise nothing will happen. ConfigurationValidationError is raised. Otherwise nothing will happen.
""" """
configuration_errors = [] configuration_errors = []
for key, sub_entries in configuration_mandatory_fields.items(): for key in configuration_mandatory_fields:
if key not in self.data: if key not in self.endpoint:
configuration_errors.append(key) configuration_errors.append(key)
for sub_key in sub_entries: if 'expectation' in self.endpoint:
if sub_key not in self.data[key]: if (not isinstance(self.endpoint['expectation'], list) or
configuration_errors.append('%s.%s' % (key, sub_key)) (isinstance(self.endpoint['expectation'], list) and
len(self.endpoint['expectation']) == 0)):
if ('endpoint' in self.data and 'expectation' in
self.data['endpoint']):
if (not isinstance(self.data['endpoint']['expectation'], list) or
(isinstance(self.data['endpoint']['expectation'], list) and
len(self.data['endpoint']['expectation']) == 0)):
configuration_errors.append('endpoint.expectation') configuration_errors.append('endpoint.expectation')
if len(configuration_errors) > 0: if len(configuration_errors) > 0:
raise ConfigurationValidationError( raise ConfigurationValidationError(
f"Config file [{self.config_file}] failed validation. Missing keys: {', '.join(configuration_errors)}") 'Endpoint [%s] failed validation. Missing keys: %s' % (self.endpoint,
', '.join(configuration_errors)))
def evaluate(self): def evaluate(self):
"""Sends the request to the URL set in the configuration and executes """Sends the request to the URL set in the configuration and executes
@@ -182,27 +142,27 @@ class Configuration(object):
except requests.ConnectionError: except requests.ConnectionError:
self.message = 'The URL is unreachable: %s %s' % (self.endpoint_method, self.endpoint_url) self.message = 'The URL is unreachable: %s %s' % (self.endpoint_method, self.endpoint_url)
self.logger.warning(self.message) self.logger.warning(self.message)
self.status = st.COMPONENT_STATUS_PARTIAL_OUTAGE self.status = st.ComponentStatus.PARTIAL_OUTAGE
return return
except requests.HTTPError: except requests.HTTPError:
self.message = 'Unexpected HTTP response' self.message = 'Unexpected HTTP response'
self.logger.exception(self.message) self.logger.exception(self.message)
self.status = st.COMPONENT_STATUS_PARTIAL_OUTAGE self.status = st.ComponentStatus.PARTIAL_OUTAGE
return return
except requests.Timeout: except (requests.Timeout, requests.ConnectTimeout):
self.message = 'Request timed out' self.message = 'Request timed out'
self.logger.warning(self.message) self.logger.warning(self.message)
self.status = st.COMPONENT_STATUS_PERFORMANCE_ISSUES self.status = st.ComponentStatus.PERFORMANCE_ISSUES
return return
# We initially assume the API is healthy. # We initially assume the API is healthy.
self.status = st.COMPONENT_STATUS_OPERATIONAL self.status: ComponentStatus = st.ComponentStatus.OPERATIONAL
self.message = '' self.message = ''
for expectation in self.expectations: for expectation in self.expectations:
status = expectation.get_status(self.request) status: ComponentStatus = expectation.get_status(self.request)
# The greater the status is, the worse the state of the API is. # The greater the status is, the worse the state of the API is.
if status > self.status: if status.value > self.status.value:
self.status = status self.status = status
self.message = expectation.get_message(self.request) self.message = expectation.get_message(self.request)
self.logger.info(self.message) self.logger.info(self.message)
@@ -214,6 +174,8 @@ class Configuration(object):
temporary_data = copy.deepcopy(self.data) temporary_data = copy.deepcopy(self.data)
# Removing the token so we don't leak it in the logs. # Removing the token so we don't leak it in the logs.
del temporary_data['cachet']['token'] del temporary_data['cachet']['token']
temporary_data['endpoints'] = temporary_data['endpoints'][self.endpoint_index]
return dump(temporary_data, default_flow_style=False) return dump(temporary_data, default_flow_style=False)
def if_trigger_update(self): def if_trigger_update(self):
@@ -222,7 +184,7 @@ class Configuration(object):
and only for non-operational ones above the configured threshold (allowed_fails). and only for non-operational ones above the configured threshold (allowed_fails).
""" """
if self.status != 1: if self.status != st.ComponentStatus.OPERATIONAL:
self.current_fails = self.current_fails + 1 self.current_fails = self.current_fails + 1
self.logger.warning(f'Failure #{self.current_fails} with threshold set to {self.allowed_fails}') self.logger.warning(f'Failure #{self.current_fails} with threshold set to {self.allowed_fails}')
if self.current_fails <= self.allowed_fails: if self.current_fails <= self.allowed_fails:
@@ -236,27 +198,30 @@ class Configuration(object):
status based on the previous call to evaluate(). status based on the previous call to evaluate().
""" """
if self.previous_status == self.status: if self.previous_status == self.status:
# We don't want to keep spamming if there's no change in status.
self.logger.info(f'No changes to component status.')
self.trigger_update = False
return return
self.previous_status = self.status self.previous_status = self.status
if not self.trigger_update: if not self.trigger_update:
return return
self.api_component_status = get_current_status(self.api_url, self.component_id, self.headers) api_component_status = self.client.get_component_status(self.component_id)
if self.status == self.api_component_status: if self.status == api_component_status:
return return
self.status = api_component_status
params = {'id': self.component_id, 'status': self.status} component_request = self.client.push_status(self.component_id, self.status)
component_request = requests.put('%s/components/%d' % (self.api_url, self.component_id), params=params,
headers=self.headers)
if component_request.ok: if component_request.ok:
# Successful update # Successful update
self.logger.info('Component update: status [%d]' % (self.status,)) self.logger.info(f'Component update: status [{self.status}]')
else: else:
# Failed to update the API status # Failed to update the API status
self.logger.warning('Component update failed with status [%d]: API' self.logger.warning(f'Component update failed with HTTP status: {component_request.status_code}. API'
' status: [%d]' % (component_request.status_code, self.status)) f' status: {self.status}')
def push_metrics(self): def push_metrics(self):
"""Pushes the total amount of seconds the request took to get a response from the URL. """Pushes the total amount of seconds the request took to get a response from the URL.
@@ -265,16 +230,11 @@ class Configuration(object):
""" """
if 'metric_id' in self.data['cachet'] and hasattr(self, 'request'): if 'metric_id' in self.data['cachet'] and hasattr(self, 'request'):
# We convert the elapsed time from the request, in seconds, to the configured unit. # We convert the elapsed time from the request, in seconds, to the configured unit.
value = self.default_metric_value if self.status != 1 else latency_unit.convert_to_unit(self.latency_unit, metrics_request = self.client.push_metrics(self.metric_id, self.latency_unit,
self.request.elapsed.total_seconds()) self.request.elapsed.total_seconds(), self.current_timestamp)
params = {'id': self.metric_id, 'value': value,
'timestamp': self.current_timestamp}
metrics_request = requests.post('%s/metrics/%d/points' % (self.api_url, self.metric_id), params=params,
headers=self.headers)
if metrics_request.ok: if metrics_request.ok:
# Successful metrics upload # Successful metrics upload
self.logger.info('Metric uploaded: %.6f %s' % (value, self.latency_unit)) self.logger.info('Metric uploaded: %.6f %s' % (self.request.elapsed.total_seconds(), self.latency_unit))
else: else:
self.logger.warning(f'Metric upload failed with status [{metrics_request.status_code}]') self.logger.warning(f'Metric upload failed with status [{metrics_request.status_code}]')
@@ -284,14 +244,10 @@ class Configuration(object):
""" """
if not self.trigger_update: if not self.trigger_update:
return return
if hasattr(self, 'incident_id') and self.status == st.COMPONENT_STATUS_OPERATIONAL: if hasattr(self, 'incident_id') and self.status == st.ComponentStatus.OPERATIONAL:
# If the incident already exists, it means it was unhealthy but now it's healthy again. incident_request = self.client.push_incident(self.status, self.public_incidents, self.component_id,
params = {'status': 4, 'visible': self.public_incidents, 'component_id': self.component_id, previous_incident_id=self.incident_id)
'component_status': self.status,
'notify': True}
incident_request = requests.put(f'{self.api_url}/incidents/{self.incident_id}', params=params,
headers=self.headers)
if incident_request.ok: if incident_request.ok:
# Successful metrics upload # Successful metrics upload
self.logger.info( self.logger.info(
@@ -300,11 +256,9 @@ class Configuration(object):
else: else:
self.logger.warning( self.logger.warning(
f'Incident update failed with status [{incident_request.status_code}], message: "{self.message}"') f'Incident update failed with status [{incident_request.status_code}], message: "{self.message}"')
elif not hasattr(self, 'incident_id') and self.status != st.COMPONENT_STATUS_OPERATIONAL: elif not hasattr(self, 'incident_id') and self.status != st.ComponentStatus.OPERATIONAL:
# This is the first time the incident is being created. incident_request = self.client.push_incident(self.status, self.public_incidents, self.component_id,
params = {'name': 'URL unavailable', 'message': self.message, 'status': 1, 'visible': self.public_incidents, message=self.message)
'component_id': self.component_id, 'component_status': self.status, 'notify': True}
incident_request = requests.post(f'{self.api_url}/incidents', params=params, headers=self.headers)
if incident_request.ok: if incident_request.ok:
# Successful incident upload. # Successful incident upload.
self.incident_id = incident_request.json()['data']['id'] self.incident_id = incident_request.json()['data']['id']
@@ -331,26 +285,29 @@ class Expectation(object):
'LATENCY': Latency, 'LATENCY': Latency,
'REGEX': Regex 'REGEX': Regex
} }
if configuration['type'] not in expectations:
raise ConfigurationValidationError(f"Invalid type: {configuration['type']}")
return expectations.get(configuration['type'])(configuration) return expectations.get(configuration['type'])(configuration)
def __init__(self, configuration): def __init__(self, configuration):
self.incident_status = self.parse_incident_status(configuration) self.incident_status = self.parse_incident_status(configuration)
@abc.abstractmethod @abc.abstractmethod
def get_status(self, response): def get_status(self, response) -> ComponentStatus:
"""Returns the status of the API, following cachet's component status """Returns the status of the API, following cachet's component status
documentation: https://docs.cachethq.io/docs/component-statuses documentation: https://docs.cachethq.io/docs/component-statuses
""" """
@abc.abstractmethod @abc.abstractmethod
def get_message(self, response): def get_message(self, response) -> str:
"""Gets the error message.""" """Gets the error message."""
@abc.abstractmethod @abc.abstractmethod
def get_default_incident(self): def get_default_incident(self):
"""Returns the default status when this incident happens.""" """Returns the default status when this incident happens."""
def parse_incident_status(self, configuration): def parse_incident_status(self, configuration) -> ComponentStatus:
return st.INCIDENT_MAP.get(configuration.get('incident', None), self.get_default_incident()) return st.INCIDENT_MAP.get(configuration.get('incident', None), self.get_default_incident())
@@ -361,6 +318,10 @@ class HttpStatus(Expectation):
@staticmethod @staticmethod
def parse_range(range_string): def parse_range(range_string):
if isinstance(range_string, int):
# This happens when there's no range and no dash character, it will be parsed as int already.
return range_string, range_string + 1
statuses = range_string.split("-") statuses = range_string.split("-")
if len(statuses) == 1: if len(statuses) == 1:
# When there was no range given, we should treat the first number as a single status check. # When there was no range given, we should treat the first number as a single status check.
@@ -369,20 +330,20 @@ class HttpStatus(Expectation):
# We shouldn't look into more than one value, as this is a range value. # We shouldn't look into more than one value, as this is a range value.
return int(statuses[0]), int(statuses[1]) return int(statuses[0]), int(statuses[1])
def get_status(self, response): def get_status(self, response) -> ComponentStatus:
if self.status_range[0] <= response.status_code < self.status_range[1]: if self.status_range[0] <= response.status_code < self.status_range[1]:
return st.COMPONENT_STATUS_OPERATIONAL return st.ComponentStatus.OPERATIONAL
else: else:
return self.incident_status return self.incident_status
def get_default_incident(self): def get_default_incident(self):
return st.COMPONENT_STATUS_PARTIAL_OUTAGE return st.ComponentStatus.PARTIAL_OUTAGE
def get_message(self, response): def get_message(self, response):
return f'Unexpected HTTP status ({response.status_code})' return f'Unexpected HTTP status ({response.status_code})'
def __str__(self): def __str__(self):
return repr(f'HTTP status range: {self.status_range}') return repr(f'HTTP status range: [{self.status_range[0]}, {self.status_range[1]}[')
class Latency(Expectation): class Latency(Expectation):
@@ -390,14 +351,14 @@ class Latency(Expectation):
self.threshold = configuration['threshold'] self.threshold = configuration['threshold']
super(Latency, self).__init__(configuration) super(Latency, self).__init__(configuration)
def get_status(self, response): def get_status(self, response) -> ComponentStatus:
if response.elapsed.total_seconds() <= self.threshold: if response.elapsed.total_seconds() <= self.threshold:
return st.COMPONENT_STATUS_OPERATIONAL return st.ComponentStatus.OPERATIONAL
else: else:
return self.incident_status return self.incident_status
def get_default_incident(self): def get_default_incident(self):
return st.COMPONENT_STATUS_PERFORMANCE_ISSUES return st.ComponentStatus.PERFORMANCE_ISSUES
def get_message(self, response): def get_message(self, response):
return 'Latency above threshold: %.4f seconds' % (response.elapsed.total_seconds(),) return 'Latency above threshold: %.4f seconds' % (response.elapsed.total_seconds(),)
@@ -412,14 +373,14 @@ class Regex(Expectation):
self.regex = re.compile(configuration['regex'], re.UNICODE + re.DOTALL) self.regex = re.compile(configuration['regex'], re.UNICODE + re.DOTALL)
super(Regex, self).__init__(configuration) super(Regex, self).__init__(configuration)
def get_status(self, response): def get_status(self, response) -> ComponentStatus:
if self.regex.match(response.text): if self.regex.match(response.text):
return st.COMPONENT_STATUS_OPERATIONAL return st.ComponentStatus.OPERATIONAL
else: else:
return self.incident_status return self.incident_status
def get_default_incident(self): def get_default_incident(self):
return st.COMPONENT_STATUS_PARTIAL_OUTAGE return st.ComponentStatus.PARTIAL_OUTAGE
def get_message(self, response): def get_message(self, response):
return 'Regex did not match anything in the body' return 'Regex did not match anything in the body'

View File

@@ -0,0 +1,19 @@
#!/usr/bin/env python
class ComponentNonexistentError(Exception):
"""Exception raised when the component does not exist."""
def __init__(self, component_id):
self.component_id = component_id
def __str__(self):
return repr(f'Component with id [{self.component_id}] does not exist.')
class MetricNonexistentError(Exception):
"""Exception raised when the component does not exist."""
def __init__(self, metric_id):
self.metric_id = metric_id
def __str__(self):
return repr(f'Metric with id [{self.metric_id}] does not exist.')

View File

@@ -1,12 +1,16 @@
#!/usr/bin/env python #!/usr/bin/env python
import logging import logging
import sys import sys
import threading
import time import time
import schedule import schedule
from yaml import load, SafeLoader
from cachet_url_monitor.configuration import Configuration from cachet_url_monitor.configuration import Configuration
cachet_mandatory_fields = ['api_url', 'token']
class Agent(object): class Agent(object):
"""Monitor agent that will be constantly verifying if the URL is healthy """Monitor agent that will be constantly verifying if the URL is healthy
@@ -32,49 +36,89 @@ class Agent(object):
def start(self): def start(self):
"""Sets up the schedule based on the configuration file.""" """Sets up the schedule based on the configuration file."""
schedule.every(self.configuration.data['frequency']).seconds.do(self.execute) schedule.every(self.configuration.endpoint['frequency']).seconds.do(self.execute)
class Decorator(object): class Decorator(object):
"""Defines the actions a user can configure to be executed when there's an incident."""
def execute(self, configuration): def execute(self, configuration):
pass pass
class UpdateStatusDecorator(Decorator): class UpdateStatusDecorator(Decorator):
"""Updates the component status when an incident happens."""
def execute(self, configuration): def execute(self, configuration):
configuration.push_status() configuration.push_status()
class CreateIncidentDecorator(Decorator): class CreateIncidentDecorator(Decorator):
"""Creates an incident entry on cachet when an incident happens."""
def execute(self, configuration): def execute(self, configuration):
configuration.push_incident() configuration.push_incident()
class PushMetricsDecorator(Decorator):
"""Updates the URL latency metric."""
def execute(self, configuration):
configuration.push_metrics()
class Scheduler(object): class Scheduler(object):
def __init__(self, config_file): def __init__(self, configuration, agent):
self.logger = logging.getLogger('cachet_url_monitor.scheduler.Scheduler') self.logger = logging.getLogger('cachet_url_monitor.scheduler.Scheduler')
self.configuration = Configuration(config_file) self.configuration = configuration
self.agent = self.get_agent() self.agent = agent
self.stop = False self.stop = False
def get_agent(self):
action_names = {
'CREATE_INCIDENT': CreateIncidentDecorator,
'UPDATE_STATUS': UpdateStatusDecorator,
}
actions = []
for action in self.configuration.get_action():
self.logger.info('Registering action %s' % (action))
actions.append(action_names[action]())
return Agent(self.configuration, decorators=actions)
def start(self): def start(self):
self.agent.start() self.agent.start()
self.logger.info('Starting monitor agent...') self.logger.info('Starting monitor agent...')
while not self.stop: while not self.stop:
schedule.run_pending() schedule.run_pending()
time.sleep(self.configuration.data['frequency']) time.sleep(self.configuration.endpoint['frequency'])
class NewThread(threading.Thread):
def __init__(self, scheduler):
threading.Thread.__init__(self)
self.scheduler = scheduler
def run(self):
self.scheduler.start()
def build_agent(configuration, logger):
action_names = {
'CREATE_INCIDENT': CreateIncidentDecorator,
'UPDATE_STATUS': UpdateStatusDecorator,
'PUSH_METRICS': PushMetricsDecorator,
}
actions = []
for action in configuration.get_action():
logger.info(f'Registering action {action}')
actions.append(action_names[action]())
return Agent(configuration, decorators=actions)
def validate_config():
if 'endpoints' not in config_file.keys():
fatal_error('Endpoints is a mandatory field')
if config_file['endpoints'] is None:
fatal_error('Endpoints array can not be empty')
for key in cachet_mandatory_fields:
if key not in config_file['cachet']:
fatal_error('Missing cachet mandatory fields')
def fatal_error(message):
logging.getLogger('cachet_url_monitor.scheduler').fatal("%s", message)
sys.exit(1)
if __name__ == "__main__": if __name__ == "__main__":
@@ -87,5 +131,15 @@ if __name__ == "__main__":
logging.getLogger('cachet_url_monitor.scheduler').fatal('Missing configuration file argument') logging.getLogger('cachet_url_monitor.scheduler').fatal('Missing configuration file argument')
sys.exit(1) sys.exit(1)
scheduler = Scheduler(sys.argv[1]) try:
scheduler.start() config_file = load(open(sys.argv[1], 'r'), SafeLoader)
except FileNotFoundError:
logging.getLogger('cachet_url_monitor.scheduler').fatal(f'File not found: {sys.argv[1]}')
sys.exit(1)
validate_config()
for endpoint_index in range(len(config_file['endpoints'])):
configuration = Configuration(config_file, endpoint_index)
NewThread(Scheduler(configuration,
build_agent(configuration, logging.getLogger('cachet_url_monitor.scheduler')))).start()

View File

@@ -3,22 +3,30 @@
This file defines all the different status different values. This file defines all the different status different values.
These are all constants and are coupled to cachet's API configuration. These are all constants and are coupled to cachet's API configuration.
""" """
from enum import Enum
COMPONENT_STATUS_OPERATIONAL = 1
COMPONENT_STATUS_PERFORMANCE_ISSUES = 2
COMPONENT_STATUS_PARTIAL_OUTAGE = 3
COMPONENT_STATUS_MAJOR_OUTAGE = 4
COMPONENT_STATUSES = [COMPONENT_STATUS_OPERATIONAL, class ComponentStatus(Enum):
COMPONENT_STATUS_PERFORMANCE_ISSUES, COMPONENT_STATUS_PARTIAL_OUTAGE, OPERATIONAL = 1
COMPONENT_STATUS_MAJOR_OUTAGE] PERFORMANCE_ISSUES = 2
PARTIAL_OUTAGE = 3
MAJOR_OUTAGE = 4
INCIDENT_PARTIAL = 'PARTIAL' INCIDENT_PARTIAL = 'PARTIAL'
INCIDENT_MAJOR = 'MAJOR' INCIDENT_MAJOR = 'MAJOR'
INCIDENT_PERFORMANCE = 'PERFORMANCE' INCIDENT_PERFORMANCE = 'PERFORMANCE'
INCIDENT_MAP = { INCIDENT_MAP = {
INCIDENT_PARTIAL: COMPONENT_STATUS_PARTIAL_OUTAGE, INCIDENT_PARTIAL: ComponentStatus.PARTIAL_OUTAGE,
INCIDENT_MAJOR: COMPONENT_STATUS_MAJOR_OUTAGE, INCIDENT_MAJOR: ComponentStatus.MAJOR_OUTAGE,
INCIDENT_PERFORMANCE: COMPONENT_STATUS_PERFORMANCE_ISSUES, INCIDENT_PERFORMANCE: ComponentStatus.PERFORMANCE_ISSUES,
} }
class IncidentStatus(Enum):
SCHEDULED = 0
INVESTIGATING = 1
IDENTIFIED = 2
WATCHING = 3
FIXED = 4

View File

@@ -1,4 +1,5 @@
endpoint: endpoints:
- name: swagger
url: http://localhost:8080/swagger url: http://localhost:8080/swagger
method: GET method: GET
header: header:
@@ -13,14 +14,14 @@ endpoint:
- type: REGEX - type: REGEX
regex: '.*(<body).*' regex: '.*(<body).*'
allowed_fails: 0 allowed_fails: 0
cachet: frequency: 30
api_url: https://demo.cachethq.io/api/v1
token: my_token
component_id: 1 component_id: 1
#metric_id: 1 metric_id: 1
action: action:
- CREATE_INCIDENT - CREATE_INCIDENT
- UPDATE_STATUS - UPDATE_STATUS
public_incidents: true public_incidents: true
latency_unit: ms latency_unit: ms
frequency: 30 cachet:
api_url: https://demo.cachethq.io/api/v1
token: my_token

View File

@@ -5,3 +5,4 @@ pudb==2016.1
pytest==5.2.2 pytest==5.2.2
pytest-cov==2.8.1 pytest-cov==2.8.1
coverage==4.5.2 coverage==4.5.2
requests-mock==1.7.0

View File

@@ -1,3 +1,4 @@
PyYAML==5.1.2 PyYAML==5.1.2
requests==2.22.0 requests==2.22.0
schedule==0.6.0 schedule==0.6.0
Click==7.0

View File

@@ -3,7 +3,7 @@
from setuptools import setup from setuptools import setup
setup(name='cachet-url-monitor', setup(name='cachet-url-monitor',
version='0.5', version='0.6.2',
description='Cachet URL monitor plugin', description='Cachet URL monitor plugin',
author='Mitsuo Takaki', author='Mitsuo Takaki',
author_email='mitsuotakaki@gmail.com', author_email='mitsuotakaki@gmail.com',

26
tests/configs/config.yml Normal file
View File

@@ -0,0 +1,26 @@
endpoints:
- name: foo
url: http://localhost:8080/swagger
method: GET
header:
SOME-HEADER: SOME-VALUE
timeout: 0.01
expectation:
- type: HTTP_STATUS
status_range: 200-300
incident: MAJOR
- type: LATENCY
threshold: 1
- type: REGEX
regex: '.*(<body).*'
allowed_fails: 0
component_id: 1
action:
- CREATE_INCIDENT
- UPDATE_STATUS
public_incidents: true
latency_unit: ms
frequency: 30
cachet:
api_url: https://demo.cachethq.io/api/v1
token: my_token

View File

@@ -0,0 +1,22 @@
endpoints:
- name: foo
url: http://localhost:8080/swagger
method: GET
header:
SOME-HEADER: SOME-VALUE
timeout: 0.01
expectation:
- type: HTTP
status_range: 200-300
incident: MAJOR
allowed_fails: 0
component_id: 1
action:
- CREATE_INCIDENT
- UPDATE_STATUS
public_incidents: true
latency_unit: ms
frequency: 30
cachet:
api_url: https://demo.cachethq.io/api/v1
token: my_token

View File

@@ -0,0 +1,28 @@
endpoints:
- name: foo
url: http://localhost:8080/swagger
method: GET
expectation:
- type: HTTP_STATUS
status_range: 200-300
allowed_fails: 0
component_id: 1
latency_unit: ms
frequency: 30
timeout: 1
public_incidents: true
- name: bar
url: http://localhost:8080/bar
method: POST
expectation:
- type: HTTP_STATUS
status_range: 500
allowed_fails: 0
component_id: 2
latency_unit: ms
frequency: 30
timeout: 1
public_incidents: true
cachet:
api_url: https://demo.cachethq.io/api/v1
token: my_token

147
tests/test_client.py Normal file
View File

@@ -0,0 +1,147 @@
#!/usr/bin/env python
import unittest
from typing import Dict, List
import requests_mock
from cachet_url_monitor.client import CachetClient
from cachet_url_monitor.exceptions import MetricNonexistentError
from cachet_url_monitor.status import ComponentStatus
TOKEN: str = 'token_123'
CACHET_URL: str = 'http://foo.localhost'
JSON: Dict[str, List[Dict[str, int]]] = {'data': [{'id': 1}]}
class ClientTest(unittest.TestCase):
def setUp(self):
self.client = CachetClient('foo.localhost', TOKEN)
def test_init(self):
self.assertEqual(self.client.headers, {'X-Cachet-Token': TOKEN}, 'Header was not set correctly')
self.assertEqual(self.client.url, CACHET_URL, 'Cachet API URL was set incorrectly')
@requests_mock.mock()
def test_get_components(self, m):
m.get(f'{CACHET_URL}/components', json=JSON, headers={'X-Cachet-Token': TOKEN})
components = self.client.get_components()
self.assertEqual(components, [{'id': 1}],
'Getting components list is incorrect.')
@requests_mock.mock()
def test_get_metrics(self, m):
m.get(f'{CACHET_URL}/metrics', json=JSON)
metrics = self.client.get_metrics()
self.assertEqual(metrics, [{'id': 1}],
'Getting metrics list is incorrect.')
@requests_mock.mock()
def test_generate_config(self, m):
def components():
return {
'data': [
{
'id': '1',
'name': 'apache',
'link': 'http://abc.def',
'enabled': True
},
{
'id': '2',
'name': 'haproxy',
'link': 'http://ghi.jkl',
'enabled': False
},
{
'id': '3',
'name': 'nginx',
'link': 'http://mno.pqr',
'enabled': True
}
]
}
m.get(f'{CACHET_URL}/components', json=components(), headers={'X-Cachet-Token': TOKEN})
config = self.client.generate_config()
self.assertEqual(config, {
'cachet': {
'api_url': CACHET_URL,
'token': TOKEN
},
'endpoints': [
{
'name': 'apache',
'url': 'http://abc.def',
'method': 'GET',
'timeout': 1,
'expectation': [
{
'type': 'HTTP_STATUS',
'status_range': '200-300',
'incident': 'MAJOR'
}
],
'allowed_fails': 0,
'frequency': 30,
'component_id': '1',
'action': [
'CREATE_INCIDENT',
'UPDATE_STATUS',
],
'public_incidents': True,
},
{
'name': 'nginx',
'url': 'http://mno.pqr',
'method': 'GET',
'timeout': 1,
'expectation': [
{
'type': 'HTTP_STATUS',
'status_range': '200-300',
'incident': 'MAJOR'
}
],
'allowed_fails': 0,
'frequency': 30,
'component_id': '3',
'action': [
'CREATE_INCIDENT',
'UPDATE_STATUS',
],
'public_incidents': True,
}
]
}, 'Generated config is incorrect.')
@requests_mock.mock()
def test_get_default_metric_value(self, m):
m.get(f'{CACHET_URL}/metrics/123', json={'data': {'default_value': 0.456}}, headers={'X-Cachet-Token': TOKEN})
default_metric_value = self.client.get_default_metric_value(123)
self.assertEqual(default_metric_value, 0.456,
'Getting default metric value is incorrect.')
@requests_mock.mock()
def test_get_default_metric_value_invalid_id(self, m):
m.get(f'{CACHET_URL}/metrics/123', headers={'X-Cachet-Token': TOKEN}, status_code=400)
with self.assertRaises(MetricNonexistentError):
self.client.get_default_metric_value(123)
@requests_mock.mock()
def test_get_component_status(self, m):
def json():
return {
'data': {
'status': ComponentStatus.OPERATIONAL.value
}
}
m.get(f'{CACHET_URL}/components/123', json=json(), headers={'X-Cachet-Token': TOKEN})
status = self.client.get_component_status(123)
self.assertEqual(status, ComponentStatus.OPERATIONAL,
'Getting component status value is incorrect.')

View File

@@ -3,11 +3,13 @@ import sys
import unittest import unittest
import mock import mock
from requests import ConnectionError, HTTPError, Timeout import pytest
import requests
import requests_mock
from yaml import load, SafeLoader
import cachet_url_monitor.status import cachet_url_monitor.status
sys.modules['requests'] = mock.Mock()
sys.modules['logging'] = mock.Mock() sys.modules['logging'] = mock.Mock()
from cachet_url_monitor.configuration import Configuration from cachet_url_monitor.configuration import Configuration
import os import os
@@ -22,154 +24,126 @@ class ConfigurationTest(unittest.TestCase):
sys.modules['logging'].getLogger = getLogger sys.modules['logging'].getLogger = getLogger
def get(url, headers): # def get(url, headers):
get_return = mock.Mock() # get_return = mock.Mock()
get_return.ok = True # get_return.ok = True
get_return.json = mock.Mock() # get_return.json = mock.Mock()
get_return.json.return_value = {'data': {'status': 1, 'default_value': 0.5}} # get_return.json.return_value = {'data': {'status': 1, 'default_value': 0.5}}
return get_return # return get_return
#
# sys.modules['requests'].get = get
sys.modules['requests'].get = get self.configuration = Configuration(
load(open(os.path.join(os.path.dirname(__file__), 'configs/config.yml'), 'rt'), SafeLoader), 0)
self.configuration = Configuration('config.yml') # sys.modules['requests'].Timeout = Timeout
sys.modules['requests'].Timeout = Timeout # sys.modules['requests'].ConnectionError = ConnectionError
sys.modules['requests'].ConnectionError = ConnectionError # sys.modules['requests'].HTTPError = HTTPError
sys.modules['requests'].HTTPError = HTTPError
def test_init(self): def test_init(self):
self.assertEqual(len(self.configuration.data), 3, 'Number of root elements in config.yml is incorrect') self.assertEqual(len(self.configuration.data), 2, 'Number of root elements in config.yml is incorrect')
self.assertEqual(len(self.configuration.expectations), 3, 'Number of expectations read from file is incorrect') self.assertEqual(len(self.configuration.expectations), 3, 'Number of expectations read from file is incorrect')
self.assertDictEqual(self.configuration.headers, {'X-Cachet-Token': 'token2'}, 'Header was not set correctly') self.assertDictEqual(self.configuration.headers, {'X-Cachet-Token': 'token2'}, 'Header was not set correctly')
self.assertEqual(self.configuration.api_url, 'https://demo.cachethq.io/api/v1', self.assertEqual(self.configuration.api_url, 'https://demo.cachethq.io/api/v1',
'Cachet API URL was set incorrectly') 'Cachet API URL was set incorrectly')
self.assertDictEqual(self.configuration.endpoint_header, {'SOME-HEADER': 'SOME-VALUE'}, 'Header is incorrect') self.assertDictEqual(self.configuration.endpoint_header, {'SOME-HEADER': 'SOME-VALUE'}, 'Header is incorrect')
def test_evaluate(self): @requests_mock.mock()
def total_seconds(): def test_evaluate(self, m):
return 0.1 m.get('http://localhost:8080/swagger', text='<body>')
def request(method, url, headers, timeout=None):
response = mock.Mock()
response.status_code = 200
response.elapsed = mock.Mock()
response.elapsed.total_seconds = total_seconds
response.text = '<body>'
return response
sys.modules['requests'].request = request
self.configuration.evaluate() self.configuration.evaluate()
self.assertEqual(self.configuration.status, cachet_url_monitor.status.COMPONENT_STATUS_OPERATIONAL, self.assertEqual(self.configuration.status, cachet_url_monitor.status.ComponentStatus.OPERATIONAL,
'Component status set incorrectly') 'Component status set incorrectly')
def test_evaluate_without_header(self): @requests_mock.mock()
def total_seconds(): def test_evaluate_without_header(self, m):
return 0.1 m.get('http://localhost:8080/swagger', text='<body>')
def request(method, url, headers=None, timeout=None):
response = mock.Mock()
response.status_code = 200
response.elapsed = mock.Mock()
response.elapsed.total_seconds = total_seconds
response.text = '<body>'
return response
sys.modules['requests'].request = request
self.configuration.evaluate() self.configuration.evaluate()
self.assertEqual(self.configuration.status, cachet_url_monitor.status.COMPONENT_STATUS_OPERATIONAL, self.assertEqual(self.configuration.status, cachet_url_monitor.status.ComponentStatus.OPERATIONAL,
'Component status set incorrectly') 'Component status set incorrectly')
def test_evaluate_with_failure(self): @requests_mock.mock()
def total_seconds(): def test_evaluate_with_failure(self, m):
return 0.1 m.get('http://localhost:8080/swagger', text='<body>', status_code=400)
def request(method, url, headers, timeout=None):
response = mock.Mock()
# We are expecting a 200 response, so this will fail the expectation.
response.status_code = 400
response.elapsed = mock.Mock()
response.elapsed.total_seconds = total_seconds
response.text = '<body>'
return response
sys.modules['requests'].request = request
self.configuration.evaluate() self.configuration.evaluate()
self.assertEqual(self.configuration.status, cachet_url_monitor.status.COMPONENT_STATUS_MAJOR_OUTAGE, self.assertEqual(self.configuration.status, cachet_url_monitor.status.ComponentStatus.MAJOR_OUTAGE,
'Component status set incorrectly or custom incident status is incorrectly parsed') 'Component status set incorrectly or custom incident status is incorrectly parsed')
def test_evaluate_with_timeout(self): @requests_mock.mock()
def request(method, url, headers, timeout=None): def test_evaluate_with_timeout(self, m):
self.assertEqual(method, 'GET', 'Incorrect HTTP method') m.get('http://localhost:8080/swagger', exc=requests.Timeout)
self.assertEqual(url, 'http://localhost:8080/swagger', 'Monitored URL is incorrect')
self.assertEqual(timeout, 0.010)
raise Timeout()
sys.modules['requests'].request = request
self.configuration.evaluate() self.configuration.evaluate()
self.assertEqual(self.configuration.status, cachet_url_monitor.status.COMPONENT_STATUS_PERFORMANCE_ISSUES, self.assertEqual(self.configuration.status, cachet_url_monitor.status.ComponentStatus.PERFORMANCE_ISSUES,
'Component status set incorrectly') 'Component status set incorrectly')
self.mock_logger.warning.assert_called_with('Request timed out') self.mock_logger.warning.assert_called_with('Request timed out')
def test_evaluate_with_connection_error(self): @requests_mock.mock()
def request(method, url, headers, timeout=None): def test_evaluate_with_connection_error(self, m):
self.assertEqual(method, 'GET', 'Incorrect HTTP method') m.get('http://localhost:8080/swagger', exc=requests.ConnectionError)
self.assertEqual(url, 'http://localhost:8080/swagger', 'Monitored URL is incorrect')
self.assertEqual(timeout, 0.010)
raise ConnectionError()
sys.modules['requests'].request = request
self.configuration.evaluate() self.configuration.evaluate()
self.assertEqual(self.configuration.status, cachet_url_monitor.status.COMPONENT_STATUS_PARTIAL_OUTAGE, self.assertEqual(self.configuration.status, cachet_url_monitor.status.ComponentStatus.PARTIAL_OUTAGE,
'Component status set incorrectly') 'Component status set incorrectly')
self.mock_logger.warning.assert_called_with('The URL is unreachable: GET http://localhost:8080/swagger') self.mock_logger.warning.assert_called_with('The URL is unreachable: GET http://localhost:8080/swagger')
def test_evaluate_with_http_error(self): @requests_mock.mock()
def request(method, url, headers, timeout=None): def test_evaluate_with_http_error(self, m):
self.assertEqual(method, 'GET', 'Incorrect HTTP method') m.get('http://localhost:8080/swagger', exc=requests.HTTPError)
self.assertEqual(url, 'http://localhost:8080/swagger', 'Monitored URL is incorrect')
self.assertEqual(timeout, 0.010)
raise HTTPError()
sys.modules['requests'].request = request
self.configuration.evaluate() self.configuration.evaluate()
self.assertEqual(self.configuration.status, cachet_url_monitor.status.COMPONENT_STATUS_PARTIAL_OUTAGE, self.assertEqual(self.configuration.status, cachet_url_monitor.status.ComponentStatus.PARTIAL_OUTAGE,
'Component status set incorrectly') 'Component status set incorrectly')
self.mock_logger.exception.assert_called_with('Unexpected HTTP response') self.mock_logger.exception.assert_called_with('Unexpected HTTP response')
def test_push_status(self): @requests_mock.mock()
def put(url, params=None, headers=None): def test_push_status(self, m):
self.assertEqual(url, 'https://demo.cachethq.io/api/v1/components/1', 'Incorrect cachet API URL') m.put('https://demo.cachethq.io/api/v1/components/1?id=1&status=1', headers={'X-Cachet-Token': 'token2'})
self.assertDictEqual(params, {'id': 1, 'status': 1}, 'Incorrect component update parameters') self.assertEqual(self.configuration.status, cachet_url_monitor.status.ComponentStatus.OPERATIONAL,
self.assertDictEqual(headers, {'X-Cachet-Token': 'token2'}, 'Incorrect component update parameters')
response = mock.Mock()
response.status_code = 200
return response
sys.modules['requests'].put = put
self.assertEqual(self.configuration.status, cachet_url_monitor.status.COMPONENT_STATUS_OPERATIONAL,
'Incorrect component update parameters') 'Incorrect component update parameters')
self.configuration.push_status() self.configuration.push_status()
def test_push_status_with_failure(self): @requests_mock.mock()
def put(url, params=None, headers=None): def test_push_status_with_failure(self, m):
self.assertEqual(url, 'https://demo.cachethq.io/api/v1/components/1', 'Incorrect cachet API URL') m.put('https://demo.cachethq.io/api/v1/components/1?id=1&status=1', headers={'X-Cachet-Token': 'token2'},
self.assertDictEqual(params, {'id': 1, 'status': 1}, 'Incorrect component update parameters') status_code=400)
self.assertDictEqual(headers, {'X-Cachet-Token': 'token2'}, 'Incorrect component update parameters') self.assertEqual(self.configuration.status, cachet_url_monitor.status.ComponentStatus.OPERATIONAL,
response = mock.Mock()
response.status_code = 400
return response
sys.modules['requests'].put = put
self.assertEqual(self.configuration.status, cachet_url_monitor.status.COMPONENT_STATUS_OPERATIONAL,
'Incorrect component update parameters') 'Incorrect component update parameters')
self.configuration.push_status() self.configuration.push_status()
class ConfigurationMultipleUrlTest(unittest.TestCase):
@mock.patch.dict(os.environ, {'CACHET_TOKEN': 'token2'})
def setUp(self):
config_yaml = load(open(os.path.join(os.path.dirname(__file__), 'configs/config_multiple_urls.yml'), 'rt'),
SafeLoader)
self.configuration = []
for index in range(len(config_yaml['endpoints'])):
self.configuration.append(Configuration(config_yaml, index))
def test_init(self):
expected_method = ['GET', 'POST']
expected_url = ['http://localhost:8080/swagger', 'http://localhost:8080/bar']
for index in range(len(self.configuration)):
config = self.configuration[index]
self.assertEqual(len(config.data), 2, 'Number of root elements in config.yml is incorrect')
self.assertEqual(len(config.expectations), 1, 'Number of expectations read from file is incorrect')
self.assertDictEqual(config.headers, {'X-Cachet-Token': 'token2'}, 'Header was not set correctly')
self.assertEqual(config.api_url, 'https://demo.cachethq.io/api/v1',
'Cachet API URL was set incorrectly')
self.assertEqual(expected_method[index], config.endpoint_method)
self.assertEqual(expected_url[index], config.endpoint_url)
class ConfigurationNegativeTest(unittest.TestCase):
@mock.patch.dict(os.environ, {'CACHET_TOKEN': 'token2'})
def test_init(self):
with pytest.raises(cachet_url_monitor.configuration.ConfigurationValidationError):
self.configuration = Configuration(
load(open(os.path.join(os.path.dirname(__file__), 'configs/config_invalid_type.yml'), 'rt'),
SafeLoader), 0)

View File

@@ -7,6 +7,7 @@ import pytest
from cachet_url_monitor.configuration import HttpStatus, Regex from cachet_url_monitor.configuration import HttpStatus, Regex
from cachet_url_monitor.configuration import Latency from cachet_url_monitor.configuration import Latency
from cachet_url_monitor.status import ComponentStatus
class LatencyTest(unittest.TestCase): class LatencyTest(unittest.TestCase):
@@ -25,7 +26,7 @@ class LatencyTest(unittest.TestCase):
request.elapsed = elapsed request.elapsed = elapsed
elapsed.total_seconds = total_seconds elapsed.total_seconds = total_seconds
assert self.expectation.get_status(request) == 1 assert self.expectation.get_status(request) == ComponentStatus.OPERATIONAL
def test_get_status_unhealthy(self): def test_get_status_unhealthy(self):
def total_seconds(): def total_seconds():
@@ -36,7 +37,7 @@ class LatencyTest(unittest.TestCase):
request.elapsed = elapsed request.elapsed = elapsed
elapsed.total_seconds = total_seconds elapsed.total_seconds = total_seconds
assert self.expectation.get_status(request) == 2 assert self.expectation.get_status(request) == ComponentStatus.PERFORMANCE_ISSUES
def test_get_message(self): def test_get_message(self):
def total_seconds(): def total_seconds():
@@ -73,13 +74,13 @@ class HttpStatusTest(unittest.TestCase):
request = mock.Mock() request = mock.Mock()
request.status_code = 200 request.status_code = 200
assert self.expectation.get_status(request) == 1 assert self.expectation.get_status(request) == ComponentStatus.OPERATIONAL
def test_get_status_unhealthy(self): def test_get_status_unhealthy(self):
request = mock.Mock() request = mock.Mock()
request.status_code = 400 request.status_code = 400
assert self.expectation.get_status(request) == 3 assert self.expectation.get_status(request) == ComponentStatus.PARTIAL_OUTAGE
def test_get_message(self): def test_get_message(self):
request = mock.Mock() request = mock.Mock()
@@ -100,13 +101,13 @@ class RegexTest(unittest.TestCase):
request = mock.Mock() request = mock.Mock()
request.text = 'We could find stuff\n in this body.' request.text = 'We could find stuff\n in this body.'
assert self.expectation.get_status(request) == 1 assert self.expectation.get_status(request) == ComponentStatus.OPERATIONAL
def test_get_status_unhealthy(self): def test_get_status_unhealthy(self):
request = mock.Mock() request = mock.Mock()
request.text = 'We will not find it here' request.text = 'We will not find it here'
assert self.expectation.get_status(request) == 3 assert self.expectation.get_status(request) == ComponentStatus.PARTIAL_OUTAGE
def test_get_message(self): def test_get_message(self):
request = mock.Mock() request = mock.Mock()

View File

@@ -26,7 +26,7 @@ class AgentTest(unittest.TestCase):
def test_start(self): def test_start(self):
every = sys.modules['schedule'].every every = sys.modules['schedule'].every
self.configuration.data = {'frequency': 5} self.configuration.endpoint = {'frequency': 5}
self.agent.start() self.agent.start()
@@ -45,13 +45,43 @@ class SchedulerTest(unittest.TestCase):
mock_requests.get = get mock_requests.get = get
self.scheduler = Scheduler('config.yml') self.agent = mock.MagicMock()
self.scheduler = Scheduler(
{
'endpoints': [
{
'name': 'foo',
'url': 'http://localhost:8080/swagger',
'method': 'GET',
'expectation': [
{
'type': 'HTTP_STATUS',
'status_range': '200 - 300',
'incident': 'MAJOR',
}
],
'allowed_fails': 0,
'component_id': 1,
'action': ['CREATE_INCIDENT', 'UPDATE_STATUS'],
'public_incidents': True,
'latency_unit': 'ms',
'frequency': 30
}
],
'cachet': {
'api_url': 'https: // demo.cachethq.io / api / v1',
'token': 'my_token'
}
}, self.agent)
def test_init(self): def test_init(self):
assert self.scheduler.stop == False self.assertFalse(self.scheduler.stop)
def test_start(self): def test_start(self):
# TODO(mtakaki|2016-05-01): We need a better way of testing this method. # TODO(mtakaki|2016-05-01): We need a better way of testing this method.
# Leaving it as a placeholder. # Leaving it as a placeholder.
self.scheduler.stop = True self.scheduler.stop = True
self.scheduler.start() self.scheduler.start()
self.agent.start.assert_called()