Prometheus monitoring metrics
Follow up the previous article, this is one of the series “Pessimistic Programming”, monitoring and alert. Generally, Prometheus is a great tool to monitor your application, for me it is extremely good. As I mentioned in previous article, a mistake is a mistake, a tiny mistake may cause a big issue and we can not avoid mistakes completely. Needless to say, It is terrible if suddenly you recognize that your system is suffering a very bad performance issue but you do not have any metric monitoring board to investigate. Then it takes 12 hours with a ton of effort and 14 pizzas to fix just a small bug. Keeping in mind that monitoring is an important part of a system will help you react a problem quickly and effectively. Specifically, it would be helpful if you know exactly caching hit rate of a module in your application or how much RAM it uses for caching currently.
Here I just want to point out some aspects of Prometheus. You can find all detail in the given links at the end of this article.
Pull or Push
Overall Prometheus server is a standalone application, which scrape metrics of your application periodically based on its configuration. Your application does not need to know where Prometheus server is or how to send metrics. Simply, the only thing your application needs to do is to store metrics in somewhere (redis cache for example — this is done by Prometheus client) and expose them via a HTTP endpoint named “metrics”, where Prometheus server can access to pull your metrics. Practically, I use AOP to calculate all metrics in an isolated module because I just need input and output of functions. By that I do not need to touch anywhere in my application. If you build your own common platform components for your services such as cache, service connection, … then you just inject monitoring module in your components and you do not need to change anything in your business application.
Security
- Prometheus → your application: Prometheus supports configurations for basic auth or bearer token for each request to scrape your metrics. Prometheus also supports configurations for TLS connection if you want to maximize your security.
- Grafana → Prometheus: If you want to use Grafana to visualize your metrics, then you can secure connection from Grafana to Prometheus with basic auth, bearer token and TLS by putting your prometheus server behind a reverse proxy such as nginx.
Performance
Prometheus stores metrics in a Time Series Database, a special database supports querying, aggregating metrics extremely fast with PromQL.
Reference
This is all about Prometheus, including PromQL
https://prometheus.io/docs/introduction/overview/
Specifically, this is how to implement Prometheus, good to start
https://prometheus.io/docs/introduction/first_steps/
Here is interesting article about Time Series Database and metrics storage