Monitoring as part of development cycle

third party services

  • ohdearapp


  • heartbeat / status page
  • processes count
    • there is a limit to an amount of process we can run
  • long running process
  • too many processes: there is a limit to the number of processes a distribution can have
  • scheduler: process is running
  • queue monitoring:
    • process is running
    • when job taking too long
    • when queue backed up / Too many jobs
  • database
    • slow queries
    • connections count
  • CPU: when above 67%
  • memory: when above 67%
  • session
    • alarm when going viral ie alert when active users >= 50