Home - Waterfall Grid T-Grid Console Builders Recent Builds Buildslaves Changesources Users - JSON API - About

Console View

Legend:   Passed Failed Warnings Failed Again Running Exception Offline No data

suma1379
fix for auth_api_nrpe failure

[chef #6285](https://github.com/rax-maas/chef/pull/6285) introduced a change to prefer internal identity url for auth requirements.
However this impacts e2e servers. e2e servers are in rackspace public cloud and can not reach internal identity.
nagios check_auth_api check authenticates using default setting from local_settings.json. This fails the check as it can't reach internal identity.
Fixing the check to use a proxy url for staging and public identity for prod.
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
Suman Jakkula
Merge pull request #4903 from rax-maas/workaround-for-intelligence-graph-failure

workaround for metric_list handler for tenants with large volume of checks
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
Aditya Bhatia
Fix aep-to-kafka Kafka test

Since #4901, agent metrics are not being sent to Flume
The tests were modified accordingly.
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
suma1379
prevent agent metrics to flume

We are switching back to scribe for aep metrics. Current state sends to both scribe and flume. To prevent duplicates, removing flume call.
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
Shawn Ashlee
Merge pull request #4902 from rax-maas/fix-agent-connection-grafana-query

agent metrics query to include both flume and scribe
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
Suman Jakkula
Merge pull request #4904 from rax-maas/fix_aep_to_kafka_test

Fix aep-to-kafka Kafka test
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
suma1379
workaround for metric_list handler for tenants with large volume of checks

When a customer browses monitoring graphs in Intelligence (Sage app), `v1.0/{tenantId}/views/metric_list?limit=250` api is called.
Handler linked to this API, calls blueflood for the given tenant to fetch metric data. However, implementation iterates over every entityId+checkId combination and makes a Blueflood call.
For customers with large volume of checks, it easily exceeds the 2K/minute Query API calls enforced by Blueflood repose. Also it takes a very long time (45+ seconds) to respond.

We could change the query pattern to Blueflood call to make it generic for the tenant to fetch all metrics in 1 call using `query=rackspace.monitoring.entities.en*.checks.*`.
Blueflood is able to return data under 7 seconds for such customers.

This could be a quick work around for customers with large lists. We could actually replace existing implementation to make Blueflood calls at "entity" level instead of entityId+checkId combination.
That will also reduce blueflood calls by large fraction.
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
suma1379
prevent agent metrics to flume

We are switching back to scribe for aep metrics. Current state sends to both scribe and flume. To prevent duplicates, removing flume call.
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
Ryan Stewart
Merge pull request #4901 from rax-maas/stop-aep-flume-metrics

prevent agent metrics to flume
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
suma1379
agent metrics query to include both flume and scribe
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
Suman Jakkula
Revert "prevent agent metrics to flume"
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
suma1379
prevent agent metrics to flume

We are switching back to scribe for aep metrics. Current state sends to both scribe and flume. To prevent duplicates, removing flume call.
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
suma1379
prevent agent metrics to flume

We are switching back to scribe for aep metrics. Current state sends to both scribe and flume. To prevent duplicates, removing flume call.
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
suma1379
workaround for metric_list handler for tenants with large volume of checks

When a customer browses monitoring graphs in Intelligence (Sage app), `v1.0/{tenantId}/views/metric_list?limit=250` api is called.
Handler linked to this API, calls blueflood for the given tenant to fetch metric data. However, implementation iterates over every entityId+checkId combination and makes a Blueflood call.
For customers with large volume of checks, it easily exceeds the 2K/minute Query API calls enforced by Blueflood repose. Also it takes a very long time (45+ seconds) to respond.

We could change the query pattern to Blueflood call to make it generic for the tenant to fetch all metrics in 1 call using `query=rackspace.monitoring.entities.en*.checks.*`.
Blueflood is able to return data under 7 seconds for such customers.

This could be a quick work around for customers with large lists. We could actually replace existing implementation to make Blueflood calls at "entity" level instead of entityId+checkId combination.
That will also reduce blueflood calls by large fraction.
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
Suman Jakkula
Merge pull request #4905 from rax-maas/fix_auth_api_nrpe

fix for auth_api_nrpe failure
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
suma1379
prevent agent metrics to flume

We are switching back to scribe for aep metrics. Current state sends to both scribe and flume. To prevent duplicates, removing flume call.
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
suma1379
workaround for metric_list handler for tenants with large volume of checks

When a customer browses monitoring graphs in Intelligence (Sage app), `v1.0/{tenantId}/views/metric_list?limit=250` api is called.
Handler linked to this API, calls blueflood for the given tenant to fetch metric data. However, implementation iterates over every entityId+checkId combination and makes a Blueflood call.
For customers with large volume of checks, it easily exceeds the 2K/minute Query API calls enforced by Blueflood repose. Also it takes a very long time (45+ seconds) to respond.

We could change the query pattern to Blueflood call to make it generic for the tenant to fetch all metrics in 1 call using `query=rackspace.monitoring.entities.en*.checks.*`.
Blueflood is able to return data under 7 seconds for such customers.

This could be a quick work around for customers with large lists. We could actually replace existing implementation to make Blueflood calls at "entity" level instead of entityId+checkId combination.
That will also reduce blueflood calls by large fraction.
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
Ryan Stewart
Merge pull request #4900 from rax-maas/fix-integration-tests-for-webhook-change

fixing gaps in managed webhook url change
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
suma1379
prevent agent metrics to flume

We are switching back to scribe for aep metrics. Current state sends to both scribe and flume. To prevent duplicates, removing flume call.
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio
suma1379
fixing gaps in managed webhook url change

[4899](https://github.com/rax-maas/ele/pull/4899) implemented a change to update managed webhook url change.
I did not realize to update the caller of the webhook helper to include new settings key.
Also fixed integration tests for this change
  • ele-bundle-ubuntu16.04_x86_64-1: 'rack files ...' failed -  stdio