Creating and Alerting on Logs-based Metrics
Initial Setup
- Set the zone in
gcloud
:gcloud config set compute/zone us-west1-b
- Then Authorize Cloud Shell.
- Set the
project ID
:export PROJECT_ID=$(gcloud info --format='value(config.project)')
- Deploy a standard GKE cluster, which will prompt you to authorize and enable the GKE API.
gcloud container clusters create gmp-cluster --num-nodes=1 --zone us-west1-b
Log-based alert
- From Cloud Console, in the Search bar, type in “logs explorer”, then click on the Logs Explorer result.
- Click the Show Query slide bar.
- Enter the following parameters to create Log Based Alert:
resource.type="gce_instance" protoPayload.methodName="v1.compute.instances.stop"
- Click Create alert link.
- Add the following parameters, click Next to move to the next parameter.
Alert name: stopped vm
Choose logs to include in the alert: will auto-fill with the query you entered
Set notification frequency and autoclose duration: Time between notifications is 5 min and Incident autoclose duration is 1 hr.
Click Next. - Who should be notified (optional):
Click on dropdown arrow next to Notification Channels, then click on Manage Notification Channels.
A Notification channels page will open in new tab.
Scroll down the page and click on ADD NEW for Email.
Enter your personal email in the Email Address field and a Display name.
Click Save.
When done, return to the Logs Explorer tab you were in previously.
Refresh the Notification Channels, then select the channel you just created. Click OK. - Go to the 2nd Cloud Console tab, and navigate to Navigation menu > Compute Engine > VM instances.
- Check the box next to instance1, then click Stop at the top of the page, then click Stop again in the pop-up window. The green check mark will turn to a gray circle when the instance has been stopped.
- In the Search bar, type “monitoring”, then choose the Monitoring option.
- Click on the Alerting tab. You’ll see that your alert has registered. Under Alert Policies click the View all link and you’ll see the log-based alert you created listed.
Log-based metric
- At the beginning of the lab you deployed a standard GKE cluster. Run the following command to ensure that the cluster named
gmp-cluster
has been created:gcloud container clusters list
If your cluster status says PROVISIONING, wait a moment and run the command above again. Repeat until the status is RUNNING.
- Authenticate the cluster:
gcloud container clusters get-credentials gmp-cluster
- Create a namespace to work in:
kubectl create ns gmp-test
- Now run the following to deploy a simple application that emits metrics at the
/metrics
endpoint:kubectl -n gmp-test apply -f https://storage.googleapis.com/spls/gsp091/gmp_flask_deployment.yaml
kubectl -n gmp-test apply -f https://storage.googleapis.com/spls/gsp091/gmp_flask_service.yaml
- Verify that the namespace is ready and emitting metrics:
kubectl get services -n gmp-test
- Re-run the command until you see the External-IP address populated.
- Check that the Python Flask app is serving metrics with the following command:
curl $(kubectl get services -n gmp-test -o jsonpath='{.items[*].status.loadBalancer.ingress[0].ip}')/metrics
Create a log-based metric
- Return to Logs Explorer.
- Click Create metric link.
- On the Create metric page, input the following:
Metric type: leave the default setting, Counter
Log metric name: hello-app-error
Filter selection: update the following into the field:severity=ERROR resource.labels.container_name="hello-app" textPayload: "ERROR: 404 Error page not found"
- Click Create metric.
Create a metrics-based alert
- Under Create a metrics-based alert, click Create Alert.
- Under Select a Metric, the metric parameters will automatically fill in.
Update the Rolling window to 2 min.
Accept the other default settings
Click Next. - You will need to set Notifications. Feel free to re-use the channel you created earlier in the lab.
- Name the alert
log based metric alert
. - Click Create Policy.
Generate some errors
-
In Cloud Shell, run the following to generate some errors:
timeout 120 bash -c -- 'while true; do curl $(kubectl get services -n gmp-test -o jsonpath='{.items[*].status.loadBalancer.ingress[0].ip}')/error; sleep $((RANDOM % 4)) ; done'
- Return to the Logs Explorer page, and go to the Severity section on the lower left side. Click on the Error severity. Now you can search for the
404 Error page not found
error. View more information by expanding one of the 404 Error messages. - Return to the Monitoring page, and click on Alerting. You will see the 2 policies you created.
- Click on the Alert policies link, and you should see both alerts in the Incidents section. Click on an incident to see details.
Tag:Google Cloud