AWS X-Ray Configuration with EKS Cluster
Getting Started with AWS X-Ray
AWS X-Ray is a service that collects data about requests that your application serves, and provides tools that you can use to view, filter, and gain insights into that data to identify issues and opportunities for optimization.
For any traced request to your application, you can see detailed information not only about the request and response but also about calls that your application makes to downstream AWS resources, microservices, databases, and web APIs.
Benefits of AWS RUM
Simple Setup: AWS X-Ray can be used with applications running on Amazon Elastic Compute Cloud (EC2), Amazon EC2 Container Service (Amazon ECS), AWS Lambda, AWS Elastic Beanstalk. It’s easy to get started with X-Ray. You just integrate the X-Ray SDK with your application and install the X-Ray agent.
With AWS Elastic Beanstalk, you only have to integrate the X-Ray SDK with your application since the X-Ray agent is pre-installed on Elastic Beanstalk.
End-to-end tracing: AWS X-Ray provides an end-to-end, cross-service view of requests made to your application. It gives you an application-centric view of requests flowing through your application by aggregating the data gathered from individual services into a single unit called a trace.
Service map: AWS X-Ray creates a map of services used by your application with trace data that you can use to drill into specific services or issues
Data annotation and filtering: AWS X-Ray lets you add annotations to data emitted from specific components or services in your application. You can use this to append business-specific metadata that help you better diagnose issues.
Console and programmatic access: You can use AWS X-Ray with the AWS Management Console, AWS CLI, and AWS SDKs. The X-Ray API lets you programmatically access the service so you can easily export trace data or ingest the data into your own tools and custom analytics dashboards.
Security: AWS X-Ray is integrated with AWS Identity and Access Management (IAM) so that you can control which users and resources have permission to access your traces and how.
Problem Statement
In the case of Latency monitoring in the EKS cluster, we were not able to do it. We have set up the ELK for logging and Grefana and Prometheus for monitoring our environment. But still, we were not able to monitor every request’s latency or traces of all requests.
We were also facing the issue of tracking the slow part of the application, like in case we are using Kafka, EKS, and RDS in the cluster and there is some slowness in the application so It was very difficult to find which service is performing slow.
Solution Approach
As we have mentioned, there are some problem statements regarding managing and monitoring the EKS cluster. So, the approach to fix this problem is that you can implement the AWS X-ray in your application. Below are the prerequisites and implementation plan for the AWS X-ray.
Prerequisite: The setup process for integrating AWS X-Ray with AWS Distro for OpenTelemetry (ADOT) necessitates adherence to specific prerequisites outlined as follows:
- Kubectl installation should align with the respective cluster version.
- eksctl should be installed.
- The installation of eksctl is required.
- Meet the TLS Certificate Requirement to ensure end-to-end encryption.
- Should have an IAM Role with the (AmazonPrometheusRemoteWriteAccess, AWSXrayWriteOnlyAccess, CloudWatchAgentServerPolicy) permissions.
- If installing an add-on version that is v0.62.1 or earlier, grant permissions to Amazon EKS add-ons to install ADOT.
Implementation: After satisfying the prerequisites, we can proceed with the step-by-step implementation of AWS X-Ray, adhering to the following points:
- If installing an add-on version that is v0.62.1 or earlier, grant permissions to Amazon EKS add-ons to install ADOT.
kubectl apply -f https://amazon-eks.s3.amazonaws.com/docs/addons-otel-permissions.yaml
- Create an IAM OIDC provider to connect the service account to AWS IAM.
eksctl utils associate-iam-oidc-provider –region=<AWS-Region> –cluster=<ClusterName>
- Create your service account and IAM role by using the below command. We need to change the highlighted part per the environment in the command.
eksctl create iamserviceaccount \ --name adot-collector \ --namespace default \ --cluster my-cluster \ --attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess \ --attach-policy-arn arn:aws:iam::aws:policy/AWSXrayWriteOnlyAccess \ --attach-policy-arn arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy \ --approve \ --override-existing-serviceaccounts
- Install the ADOT Amazon EKS add-on to your AWS EKS Cluster by following the below steps:
-
- Open the Amazon EKS console at https://console.aws.amazon.com/eks/home#/clusters.
- In the left pane, select Clusters, and then select the name of your cluster on the Clusters page.
- Choose the Add-ons tab.
- Choose Get more add-ons.
- On the Select add-ons page, do the following:
- In the Amazon EKS-addons section, select the AWS Distro for OpenTelemetry checkbox.
- Choose Next.
- On the Configure selected add-ons settings page, do the following:
-
-
- The default version will be selected in the Version drop-down. Select the Version you’d like to use.
- (Optional) If deploying an ADOT Collector, expand the Optional configuration settings and provide the Configuration values that match your use case for Collector deployment. The Add-on configuration schema provides the available options for your configuration values.
- Expand the Optional configuration settings and select Override for the Conflict resolution method if a service account is already created in the cluster without an IAM role.
- Choose Next.
-
On the Review and Add page choose Create. After the add-on installation is complete, you see your installed add-on. Change the ADOT values by using the command below.
aws eks create-addon \ --cluster-name <Clustername> \ --addon-name adot \ --configuration-values "{\"manager\":{\"resources\":{\"limits\":{\"cpu\":\"200m\"}}}}" \ --resolve-conflicts=OVERWRITE
- Now, we need to install a cert-manager to validate the authentication. We need to run the below command for the installation.
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.8.2/cert-manager.yaml
Now, we need to deploy the ADOT Collector by following the below steps:
1. Create the YAML file below with the “collector-config-xray.yaml” name. In this YAML you need to change the Highlighted part as per your environment.
apiVersion: opentelemetry.io/v1alpha1 kind: OpenTelemetryCollector metadata: name: my-collector-xray spec: mode: deployment serviceAccount: adot-collector config: | receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318 processors: exporters: awsxray: region: <YOUR_AWS_REGION> service: pipelines: traces: receivers: [otlp] processors: [] exporters: [awsxray]
2. Now, we need to run the above YAML file by using the below command.
kubectl apply -f collector-config-xray.yaml
- Utilize a Sample Application to generate Trace Data by following the below steps:
1. Download the traffic-generator.yaml file to your computer by using the below command.
curl -O https://raw.githubusercontent.com/aws-observability/aws-otel-community/master/sample-configs/traffic-generator.yaml
2. In traffic-generator.yaml, ensure the second kind value reflects your mode. Which is “kind: Deployment.”
3. Now, you need to apply traffic-generator.yaml to your cluster.
kubectl apply -f traffic-generator.yaml
4. Download the sample-app.yaml file to your computer by using the below command.
curl -O https://raw.githubusercontent.com/aws-observability/aws-otel-community/master/sample-configs/sample-app.yaml
5. In sample-app.yaml, replace the following with your own AWS Region with “<YOUR_AWS_REGION>“
6. In sample-app.yaml, update the value for OTEL_EXPORTER_OTLP_ENDPOINT if it doesn’t match your collector service name. For example, X-Ray requires replacing http://my-collector-collector:4317 replace with http://my-collector-xray-collector:4317.
7. Now apply sample-app.yaml to your cluster by following the below command.
kubectl apply -f sample-app.yaml
- Now, the Implementation has been completed. We can validate it by going to the AWS Console.
Debugging
There may be an error during the Setup of the AWS X-ray so here are some aspects of debugging in AWS X-Ray setup:
Configuration Errors: Verifying that the setup and configuration of X-Ray components, such as SDK integration, permissions, and sampling rules, are correctly implemented.
Instrumentation Problems: Identifying any issues with instrumenting services or applications to send trace data to X-Ray. Debugging involves examining SDK integration within each service to ensure it captures relevant information.
Permission and Access Issues: Debugging authorization and IAM roles to ensure X-Ray has the necessary permissions to collect and access tracing information across AWS services and applications.
Conclusion
In conclusion, the implementation of AWS X-Ray has provided us with invaluable insights into our system’s performance and behavior. This tracing tool has given us a comprehensive understanding of our distributed architecture, allowing us to trace requests, identify bottlenecks, and optimize resource usage. As a result, we are well-positioned to enhance our system’s efficiency and reliability.