How to troubleshoot pod evictions in Sourcegraph Kubernetes deployments
This document will take you through how to solve for pod eviction that can cause data loss in ephemeral storage.
This document will take you step-by-step through the tasks required to perform troubleshooting to understand why this occurrence took place and eventually solve for it.
Prerequisites
This document assumes that you have deployed Sourcegraph on Kubernetes and you are a site admin for your organization.
Steps to troubleshoot
- Run
kubectl describe pod $EVICTEDPOD - Check the
Messageobject - If the error is:
Pod ephemeral local storage usage exceeds the total limit of containers xGi. - Check on the:
ephemeral-storageLimits and Requests, for exampleephemeral-storage: xGi. Also, check the cache size for the pod where$PODNAME_CACHE_SIZE_MB>:x0000, (x is an integer). - In the
$PODNAME.Deployment.yaml, raise theephemeral-storagefigures to a preferred storage size for your node and set theCACHE_SIZE_MBto a size lower than the ephemeral storage limit. - Enable auto scaling by increasing the number of replicas(if preferred)