Troubleshoot Cassandra Setup
The C3 Agentic AI Platform offers Cassandra as a database solution. The platform uses the K8ssandra Operator to create resources and manage Cassandra pods.
To learn more about the K8ssandra Operator, see the Kubernetes documentation K8ssandra Operator.
To learn more about Cassandra as a database solution for the platform, see Cassandra Database.
Cassandra resources and pods are created during the platform installation and deployment process. The c3aiops Helm chart defines k8ssandra-operator resources. See k8ssandra-operator.
This topic provides troubleshooting guidance if you encounter an issue with Cassandra for your platform.
Prerequisites
Make sure the following are complete before you perform Cassandra troubleshooting steps:
- You, C3 AI Operations, C3 AI CoE, or an administrator has installed and deployed the platform with Cassandra as a database.
- Have access to your platform Kubernetes deployment. The kubectl commands in this topic require Kubernetes access to run them.
Troubleshoot Cassandra setup
If you run a Cassandra query in the C3 AI Console, for example NormalizedTimeseries.fetch();, you might encounter the following error:
NoHostAvailableException: All host(s) tried for query failedIf you encounter this error, check if your cluster has Cassandra configured as a database. run the following command in the C3 AI Console:
let cass = CassandraDB.list()[0];
CassandraDB.forId(cass.id).status;This code returns an "available" response if Cassandra is configured as a database for your cluster.
To check if all queries are hitting this same issue, run the following command in the C3 AI Console:
TargetStage.fetch({limit: 1})This command should return 0 or 1 records if Cassandra is healthy. If this command returns a similar error, the issue likely impacts all environments and applications in your cluster. If this command runs sucessfully, then work with your C3 AI project team to understand when the Cassandra query fails.
If your cluster Cassandra configuration has been set improperly, the platform uses the default PostgreSQL or relational database instead. In this case, the TargetStage.fetch({limit: 1}) command still succeeds because the platform queries the default PostgreSQL or relational database. See the previous code snippet to check if your cluster has Cassandra properly configured.
If the TargetStage.fetch({limit: 1}) command returns an error, run the following code in the command line of your host machine to investigate further:
Check if
default-stspods are running:Command Linekubectl get pods --all-namespaces \ -l c3__func-0=0k8sc3cassv20 \ -l app.kubernetes.io/managed-by=cass-operatorCheck Medusa and Reaper if all pods in the Cassandra ring are running:
Command Linekubectl get pods --all-namespaces -l c3__func-0=0k8sc3cassv20 kubectl get pods --all-namespaces -l c3__func-0=0k8sc3cass0To learn more about Medusa and Reaper, see the following Kubernetes documentation
If
default-stsor Cassandra pods are not running, perform a restart:Command Linekubectl rollout restart -n <cluster_id> statefulset \ -l app.kubernetes.io/managed-by=cass-operatorIf the pods are still not running after the restart, check the Cassandra logs to investigate:
Command Linekubectl logs -n <cluster_id> \ -l app.kubernetes.io/managed-by=cass-operatorRetrieve the Cassandra secret name to run the nodetool command in the subsequent step:
Command Linekubectl get secret -n <c3namespace> -o name | grep "superuser"This command returns the Cassandra secret name, such as
secret/cs-abcd012345-superuser.Or, you can run the following command instead:
Command Linekubectl get secret <secret-name> -n <c3namespace> \ -o jsonpath='{.data.username}' | base64 -dOr, you can generate a randomly set secure password instead:
Command Linekubectl get secret <secret-name> -n <c3namespace> \ -o jsonpath='{.data.password}' | base64 -dNavigate to the pod and check if nodetool status indicates that nodes are down. Pass the secret from the previous step:
Command Linenodetool -u <cassandra_username> -pw <cassandra_password> statusThis command should show all nodes with status
UNif the nodes are healthy.If everything looks healthy, perform a restart:
Command Linekubectl rollout restart -n <cluster_name> statefulset \ -l app.kubernetes.io/managed-by=cass-operatorIf none of the previous steps resolve the issue, use the following K8ssandra and Kubernetes resources to further troubleshoot: