Installation Troubleshooting¶
This page summarizes common issues with the installer and provides troubleshooting solutions to help users quickly resolve problems encountered during installation and operation.
Troubleshooting DCE 5.0 Platform Interface Unavailability with diag.sh Script¶
Since installer version v0.12.0, a new diag.sh script has been added to facilitate troubleshooting when the DCE 5.0 platform interface is unavailable.
Run the command:
Example of execution result:
Failure to Start kubelet Service After Kind Container Restart¶
After restarting the Kind container, the kubelet service fails to start and reports an error:
Solution:
-
Solution 1: Restart by executing the command
podman restart [kind] --time 120
, and do not interrupt this task with ctrl+c during the execution process. -
Solution 2: Enter the Kind container using
podman exec
and Run the following command:
Podman fails to create containers after disabling IPv6
¶
Error message:
Solutions
Re-enable IPv6 or change bootstrapping node base to Docker.
Podman-related issues: https://github.com/containers/podman/issues/13388
Redis get stuck when reinstalling DCE 5.0 in kind cluster¶
Issue: The Redis Pod remains in a 0/4 running state for a long time with the error message primary ClusterIP can not unset
.
-
Delete the rfs-mcamel-common-redis service in the mcamel-system namespace:
-
Then, retry the installation command.
When using Metallb, the VIP access is blocked and cannot log into DCE¶
-
Check whether the VIP is in the same network segment as the host. In Metallb L2 mode, they should be in the same network segment.
-
If this error occurs after you added a new NIC to the control node in the global cluster, you need to manually declare and configure
L2Advertisement
.Refer to related Metallb issues
Community package: fluent-bit
installation failed¶
Error:
Solutions:
Check if the following key information appears in the Pod log:
[warn] [net] getaddrinfo(host='mcamel-common-es-cluster-masters-es-http.mcamel-system.svc.cluster.local',errt11):Could not contact DNS servers
If yes, it is a known bug of fluent-bit
bug. Refer to: https://github.com/aws/aws-for-fluent-bit/issues/233
Error reported during CentOS 7.6 installation¶
Error:
Solutions:
Run modprobe br_netfilter
on each node where the global service cluster is installed, and wait until br_netfilter
is loaded.
CentOS environment preparation issues¶
Running yum install docker
reports an error:
Failed to set locale, defaulting to C.UTF-8
CentOS Linux 8 - AppStream 93 B/s | 38 B 00:00
Error: Failed to download metadata for repo 'appstream': Cannot prepare internal mirrorlist: No URLs in mirrorlist
You can try the following methods to resolve the issue:
-
Install
glibc-langpack-en
package: -
If the issue persists, try the following:
Unable to Restart kind Cluster Properly After Bootstrap Node Reboot¶
After rebooting the bootstrap node, the kind cluster may fail to restart properly because the kind cluster was not set to start automatically on the openEuler 22.03 LTS SP2 operating system during deployment.
To resolve this issue, execute the following command to restart:
Note
If the above scenario occurs in other environments, you can also execute the same command for restarting.
Missing ip6tables When Deploying Ubuntu 20.04 as the Bootstrap Machine¶
When deploying Ubuntu 20.04 as the bootstrap machine, the absence of ip6tables can cause errors during the deployment process.
Refer to the Podman known issue.
Temporary solution: Manually install iptables, refer to Install and Use iptables on Ubuntu 22.04.