2.4 Thermal Control and CPU Frequency Management
The following information does not apply to the RDK Ultra
development board.
X3 Thermal Control
The following information applies to the RDK X3
and RDK X3 Module
development boards.
To avoid chip overheating under heavy load, power management is implemented at the operating system level. The SoC has an internal temperature sensor, and the Thermal subsystem monitors this temperature.
Main Temperature Points
- Boot Temperature: This is the maximum temperature during system startup. If the temperature exceeds this threshold, the system will immediately throttle the CPU and BPU frequencies during boot. The current configuration value can be obtained using the command
cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_temp
, with a default value of 80000 (80 degrees Celsius). - Throttling Temperature: This is the temperature at which the CPU and BPU frequencies are throttled. When the temperature exceeds this threshold, the CPU and BPU frequencies are reduced to decrease the SoC power consumption. The CPU frequency can be throttled down to a minimum of 240 MHz, and the BPU frequency can be throttled down to a minimum of 400 MHz. The current configuration value can be obtained using the command
cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_1_temp
, with a default value of 95000 (95 degrees Celsius). - Shutdown Temperature: This is the temperature at which the system shuts down to protect the chip and hardware. It is recommended to ensure proper cooling of the device to prevent shutdowns. After a shutdown, the device does not automatically restart and requires a manual power cycle of the development board to restart. The current configuration value can be obtained using the command
cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_2_temp
, with a default value of 105000 (105 degrees Celsius).
The current chip operating frequency, temperature, and other status can be viewed using the sudo hrut_somstatus
command:
Configuring Temperature Thresholds
The throttling and shutdown temperature thresholds can be temporarily configured using the following commands. The throttling temperature cannot exceed the shutdown temperature, and the shutdown temperature cannot be set above 105 degrees Celsius.
For example, to set the throttling temperature to 85 degrees Celsius:
echo 85000 > /sys/devices/virtual/thermal/thermal_zone0/trip_point_1_temp
For example, to set the shutdown temperature to 105 degrees Celsius:
echo 105000 > /sys/devices/virtual/thermal/thermal_zone0/trip_point_2_temp
Please note that the above configuration will be reset to the default values after a system restart. To ensure persistent configuration, these commands can be added to the startup scripts for automatic configuration.
X5 Thermal Control
The following information applies to the RDK X5
development boards.
temperature sensor
There are three temperature sensors on X5, which are used to display the temperature of DDR/BPU/CPU. In/sys/class/hwmon/, there is a hwmon0 directory containing the relevant parameters of temperature sensors. temp1_input is the temperature of DDR, temp2input is the temperature of BPU, and temp3_input is the temperature of CPU. The accuracy of temperature is 0.001 degrees Celsius
cat /sys/class/hwmon/hwmon0/temp1_input
46643
The temperature sensor of BPU is located in the BPU subsystem, and the BPU subsystem is only powered on when the BPU is running, so the temperature of the BPU can only be viewed when it is running.
Thermal
Linux Thermal is a temperature control module in the Linux system, mainly used to control the heat generated by the chip during system operation, so as to maintain the temperature of the chip and the device casing within a safe and comfortable range.
To achieve reasonable control of equipment temperature, we need to understand the following three modules:
Device for obtaining temperature: abstracted as Thermal Zone Device in the Thermal framework, X5 has two thermal zones, namely thermalzone0 and thermalzone1;
Devices that require cooling: abstracted as Thermal Cooling Devices in the Thermal framework, including CPU, BPU, GPU, and DDR;
Temperature control strategy: abstracted as Thermal Governor in the Thermal framework;
The information and controls of the above modules can be obtained in the/sys/class/male directory.
There are a total of four cooling devices in X5:
-
cooling_device0: cpu
-
cooling_device1: bpu
-
cooling_device2: gpu
-
cooling_device3: ddr
Among them, the cooling device DDR is associated with thermalzone0, and the cooling device CPU/BPU/GPU is associated with thermalzone1. The current default strategy is known to be using stepw_ise through the following command.
cat /sys/class/thermal/thermal_zone0/policy
The supported strategies can be seen through the following command: user_Space, stepw_ise, a total of two types.
cat /sys/class/thermal/thermal_zone0/available_policies
User_Space reports the current temperature of the temperature zone, temperature control trigger points, and other information to the user space through uevent, and the user space software formulates the temperature control strategy.
Stepw_ise is a relatively mild temperature control strategy that gradually increases the cooling state during each polling cycle
The specific strategy to choose is based on the needs of the product. It can be specified during compilation or dynamically switched through sysfs. For example, the strategy for dynamically switching thermalzone0 to user_stpace mode
echo user_space > /sys/class/thermal/thermal_zone0/policy
There is one trip_point in thermalzone0, which is used to control the frequency modulation temperature of the cooling device DDR
The frequency modulation temperature of DDR can be viewed through sysfs, and the current configuration is 95 degrees
cat /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_temp
If you want to adjust the frequency modulation temperature of DDR, such as 85 degrees, you can use the following command:
echo 85000 > /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_temp
There are three triplepoints in thermalzone1, where triple_point_0_temp is reserved; Trip_point_1_temp is the frequency modulation temperature of this thermal zone, which can control the frequency of CPU/BPU/GPU. It is currently set to 95 degrees. Trip_point_2_temp is the shutdown temperature, currently set to 105 degrees. For example, if you want the junction temperature to reach 85 degrees Celsius, the CPU/BPU/GPU will start frequency modulation:
echo 85000 > /sys/devices/virtual/thermal/thermal_zone1/trip_point_1_temp
If you want to adjust the shutdown temperature to 105 degrees Celsius:
echo 105000 > /sys/devices/virtual/thermal/thermal_zone1/trip_point_2_temp
Please note that the above configuration will be reset to the default values after a system restart. To ensure persistent configuration, these commands can be added to the startup scripts for automatic configuration.
CPU Frequency Management
The Linux kernel includes the cpufreq subsystem for controlling CPU frequencies and frequency control policies.
Navigate to the /sys/devices/system/cpu/cpufreq/policy0
directory and execute the ls
command. You will see the following files:
affected_cpus // CPUs currently affected by the frequency control (offline CPUs are not displayed)
cpuinfo_cur_freq // Current CPU frequency (unit: KHz)
cpuinfo_max_freq // The highest frequency available for the CPU under the current scaling strategy (unit: KHz)
cpuinfo_min_freq // The lowest frequency available for the CPU under the current scaling strategy (unit: KHz)
cpuinfo_transition_latency // The time required to switch the processor frequency (unit: ns)
related_cpus // The CPU cores affected by this control strategy (including all online+offline CPUs)
scaling_available_frequencies // List of CPU supported frequencies (unit: KHz)
scaling_available_governors // List of all available governors (scaling) types in the current kernel
scaling_boost_frequencies // List of supported frequencies for CPU in boost mode (unit: KHz)
scaling_cur_freq // The currently cached CPU frequency in the cpufreq module, without checking the CPU hardware registers.
scaling_disable_freq // The CPU frequency that is disabled and can only be set to one value
scaling_driver // The currently used scaling driver
scaling_governor // The current governor (scaling) strategy
scaling_max_freq // The highest frequency available for the CPU under the current scaling strategy (read from the cpufreq module cache)
scaling_min_freq // The lowest frequency available for the CPU under the current scaling strategy (read from the cpufreq module cache)
scaling_setspeed // A file that should be used to switch the governor to 'userspace' before use. Echo a value to this file to switch the frequency.
The Linux kernel used by the RDK system supports the following types of scaling strategies:
- Performance: It always keeps the CPU in the highest power consumption and highest performance state, which is the maximum frequency supported by the hardware.
- Powersave: It always keeps the CPU in the lowest power consumption and lowest performance state, which is the minimum frequency supported by the hardware.
- Ondemand: It periodically checks the load and adjusts the frequency accordingly. When the load is low, it adjusts to the minimum frequency that can meet the current load requirements. When the load is high, it immediately boosts to the highest performance state.
- Conservative: It is similar to the ondemand strategy. It periodically checks the load and adjusts the frequency accordingly. When the load is low, it adjusts to the minimum frequency that can meet the current load requirements. However, when the load is high, it gradually increases the frequency instead of immediately setting it to the highest performance state.
- Userspace: It exposes the control interface through sysfs to allow users to customize their own strategies. Users can manually adjust the frequency in the user space.
- Schedutil: This is a strategy introduced in Linux-4.7 that adjusts the frequency based on the CPU utilization information provided by the scheduler. It has similar effects to the ondemand strategy but is more accurate and natural (as the scheduler has the best understanding of CPU usage).
Users can control the CPU scaling strategy by modifying the corresponding settings under the directory /sys/devices/system/cpu/cpufreq/policy0
.
For example, to set the CPU to performance mode:
sudo bash -c "echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor"
Or to set the CPU to a fixed frequency (1GHz):
sudo bash -c "echo userspace > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor"
sudo bash -c "echo 1000000 > /sys/devices/system/cpu/cpufreq/policy0/scaling_setspeed"
CPU Overclocking
The following content is applicable to RDK X3
and RDK X3 Module
development boards and not applicable to the RDK Ultra
development board.
Video: https://www.youtube.com/watch?v=WqLxbN2qw-k&list=PLSxjn4YS2IuFUWcLGj2_uuCfLYnNYw6Ld&index=5
The development board uses the CPU Freq driver to manage the CPU operating state. The default mode is the 'ondemand' mode, where the CPU frequency is dynamically adjusted based on the load to save power. User can change to the 'performance' mode to make the CPU always operate at the highest frequency. The command is as follows:
sudo bash -c 'echo performance > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor'
The development board provides overclocking function in the system, which can increase the CPU maximum frequency from 1.2GHz to 1.5GHz. The configuration command is as follows:
sudo bash -c 'echo 1 > /sys/devices/system/cpu/cpufreq/boost'
The CPU frequency configured by the above command only takes effect during current operation. If the device is restarted, it will return to the default configuration.
Overclocking the CPU will increase the power consumption and heat dissipation of the chip. If stability issues occur, you can disable the overclocking function with the following command:
sudo bash -c 'echo 0 > /sys/devices/system/cpu/cpufreq/boost'
You can use the sudo hrut_somstatus
command to check the current chip operating frequency, temperature, and other status: