Ticket #2380 (closed defect: fixed)

Opened 3 years ago

Last modified 3 years ago

pwmconfig doesn't detect correlations properly

Reported by: khali Owned by: khali
Priority: major Milestone: 3.2.0
Component: fancontrol Version: 3.1.2
Keywords: pwmconfig Cc:

Description

Original report from Charles Pillar on the lm-sensors list:
 http://lists.lm-sensors.org/pipermail/lm-sensors/2010-August/029234.html
Unfortunately the post is incomplete in the archive, so I duplicate it below.

I believe that I have found a bug in pwmconfig. I first observed this behavior many many months ago and couldn't find anyone else with the problem so I just assumed it was just me. I've since stumbled on it again so I decided to look into it myself. I don't know if anyone is aware of the behavior I am seeing, but here it is...

Take for example a board with two or more PWM controllable fans both which of which the speed can be measured. Thus I have:

/sys/class/hwmon/hwmon0/pwm1
/sys/class/hwmon/hwmon0/pwm1_enable
/sys/class/hwmon/hwmon0/fan1_input
/sys/class/hwmon/hwmon0/pwm2
/sys/class/hwmon/hwmon0/pwm2_enable
/sys/class/hwmon/hwmon0/fan2_input
(etc...)

pwm1 & pwm1_enable = fan1_input
pwm2 & pwm2_enable = fan2_input
(etc...)

I think this would be a fairly common set up? Indeed I have three machines that are setup this way (1 board has 2 fans, the other 2 have 3 fans each)

From what I can see, pwmconfig does this:

pwm1_enable=0
pwm2_enable=0
wait...
for each PWM:
   this pwm_enable=1
   this pwm=0
   for each fan
      compare this fan before / after
      this pwm_enable=0
      check fan returns to normal
   next fan
next pwm

The problem with this logic is that for each PWM, the pwm_enable is set to 1, then the first fan is tested, after the first fan is tested, the pwm is disabled and never re-enabled (until the next pwm)... This means the pwm1=fan1 correlation is detected, but pwm2=fan2 is not - but only because the pwm_enable is still set to 0 when the second and subsequent fans are tested...

There are several ways to fix this, but after some pondering, I think that this logic might be most appropriate:

pwm1_enable=0
pwm2_enable=0
wait...
for each PWM:
   for each fan
      this pwm_enable=1
      this pwm=0
      compare this fan before / after
      this pwm_enable=0
      check fan returns to normal
   next fan
next pwm

This causes the given pwm/fan to stop/start once for each fan input, which might seem bad, but then each fan is individually checked to make sure that it has begun spinning again after pwm_enable is set to 0... (as compared to stopping and starting once per pwm...) I have a patch which I can supply. It fixes the problem for me, but I'm not 100% sure if it introduces any other problems. Let me know if I should email it or submit somewhere etc...

Change History

Changed 3 years ago by khali

  • owner set to khali
  • priority changed from critical to major
  • status changed from new to assigned

In practice, the bug is hard to trigger, because most hwmon drivers cache the monitored values for a second, so the pwmconfig script gets an old (0) reading for correlated fans even though PWM has been disabled.

Changed 3 years ago by khali

  • status changed from assigned to closed
  • resolution set to fixed

Fixed in r5855.

Note: See TracTickets for help on using tickets.