blob: b6a3cbf7e846d3d8dbd8fc4b133578c03ac6e07e [file] [log] [blame]
Linus Torvalds1da177e2005-04-16 15:20:36 -07001
2PCI Power Management
3~~~~~~~~~~~~~~~~~~~~
4
5An overview of the concepts and the related functions in the Linux kernel
6
7Patrick Mochel <mochel@transmeta.com>
8(and others)
9
10---------------------------------------------------------------------------
11
121. Overview
132. How the PCI Subsystem Does Power Management
143. PCI Utility Functions
154. PCI Device Drivers
165. Resources
17
181. Overview
19~~~~~~~~~~~
20
21The PCI Power Management Specification was introduced between the PCI 2.1 and
22PCI 2.2 Specifications. It a standard interface for controlling various
23power management operations.
24
25Implementation of the PCI PM Spec is optional, as are several sub-components of
26it. If a device supports the PCI PM Spec, the device will have an 8 byte
27capability field in its PCI configuration space. This field is used to describe
28and control the standard PCI power management features.
29
30The PCI PM spec defines 4 operating states for devices (D0 - D3) and for buses
31(B0 - B3). The higher the number, the less power the device consumes. However,
32the higher the number, the longer the latency is for the device to return to
33an operational state (D0).
34
35There are actually two D3 states. When someone talks about D3, they usually
36mean D3hot, which corresponds to an ACPI D2 state (power is reduced, the
37device may lose some context). But they may also mean D3cold, which is an
38ACPI D3 state (power is fully off, all state was discarded); or both.
39
40Bus power management is not covered in this version of this document.
41
42Note that all PCI devices support D0 and D3cold by default, regardless of
43whether or not they implement any of the PCI PM spec.
44
45The possible state transitions that a device can undergo are:
46
47+---------------------------+
48| Current State | New State |
49+---------------------------+
50| D0 | D1, D2, D3|
51+---------------------------+
52| D1 | D2, D3 |
53+---------------------------+
54| D2 | D3 |
55+---------------------------+
56| D1, D2, D3 | D0 |
57+---------------------------+
58
59Note that when the system is entering a global suspend state, all devices will
60be placed into D3 and when resuming, all devices will be placed into D0.
61However, when the system is running, other state transitions are possible.
62
632. How The PCI Subsystem Handles Power Management
64~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
65
66The PCI suspend/resume functionality is accessed indirectly via the Power
67Management subsystem. At boot, the PCI driver registers a power management
68callback with that layer. Upon entering a suspend state, the PM layer iterates
69through all of its registered callbacks. This currently takes place only during
70APM state transitions.
71
72Upon going to sleep, the PCI subsystem walks its device tree twice. Both times,
73it does a depth first walk of the device tree. The first walk saves each of the
74device's state and checks for devices that will prevent the system from entering
75a global power state. The next walk then places the devices in a low power
76state.
77
78The first walk allows a graceful recovery in the event of a failure, since none
79of the devices have actually been powered down.
80
81In both walks, in particular the second, all children of a bridge are touched
82before the actual bridge itself. This allows the bridge to retain power while
83its children are being accessed.
84
85Upon resuming from sleep, just the opposite must be true: all bridges must be
86powered on and restored before their children are powered on. This is easily
87accomplished with a breadth-first walk of the PCI device tree.
88
89
903. PCI Utility Functions
91~~~~~~~~~~~~~~~~~~~~~~~~
92
93These are helper functions designed to be called by individual device drivers.
94Assuming that a device behaves as advertised, these should be applicable in most
95cases. However, results may vary.
96
97Note that these functions are never implicitly called for the driver. The driver
98is always responsible for deciding when and if to call these.
99
100
101pci_save_state
102--------------
103
104Usage:
Jonathan Corbet5fabdb92007-03-22 16:53:40 -0600105 pci_save_state(struct pci_dev *dev);
Linus Torvalds1da177e2005-04-16 15:20:36 -0700106
107Description:
Jonathan Corbet5fabdb92007-03-22 16:53:40 -0600108 Save first 64 bytes of PCI config space, along with any additional
109 PCI-Express or PCI-X information.
Linus Torvalds1da177e2005-04-16 15:20:36 -0700110
111
112pci_restore_state
113-----------------
114
115Usage:
Jonathan Corbet5fabdb92007-03-22 16:53:40 -0600116 pci_restore_state(struct pci_dev *dev);
Linus Torvalds1da177e2005-04-16 15:20:36 -0700117
118Description:
Jonathan Corbet5fabdb92007-03-22 16:53:40 -0600119 Restore previously saved config space.
Linus Torvalds1da177e2005-04-16 15:20:36 -0700120
121
122pci_set_power_state
123-------------------
124
125Usage:
Jonathan Corbet5fabdb92007-03-22 16:53:40 -0600126 pci_set_power_state(struct pci_dev *dev, pci_power_t state);
Linus Torvalds1da177e2005-04-16 15:20:36 -0700127
128Description:
129 Transition device to low power state using PCI PM Capabilities
130 registers.
131
132 Will fail under one of the following conditions:
133 - If state is less than current state, but not D0 (illegal transition)
134 - Device doesn't support PM Capabilities
135 - Device does not support requested state
136
137
138pci_enable_wake
139---------------
140
141Usage:
Jonathan Corbet5fabdb92007-03-22 16:53:40 -0600142 pci_enable_wake(struct pci_dev *dev, pci_power_t state, int enable);
Linus Torvalds1da177e2005-04-16 15:20:36 -0700143
144Description:
145 Enable device to generate PME# during low power state using PCI PM
146 Capabilities.
147
148 Checks whether if device supports generating PME# from requested state
149 and fail if it does not, unless enable == 0 (request is to disable wake
150 events, which is implicit if it doesn't even support it in the first
151 place).
152
Matt LaPlante5d3f0832006-11-30 05:21:10 +0100153 Note that the PMC Register in the device's PM Capabilities has a bitmask
Linus Torvalds1da177e2005-04-16 15:20:36 -0700154 of the states it supports generating PME# from. D3hot is bit 3 and
155 D3cold is bit 4. So, while a value of 4 as the state may not seem
156 semantically correct, it is.
157
158
1594. PCI Device Drivers
160~~~~~~~~~~~~~~~~~~~~~
161
162These functions are intended for use by individual drivers, and are defined in
163struct pci_driver:
164
Pavel Machek92df5162005-04-05 23:49:49 +0200165 int (*suspend) (struct pci_dev *dev, pm_message_t state);
Linus Torvalds1da177e2005-04-16 15:20:36 -0700166 int (*resume) (struct pci_dev *dev);
Pavel Machek92df5162005-04-05 23:49:49 +0200167 int (*enable_wake) (struct pci_dev *dev, pci_power_t state, int enable);
Linus Torvalds1da177e2005-04-16 15:20:36 -0700168
169
170suspend
171-------
172
173Usage:
174
175if (dev->driver && dev->driver->suspend)
176 dev->driver->suspend(dev,state);
177
178A driver uses this function to actually transition the device into a low power
179state. This should include disabling I/O, IRQs, and bus-mastering, as well as
180physically transitioning the device to a lower power state; it may also include
181calls to pci_enable_wake().
182
183Bus mastering may be disabled by doing:
184
185pci_disable_device(dev);
186
187For devices that support the PCI PM Spec, this may be used to set the device's
188power state to match the suspend() parameter:
189
190pci_set_power_state(dev,state);
191
192The driver is also responsible for disabling any other device-specific features
193(e.g blanking screen, turning off on-card memory, etc).
194
195The driver should be sure to track the current state of the device, as it may
196obviate the need for some operations.
197
198The driver should update the current_state field in its pci_dev structure in
199this function, except for PM-capable devices when pci_set_power_state is used.
200
201resume
202------
203
204Usage:
205
206if (dev->driver && dev->driver->suspend)
207 dev->driver->resume(dev)
208
209The resume callback may be called from any power state, and is always meant to
210transition the device to the D0 state.
211
212The driver is responsible for reenabling any features of the device that had
213been disabled during previous suspend calls, such as IRQs and bus mastering,
214as well as calling pci_restore_state().
215
216If the device is currently in D3, it may need to be reinitialized in resume().
217
218 * Some types of devices, like bus controllers, will preserve context in D3hot
219 (using Vcc power). Their drivers will often want to avoid re-initializing
220 them after re-entering D0 (perhaps to avoid resetting downstream devices).
221
222 * Other kinds of devices in D3hot will discard device context as part of a
223 soft reset when re-entering the D0 state.
224
225 * Devices resuming from D3cold always go through a power-on reset. Some
226 device context can also be preserved using Vaux power.
227
228 * Some systems hide D3cold resume paths from drivers. For example, on PCs
229 the resume path for suspend-to-disk often runs BIOS powerup code, which
230 will sometimes re-initialize the device.
231
232To handle resets during D3 to D0 transitions, it may be convenient to share
233device initialization code between probe() and resume(). Device parameters
234can also be saved before the driver suspends into D3, avoiding re-probe.
235
236If the device supports the PCI PM Spec, it can use this to physically transition
237the device to D0:
238
239pci_set_power_state(dev,0);
240
241Note that if the entire system is transitioning out of a global sleep state, all
242devices will be placed in the D0 state, so this is not necessary. However, in
243the event that the device is placed in the D3 state during normal operation,
244this call is necessary. It is impossible to determine which of the two events is
245taking place in the driver, so it is always a good idea to make that call.
246
247The driver should take note of the state that it is resuming from in order to
248ensure correct (and speedy) operation.
249
250The driver should update the current_state field in its pci_dev structure in
251this function, except for PM-capable devices when pci_set_power_state is used.
252
253
254enable_wake
255-----------
256
257Usage:
258
259if (dev->driver && dev->driver->enable_wake)
260 dev->driver->enable_wake(dev,state,enable);
261
262This callback is generally only relevant for devices that support the PCI PM
263spec and have the ability to generate a PME# (Power Management Event Signal)
264to wake the system up. (However, it is possible that a device may support
265some non-standard way of generating a wake event on sleep.)
266
267Bits 15:11 of the PMC (Power Mgmt Capabilities) Register in a device's
Matt LaPlante5d3f0832006-11-30 05:21:10 +0100268PM Capabilities describe what power states the device supports generating a
Linus Torvalds1da177e2005-04-16 15:20:36 -0700269wake event from:
270
271+------------------+
272| Bit | State |
273+------------------+
274| 11 | D0 |
275| 12 | D1 |
276| 13 | D2 |
277| 14 | D3hot |
278| 15 | D3cold |
279+------------------+
280
281A device can use this to enable wake events:
282
283 pci_enable_wake(dev,state,enable);
284
285Note that to enable PME# from D3cold, a value of 4 should be passed to
286pci_enable_wake (since it uses an index into a bitmask). If a driver gets
287a request to enable wake events from D3, two calls should be made to
288pci_enable_wake (one for both D3hot and D3cold).
289
290
pavel@ucw.cz21d6b7e2005-06-25 14:55:16 -0700291A reference implementation
292-------------------------
293.suspend()
294{
295 /* driver specific operations */
296
297 /* Disable IRQ */
298 free_irq();
299 /* If using MSI */
300 pci_disable_msi();
301
302 pci_save_state();
303 pci_enable_wake();
304 /* Disable IO/bus master/irq router */
305 pci_disable_device();
306 pci_set_power_state(pci_choose_state());
307}
308
309.resume()
310{
311 pci_set_power_state(PCI_D0);
312 pci_restore_state();
313 /* device's irq possibly is changed, driver should take care */
314 pci_enable_device();
315 pci_set_master();
316
317 /* if using MSI, device's vector possibly is changed */
318 pci_enable_msi();
319
320 request_irq();
321 /* driver specific operations; */
322}
323
324This is a typical implementation. Drivers can slightly change the order
325of the operations in the implementation, ignore some operations or add
Matt LaPlantefff92892006-10-03 22:47:42 +0200326more driver specific operations in it, but drivers should do something like
pavel@ucw.cz21d6b7e2005-06-25 14:55:16 -0700327this on the whole.
328
Linus Torvalds1da177e2005-04-16 15:20:36 -07003295. Resources
330~~~~~~~~~~~~
331
332PCI Local Bus Specification
333PCI Bus Power Management Interface Specification
334
Randy Dunlap98766fb2005-11-21 21:32:31 -0800335 http://www.pcisig.com
Linus Torvalds1da177e2005-04-16 15:20:36 -0700336