|
| Name
| Size
| Last modified (GMT)
| Description
|
| Parent directory
|
| 2000-11-28 01:47:38
|
|
| ChangeLog
| 48782 bytes
| 2000-07-06 04:36:49
|
|
| README
| 56044 bytes
| 2000-11-29 18:11:38
|
|
| ToDo
| 1128 bytes
| 2000-07-06 04:36:49
|
|
| boot-options
| 1702 bytes
| 2000-05-09 05:00:41
|
|
| rc.devfs
| 2511 bytes
| 2000-02-16 23:42:05
|
|
1 Devfs (Device File System) FAQ
2
3
4 Linux Devfs (Device File System) FAQ
5 Richard Gooch
6 3-JUL-2000
7
8 -----------------------------------------------------------------------------
9
10 NOTE: the master copy of this document is available online at:
11
12 http://www.atnf.csiro.au/~rgooch/linux/docs/devfs.html
13 and looks much better than the text version distributed with the
14 kernel sources.
15
16 There is also an optional daemon that may be used with devfs. You can
17 find out more about it at:
18
19 http://www.atnf.csiro.au/~rgooch/linux/
20
21 NEWFLASH: The official 2.3.46 kernel has
22 included the devfs patch. Future patches will be released which
23 build on this. These patches are rolled into Linus' tree from time to
24 time.
25
26 A mailing list is available which you may subscribe to. Send
27 email
28 to majordomo@oss.sgi.com with the following line in the
29 body of the message:
30 subscribe devfs
31 The list is archived at
32
33 http://oss.sgi.com/projects/devfs/archive/.
34
35 -----------------------------------------------------------------------------
36
37 Contents
38
39
40 What is it?
41
42 Why do it?
43
44 Who else does it?
45
46 How it works
47
48 Operational issues (essential reading)
49
50 Instructions for the impatient
51 Permissions persistence accross reboots
52 Dealing with drivers without devfs support
53 All the way with Devfs
54 Other Issues
55 Kernel Naming Scheme
56 Devfsd Naming Scheme
57 SCSI Host Probing Issues
58
59
60
61 Device drivers currently ported
62
63 Allocation of Device Numbers
64
65 Questions and Answers
66
67 Making things work
68 Alternatives to devfs
69
70
71 Other resources
72
73
74 -----------------------------------------------------------------------------
75
76
77 What is it?
78
79 Devfs is an alternative to "real" character and block special devices
80 on your root filesystem. Kernel device drivers can register devices by
81 name rather than major and minor numbers. These devices will appear in
82 devfs automatically, with whatever default ownership and
83 protection the driver specified. A daemon (devfsd) can be used to
84 override these defaults.
85
86 NOTE that devfs is entirely optional. If you prefer the old
87 disc-based device nodes, then simply leave CONFIG_DEVFS_FS=n (the
88 default). In this case, nothing will change. ALSO NOTE that if you do
89 enable devfs, the defaults are such that full compatibility is
90 maintained with the old devices names.
91
92 There are two aspects to devfs: one is the underlying device
93 namespace, which is a namespace just like any mounted filesystem. The
94 other aspect is the filesystem code which provides a view of the
95 device namespace. The reason I make a distinction is because devfs
96 can be mounted many times, with each mount showing the same device
97 namespace. Changes made are global to all mounted devfs filesystems.
98 Also, because the devfs namespace exists without any devfs mounts, you
99 can easily mount the root filesystem by referring to an entry in the
100 devfs namespace.
101
102 The cost of devfs is a small increase in kernel code size and memory
103 usage. About 7 pages of code (some of that in __init sections) and 72
104 bytes for each entry in the namespace. A modest system has only a
105 couple of hundred device entries, so this costs a few more
106 pages. Compare this with the suggestion to put /dev on a <a
107 href="#why-faq-ramdisc">ramdisc.
108
109 On a typical machine, the cost is under 0.2 percent. On a modest
110 system with 64 MBytes of RAM, the cost is under 0.1 percent. The
111 accusations of "bloatware" levelled at devfs are not justified.
112
113 -----------------------------------------------------------------------------
114
115
116 Why do it?
117
118 There are several problems that devfs addresses. Some of these
119 problems are more serious than others (depending on your point of
120 view), and some can be solved without devfs. However, the totality of
121 these problems really calls out for devfs.
122
123 The choice is a patchwork of inefficient user space solutions, which
124 are complex and likely to be fragile, or to use a simple and efficient
125 devfs which is robust.
126
127 There have been many counter-proposals to devfs, all seeking to
128 provide some of the benefits without actually implementing devfs. So
129 far there has been an absence of code and no proposed alternative has
130 been able to provide all the features that devfs does. Further,
131 alternative proposals require far more complexity in user-space (and
132 still deliver less functionality than devfs). Some people have the
133 mantra of reducing "kernel bloat", but don't consider the effects on
134 user-space.
135
136 A good solution limits the total complexity of kernel-space and
137 user-space.
138
139
140 Major&minor allocation
141
142 The existing scheme requires the allocation of major and minor device
143 numbers for each and every device. This means that a central
144 co-ordinating authority is required to issue these device numbers
145 (unless you're developing a "private" device driver), in order to
146 preserve uniqueness. Devfs shifts the burden to a namespace. This may
147 not seem like a huge benefit, but actually it is. Since driver authors
148 will naturally choose a device name which reflects the functionality
149 of the device, there is far less potential for namespace conflict.
150 Solving this requires a kernel change.
151
152 /dev management
153
154 Because you currently access devices through device nodes, these must
155 be created by the system administrator. For standard devices you can
156 usually find a MAKEDEV programme which creates all these (hundreds!)
157 of nodes. This means that changes in the kernel must be reflected by
158 changes in the MAKEDEV programme, or else the system administrator
159 creates device nodes by hand.
160 The basic problem is that there are two separate databases of
161 major and minor numbers. One is in the kernel and one is in /dev (or
162 in a MAKEDEV programme, if you want to look at it that way). This is
163 duplication of information, which is not good practice.
164 Solving this requires a kernel change.
165
166 /dev growth
167
168 A typical /dev has over 1200 nodes! Most of these devices simply don't
169 exist because the hardware is not available. A huge /dev increases the
170 time to access devices (I'm just referring to the dentry lookup times
171 and the time taken to read inodes off disc: the next subsection shows
172 some more horrors).
173
174 An example of how big /dev can grow is if we consider SCSI devices:
175
176 host 6 bits (say up to 64 hosts on a really big machine)
177 channel 4 bits (say up to 16 SCSI buses per host)
178 id 4 bits
179 lun 3 bits
180 partition 6 bits
181 TOTAL 23 bits
182
183
184 This requires 8 Mega (1024*1024) inodes if we want to store all
185 possible device nodes. Even if we scrap everything but id,partition
186 and assume a single host adapter with a single SCSI bus and only one
187 logical unit per SCSI target (id), that's still 10 bits or 1024
188 inodes. Each VFS inode takes around 256 bytes (kernel 2.1.78), so
189 that's 256 kBytes of inode storage on disc (assuming real inodes take
190 a similar amount of space as VFS inodes). This is actually not so bad,
191 because disc is cheap these days. Embedded systems would care about
192 256 kBytes of /dev inodes, but you could argue that embedded systems
193 would have hand-tuned /dev directories. I've had to do just that on my
194 embedded systems, but I would rather just leave it to devfs.
195 Another issue is the time taken to lookup an inode when first
196 referenced. Not only does this take time in scanning through a list in
197 memory, but also the seek times to read the inodes off disc.
198 This could be solved in user-space using a clever programme which
199 scanned the kernel logs and deleted /dev entries which are not
200 available and created them when they were available. This programme
201 would need to be run every time a new module was loaded, which would
202 slow things down a lot.
203
204 There is an existing programme called scsidev which will automatically
205 create device nodes for SCSI devices. It can do this by scanning files
206 in /proc/scsi. Unfortunately, to extend this idea to other device
207 nodes would require significant modifications to existing drivers (so
208 they too would provide information in /proc). This is a non-trivial
209 change (I should know: devfs has had to do something similar). Once
210 you go to this much effort, you may as well use devfs itself (which
211 also provides this information). Furthermore, such a system would
212 likely be implemented in an ad-hoc fashion, as different drivers will
213 provide their information in different ways.
214
215 Devfs is much cleaner, because it (natually) has a uniform mechanism
216 to provide this information: the device nodes themselves!
217
218
219 Node to driver file_operations translation
220
221 There is an important difference between the way disc-based character
222 and block nodes and devfs entries make the connection between an entry
223 in /dev and the actual device driver.
224
225 With the current 8 bit major and minor numbers the connection between
226 disc-based c&b nodes and per-major drivers is done through a
227 fixed-length table of 128 entries. The various filesystem types set
228 the inode operations for c&b nodes to {chr,blk}dev_inode_operations,
229 so when a device is opened a few quick levels of indirection bring us
230 to the driver file_operations.
231
232 For miscellaneous character devices a second step is required: there
233 is a scan for the driver entry with the same minor number as the file
234 that was opened, and the appropriate minor open method is called. This
235 scanning is done *every time* you open a device node. Potentially, you
236 may be searching through dozens of misc. entries before you find your
237 open method. While not an enormous performance overhead, this does
238 seem pointless.
239
240 Linux *must* move beyond the 8 bit major and minor barrier,
241 somehow. If we simply increase each to 16 bits, then the indexing
242 scheme used for major driver lookup becomes untenable, because the
243 major tables (one each for character and block devices) would need to
244 be 64 k entries long (512 kBytes on x86, 1 MByte for 64 bit
245 systems). So we would have to use a scheme like that used for
246 miscellaneous character devices, which means the search time goes up
247 linearly with the average number of major device drivers on your
248 system. Not all "devices" are hardware, some are higher-level drivers
249 like KGI, so you can get more "devices" without adding hardware
250 You can improve this by creating an ordered (balanced:-)
251 binary tree, in which case your search time becomes log(N).
252 Alternatively, you can use hashing to speed up the search.
253 But why do that search at all if you don't have to? Once again, it
254 seems pointless.
255
256 Note thate devfs doesn't use the major&minor system. For devfs
257 entries, the connection is done when you lookup the /dev entry. When
258 devfs_register() is called, an internal table is appended which has
259 the entry name and the file_operations. If the dentry cache doesn't
260 have the /dev entry already, this internal table is scanned to get the
261 file_operations, and an inode is created. If the dentry cache already
262 has the entry, there is *no lookup time* (other than the dentry scan
263 itself, but we can't avoid that anyway, and besides Linux dentries
264 cream other OS's which don't have them:-). Furthermore, the number of
265 node entries in a devfs is only the number of available device
266 entries, not the number of *conceivable* entries. Even if you remove
267 unnecessary entries in a disc-based /dev, the number of conceivable
268 entries remains the same: you just limit yourself in order to save
269 space.
270
271 Devfs provides a fast connection between a VFS node and the device
272 driver, in a scalable way.
273
274 /dev as a system administration tool
275
276 Right now /dev contains a list of conceivable devices, most of which I
277 don't have. A devfs would only show those devices available on my
278 system. This means that listing /dev would be a handy way of checking
279 what devices were available.
280
281 Major&minor size
282
283 Existing major and minor numbers are limited to 8 bits each. This is
284 now a limiting factor for some drivers, particularly the SCSI disc
285 driver, which consumes a single major number. Only 16 discs are
286 supported, and each disc may have only 15 partitions. Maybe this isn't
287 a problem for you, but some of us are building huge Linux systems with
288 disc arrays. With devfs an arbitrary pointer can be associated with
289 each device entry, which can be used to give an effective 32 bit
290 device identifier (i.e. that's like having a 32 bit minor
291 number). Since this is private to the kernel, there are no C library
292 compatibility which you would have with increasing major and minor
293 number sizes. See the section on "Allocation of Device Numbers" for
294 details on maintaining compatibility with userspace.
295
296 Solving this requires a kernel change.
297
298 Since writing this, the kernel has been modified so that the SCSI disc
299 driver has more major numbers allocated to it and now supports up to
300 128 discs. Since these major numbers are non-contiguous (a result of
301 unplanned expansion), the implementation is a little more cumbersome
302 than originally.
303
304 Just like the changes to IPv4 to fix impending limitations in the
305 address space, people find ways around the limitations. In the long
306 run, however, solutions like IPv6 or devfs can't be put off forever.
307
308 Read-only root filesystem
309
310 Having your device nodes on the root filesystem means that you can't
311 operate properly with a read-only root filesystem. This is because you
312 want to change ownerships and protections of tty devices. Existing
313 practice prevents you using a CD-ROM as your root filesystem for a
314 *real* system. Sure, you can boot off a CD-ROM, but you can't change
315 tty ownerships, so it's only good for installing.
316
317 Also, you can't use a shared NFS root filesystem for a cluster of
318 discless Linux machines (having tty ownerships changed on a common
319 /dev is not good). Nor can you embed your root filesystem in a
320 ROM-FS.
321
322 You can get around this by creating a RAMDISC at boot time, making
323 an ext2 filesystem in it, mounting it somewhere and copying the
324 contents of /dev into it, then unmounting it and mounting it over
325 /dev.
326
327 A devfs is a cleaner way of solving this.
328
329 Non-Unix root filesystem
330
331 Non-Unix filesystems (such as NTFS) can't be used for a root
332 filesystem because they variously don't support character and block
333 special files or symbolic links. You can't have a separate disc-based
334 or RAMDISC-based filesystem mounted on /dev because you need device
335 nodes before you can mount these. Devfs can be mounted without any
336 device nodes. Devlinks won't work because symlinks aren't supported.
337 An alternative solution is to use initrd to mount a RAMDISC initial
338 root filesystem (which is populated with a minimal set of device
339 nodes), and then construct a new /dev in another RAMDISC, and finally
340 switch to your non-Unix root filesystem. This requires clever boot
341 scripts and a fragile and conceptually complex boot procedure.
342
343 Devfs solves this in a robust and conceptually simple way.
344
345 PTY security
346
347 Current pseudo-tty (pty) devices are owned by root and read-writable
348 by everyone. The user of a pty-pair cannot change
349 ownership/protections without being suid-root.
350
351 This could be solved with a secure user-space daemon which runs as
352 root and does the actual creation of pty-pairs. Such a daemon would
353 require modification to *every* programme that wants to use this new
354 mechanism. It also slows down creation of pty-pairs.
355
356 An alternative is to create a new open_pty() syscall which does much
357 the same thing as the user-space daemon. Once again, this requires
358 modifications to pty-handling programmes.
359
360 The devfs solution allows a device driver to "tag" certain device
361 files so that when an unopened device is opened, the ownerships are
362 changed to the current euid and egid of the opening process, and the
363 protections are changed to the default registered by the driver. When
364 the device is closed ownership is set back to root and protections are
365 set back to read-write for everybody. No programme need be changed.
366 The devpts filesystem provides this auto-ownership feature for Unix98
367 ptys. It doesn't support old-style pty devices, nor does it have all
368 the other features of devfs.
369
370 Intelligent device management
371
372 Devfs implements a simple yet powerful protocol for communication with
373 a device management daemon (devfsd) which runs in user space. It is
374 possible to send a message (either synchronously or asynchronously) to
375 devfsd on any event, such as registration/unregistration of device
376 entries, opening and closing devices, looking up inodes, scanning
377 directories and more. This has many possibilities. Some of these are
378 already implemented.
379
380 See:
381 http://www.atnf.csiro.au/~rgooch/linux/
382
383 Device entry registration events can be used by devfsd to change
384 permissions of newly-created device nodes. This is one mechanism to
385 control device permissions.
386
387 Device entry registration/unregistration events can be used to run
388 programmes or scripts. This can be used to provide automatic mounting
389 of filesystems when a new block device media is inserted into the
390 drive.
391
392 Asynchronous device open and close events can be used to implement
393 clever permissions management. For example, the default permissions on
394 /dev/dsp do not allow everybody to read from the device. This is
395 sensible, as you don't want some remote user recording what you say at
396 your console. However, the console user is also prevented from
397 recording. This behaviour is not desirable. With asynchronous device
398 open and close events, you can have devfsd run a programme or script
399 when console devices are opened to change the ownerships for *other*
400 device nodes (such as /dev/dsp). On closure, you can run a different
401 script to restore permissions. An advantage of this scheme over
402 modifying the C library tty handling is that this works even if your
403 programme crashes (how many times have you seen the utmp database with
404 lingering entries for non-existent logins?).
405
406 Synchronous device open events can be used to perform intelligent
407 device access protections. Before the device driver open() method is
408 called, the daemon must first validate the open attempt, by running an
409 external programme or script. This is far more flexible than access
410 control lists, as access can be determined on the basis of other
411 system conditions instead of just the UID and GID.
412
413 Inode lookup events can be used to authenticate module autoload
414 requests. Instead of using kmod directly, the event is sent to
415 devfsd which can implement an arbitrary authentication before loading
416 the module itself.
417 Inode lookup events can also be used to construct arbitrary
418 namespaces, without having to resort to populating devfs with symlinks
419 to devices that don't exist.
420
421 Speculative Device Scanning
422
423 Consider an application (like cdparanoia) that wants to find all
424 CD-ROM devices on the system (SCSI, IDE and other types), whether or
425 not their respective modules are loaded. The application must
426 speculatively open certain device nodes (such as /dev/sr0 for the SCSI
427 CD-ROMs) in order to make sure the module is loaded. This requires
428 that all Linux distributions follow the standard device naming scheme
429 (last time I looked RedHat did things differently). Devfs solves the
430 naming problem.
431
432 The same application also wants to see which devices are actually
433 available on the system. With the existing system it needs to read the
434 /dev directory and speculatively open each /dev/sr* device to
435 determine if the device exists or not. With a large /dev this is an
436 inefficient operation, especially if there are many /dev/sr* nodes. A
437 solution like scsidev could reduce the number of /dev/sr* entries (but
438 of course that also requires all that inefficient directory scanning).
439
440 With devfs, the application can open the /dev/sr directory
441 (which triggers the module autoloading if required), and proceed to
442 read /dev/sr. Since only the available devices will have
443 entries, there are no inefficencies in directory scanning or device
444 openings.
445
446 -----------------------------------------------------------------------------
447
448 Who else does it?
449
450 FreeBSD has a devfs implementation. Solaris 2 has a pseudo-devfs
451 (something akin to scsidev but for all devices, with some unspecified
452 kernel support). BeOS, Plan9 and QNX also have it. SGI's IRIX 6.4 and
453 above also have a device filesystem.
454
455 While we shouldn't just automatically do something because others do
456 it, we should not ignore the work of others either. FreeBSD has a lot
457 of competent people working on it, so their opinion should not be
458 blithely ignored.
459
460 -----------------------------------------------------------------------------
461
462
463 How it works
464
465 Registering device entries
466
467 For every entry (device node) in a devfs-based /dev a driver must call
468 devfs_register(). This adds the name of the device entry, the
469 file_operations structure pointer and a few other things to an
470 internal table. Device entries may be added and removed at any
471 time. When a device entry is registered, it automagically appears in
472 any mounted devfs'.
473
474 Inode lookup
475
476 When a lookup operation on an entry is performed and if there is no
477 driver information for that entry devfs will attempt to call
478 devfsd. If still no driver information can be found then a negative
479 dentry is yielded and the next stage operation will be called by the
480 VFS (such as create() or mknod() inode methods). If driver information
481 can be found, an inode is created (if one does not exist already) and
482 all is well.
483
484 Manually creating device nodes
485
486 The mknod() method allows you to create an ordinary named pipe in the
487 devfs, or you can create a character or block special inode if one
488 does not already exist. You may wish to create a character or block
489 special inode so that you can set permissions and ownership. Later, if
490 a device driver registers an entry with the same name, the
491 permissions, ownership and times are retained. This is how you can set
492 the protections on a device even before the driver is loaded. Once you
493 create an inode it appears in the directory listing.
494
495 Unregistering device entries
496
497 A device driver calls devfs_unregister() to unregister an entry.
498
499 Chroot() gaols
500
501 2.2.x kernels
502
503 The semantics of inode creation are different when devfs is mounted
504 with the "explicit" option. Now, when a device entry is registered, it
505 will not appear until you use mknod() to create the device. It doesn't
506 matter if you mknod() before or after the device is registered with
507 devfs_register(). The purpose of this behaviour is to support
508 chroot(2) gaols, where you want to mount a minimal devfs inside the
509 gaol. Only the devices you specifically want to be available (through
510 your mknod() setup) will be accessible.
511
512 2.4.x kernels
513
514 As of kernel 2.3.99, the VFS has had the ability to rebind parts of
515 the global filesystem namespace into another part of the namespace.
516 This now works even at the leaf-node level, which means that
517 individual files and device nodes may be bound into other parts of the
518 namespace. This is like making links, but better, because it works
519 across filesystems (unlike hard links) and works through chroot()
520 gaols (unlike symbolic links).
521
522 Because of these improvements to the VFS, the multi-mount capability
523 in devfs is no longer needed. The administrator may create a minimal
524 device tree inside a chroot(2) gaol by using VFS bindings. As this
525 provides most of the features of the devfs multi-mount capability, I
526 removed the multi-mount support code (after issuing an RFC). This
527 yielded code size reductions and simplifications.
528
529 If you want to construct a minimal chroot() gaol, the following
530 command should suffice:
531
532 mount --bind /dev/null /gaol/dev/null
533
534
535 Repeat for other device nodes you want to expose. Simple!
536
537 -----------------------------------------------------------------------------
538
539
540 Operational issues
541
542
543 Instructions for the impatient
544
545 Nobody likes reading documentation. People just want to get in there
546 and play. So this section tells you quickly the steps you need to take
547 to run with devfs mounted over /dev. Skip these steps and you will end
548 up with a nearly unbootable system. Subsequent sections describe the
549 issues in more detail, and discuss non-essential configuration
550 options.
551
552 Devfsd
553 OK, if you're reading this, I assume you want to play with
554 devfs. First you need to compile devfsd, the device management daemon,
555 available at
556 http://www.atnf.csiro.au/~rgooch/linux/.
557 Because the kernel has a naming scheme
558 which is quite different from the old naming scheme, you need to
559 install devfsd so that software and configuration files that use the
560 old naming scheme will not break.
561
562 Compile and install devfsd. You will be provided with a default
563 configuration file /etc/devfsd.conf which will provide
564 compatibility symlinks for the old naming scheme. Don't change this
565 config file unless you know what you're doing. Even if you think you
566 do know what you're doing, don't change it until you've followed all
567 the steps below and booted a devfs-enabled system and verified that it
568 works.
569
570 Now edit your main system boot script so that devfsd is started at the
571 very beginning (before any filesystem
572 checks). /etc/rc.d/rc.sysinit is often the main boot script
573 on systems with SysV-style boot scripts. On systems with BSD-style
574 boot scripts it is often /etc/rc. Also check
575 /sbin/rc.
576
577 NOTE that the line you put into the boot
578 script should be exactly:
579
580 /sbin/devfsd /dev
581
582 DO NOT use some special daemon-launching
583 programme, otherwise the boot script may not wait for devfsd to finish
584 initialising.
585
586 System Libraries
587 There may still be some problems because of broken software making
588 assumptions about device names. In particular, some software does not
589 handle devices which are symbolic links. If you are running a libc 5
590 based system, install libc 5.4.44 (if you have libc 5.4.46, go back to
591 libc 5.4.44, which is actually correct). If you are running a glibc
592 based system, make sure you have glibc 2.1.3 or later.
593
594 /etc/securetty
595 PAM (Pluggable Authentication Modules) is supposed to be a flexible
596 mechanism for providing better user authentication and access to
597 services. Unfortunately, it's also fragile, complex and undocumented
598 (check out RedHat 6.1, and probably other distributions as well). PAM
599 has problems with symbolic links. Append the following lines to your
600 /etc/securetty file:
601
602 1
603 2
604 3
605 4
606 5
607 6
608 7
609 8
610
611 This may potentially weaken security by allowing root logins over the
612 network (a password is still required, though). However, since there
613 are problems with dealing with symlinks, I'm suspicious of the level
614 of security offered in any case.
615
616 XFree86
617 While not essential, it's probably a good idea to upgrade to XFree86
618 4.0, as patches went in to make it more devfs-friendly. If you don't,
619 you'll probably need to apply the following patch to
620 /etc/security/console.perms so that ordinary users can run
621 startx.
622
623 --- /etc/security/console.perms.orig Sat Apr 17 16:26:47 1999
624 +++ /etc/security/console.perms Fri Feb 25 23:53:55 2000
625 @@ -14,7 +14,7 @@
626 # man 5 console.perms
627
628 # file classes -- these are regular expressions
629 -<console>=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9]
630 +<console>=tty[0-9][0-9]* [0-9][0-9]* :[0-9]\.[0-9] :[0-9]
631
632 # device classes -- these are shell-style globs
633 <floppy>=/dev/fd[0-1]*
634
635
636 Disable devpts
637 I've had a report of devpts mounted on /dev/pts not working
638 correctly. Since devfs will also manage /dev/pts, there is no
639 need to mount devpts as well. You should either edit your
640 /etc/fstab so devpts is not mounted, or disable devfs from
641 your kernel configuration.
642
643 Unsupported drivers
644 Not all drivers have devfs support. If you depend on one of these
645 drivers, you will need to create a script or tarfile that you can use
646 at boot time to create device nodes as appropriate. There is a
647 section which describes this. Another
648 section lists the drivers which have
649 devfs support.
650
651 /dev/mouse
652
653 Many disributions configure /dev/mouse to be the mouse device
654 for XFree86 and GPM. I actually think this is a bad idea, because it
655 adds another level of indirection. When looking at a config file, if
656 you see /dev/mouse you're left wondering which mouse
657 is being referred to. Hence I recommend putting the actual mouse
658 device (for example /dev/psaux) into your
659 /etc/X11/XF86Config file (and similarly for the GPM
660 configuration file).
661
662 Alternatively, use the same technique used for unsupported drivers
663 described above.
664
665 The Kernel
666 Finally, you need to make sure devfs is compiled into your
667 kernel. Set CONFIG_DEVFS_FS=y and recompile your kernel. Next, you
668 need to make sure devfs is mounted. The best solution is to pass
669 devfs=mount at the kernel boot command line. You can edit
670 /etc/lilo.conf and add the line:
671
672 append = "devfs=mount"
673
674
675 This will make the kernel mount devfs at boot time onto /dev.
676
677 Now you've finished all the steps required. You're now ready to boot
678 your shiny new kernel. Enjoy.
679
680 Changing the configuration
681
682 OK, you've now booted a devfs-enabled system, and everything works.
683 Now you may feel like changing the configuration (common targets are
684 /etc/fstab and /etc/devfsd.conf). Since you have a
685 system that works, if you make any changes and it doesn't work, you
686 now know that you only have to restore your configuration files to the
687 default and it will work again.
688
689
690 Permissions persistence across reboots
691
692 If you don't use mknod(2) to create a device file, nor use chmod(2) or
693 chown(2) to change the ownerships/permissions, the inode ctime will
694 remain at 0 (the epoch, 12 am, 1-JAN-1970, GMT). Anything with a ctime
695 later than this has had it's ownership/permissions changed. Hence, a
696 simple script or programme may be used to tar up all changed inodes,
697 prior to shutdown. Although effective, many consider this approach a
698 kludge.
699
700 A much better approach is to use devfsd to save and restore
701 permissions. It may be configured to record changes in permissions and
702 will save them in a database (in fact a directory tree), and restore
703 these upon boot. This is an efficient method and results in immediate
704 saving of current permissions (unlike the tar approach, which save
705 permissions at some unspecified future time).
706
707 The default configuration file supplied with devfsd has config entries
708 which you may uncomment to enable persistence management.
709
710 If you decide to use the tar approach anyway, be aware that tar will
711 first unlink(2) an inode before creating a new device node. The
712 unlink(2) has the effect of breaking the connection between a devfs
713 entry and the device driver. If you use the "devfs=only" boot option,
714 you lose access to the device driver, requiring you to reload the
715 module. I consider this a bug in tar (there is no real need to
716 unlink(2) the inode first).
717
718 Alternatively, you can use devfsd to provide more sophisticated
719 management of device permissions. You can use devfsd to store
720 permissions for whole groups of devices with a single configuration
721 entry, rather than the conventional single entry per device entry.
722
723 Permissions database stored in mounted-over /dev
724
725 If you wish to save and restore your device permissions into the
726 disc-based /dev while still mounting devfs onto /dev
727 you may do so. This requires a 2.4.x kernel (in fact, 2.3.99 or
728 later), which has the VFS binding facility. You need to do the
729 following to set this up:
730
731
732
733 make sure the kernel does not mount devfs at boot time
734
735
736 create the /dev-state directory
737
738
739 add the following lines near the very beginning of your boot
740 scripts:
741
742 mount --bind /dev /dev-state
743 mount -t devfs none /dev
744 devfsd /dev
745
746
747
748 add the following lines to your /etc/devfsd.conf file:
749
750 REGISTER .* COPY /dev-state/$devname $devpath
751 CHANGE .* COPY $devpath /dev-state/$devname
752 CREATE .* COPY $devpath /dev-state/$devname
753
754
755
756 reboot.
757
758
759
760
761
762 Dealing with drivers without devfs support
763
764 Currently, not all device drivers in the kernel have been modified to
765 use devfs. Device drivers which do not yet have devfs support will not
766 automagically appear in devfs. The simplest way to create device nodes
767 for these drivers is to unpack a tarfile containing the required
768 device nodes. You can do this in your boot scripts. All your drivers
769 will now work as before.
770
771 Hopefully for most people devfs will have enough support so that they
772 can mount devfs directly over /dev without loosing most functionality
773 (i.e. loosing access to various devices). As of 22-JAN-1998 (devfs
774 patch version 10) I am now running this way. All the devices I have
775 are available in devfs, so I don't lose anything.
776
777 WARNING: if your configuration requires the old-style device names
778 (i.e. /dev/hda1 or /dev/sda1), you must install devfsd and configure
779 it to maintain compatibility entries. It is almost certain that you
780 will require this. Note that the kernel creates a compatibility entry
781 for the root device, so you don't need initrd.
782
783 Note that you no longer need to mount devpts if you use Unix98 PTYs,
784 as devfs can manage /dev/pts itself. This saves you some RAM, as you
785 don't need to compile and install devpts. Note that some versions of
786 glibc have a bug with Unix98 pty handling on devfs systems. Contact
787 the glibc maintainers for a fix. Glibc 2.1.3 has the fix.
788
789 Note also that apart from editing /etc/fstab, other things will need
790 to be changed if you *don't* install devfsd. Some software (like the X
791 server) hard-wire device names in their source. It really is much
792 easier to install devfsd so that compatibility entries are created.
793 You can then slowly migrate your system to using the new device names
794 (for example, by starting with /etc/fstab), and then limiting the
795 compatibility entries that devfsd creates.
796
797 MAKE SURE YOU INSTALL DEVFSD BEFORE YOU BOOT A DEVFS-ENABLED KERNEL!
798
799 Now that devfs has gone into the 2.3.46 kernel, I'm getting a lot of
800 reports back. Many of these are because people are trying to run
801 without devfsd, and hence some things break. Please just run devfsd if
802 things break. I want to concentrate on real bugs rather than
803 misconfiguration problems at the moment. If people are willing to fix
804 bugs/false assumptions in other code (i.e. glibc, X server) and submit
805 that to the respective maintainers, that would be great.
806
807
808 All the way with Devfs
809
810 The devfs kernel patch creates a rationalised device tree. As stated
811 above, if you want to keep using the old /dev naming scheme,
812 you just need to configure devfsd appopriately (see the man
813 page). People who prefer the old names can ignore this section. For
814 those of us who like the rationalised names and an uncluttered
815 /dev, read on.
816
817 If you don't run devfsd, or don't enable compatibility entry
818 management, then you will have to configure your system to use the new
819 names. For example, you will then need to edit your
820 /etc/fstab to use the new disc naming scheme. If you want to
821 be able to boot non-devfs kernels, you will need compatibility
822 symlinks in the underlying disc-based /dev pointing back to
823 the old-style names for when you boot a kernel without devfs.
824
825 You can selectively decide which devices you want compatibility
826 entries for. For example, you may only want compatibility entries for
827 BSD pseudo-terminal devices (otherwise you'll have to patch you C
828 library or use Unix98 ptys instead). It's just a matter of putting in
829 the correct regular expression into /dev/devfsd.conf.
830
831 There are other choices of naming schemes that you may prefer. For
832 example, I don't use the kernel-supplied
833 names, because they are too verbose. A common misconception is
834 that the kernel-supplied names are meant to be used directly in
835 configuration files. This is not the case. They are designed to
836 reflect the layout of the devices attached and to provide easy
837 classification.
838
839 If you like the kernel-supplied names, that's fine. If you don't then
840 you should be using devfsd to construct a namespace more to your
841 liking. Devfsd has built-in code to construct a
842 namespace that is both logical and easy to
843 manage. In essence, it creates a convenient abbreviation of the
844 kernel-supplied namespace.
845
846 You are of course free to build your own namespace. Devfsd has all the
847 infrastructure required to make this easy for you. All you need do is
848 write a script. You can even write some C code and devfsd can load the
849 shared object as a callable extension.
850
851
852 Other Issues
853
854 The init programme
855 Another thing to take note of is whether your init programme
856 creates a Unix socket /dev/telinit. Some versions of init
857 create /dev/telinit so that the telinit programme can
858 communicate with the init process. If you have such a system you need
859 to make sure that devfs is mounted over /dev *before* init
860 starts. In other words, you can't leave the mounting of devfs to
861 /etc/rc, since this is executed after init. Other
862 versions of init require a named pipe /dev/initctl
863 which must exist *before* init starts. Once again, you need to
864 mount devfs and then create the named pipe *before* init
865 starts.
866
867 The default behaviour now is not to mount devfs onto /dev at
868 boot time for 2.3.x and later kernels. You can correct this with the
869 "devfs=mount" boot option. This solves any problems with init,
870 and also prevents the dreaded:
871
872 Cannot open initial console
873
874 message. For 2.2.x kernels where you need to apply the devfs patch,
875 the default is to mount.
876
877 If you have automatic mounting of devfs onto /dev then you
878 may need to create /dev/initctl in your boot scripts. The
879 following lines should suffice:
880
881 mknod /dev/initctl p
882 kill -SIGUSR1 1 # tell init that /dev/initctl now exists
883
884 Alternatively, if you don't want the kernel to mount devfs onto
885 /dev then you could use the following procedure is a
886 guideline for how to get around /dev/initctl problems:
887
888 # cd /sbin
889 # mv init init.real
890 # cat > init
891 #! /bin/sh
892 mount -n -t devfs none /dev
893 mknod /dev/initctl p
894 exec /sbin/init.real $*
895 [control-D]
896 # chmod a+x init
897
898 Note that newer versions of init create /dev/initctl
899 automatically, so you don't have to worry about this.
900
901 Module autoloading
902 You will need to configure devfsd to enable module
903 autoloading. The following lines should be placed in your
904 /etc/devfsd.conf file:
905
906 LOOKUP .* MODLOAD
907
908
909 As of devfsd-v1.3.10, a generic /etc/modules.devfs
910 configuration file is installed, which is used by the MODLOAD
911 action. This should be sufficient for most configurations. If you
912 require further configuration, edit your /etc/modules.conf
913 file.
914
915 Mounting root off a devfs device
916 If you wish to mount root off a devfs device when you pass the
917 "devfs=only" boot option, then you need to pass in the "root="
918 option to the kernel when booting. If you use LILO, then you must have
919 this in lilo.conf:
920
921 append = "root=<device>"
922
923 Surprised? Yep, so was I. It turns out if you have (as most people
924 do):
925
926 root = <device>
927
928
929 then LILO will determine the device number of and will write
930 that device number into a special place in the kernel image before
931 starting the kernel, and the kernel will use that device number to
932 mount the root filesystem. So, using the "append" variety ensures that
933 LILO passes the root filesystem device as a string, which devfs can
934 then use.
935
936 Note that this isn't an issue if you don't pass "devfs=only".
937
938 TTY issues
939 The ttyname(3) function in some versions of the C library makes
940 false assumptions about device entries which are symbolic links. The
941 tty(1) programme is one that depends on this function. I've
942 written a patch to libc 5.4.43 which fixes this. This has been
943 included in libc 5.4.44 and a similar fix is in glibc 2.1.3.
944
945
946 Kernel Naming Scheme
947
948 The kernel provides a default naming scheme. This scheme is designed
949 to make it easy to search for specific devices or device types, and to
950 view the available devices. Some device types (such as hard discs),
951 have a directory of entries, making it easy to see what devices of
952 that class are available. Often, the entries are symbolic links into a
953 directory tree that reflects the topology of available devices. The
954 topological tree is useful for finding how your devices are arranged.
955
956 Disc Devices
957
958 All discs, whether SCSI, IDE or whatever, are placed under the
959 /dev/discs hierarchy:
960
961 /dev/discs/disc0 first disc
962 /dev/discs/disc1 second disc
963
964
965 Each of these entries is a symbolic link to the directory for that
966 device. The device directory contains:
967
968 disc for the whole disc
969 part* for individual partitions
970
971
972 CD-ROM Devices
973
974 All CD-ROMs, whether SCSI, IDE or whatever, are placed under the
975 /dev/cdroms hierarchy:
976
977 /dev/cdroms/cdrom0 first CD-ROM
978 /dev/cdroms/cdrom1 second CD-ROM
979
980
981 Each of these entries is a symbolic link to the real device entry for
982 that device.
983
984 Tape Devices
985
986 All tapes, whether SCSI, IDE or whatever, are placed under the
987 /dev/tapes hierarchy:
988
989 /dev/tapes/tape0 first tape
990 /dev/tapes/tape1 second tape
991
992
993 Each of these entries is a symbolic link to the directory for that
994 device. The device directory contains:
995
996 mt for mode 0
997 mtl for mode 1
998 mtm for mode 2
999 mta for mode 3
1000 mtn for mode 0, no rewind
1001 mtln for mode 1, no rewind
1002 mtmn for mode 2, no rewind
1003 mtan for mode 3, no rewind
1004
1005
1006 SCSI Devices
1007
1008 To uniquely identify any SCSI device requires the following
1009 information:
1010
1011 controller (host adapter)
1012 bus (SCSI channel)
1013 target (SCSI ID)
1014 unit (Logical Unit Number)
1015
1016
1017 All SCSI devices are placed under /dev/scsi (assuming devfs
1018 is mounted on /dev). Hence, a SCSI device with the following
1019 parameters: c=1,b=2,t=3,u=4 would appear as:
1020
1021 /dev/scsi/host1/bus2/target3/lun4 device directory
1022
1023
1024 Inside this directory, a number of device entries may be created,
1025 depending on which SCSI device-type drivers were installed.
1026
1027 See the section on the disc naming scheme to see what entries the SCSI
1028 disc driver creates.
1029
1030 See the section on the tape naming scheme to see what entries the SCSI
1031 tape driver creates.
1032
1033 The SCSI CD-ROM driver creates:
1034
1035 cd
1036
1037
1038 The SCSI generic driver creates:
1039
1040 generic
1041
1042
1043 IDE Devices
1044
1045 To uniquely identify any IDE device requires the following
1046 information:
1047
1048 controller
1049 bus (aka. primary/secondary)
1050 target (aka. master/slave)
1051 unit
1052
1053
1054 All IDE devices are placed under /dev/ide, and uses a similar
1055 naming scheme to the SCSI subsystem.
1056
1057 XT Hard Discs
1058
1059 All XT discs are placed under /dev/xd. The first XT disc has
1060 the directory /dev/xd/disc0.
1061
1062 TTY devices
1063
1064 The tty devices now appear as:
1065
1066 New name Old-name Device Type
1067 -------- -------- -----------
1068 /dev/tts/{0,1,...} /dev/ttyS{0,1,...} Serial ports
1069 /dev/cua/{0,1,...} /dev/cua{0,1,...} Call out devices
1070 /dev/vc/{0,1,...} /dev/tty{1...63} Virtual consoles
1071 /dev/vcc/{0,1,...} /dev/vcs{1...63} Virtual consoles
1072 /dev/pty/m{0,1,...} /dev/ptyp?? PTY masters
1073 /dev/pty/s{0,1,...} /dev/ttyp?? PTY slaves
1074
1075
1076 RAMDISCS
1077
1078 The RAMDISCS are placed in their own directory, and are named thus:
1079
1080 /dev/rd/{0,1,2,...}
1081
1082
1083 Meta Devices
1084
1085 The meta devices are placed in their own directory, and are named
1086 thus:
1087
1088 /dev/md/{0,1,2,...}
1089
1090
1091 Floppy discs
1092
1093 Floppy discs are placed in the /dev/floppy directory.
1094
1095 Loop devices
1096
1097 Loop devices are placed in the /dev/loop directory.
1098
1099 Sound devices
1100
1101 Sound devices are placed in the /dev/sound directory
1102 (audio, sequencer, ...).
1103
1104
1105 Devfsd Naming Scheme
1106
1107 Devfsd provides a naming scheme which is a convenient abbreviation of
1108 the kernel-supplied namespace. In some
1109 cases, the kernel-supplied naming scheme is quite convenient, so
1110 devfsd does not provide another naming scheme. The convenience names
1111 that devfsd creates are in fact the same names as the original devfs
1112 kernel patch created (before Linus mandated the Big Name Change).
1113
1114 In order to configure devfsd to create these convenience names, the
1115 following lines should be placed in your /etc/devfsd.conf:
1116
1117 REGISTER .* MKNEWCOMPAT
1118 UNREGISTER .* RMNEWCOMPAT
1119
1120 This will cause devfsd to create (and destroy) symbolic links which
1121 point to the kernel-supplied names.
1122
1123 SCSI Hard Discs
1124
1125 All SCSI discs are placed under /dev/sd (assuming devfs is
1126 mounted on /dev). Hence, a SCSI disc with the following
1127 parameters: c=1,b=2,t=3,u=4 would appear as:
1128
1129 /dev/sd/c1b2t3u4 for the whole disc
1130 /dev/sd/c1b2t3u4p5 for the 5th partition
1131 /dev/sd/c1b2t3u4p5s6 for the 6th slice in the 5th partition
1132
1133
1134 SCSI Tapes
1135
1136 All SCSI tapes are placed under /dev/st. A similar naming
1137 scheme is used as for SCSI discs. A SCSI tape with the
1138 parameters:c=1,b=2,t=3,u=4 would appear as:
1139
1140 /dev/st/c1b2t3u4m0 for mode 0
1141 /dev/st/c1b2t3u4m1 for mode 1
1142 /dev/st/c1b2t3u4m2 for mode 2
1143 /dev/st/c1b2t3u4m3 for mode 3
1144 /dev/st/c1b2t3u4m0n for mode 0, no rewind
1145 /dev/st/c1b2t3u4m1n for mode 1, no rewind
1146 /dev/st/c1b2t3u4m2n for mode 2, no rewind
1147 /dev/st/c1b2t3u4m3n for mode 3, no rewind
1148
1149
1150 SCSI CD-ROMs
1151
1152 All SCSI CD-ROMs are placed under /dev/sr. A similar naming
1153 scheme is used as for SCSI discs. A SCSI CD-ROM with the
1154 parameters:c=1,b=2,t=3,u=4 would appear as:
1155
1156 /dev/sr/c1b2t3u4
1157
1158
1159 SCSI Generic Devices
1160
1161 All SCSI CD-ROMs are placed under /dev/sg. A similar naming
1162 scheme is used as for SCSI discs. A SCSI generic device with the
1163 parameters:c=1,b=2,t=3,u=4 would appear as:
1164
1165 /dev/sg/c1b2t3u4
1166
1167
1168 IDE Hard Discs
1169
1170 All IDE discs are placed under /dev/ide/hd, using a similar
1171 convention to SCSI discs. The following mappings exist between the new
1172 and the old names:
1173
1174 /dev/hda /dev/ide/hd/c0b0t0u0
1175 /dev/hdb /dev/ide/hd/c0b0t1u0
1176 /dev/hdc /dev/ide/hd/c0b1t0u0
1177 /dev/hdd /dev/ide/hd/c0b1t1u0
1178
1179
1180 IDE Tapes
1181
1182 A similar naming scheme is used as for IDE discs. The entries will
1183 appear in the /dev/ide/mt directory.
1184
1185 IDE CD-ROM
1186
1187 A similar naming scheme is used as for IDE discs. The entries will
1188 appear in the /dev/ide/cd directory.
1189
1190 IDE Floppies
1191
1192 A similar naming scheme is used as for IDE discs. The entries will
1193 appear in the /dev/ide/fd directory.
1194
1195 XT Hard Discs
1196
1197 All XT discs are placed under /dev/xd. The first XT disc
1198 would appear as /dev/xd/c0t0.
1199
1200
1201 SCSI Host Probing Issues
1202
1203 Devfs allows you to identify SCSI discs based in part on SCSI host
1204 numbers. If you have only one SCSI host (card) in your computer, then
1205 clearly it will be given host number 0. Life is not always that easy
1206 is you have multiple SCSI hosts. Unfortunately, it can sometimes be
1207 difficult to guess what the probing order of SCSI hosts is. You need
1208 to know the probe order before you can use device names. To make this
1209 easy, there is a kernel boot parameter called "scsihosts". This allows
1210 you to specify the probe order for different types of SCSI hosts. The
1211 syntax of this parameter is:
1212
1213 scsihosts=<name_1>:<name_2>:<name_3>:...:<name_n>
1214
1215 where <name_1>,<name_2>,...,<name_n> are the names
1216 of drivers used in the /proc filesystem. For example:
1217
1218 scsihosts=aha1542:ppa:aha1542::ncr53c7xx
1219
1220
1221 means that devices connected to
1222
1223 - first aha1542 controller - will be c0b#t#u#
1224 - first parallel port ZIP - will be c1b#t#u#
1225 - second aha1542 controller - will be c2b#t#u#
1226 - first NCR53C7xx controller - will be c4b#t#u#
1227 - any extra controller - will be c5b#t#u#, c6b#t#u#, etc
1228 - if any of above controllers will not be found - the reserved names will
1229 not be used by any other device.
1230 - c3b#t#u# names will never be used
1231
1232
1233 You can use ',' instead of ':' as the separator character if you
1234 wish. I have used the devfsd naming scheme
1235 here.
1236
1237 Note that this scheme does not address the SCSI host order if you have
1238 multiple cards of the same type (such as NCR53c8xx). In this case you
1239 need to use the driver-specific boot parameters to control this.
1240
1241 -----------------------------------------------------------------------------
1242
1243
1244 Device drivers currently ported
1245
1246 - All miscellaneous character devices support devfs (this is done
1247 transparently through misc_register())
1248
1249 - SCSI discs and generic hard discs
1250
1251 - Character memory devices (null, zero, full and so on)
1252 Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
1253
1254 - Loop devices (/dev/loop?)
1255
1256 - TTY devices (console, serial ports, terminals and pseudo-terminals)
1257 Thanks to C. Scott Ananian <cananian@alumni.princeton.edu>
1258
1259 - SCSI tapes (/dev/scsi and /dev/tapes)
1260
1261 - SCSI CD-ROMs (/dev/scsi and /dev/cdroms)
1262
1263 - SCSI generic devices (/dev/scsi)
1264
1265 - RAMDISCS (/dev/ram?)
1266
1267 - Meta Devices (/dev/md*)
1268
1269 - Floppy discs (/dev/floppy)
1270
1271 - Parallel port printers (/dev/printers)
1272
1273 - Sound devices (/dev/sound)
1274 Thanks to Eric Dumas <dumas@linux.eu.org> and
1275 C. Scott Ananian <cananian@alumni.princeton.edu>
1276
1277 - Joysticks (/dev/joysticks)
1278
1279 - Sparc keyboard (/dev/kbd)
1280
1281 - DSP56001 digital signal processor (/dev/dsp56k)
1282
1283 - Apple Desktop Bus (/dev/adb)
1284
1285 - Coda network file system (/dev/cfs*)
1286
1287 - Virtual console capture devices (/dev/vcc)
1288 Thanks to Dennis Hou <smilax@mindmeld.yi.org>
1289
1290 - Frame buffer devices (/dev/fb)
1291
1292 - Video capture devices (/dev/v4l)
1293
1294
1295 -----------------------------------------------------------------------------
1296
1297
1298 Allocation of Device Numbers
1299
1300 Devfs allows you to write a driver which doesn't need to allocate a
1301 device number (major&minor numbers) for the internal operation of the
1302 kernel. However, there are a number of userspace programmes that use
1303 the device number as a unique handle for a device. An example is the
1304 find programme, which uses device numbers to determine whether
1305 an inode is on a different filesystem than another inode. The device
1306 number used is the one for the block device which a filesystem is
1307 using. To preserve compatibility with userspace programmes, block
1308 devices using devfs need to have unique device numbers allocated to
1309 them. Furthermore, POSIX specifies device numbers, so some kind of
1310 device number needs to be presented to userspace.
1311
1312 The simplest option (especially when porting drivers to devfs) is to
1313 keep using the old major and minor numbers. Devfs will take whatever
1314 values are given for major&minor and pass them onto userspace.
1315
1316 Alternatively, you can have devfs choose unique device numbers for
1317 you. When you register a character or block device using
1318 devfs_register you can provide the optional
1319 DEVFS_FL_AUTO_DEVNUM flag, which will then automatically allocate a
1320 unique device number (the allocation is separated for the character
1321 and block devices).
1322
1323 This device number is a 16 bit number, so this leaves plenty of space
1324 for large numbers of discs and partitions. This scheme can also be
1325 used for character devices, in particular the tty devices, which are
1326 currently limited to 256 pseudo-ttys (this limits the total number of
1327 simultaneous xterms and remote logins). Note that the device number
1328 is limited to the range 36864-61439 (majors 144-239), in order to
1329 avoid any possible conflicts with existing official allocations.
1330
1331 Please note that using dynamically allocated block device numbers may
1332 break the NFS daemons (both user and kernel mode), which expect dev_t
1333 for a given device to be constant over the lifetime of remote mounts.
1334
1335 A final note on this scheme: since it doesn't increase the size of
1336 device numbers, there are no compatibility issues with userspace.
1337
1338 -----------------------------------------------------------------------------
1339
1340
1341 Questions and Answers
1342
1343
1344 Making things work
1345 Alternatives to devfs
1346
1347
1348
1349 Making things work
1350
1351 Here are some common questions and answers.
1352
1353
1354
1355 Devfsd is not managing all my permissions
1356
1357 Make sure you are capturing the appropriate events. For example,
1358 device entries created by the kernel generate REGISTER events,
1359 but those created by devfsd generate CREATE events.
1360
1361
1362 Devfsd is not capturing all REGISTER events
1363
1364 See the previous entry: you may need to capture CREATE events.
1365
1366
1367 X will not start
1368
1369 Make sure you followed the steps
1370 outlined above.
1371
1372
1373 Why don't my network devices appear in devfs?
1374
1375 This is not a bug. Network devices have their own, completely separate
1376 namespace. They are accessed via socket(2) and
1377 setsockopt(2) calls, and thus require no device nodes. I have
1378 raised the possibilty of moving network devices into the device
1379 namespace, but have had no response.
1380
1381
1382
1383
1384
1385 Alternatives to devfs
1386
1387 I've attempted to collate all the anti-devfs proposals and explain
1388 their limitations. Under construction.
1389
1390
1391 Why not just pass device create/remove events to a daemon?
1392
1393 Here the suggestion is to develop an API in the kernel so that devices
1394 can register create and remove events, and a daemon listens for those
1395 events. The daemon would then populate/depopulate /dev (which
1396 resides on disc).
1397
1398 This has several limitations:
1399
1400
1401 it only works for modules loaded and unloaded (or devices inserted
1402 and removed) after the kernel has finished booting. Without a database
1403 of events, there is no way the daemon could fully populate
1404 /dev
1405
1406
1407 if you add a database to this scheme, the question is then how to
1408 present that database to user-space. If you make it a list of strings
1409 with embedded event codes which are passed through a pipe to the
1410 daemon, then this is only of use to the daemon. I would argue that the
1411 natural way to present this data is via a filesystem (since many of
1412 the events will be of a hierarchical nature), such as devfs.
1413 Presenting the data as a filesystem makes it easy for the user to see
1414 what is available and also makes it easy to write scripts to scan the
1415 "database"
1416
1417
1418 the tight binding between device nodes and drivers is no longer
1419 possible (requiring the otherwise perfectly avoidable
1420 table lookups)
1421
1422
1423 you cannot catch inode lookup events on /dev which means
1424 that module autoloading requires device nodes to be created. This is a
1425 problem, particularly for drivers where only a few inodes are created
1426 from a potentially large set
1427
1428
1429 this technique can't be used when the root FS is mounted
1430 read-only
1431
1432
1433
1434
1435 Just implement a better scsidev
1436
1437 This suggestion involves taking the scsidev programme and
1438 extending it to scan for all devices, not just SCSI devices. The
1439 scsidev programme works by scanning /proc/scsi
1440
1441 Problems:
1442
1443
1444 the kernel does not currently provide a list of all devices
1445 available. Not all drivers register entries in /proc or
1446 generate kernel messages
1447
1448
1449 there is no uniform mechanism to register devices other than the
1450 devfs API
1451
1452
1453 implementing such an API is then the same as the
1454 proposal above
1455
1456
1457
1458
1459 Put /dev on a ramdisc
1460
1461 This suggestion involves creating a ramdisc and populating it with
1462 device nodes and then mounting it over /dev.
1463
1464 Problems:
1465
1466
1467
1468 this doesn't help when mounting the root filesystem, since you
1469 still need a device node to do that
1470
1471
1472 if you want to use this technique for the root device node as
1473 well, you need to use initrd. This complicates the booting sequence
1474 and makes it significantly harder to administer and configure. The
1475 initrd is essentially opaque, robbing the system administrator of easy
1476 configuration
1477
1478
1479 insufficient information is available to correctly populate the
1480 ramdisc. So we come back to the
1481 proposal above to "solve" this
1482
1483
1484 a ramdisc-based solution would take more kernel memory, since the
1485 backing store would be (at best) normal VFS inodes and dentries, which
1486 take 284 bytes and 112 bytes, respectively, for each entry. Compare
1487 that to 72 bytes for devfs
1488
1489
1490
1491
1492 Do nothing: there's no problem
1493
1494 Sometimes people can be heard to claim that the existing scheme is
1495 fine. This is what they're ignoring:
1496
1497
1498 device number size (8 bits each for major and minor) is a real
1499 limitation, and must be fixed somehow. Systems with large numbers of
1500 SCSI devices, for example, will continue to consume the remaining
1501 unallocated major numbers. USB will also need to push beyond the 8 bit
1502 minor limitation
1503
1504
1505 simplying increasing the device number size is insufficient. Apart
1506 from causing a lot of pain, it doesn't solve the management issues
1507 of a /dev with thousands or more device nodes
1508
1509
1510 ignoring the problem of a huge /dev will not make it go
1511 away, and dismisses the legitimacy of a large number of people who
1512 want a dynamic /dev
1513
1514
1515 the standard response then becomes: "write a device management
1516 daemon", which brings us back to the
1517 proposal above
1518
1519
1520
1521 -----------------------------------------------------------------------------
1522
1523
1524 Other resources
1525
1526
1527
1528 Douglas Gilbert has written a useful document at
1529
1530 http://www.torque.net/sg/devfs_scsi.html which
1531 explores the SCSI subsystem and how it interacts with devfs
1532
1533
1534 Douglas Gilbert has written another useful document at
1535
1536 http://www.torque.net/scsi/scsihosts.html which
1537 discusses the scsihosts= boot option
1538
1539
1540 Douglas Gilbert has written yet another useful document at
1541
1542 http://www.torque.net/scsi/linux_scsi_24/ which
1543 discusses the Linux SCSI subsystem in 2.4.
1544
1545
1546 Johannes Erdfelt has started a discussion paper on Linux and
1547 hot-swap devices, describing what the requirements are for a scalable
1548 solution and how and why he's used devfs+devfsd. Note that this is an
1549 early draft only, available in plain text form at:
1550
1551 http://johannes.erdfelt.com/hotswap.txt.
1552 Johannes has promised a HTML version will follow.
1553
1554
1555
This page was automatically generated by the
LXR engine.
Visit the LXR main site for more
information.