Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new dir module #916

Merged
merged 1 commit into from Sep 4, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Expand Up @@ -12,6 +12,8 @@ This project adheres to [Semantic Versioning](http://semver.org/).
([#921](https://github.com/nix-rust/nix/pull/921))
- Added support for `SCM_CREDENTIALS`, allowing to send process credentials over Unix sockets.
([#923](https://github.com/nix-rust/nix/pull/923))
- Added a `dir` module for reading directories (wraps `fdopendir`, `readdir`, and `rewinddir`).
([#916](https://github.com/nix-rust/nix/pull/916))

### Changed
- Increased required Rust version to 1.22.1/
Expand Down
210 changes: 210 additions & 0 deletions src/dir.rs
@@ -0,0 +1,210 @@
use {Error, NixPath, Result};
use errno::Errno;
use fcntl::{self, OFlag};
use libc;
use std::os::unix::io::{AsRawFd, IntoRawFd, RawFd};
use std::{ffi, fmt, ptr};
use sys;

#[cfg(target_os = "linux")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the cfg_if! macro here instead? It's a little more verbose but I find it easier to read an if-else rather than trying to determine that through consecutive cfg statements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise

use libc::{dirent64 as dirent, readdir64_r as readdir_r};

#[cfg(not(target_os = "linux"))]
use libc::{dirent, readdir_r};

/// An open directory.
///
/// This is a lower-level interface than `std::fs::ReadDir`. Notable differences:
/// * can be opened from a file descriptor (as returned by `openat`, perhaps before knowing
/// if the path represents a file or directory).
/// * implements `AsRawFd`, so it can be passed to `fstat`, `openat`, etc.
/// The file descriptor continues to be owned by the `Dir`, so callers must not keep a `RawFd`
/// after the `Dir` is dropped.
/// * can be iterated through multiple times without closing and reopening the file
/// descriptor. Each iteration rewinds when finished.
/// * returns entries for `.` (current directory) and `..` (parent directory).
/// * returns entries' names as a `CStr` (no allocation or conversion beyond whatever libc
/// does).
pub struct Dir(
// This could be ptr::NonNull once nix requires Rust 1.25.
*mut libc::DIR
);

impl Dir {
/// Opens the given path as with `fcntl::open`.
pub fn open<P: ?Sized + NixPath>(path: &P, oflag: OFlag,
mode: sys::stat::Mode) -> Result<Self> {
let fd = fcntl::open(path, oflag, mode)?;
Dir::from_fd(fd)
}

/// Opens the given path as with `fcntl::openat`.
pub fn openat<P: ?Sized + NixPath>(dirfd: RawFd, path: &P, oflag: OFlag,
mode: sys::stat::Mode) -> Result<Self> {
let fd = fcntl::openat(dirfd, path, oflag, mode)?;
Dir::from_fd(fd)
}

/// Converts from a descriptor-based object, closing the descriptor on success or failure.
#[inline]
pub fn from<F: IntoRawFd>(fd: F) -> Result<Self> {
Dir::from_fd(fd.into_raw_fd())
}

/// Converts from a file descriptor, closing it on success or failure.
pub fn from_fd(fd: RawFd) -> Result<Self> {
let d = unsafe { libc::fdopendir(fd) };
if d.is_null() {
let e = Error::last();
unsafe { libc::close(fd) };
return Err(e);
};
Ok(Dir(d))
}

/// Returns an iterator of `Result<Entry>` which rewinds when finished.
pub fn iter(&mut self) -> Iter {
Iter(self)
}
}

// `Dir` is not `Sync`. With the current implementation, it could be, but according to
// https://www.gnu.org/software/libc/manual/html_node/Reading_002fClosing-Directory.html,
// future versions of POSIX are likely to obsolete `readdir_r` and specify that it's unsafe to
// call `readdir` simultaneously from multiple threads.
//
// `Dir` is safe to pass from one thread to another, as it's not reference-counted.
unsafe impl Send for Dir {}

impl AsRawFd for Dir {
fn as_raw_fd(&self) -> RawFd {
unsafe { libc::dirfd(self.0) }
}
}

impl fmt::Debug for Dir {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
f.debug_struct("Dir")
.field("fd", &self.as_raw_fd())
.finish()
}
}

impl Drop for Dir {
fn drop(&mut self) {
unsafe { libc::closedir(self.0) };
}
}

#[derive(Debug)]
pub struct Iter<'d>(&'d mut Dir);

impl<'d> Iterator for Iter<'d> {
type Item = Result<Entry>;

fn next(&mut self) -> Option<Self::Item> {
unsafe {
// Note: POSIX specifies that portable applications should dynamically allocate a
// buffer with room for a `d_name` field of size `pathconf(..., _PC_NAME_MAX)` plus 1
// for the NUL byte. It doesn't look like the std library does this; it just uses
// fixed-sized buffers (and libc's dirent seems to be sized so this is appropriate).
// Probably fine here too then.
let mut ent: Entry = Entry(::std::mem::uninitialized());
let mut result = ptr::null_mut();
if let Err(e) = Errno::result(readdir_r((self.0).0, &mut ent.0, &mut result)) {
return Some(Err(e));
}
if result == ptr::null_mut() {
return None;
}
assert_eq!(result, &mut ent.0 as *mut dirent);
return Some(Ok(ent));
}
}
}

impl<'d> Drop for Iter<'d> {
fn drop(&mut self) {
unsafe { libc::rewinddir((self.0).0) }
}
}

/// A directory entry, similar to `std::fs::DirEntry`.
///
/// Note that unlike the std version, this may represent the `.` or `..` entries.
#[derive(Copy, Clone)]
pub struct Entry(dirent);

#[derive(Copy, Clone, Debug, Eq, PartialEq)]
pub enum Type {
Fifo,
CharacterDevice,
Directory,
BlockDevice,
File,
Symlink,
Socket,
}

impl Entry {
/// Returns the inode number (`d_ino`) of the underlying `dirent`.
#[cfg(any(target_os = "android",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use cfg-if here instead of these huge if/else blocks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this on July 5th, and the only way I could get it to work was by having two impl blocks, which makes the rustdoc output look weird. cfg_if can only be attached at the "item" level; apparently neither functions nor expressions are items.

target_os = "emscripten",
target_os = "fuchsia",
target_os = "haiku",
target_os = "ios",
target_os = "l4re",
target_os = "linux",
target_os = "macos",
target_os = "solaris"))]
pub fn ino(&self) -> u64 {
self.0.d_ino as u64
}

/// Returns the inode number (`d_fileno`) of the underlying `dirent`.
#[cfg(not(any(target_os = "android",
target_os = "emscripten",
target_os = "fuchsia",
target_os = "haiku",
target_os = "ios",
target_os = "l4re",
target_os = "linux",
target_os = "macos",
target_os = "solaris")))]
pub fn ino(&self) -> u64 {
self.0.d_fileno as u64
}

/// Returns the bare file name of this directory entry without any other leading path component.
pub fn file_name(&self) -> &ffi::CStr {
unsafe { ::std::ffi::CStr::from_ptr(self.0.d_name.as_ptr()) }
}

/// Returns the type of this directory entry, if known.
///
/// See platform `readdir(3)` or `dirent(5)` manpage for when the file type is known;
/// notably, some Linux filesystems don't implement this. The caller should use `stat` or
/// `fstat` if this returns `None`.
pub fn file_type(&self) -> Option<Type> {
match self.0.d_type {
libc::DT_FIFO => Some(Type::Fifo),
libc::DT_CHR => Some(Type::CharacterDevice),
libc::DT_DIR => Some(Type::Directory),
libc::DT_BLK => Some(Type::BlockDevice),
libc::DT_REG => Some(Type::File),
libc::DT_LNK => Some(Type::Symlink),
libc::DT_SOCK => Some(Type::Socket),
/* libc::DT_UNKNOWN | */ _ => None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be better to actually use DT_UNKNOWN and then have another catchall that maps to unreachable!()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the list of DT_ types is incomplete (which seems possible—I haven't read all the platforms' manpages, and a platform could add one anyway), what should this code do? With unreachable!, it'd panic. I'd prefer it return None. Calling code is supposed to handle DT_UNKNOWN gracefully, and it makes sense to me to handle an unexpected enum type in the same way.

}
}
}

impl fmt::Debug for Entry {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
f.debug_struct("Entry")
.field("ino", &self.ino())
.field("file_name", &self.file_name())
.field("file_type", &self.file_type())
.finish()
}
}
1 change: 1 addition & 0 deletions src/lib.rs
Expand Up @@ -29,6 +29,7 @@ pub extern crate libc;
#[macro_use] mod macros;

// Public crates
pub mod dir;
pub mod errno;
#[deny(missing_docs)]
pub mod features;
Expand Down
1 change: 1 addition & 0 deletions test/test.rs
Expand Up @@ -10,6 +10,7 @@ extern crate rand;
extern crate tempfile;

mod sys;
mod test_dir;
mod test_fcntl;
#[cfg(any(target_os = "dragonfly",
target_os = "freebsd",
Expand Down
46 changes: 46 additions & 0 deletions test/test_dir.rs
@@ -0,0 +1,46 @@
extern crate nix;
extern crate tempfile;

use nix::dir::{Dir, Type};
use nix::fcntl::OFlag;
use nix::sys::stat::Mode;
use std::fs::File;
use self::tempfile::tempdir;

#[test]
fn read() {
let tmp = tempdir().unwrap();
File::create(&tmp.path().join("foo")).unwrap();
::std::os::unix::fs::symlink("foo", tmp.path().join("bar")).unwrap();
let mut dir = Dir::open(tmp.path(), OFlag::O_DIRECTORY | OFlag::O_RDONLY | OFlag::O_CLOEXEC,
Mode::empty()).unwrap();
let mut entries: Vec<_> = dir.iter().map(|e| e.unwrap()).collect();
entries.sort_by(|a, b| a.file_name().cmp(b.file_name()));
let entry_names: Vec<_> = entries
.iter()
.map(|e| e.file_name().to_str().unwrap().to_owned())
.collect();
assert_eq!(&entry_names[..], &[".", "..", "bar", "foo"]);

// Check file types. The system is allowed to return DT_UNKNOWN (aka None here) but if it does
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why would it return DT_UNKNOWN? Does it do this in practice on any systems?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does. From the Linux readdir(3) manpage:

              Currently,  only  some  filesystems  (among them: Btrfs, ext2, ext3, and ext4) have
              full support for returning the file type in d_type.  All applications must properly
              handle a return of DT_UNKNOWN.

// return a type, ensure it's correct.
assert!(&[Some(Type::Directory), None].contains(&entries[0].file_type())); // .: dir
assert!(&[Some(Type::Directory), None].contains(&entries[1].file_type())); // ..: dir
assert!(&[Some(Type::Symlink), None].contains(&entries[2].file_type())); // bar: symlink
assert!(&[Some(Type::File), None].contains(&entries[3].file_type())); // foo: regular file
}

#[test]
fn rewind() {
let tmp = tempdir().unwrap();
let mut dir = Dir::open(tmp.path(), OFlag::O_DIRECTORY | OFlag::O_RDONLY | OFlag::O_CLOEXEC,
Mode::empty()).unwrap();
let entries1: Vec<_> = dir.iter().map(|e| e.unwrap().file_name().to_owned()).collect();
let entries2: Vec<_> = dir.iter().map(|e| e.unwrap().file_name().to_owned()).collect();
assert_eq!(entries1, entries2);
}

#[test]
fn ebadf() {
assert_eq!(Dir::from_fd(-1).unwrap_err(), nix::Error::Sys(nix::errno::Errno::EBADF));
}