03 Jan 2019

Lints, Syntax Parsing, and You

In my previous post I ran into a number of issues and confusion around clippy’s #[clippy::author] annotation and autogenerated code. Instead of continuing with clippy’s documentation, I’m going to jump over to llogiq’s blogpost on writing clippy lints and see what I can learn about lint implementation and the necessary datatypes.

Because clippy is a Rust toolchain component, and the clippy version on https://crates.io is no longer maintained, documentation tools like docs.rs are unavailable for browsing clippy’s types. Fortunately, Rust has a very strong offline documentation story, and generating documentation for a project is as simple as cargo doc --open. This can be a little tricky to get set up when you’re halfway through implementing a feature and looking for information in the documentation though, because the project has to successfully compile as part of generating the offline documentation.

Since clippy uses Rust compiler types to implement lints, we also need to have access to the documentation for rustc, Rust’s compiler, on the nightly toolchain; this can be accessed at https://doc.rust-lang.org/nightly/nightly-rustc/rustc/index.html. Unfortunately this set of compiler documentation is not available offline which seems like a gap in Rust’s documentation story.

Steps to Writing a Clippy Lint

I’ve broken the process I went through to implement this lint into steps to more clearly separate the different parts of a lint, the tools and processes involved at each stage, and problems and errors I encountered along the way as well as solutions I found where applicable.

For quick reference, here is a list of the steps involved:

Step One: Write an Example

The first step in writing a lint, as with most code, is to write the piece that uses the code you’re going to write. If we were writing regular code this might take the form of one or more test cases, but because we’re writing a lint we want to instead write an example of the code we’re trying to lint against. I’ve already done this earlier when I tried to use the #[clippy::author] directive, so I’m going to reuse the same example while using llogiq’s blogpost as a guide for my own lint.

Here’s the small code example we want to lint against:

pub struct MyStruct {
    id: usize
}

impl MyStruct {
    pub fn get_id(&self) -> usize {
        self.id
    }
}

fn main() {
   let s = MyStruct { id: 42 };
   s.get_id();
}

Step Two: Declare the Lint

After writing the example to lint against, we need to actually define the lint; this is done using the declare_clippy_lint! macro from the clippy_lints crate. I tried compiling clippy after declaring the lint and ran into the following error:

cargo check
   Compiling clippy_lints v0.0.212
error: cannot find macro `declare_tool_lint!` in this scope
  --> clippy_lints/src/lib.rs:53:9
   |
53 |           declare_tool_lint! { pub clippy::$name, Warn, $description, report_in_external_macro: true }
   |           ^^^^^^^^^^^^^^^^^
   |
  ::: clippy_lints/src/getter_prefix.rs:3:1
   |
3  | / declare_clippy_lint! {
4  | |     pub GETTER_PREFIX,
5  | |     style,
6  | |     "prefixing a getter with `get_`, which does not follow convention"
7  | | }
   | |_- in this macro invocation

error: aborting due to previous error

error: Could not compile `clippy_lints`.
warning: build failed, waiting for other jobs to finish...
error: cannot find macro `declare_tool_lint!` in this scope
  --> clippy_lints/src/lib.rs:53:9
   |
53 |           declare_tool_lint! { pub clippy::$name, Warn, $description, report_in_external_macro: true }
   |           ^^^^^^^^^^^^^^^^^
   |
  ::: clippy_lints/src/getter_prefix.rs:3:1
   |
3  | / declare_clippy_lint! {
4  | |     pub GETTER_PREFIX,
5  | |     style,
6  | |     "prefixing a getter with `get_`, which does not follow convention"
7  | | }
   | |_- in this macro invocation

error: aborting due to previous error

error: Could not compile `clippy_lints`.

To learn more, run the command again with --verbose.

cargo can’t find the declare_tool_lint! macro in the current scope, which is not an error I was expecting to see because I didn’t think I was using that macro in my lint definition. It turns out that the definition of the declare_clippy_lint! macro uses the declare_tool_lint! macro, so this second macro must be brought into scope before the first can be used. I’m not very familiar with macros or metaprogramming in Rust, but after looking at a number of other lints it seems like all of them use the declare_tool_lint! macro so I will too.

Here is the lint code so far, including the first line which imports the dependent lint:

use crate::rustc::declare_tool_lint;

declare_clippy_lint! {
    pub GETTER_PREFIX,
    style,
    "prefixing a getter with `get_`, which does not follow convention"
}

As part of declaring the lint, we also want to document the lint by describing what it does and include some examples of code that will not pass the lint and alternatives that will pass the lint rules. By following a standard convention, documentation written for each lint can be extracted and added to the lint list which provides a filterable list of all of clippy’s lints.

Rust provides a lot of tools for documenting Rust code, the cornerstone of which is documentation comments. Documentation comments support Markdown syntax for formatting and are used to generate browsable HTML pages, without requiring the author to maintain separate written documentation. Documentation comments also drive other tools such as docs.rs.

Here is a (very rough) initial documentation comment for the lint, including examples of bad and good code samples:

/// **What it does:** Checks for the `get_` prefix on getters.
///
/// **Why is this bad?** The Rust API Guidelines section on naming
/// [specifies](https://rust-lang-nursery.github.io/api-guidelines/naming.html#getter-names-follow-rust-convention-c-getter)
/// that the `get_` prefix is not used for getters in Rust code unless
/// there is a single and obvious thing that could reasonably be gotten by
/// a getter.
///
/// **Known problems:** Exceptions not yet implemented.
///
/// **Example:**
///
/// ```rust
/// // Bad
/// impl B {
///     fn get_id(&self) -> usize {
///         ..
///     }
///}
///
/// // Good
/// impl G {
///     fn id(&self) -> usize {
///         ..
///     }
/// }
/// ```

Step Three: Register the Lint with Clippy

This is a largely automated process thanks to some clippy tooling as described in the How Clippy Works section of the contribution documentation. Simply (re-)run util/dev update_lints as necessary, which autogenerates the majority of clippy_lints/src/lib.rs to declare lints and lint groups.

Alternatively, this can be done manually by adding the lint to the correct lint group inside the register_plugins function.

Regardless of how the lint is added to the lint groups, the lint must be registered with the lint registry by introducing the lint as either an early or late lint pass. In the case of this lint, which will be implemented as an early lint pass, the following line adds it to the lint registry as declared inside the register_plugins function:

reg.register_early_lint_pass(box naming::GetterPrefix);

Step Four: Start Implementing Lint Passes

Now that the lint is defined and registered with clippy, we can start implementing the logic to lint against the example we implemented in Step One.

For most lints there are two traits that need to be implemented so the lint can be registered with rustc_plugin::registry::Registry: the LintPass trait and one of either the EarlyLintPass or the LateLintPass trait. LintPass is apparently necessary to provide descriptions of the possible lints the lint can emit, but in all of the lints I have looked at, the LintPass implementation always takes the form:

impl LintPass for Pass {
    fn get_lints(&self) -> LintArray {
        lint_array!(LINT_NAME)
    }
}

so I’m surprised there hasn’t been a #[derive] annotation written for it or some kind of macro that would reduce the repetition.

As for EarlyLintPass or LateLintPass, the choice of which trait to implement comes down to the kind of information a given lint needs about the code it is linting: EarlyLintPass methods only provide abstract syntax tree (AST) information, whereas LateLintPass methods are, as the name implies, executed later in the compilation process and contain type information. Since this lint is only interested in function names and checking them against known patterns, I’ve decided to implement the EarlyLintPass trait. There seems to be a lot of overlap in method signatures between the two types of lint pass, so if I need to switch to LateLintPass to get access to additional type information it should not be too difficult to transition over.

Step Five: Inspect and Interpret the AST

Most everything up until now has been boilerplate for getting the lint set up and correctly registered with clippy’s infrastructure. Now that we have everything set up, we need to figure out how to tell the lint to match against the undesired function names we’ve already written in our test. To do this, we need to first understand how the rustc compiler sees the code we have written, which for rustc is represented as an abstract syntax tree (AST), and then parse that tree to match against undesired nodes within the tree that correspond to the Rust source code. To retrieve this representation, we can use rustc to generate the AST and then inspect it manually to get a sense of the program structure; rustc has the -Z option for controlling various debug options (see rustc --help), and one of these options tells rustc to print the AST as JSON and halt compilation: -Z ast-json.

The full command to print the AST as JSON, specifying the individual test for this lint whose code we want to inspect, is:

$ rustc tests/ui/naming.rs -L target/debug -Z ast-json

The initial output of this command is utterly unreadable because rustc prints it as a single line of JSON, so I’m going to use jq to pretty-print the AST so it’s easier to read:

{
    "module": {
        "inner": {
            "lo": 426,
            "hi": 611
        },
        "items": [
            [snip 1604 lines]
        ],
        "inline": true
    },
    "attrs": [],
    "span": {
        "lo": 426,
        "hi": 611
    }
}

In all, the AST for my 12-line test file ended up expanding to a 1627-line (pretty-printed) JSON file. The items key and its 1604 lines are where the actually interesting AST information is in the data structure, containing information about every identifier, implementation, attribute, etc in the code. Within the items array, the name of the function in the test, get_id, appears 5 times in various contexts such as "variant": "Impl" and "variant": "MethodCall". The following is what I believe to be the beginning of the AST for the get_id function:

{
    "id": 20,
    "ident": "get_id",
    "vis": {
        "node": "Public",
        "span": {
        "lo": 485,
        "hi": 488
        }
    },
    "defaultness": "Final",
    "attrs": [],
    "generics": {
        "params": [],
        "where_clause": {
            "id": 21,
            "predicates": [],
            "span": {
                "lo": 0,
                "hi": 0
            }
        },
        "span": {
            "lo": 0,
            "hi": 0
        }
    },
    ...
}

Specifically, the following are the nodes that describe the pub fn get_id text literals of the function signature:

"tokens": [
    {
        "variant": "Token",
        "fields": [
            {
                "lo": 485,
                "hi": 488
            },
            {
                "variant": "Ident",
                "fields": [
                    "pub",
                    false
                ]
            }
        ]
    },
    {
        "variant": "Token",
        "fields": [
            {
                "lo": 489,
                "hi": 491
            },
            {
                "variant": "Ident",
                "fields": [
                    "fn",
                    false
                ]
            }
        ]
    },
    {
        "variant": "Token",
        "fields": [
            {
                "lo": 492,
                "hi": 498
            },
            {
                "variant": "Ident",
                "fields": [
                    "get_id",
                    false
                ]
            }
        ]
    },
]

Now that the relevant sections of the AST have been identified, we can use this information to get a better understanding of how rustc (and thus tools like clippy) understand the written Rust code in the source file. This information can be useful at many different stages of lint implementation, particularly if we need to debug out lint or we are not matching the expected nodes in our lint code.

Step Six: Implement EarlyLintPass

The EarlyLintPass trait requires the implementation of one of its provided methods to perform the lint work. Looking at the list of provided methods, however, I’m not really sure where to start or which method to implement. A lot of rustc’s internals seem to be chronically under-documented, which makes it very difficult to understand how different internals are used (what’s the difference between check_item and check_item_post, for example?) or the subtle (or not so subtle) differences between various types or methods.

My strategy in learning about the rustc internals so far has been a combination of looking at existing lints and trying to identify what the internals do based on the lint’s goals and existing code, as well as inspecting not the EarlyLintPass methods themselves but rather the associated types such as syntax::ast::Item or syntax::ast::Local which are generally better documented than the methods that use them. These associated types also match very closely if not exactly to the AST structure produced by rustc, so by identifying the structure of the information in the AST as above, it is easier to choose the correct function to implement based on its associated type which contains the same information as the already-identified AST subtree(s).

Based on my understanding, it looks like the method that I want to implement is check_item which will yield syntax::ast::Item instances whose syntax::ast::ItemKind we can then match on for function declarations.

My initial implementation of the EarlyLintPass trait based on the above yielded the following:

impl EarlyLintPass for GetterPrefix {
    fn check_item(&mut self, cx: &EarlyContext<'_>, item: &ast::Item) {
        if let ast::ItemKind::Fn(..) = item.node {
            let name = item.ident.name;
            if name.as_str().starts_with("get_") {
                span_lint(cx, GETTER_PREFIX, item.span, "prefixing a getter with `get_` does not follow naming conventions");
            }
        }
    }
}

This implementation checks each Item in the AST and looks for nodes identified as ItemKind::Fn; that is, functions declared in the source. For matching nodes, the name is extracted from the node identifier and is checked to see if it starts with the get_ prefix. The span_lint function comes from clippy’s lint utils and is what actually contextualizes lint warnings and errors around the offending code and outputs the lint message.

However, it turns out that if let ast::ItemKind::Fn(..) only matches the main function name of the test case, rather than my expectation that it would match all functions declared within the file.

Instead, since we are interested in functions implemented for a type, we can use the check_impl_item method which will yield syntax::ast::ImplItem instances whose syntax::ast::ImplItemKind we can then match on for methods like so:

impl EarlyLintPass for GetterPrefix {
    fn check_impl_item(&mut self, cx: &EarlyContext<'_>, implitem: &ast::ImplItem) {
        if let ast::ImplItemKind::Method(..) = implitem.node {
            let name = implitem.ident.name;
            if name.as_str().starts_with("get_") {
                span_lint(
                    cx,
                    GETTER_PREFIX,
                    implitem.span,
                    "prefixing a getter with `get_` does not follow naming conventions"
                );
            }
        }
    }
}

The above implementation works very similarly to the initial implementation, but instead of checking Items looking for ItemKind::Fn, it instead looks at ImplItems in the AST and matches ImplItemKind::Method nodes (since implementation nodes also include const declarations, types and type aliases, traits, and macros in addition to methods).

To test an individual lint without the full clippy test harness (or to see println!s or other debugging statements more clearly), we can use the following clippy-driver incantation and specify a single UI test file, tests/ui/naming.rs:

$ CLIPPY_TESTS=true cargo run --bin clippy-driver -- -L ./target/debug tests/ui/naming.rs

This lint pass implementation results in the following (successful) lint warning:

warning: prefixing a getter with `get_` does not follow naming conventions
  --> tests/ui/naming.rs:15:5
   |
15 | /     pub fn get_id(&self) -> usize {
16 | |         self.id
17 | |     }
   | |_____^
   |
   = note: #[warn(clippy::getter_prefix)] on by default
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#getter_prefix

Step Seven: Generate the .stderr File for the New Lint

To test its behaviour, clippy uses UI tests to check that the output of the compiler is exactly as expected. The .stderr file clippy will use to check the newly implemented lint is automatically generated whenever the tests are run using cargo test, but rather than running all of the tests at this point we can run just the test for the individual lint by specifying TESTNAME=ui/naming where ui/naming is the UI test to run:

$ TESTNAME=ui/naming cargo test --test compile-test

To update the .stderr (and .stdout, if applicable) files in tests/ui/, we use the provided update script (the correct incantation for an individual lint can be found in the output of cargo test with a specified TESTNAME as we did above):

$ tests/ui/update-references.sh 'target/debug/test_build_base' 'naming.rs'

This will create the tests/ui/naming.stderr file for the lint.

Step Eight: Iterate

Now that the bare bones of the lint is implemented and it has stderr output as expected, the lint can be iterated on to add more functionality and identify possible false positives that need to be mitigated.

In the case of this lint, I added test cases that cover the exceptions to the naming rule and then introduced an additional check in the lint to see if the matched function name appeared in this exceptions list:

const ALLOWED_METHOD_NAMES: [&'static str; 5] = [
    "get",
    "get_mut",
    "get_unchecked",
    "get_unchecked_mut",
    "get_ref"
];

impl EarlyLintPass for GetterPrefix {
    fn check_impl_item(&mut self, cx: &EarlyContext<'_>, implitem: &ast::ImplItem) {
        if let ast::ImplItemKind::Method(..) = implitem.node {
            let name = implitem.ident.name.as_str().get();
            if name.starts_with("get_") && !ALLOWED_METHOD_NAMES.contains(&name) {
                span_lint(
                    cx,
                    GETTER_PREFIX,
                    implitem.span,
                    "prefixing a getter with `get_` does not follow naming conventions",
                );
            }
        }
    }
}

Step Nine: Run the Full Test Suite

Up until now I have only been running tests for the lint I have been working on to speed up the feedback cycle. Before moving on, I want to run the complete test suite to make sure that the tests still pass with my additions, and if any existing lints need to be changed to conform to the new lint.

In doing so I discovered a get_unit function defined in the unused_unit lint, but since this definition was localized to one lint file I was able to change the function name to pass the getter prefix lint and still maintain the lint’s original functionality. Changing this function name also meant that the tests/ui/unused_unit.stderr file was out of date, which was updated using the provided tests/ui/update-all-references.sh script.

Step Ten: Linting Clippy with Local Changes

The last step before submitting a pull request to rust-lang/rust-clippy is to make sure that all lints that have already been defined pass clippy (that is, there are no suggestions reported by clippy for its own codebase). Running clippy locally and addressing any issues found ahead of submitting a pull request will cut down on the feedback cycle and speed up the pull request review process. The recommended way to do this is by building clippy and then running it with all lint groups (including internal and pedantic) turned on:

$ cargo build
$ `pwd`/target/debug/cargo-clippy clippy --all-targets --all-features -- -D clippy::all -D clippy::internal -D clippy::pedantic

I was not able to run this command successfully, as it resulted in what appear to be dynamic linker errors on my machine:

$ clippy/target/debug/cargo-clippy --all-targets --all-features -- -D clippy::all -D clippy::internal -D clippy::pedantic
error: failed to run `rustc` to learn about target-specific information

Caused by:
  process didn't exit successfully: `clippy/target/debug/clippy-driver rustc - --crate-name ___ --print=file-names --crate-type bin --crate-type rlib --crate-type dylib --crate-type cdylib --crate-type staticlib --crate-type proc-macro` (signal: 6, SIGABRT: process abort signal)
--- stderr
dyld: Library not loaded: @rpath/librustc_driver-b630426988dbbdb0.dylib
  Referenced from: clippy/target/debug/clippy-driver
  Reason: image not found

However, even without running clippy locally, in the case of this getter prefix lint clippy would not pass local clippy. It turns out that there a number of rustc methods that do not follow the API naming convention, and I will need to consult with other clippy developers to find out if we will need to add more exceptions to the lint than we had originally identified. The one method that is particularly problematic is LintPass::get_lints because it appears in every lint that has already been defined in clippy.

Submitting the Lint

Here’s the implemented lint in its entirety:

The UI test file tests/ui/naming.rs:

// Copyright 2018 The Rust Project Developers. See the COPYRIGHT
// file at the top-level directory of this distribution.
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.

pub struct MyStruct {
    id: usize
}

impl MyStruct {
    pub fn get_id(&self) -> usize {
        self.id
    }

    pub fn get(&self) -> usize {
        self.id
    }

    pub fn get_mut(&mut self) -> usize {
        self.id
    }

    pub fn get_unchecked(&self) -> usize {
        self.id
    }

    pub fn get_unchecked_mut(&mut self) -> usize {
        self.id
    }

    pub fn get_ref(&self) -> usize {
        self.id
    }
}

fn main() {
   let mut s = MyStruct { id: 42 };
   s.get_id();
   s.get();
   s.get_mut();
   s.get_unchecked();
   s.get_unchecked_mut();
   s.get_ref();
}

The lint implementation clippy_lints/src/naming.rs:

// Copyright 2018 The Rust Project Developers. See the COPYRIGHT
// file at the top-level directory of this distribution.
//
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
// option. This file may not be copied, modified, or distributed
// except according to those terms.

use crate::rustc::lint::{EarlyContext, EarlyLintPass, LintArray, LintPass};
use crate::rustc::{declare_tool_lint, lint_array};
use crate::syntax::ast;
use crate::utils::span_lint;

/// **What it does:** Checks for the `get_` prefix on getters.
///
/// **Why is this bad?** The Rust API Guidelines section on naming
/// [specifies](https://rust-lang-nursery.github.io/api-guidelines/naming.html#getter-names-follow-rust-convention-c-getter)
/// that the `get_` prefix is not used for getters in Rust code unless
/// there is a single and obvious thing that could reasonably be gotten by
/// a getter.
///
/// The exceptions to this naming convention are as follows:
/// - `get` (such as in
///   [`std::cell::Cell::get`](https://doc.rust-lang.org/std/cell/struct.Cell.html#method.get))
/// - `get_mut`
/// - `get_unchecked`
/// - `get_unchecked_mut`
/// - `get_ref`
///
/// **Known problems:** None.
///
/// **Example:**
///
/// ```rust
/// // Bad
/// impl B {
///     fn get_id(&self) -> usize {
///         ..
///     }
/// }
///
/// // Good
/// impl G {
///     fn id(&self) -> usize {
///         ..
///     }
/// }
///
/// // Also allowed
/// impl A {
///     fn get(&self) -> usize {
///         ..
///     }
/// }
/// ```
declare_clippy_lint! {
    pub GETTER_PREFIX,
    style,
    "prefixing a getter with `get_`, which does not follow convention"
}

#[derive(Copy, Clone)]
pub struct GetterPrefix;

#[rustfmt::skip]
const ALLOWED_METHOD_NAMES: [&'static str; 5] = [
    "get",
    "get_mut",
    "get_unchecked",
    "get_unchecked_mut",
    "get_ref"
];

impl LintPass for GetterPrefix {
    fn get_lints(&self) -> LintArray {
        lint_array!(GETTER_PREFIX)
    }
}

impl EarlyLintPass for GetterPrefix {
    fn check_impl_item(&mut self, cx: &EarlyContext<'_>, implitem: &ast::ImplItem) {
        if let ast::ImplItemKind::Method(..) = implitem.node {
            let name = implitem.ident.name.as_str().get();
            if name.starts_with("get_") && !ALLOWED_METHOD_NAMES.contains(&name) {
                span_lint(
                    cx,
                    GETTER_PREFIX,
                    implitem.span,
                    "prefixing a getter with `get_` does not follow naming conventions",
                );
            }
        }
    }
}

Once the basic lint functionality was completed, I wanted to open a pull request against rust-lang/rust-clippy as soon as possible. This is my first time writing a lint for clippy, so I wanted to start getting feedback from other clippy developers and incorporate their suggestions to improve the lint code as well as my understanding of how clippy works. Once I got the lint working I also realized there were a number of decisions to be made where I did not have enough Rust ecosystem or clippy-specific knowledge to answer, so opening a pull request would be a great way to get those questions answered within the context of the code I had written for the lint.

My pull request for this getter prefix name lint can be found at rust-lang/rust-clippy#3616 which details the current state of the lint implementation and its progress getting merged into clippy.

There is still some work to be done before I feel confident this lint can be merged. First, I need to go through clippy more thoroughly and identify existing lints that need to be changed to pass the newly introduced lint; resolving the error I identified in Step Ten should get me well on the way to achieving this.

I would also like to implement some machine-applicable renaming suggestions that would remove the get_ prefix from method names so that functions that fail to meet this Rust API naming convention can be automatically fixed by rustfix. I’m looking into the mechanisms clippy provides to make these kinds of suggestions, and span_lint_and_sugg looks like a likely candidate for teaching rustfix about these renaming rules.

Lastly I will of course need to implement the feedback on the pull request from other clippy developers so the lint will be accepted for inclusion in clippy.

Conclusion

I’m really glad I was able to write this lint for clippy. I had found the original issue, rust-lang/rust-clippy#1673, back in October 2018 and thought it was the perfect size for getting my feet wet writing a lint. Some parts of implementing this lint turned out to be more difficult than I had anticipated, most notably around the sparseness or complete lack of rustc internals documentation which was surprising given the Rust community’s focus on writing documentation and the generally high quality documentation available across the ecosystem, but a little trial and error and some println! debugging pointed me in the right direction in the end.

A special thanks to Manish Goregaokar, Philipp Krones, Matthias Krüger, and hcpl for their input, feedback, and help at various points throughout this process, and llogiq for their blog post on writing clippy lints which was very helpful in finding my way to a working lint implementation.

There is a particular quote by Charles H. Spurgeon that I tend to associate with new years and new beginnings: “Begin as you mean to go on, and go on as you began”. The way that I began 2019 was writing Rust code, and I fully intend to go on writing Rust code for the rest of 2019 and beyond. I think that’s a worth-while New Year’s resolution, don’t you?

06 Dec 2018

Yak Shaving in F♭

Following on from my introduction to clippy lints, this week I am beginning my journey of actually implementing a clippy lint.

As a refresher, I am implementing rust-lang/rust-clippy#1673:

To summarize, if the type has a get_foo method we should suggest naming it foo instead to follow the API Guidelines for Rust getter name conventions except for cases of:

  • get
  • get_mut
  • get_unchecked
  • get_unchecked_mut
  • get_ref

This should be enough to get me started on a style lint for this convention, I should have some time over the next couple days to start digging into this.

clippy’s Author Lint

In typical TDD fashion, I want to start with the test case, the code I want to lint against, so I can test my implementation and drive design. I’ve started off with the simple case of detecting the invalid style rather than worrying about whitelisting the exceptions identified.

pub struct MyStruct {
    id: u32
}

impl MyStruct {
    pub fn get_id(&self) -> u32 {
        self.id
    }
}

fn main() {
   let s = MyStruct { id: 42 };

   #[clippy::author]
   let id = s.get_id();
}

I’ve also added the #[clippy::author] annotation as suggested by clippy’s contributing documentation to generate a starting point for the lint.

Next, I have to run the test to produce a .stdout file with the code generated by the #[clippy::author] lint. The instructions say:

If the command was executed successfully, you can copy the code over to where you are implementing your lint.

Let’s try it out:

$ TESTNAME=ui/getter_prefix cargo test --test compile-test
    Finished dev [unoptimized + debuginfo] target(s) in 0.15s
     Running target/debug/deps/compile_test-f89d0316ceade355

running 1 test

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 29 filtered out


running 1 test
test [ui] ui/getter_prefix.rs ... FAILED

failures:

---- [ui] ui/getter_prefix.rs stdout ----
normalized stdout:
if_chain! {
    if let StmtKind::Decl(ref decl, _) = stmt.node
    if let DeclKind::Local(ref local) = decl.node;
    if let Some(ref init) = local.init
    if let ExprKind::MethodCall(ref method_name, ref generics, ref args) = init.node;
    // unimplemented: `ExprKind::MethodCall` is not further destructured at the moment
    if let PatKind::Binding(BindingAnnotation::Unannotated, _, name, None) = local.pat.node;
    if name.node.as_str() == "id";
    then {
        // report your lint here
    }
}


expected stdout:


diff of stdout:

+if_chain! {
+    if let StmtKind::Decl(ref decl, _) = stmt.node
+    if let DeclKind::Local(ref local) = decl.node;
+    if let Some(ref init) = local.init
+    if let ExprKind::MethodCall(ref method_name, ref generics, ref args) = init.node;
+    // unimplemented: `ExprKind::MethodCall` is not further destructured at the moment
+    if let PatKind::Binding(BindingAnnotation::Unannotated, _, name, None) = local.pat.node;
+    if name.node.as_str() == "id";
+    then {
+        // report your lint here
+    }
+}
+

The actual stdout differed from the expected stdout.
Actual stdout saved to /Users/scrust/devel/rust-clippy/target/debug/test_build_base/getter_prefix.stdout
To update references, run this command from build directory:
tests/ui/update-references.sh '/Users/scrust/devel/rust-clippy/target/debug/test_build_base' 'getter_prefix.rs'

error: 1 errors occurred comparing output.
status: exit code: 0
command: "target/debug/clippy-driver" "tests/ui/getter_prefix.rs" "-L" "/Users/scrust/devel/rust-clippy/target/debug/test_build_base" "--target=x86_64-apple-darwin" "-C" "prefer-dynamic" "-o" "/Users/scrust/devel/rust-clippy/target/debug/test_build_base/getter_prefix.stage-id" "-L" "target/debug" "-L" "target/debug/deps" "-Dwarnings" "-L" "/Users/scrust/devel/rust-clippy/target/debug/test_build_base/getter_prefix.stage-id.aux" "-A" "unused"
stdout:
------------------------------------------
if_chain! {
    if let StmtKind::Decl(ref decl, _) = stmt.node
    if let DeclKind::Local(ref local) = decl.node;
    if let Some(ref init) = local.init
    if let ExprKind::MethodCall(ref method_name, ref generics, ref args) = init.node;
    // unimplemented: `ExprKind::MethodCall` is not further destructured at the moment
    if let PatKind::Binding(BindingAnnotation::Unannotated, _, name, None) = local.pat.node;
    if name.node.as_str() == "id";
    then {
        // report your lint here
    }
}

------------------------------------------
stderr:
------------------------------------------

------------------------------------------

thread '[ui] ui/getter_prefix.rs' panicked at 'explicit panic', /Users/scrust/.cargo/registry/src/github.com-1ecc6299db9ec823/compiletest_rs-0.3.17/src/runtest.rs:2553:9
note: Run with `RUST_BACKTRACE=1` for a backtrace.


failures:
    [ui] ui/getter_prefix.rs

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 225 filtered out

test compile_test ... FAILED

failures:

---- compile_test stdout ----
thread 'compile_test' panicked at 'Some tests failed', /Users/scrust/.cargo/registry/src/github.com-1ecc6299db9ec823/compiletest_rs-0.3.17/src/lib.rs:89:22


failures:
    compile_test

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

error: test failed, to rerun pass '--test compile-test'

Wow that’s a lot out output. More importantly, I don’t actually know if it worked. I see

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 29 filtered out

but then I see

The actual stdout differed from the expected stdout.
Actual stdout saved to /Users/scrust/devel/rust-clippy/target/debug/test_build_base/getter_prefix.stdout

and

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 225 filtered out

test compile_test ... FAILED

I also don’t see any generated .stdout file, so I’m going to assume there’s something wrong with my test case.

If I remove the author lint tag, I get a different result:

$ TESTNAME=ui/getter_prefix cargo test --test compile-test
    Finished dev [unoptimized + debuginfo] target(s) in 0.16s
     Running target/debug/deps/compile_test-f89d0316ceade355

running 1 test

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 29 filtered out


running 1 test
test [ui] ui/getter_prefix.rs ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 225 filtered out


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out

test compile_test ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

This all seems to pass, but I still don’t see a generated .stdout file, and I’ve also lost any output the #[clippy:author] annotation would have given me.

I noticed a line from the earlier, possibly failed, test run:

To update references, run this command from build directory: tests/ui/update-references.sh ‘/Users/scrust/devel/rust-clippy/target/debug/test_build_base’ ‘getter_prefix.rs’

What does this do? What’s a “reference”? clippy’s documentation doesn’t seem to mention tests/ui/update-references.sh at all, though there is a mention of tests/ui/update-all-references.sh which seems to update all of the existing .stderr files that drive the UI tests.

It turns out that running tests/ui/update-references.sh is necessary to actually write the .stdout file. I wasn’t expecting this extra step because the way the instructions are phrased I though running the test would generate the .stdout file automatically. For the test I wrote, #[clippy::author] generated a .stdout file with the following code:

if_chain! {
    if let StmtKind::Decl(ref decl, _) = stmt.node
    if let DeclKind::Local(ref local) = decl.node;
    if let Some(ref init) = local.init
    if let ExprKind::MethodCall(ref method_name, ref generics, ref args) = init.node;
    // unimplemented: `ExprKind::MethodCall` is not further destructured at the moment
    if let PatKind::Binding(BindingAnnotation::Unannotated, _, name, None) = local.pat.node;
    if name.node.as_str() == "id";
    then {
        // report your lint here
    }
}

I’m not at all familiar with the if_chain! macro or any of these datatypes, so I definitely have some reading to do so I can understand what this snippet actually does. I do see if name.node.as_str() == "id"; which seems to match the id field on MyStruct which is about the only piece I understand without delving deeper.

Out of curiosity after reading a number of other lint tests, I decided to update main and added an assert statement:

fn main() {
   let s = MyStruct { id: 42 };

    #[clippy::author]
    let id = s.get_id();

    assert_eq!(id, 42);
}

and got yet another different result:

$ TESTNAME=ui/getter_prefix cargo test --test compile-test
    Finished dev [unoptimized + debuginfo] target(s) in 0.16s
     Running target/debug/deps/compile_test-f89d0316ceade355

running 1 test

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 29 filtered out


running 1 test
test [ui] ui/getter_prefix.rs ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 225 filtered out


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out


running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out

test compile_test ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

I’m not sure the #[clippy::author] annotation is going to help me too much in implementing this lint. Right now I don’t know enough about clippy or the datatypes it uses to make heads or tails of the generated code, and the results of my test are inconsistent depending on various combinations of assert_eq! and the author annotation. This definitely calls for more research, so until next week it looks like I’ll be getting to grips with some of rustc’s compiler internals.

25 Nov 2018

I See You Are Writing Some Rust

Would You Like Some Help With That?

In my last post I wrote a bit about code linting and code formatting, particularly in more modern programming languages like Rust and Go where such tools come first-class as part of the language’s toolchain. In addition to Rust’s rustfmt tool which formats Rust code according to style guidelines, Rust’s ecosystem also has a tool called clippy which is a much more opinionated tool “to catch common mistakes and improve your Rust code.”

Introducing clippy

 _____________________________________
/ I see you are writing some Rust.    \
\ Would you like some help with that? /
 -------------------------------------
 \
  \
    /  \
    |  |
    @  @
    |  |
    || |/
    || ||
    |\_/|
    \___/

clippy currently has 288 lints to help developers write better Rust code. The lints are broken down into various categories such as:

  • style: code that should be written in a more idiomatic way (e.g. if x.len() == 0 {...} could be re-written as if x.is_empty() {...})
  • correctness: code that it outright wrong of very useless (e.g. ensuring syntax when creating regexes)
  • complexity: code that is more complex than necessary (e.g. if x == true {...} could be re-written as if x {...})

as well as many others.

I’ve been interested in contributing to a Rust tooling project since I did my initial Rust project overview as part of Hacktoberfest 2018. Good tooling is a force multiplier in software development, and improving tooling - especially tooling that is “blessed” and supported by the core language team - can reach so many more people than small purpose-built tools set up for individual projects.

During Hacktoberfest, I ran across rust-lang/rust-clippy#1673 but because of the time constraints on Hacktoberfest contributions along with my other coursework I didn’t have time to claim the issue.

The full issue is as follows:

It is not idiomatic in Rust to have setters and getters. Make the field public instead. If the type only has a get_foo method but not a set_foo method, suggest naming it foo instead.

This seemed like a relatively simple lint to implement which would hopefully introduce me to a number of features clippy uses when analyzing Rust code and inspecting the representation that the rustc compiler sees before it generates build artifacts. Once Hacktoberfest was over and I cleared some work off my plate, I went back and asked if the lint was still up for grabs before I invested the time to attempt an implementation. Manish Goregaokar of the Rust dev tools team got back to me almost immediately:

Actually, I’m not sure if this really is valid rust style – setters and getters may be added to future proof an API, for example.

Manish raised the excellent point that getters and setters are in fact valid Rust style and I agreed, so I thought I was going to have to find another issue to work on and moved to close the issue.

I was worried that I would run into a similar situation with other issues I was interested in working on, so I reached out to Manish directly on the wg-clippy Discord channel and asked about another issue I was interested in working on:

@manishearth i was interested in picking up https://github.com/rust-lang/rust-clippy/issues/1673 but i agree with your comment that it may not be a desirable lint to have

i’m looking at https://github.com/rust-lang/rust-clippy/issues/2144 now, or if there’s another good first issue that’s up for grabs i’d definitely be interested in taking a look!

I got a response pretty quickly:

that seems fine!

However, activity on the original issue I was interested in had clearly caught some attention, and I got into a discussion with user hcpl about other use cases for the lint, specifically:

If the type only has a get_foo method but not a set_foo method, suggest naming it foo instead.

It turned out that there was already some precedence for this style in the Rust standard library, and the Rust API Guidelines has an entire section about Rust conventions for getter names. Except for the cases of:

  • get
  • get_mut
  • get_unchecked
  • get_unchecked_mut
  • get_ref

which have some special meanings in Rust related to data mutability, references, or unsafe code, the get_ prefix is not generally used in Rust. Searching for these exceptions also turned up an unstable feature relating to TypeId to support reflection that does include the get_ prefix even though it’s not supposed to, which goes to show that implementing this lint could be very valuable to help maintain style even in core Rust projects and the compiler.

After some good back-and-forth discussion with hcpl and Philipp Krones, I summarized the proposed refinements to the filed issue:

To summarize, if the type has a get_foo method we should suggest naming it foo instead to follow the API Guidelines for Rust getter name conventions except for cases of:

  • get
  • get_mut
  • get_unchecked
  • get_unchecked_mut
  • get_ref

This should be enough to get me started on a style lint for this convention, I should have some time over the next couple days to start digging into this.

With better clarity on what the lint should implement as well as some known exceptions, I was in a position to start getting the project set up and go through all the steps of onboarding onto a new project.

Working on clippy

To start implementing the lint, I had to go through all the usual steps of forking and cloning clippy and making sure I could build the project locally before I could start digging into code. After cloning the project, I went ahead and tried to build clippy locally:

$ cargo --version
cargo 1.32.0-nightly (1fa308820 2018-10-31)

$ cargo build

[snip]

error[E0050]: method `check_pat` has 4 parameters but the declaration in trait `rustc::lint::EarlyLintPass::check_pat` has 3
   --> clippy_lints/src/misc_early.rs:244:66
    |
244 |     fn check_pat(&mut self, cx: &EarlyContext<'_>, pat: &Pat, _: &mut bool) {
    |                                                                  ^^^^^^^^^ expected 3 parameters, found 4
    |
    = note: `check_pat` from trait: `fn(&mut Self, &rustc::lint::EarlyContext<'_>, &syntax::ast::Pat)`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0050`.
error: Could not compile `clippy_lints`.
warning: build failed, waiting for other jobs to finish...
error: build failed

Oops, that’s not good. I knew clippy relied heavily on features from the nightly release channel, and, as the name implies, the nightly channel is released every night with new changes and improvements. clippy must be making use of some new feature here and my nightly Rust is out of date. I updated Rust with rustup and then tried again to build clippy locally:

$ rustup update

[snip]

nightly-x86_64-apple-darwin updated - rustc 1.32.0-nightly (5aff30734 2018-11-19)
$ cargo --version
cargo 1.32.0-nightly (b3d0b2e54 2018-11-15)

$ cargo build

[snip]

Finished dev [unoptimized + debuginfo] target(s) in 2m 03s

Now that I knew I really had to watch the versions and make sure everything was up to date, I was ready to start thinking about implementing the lint.

A Few Days Later…

A challenge that I’ve been running into working on clippy is that because it relies so heavily on nightly compiler features, and both clippy and nightly are moving targets, my local clippy checkout can very quickly get out of date not only from upstream but also from the nightly release channel.

For example, I updated everything recently and got:

$ cargo build

   Compiling clippy_lints v0.0.212 (/Users/azure/devel/rust-clippy/clippy_lints)
error[E0615]: attempted to take value of method `abi` on type `rustc_target::abi::Align`
    --> clippy_lints/src/types.rs:1067:93
     |
1067 |                 if let Some(from_align) = cx.layout_of(from_ptr_ty.ty).ok().map(|a| a.align.abi);
     |                                                                                             ^^^
     |
     = help: maybe a `()` to call it is missing?

error[E0615]: attempted to take value of method `abi` on type `rustc_target::abi::Align`
    --> clippy_lints/src/types.rs:1068:89
     |
1068 |                 if let Some(to_align) = cx.layout_of(to_ptr_ty.ty).ok().map(|a| a.align.abi);
     |                                                                                         ^^^
     |
     = help: maybe a `()` to call it is missing?

error: aborting due to 2 previous errors

For more information about this error, try `rustc --explain E0615`.
error: Could not compile `clippy_lints`.

which is the change introduced in rust-lang/rust-clippy#3452.

For whatever reason, even after updating Rust and using the same tool versions as in this passing test for the above pull request the project does not build locally. Luckily reverting back to 61501b2810d887367d360025398dd9280c4bcd8b lets me compile the project so I can continue working without getting too out of date with upstream, but there’s a lot of churn and things often break for unexplained reasons.

I reached out to Matthias Krüger, the original author of #3453, on the wg-rust Discord channel to find out what was going on. It turns out that sometimes even the nightly release channel isn’t bleeding-edge enough to work on clippy lints and a tool called rustup-toolchain-install-master is necessary to install compiler artifacts directly from Rust’s continuous integration pipeline that haven’t even been published to the nightly channel yet. This information is also documented in clippy’s CONTRIBUTING.MD file, but it’s located at almost the bottom of the document which is why I hadn’t run across the information earlier. It is very true that it pays to read the documentation, and in many other projects asking a question about why my local build was failing in this way would receive comments to “RTFM”. However, my experiences in the Rust community have been nothing but positive, everyone I have interacted with has been very helpful, and even “big names” in the community are accessible and directly engaged in projects and contributor mentorship.

To Be Continued…

This week was all about finding my footing and getting my local environment set up to actually do development work on clippy. In the coming weeks I’ll tackle actually implementing the lint now that the requirements and goals have been fleshed out, and I hope to have something up for code review soon to get community feedback on improving the lint and catching anything I’ve missed.

First up: reading llogiq’s blogpost on writing clippy lints. Then I’ll create a starting point for my lint with clippy’s internal author lint as well as reading some existing lints to get a general idea of lint structure.

18 Nov 2018

Opinionated Formatting

This week in class we talked about code linting and code formatting using ESLint and Prettier. These kinds of tools automate a lot of the otherwise labour-intensive and easy-to-miss nitpicks reviewers often leave on pull requests, freeing up time to review much more important elements such as design and code structure. Many tech companies - Google, AirBnB, Microsoft to name a few - have their own code style guides, much in the same way that writing organizations (news organizations and scientific publishing, for example) have documents that outline how to maintain consistency in publications across many hundreds of outlets and thousands of writers, and the idea of a style guide is not new. However, the idea of automating this style checking to help developers maintain a consistent style has gained a lot more traction in recent years thanks in part to style and formatting tools coming standard and enabled by default in many modern programming languages.

Formatting as a First-Class Tool

I was first exposed to this idea of a universal formatter as part of a language’s toolchain when I started experimenting with Go. Go’s gofmt enforces a consistent style not just within one project, but across the entire ecosystem of Go code, making even foreign codebases more accessible because of their consistent style. This consistency helps to reduce visual noise, making it easier to focus on what the code says rather than how it’s formatted. This idea of a universal formatter appears in other languages like Rust’s own rustfmt, though there are many other tools for enforcing style that predate these tools, such as rubocop for Ruby and astyle for C and C++.

In Go and Rust, these formatters increase productivity dramatically because as long as you write syntactically correct code, it can be the ugliest code you have ever written and by passing it through the formatter (and many editors and IDEs will even format the source for you when you save!) you can leave it up to the formatter to make the code pretty and readable. This means less time fiddling with alignment, worrying about indentation, and dithering over where to break your function call chain to best communicate your intent. It also means that wars over format like tabs versus spaces are dead; the formatter is the absolute arbitrator of the correct style, and because the formatter is consistent across the entire ecosystem there is a lot of pressure for users to conform instead of trying to tweak the formatter to their own personal preferences.

We still have an open issue on supernova for implementing rustfmt as part of our continuous integration process that we hope to close out with a pull request relatively soon so we can be sure all the code contributed to supernova follows the same format as other projects.

Build Infrastructure Weirdness

Speaking of our continuous integration process, we ran into a very curious issue with 0xazure/supernova#24 this week where our builds started failing on the beta release channel. The build failure is caused by clippy’s new_ret_no_self lint which checks to ensure that, as a convention, new methods are used to make a new instance of a type and return that instance as the return value. In the issue, the build is only failing on the beta release channel which was surprising because I was expecting this lint to have failed the build on the stable release channel as well if it failed on beta. To further confuse the issue the build on nightly, which per our configuration for supernova is allowed to fail, was successful.

Digging into the problem some more, it looks like we are running into rust-lang-nursery/rust-clippy#3313 where the new_ret_no_self lint is incorrectly triggering on a new function that does return Self, it’s just wrapped by a container type or tuple.

Indeed, we can see this from our implementation of Config::new

pub fn new(mut args: env::Args) -> Result<Config, &'static str> {
    args.next();

    let username = match args.next() {
        None => return Err("No username provided"),
        Some(arg) => arg,
    };

    let token = args.next();

    Ok(Config { username, token })
}

which triggers the resulting lint failure

error: methods called `new` usually return `Self`
  --> src/lib.rs:18:5
   |
18 | /     pub fn new(mut args: env::Args) -> Result<Config, &'static str> {
19 | |         args.next();
20 | |
21 | |         let username = match args.next() {
...  |
28 | |         Ok(Config { username, token })
29 | |     }
   | |_____^
   |
   = note: `-D clippy::new-ret-no-self` implied by `-D warnings`
   = help: for further information visit https://rust-lang-nursery.github.io/rust-clippy/v0.0.212/index.html#new_ret_no_self

even though our return type is Result<Config, &'static str> which unwraps to Config on success and a static str when there is an error in creating a new instance.

Investigating the Root Cause

An important part of build infrastructure is reproducibility: the ability to run a build with the same inputs and get the same outputs. Without reproducibility we have flaky tests that no one wants to run and worse, no one trusts. In the case of supernova we have a build matrix to test on all three release channels: stable, beta, and nightly, and we need to make sure testing on these channels happens in a predictable way.

It turns out the issue results from how clippy is installed in each environment. The recommended way to install clippy is as a rustup component using rustup component add clippy-preview. However, because clippy is published as a component for rustup rather than as some kind of version-pinned project dependency, this command does not install the same version of clippy across all release channels. This can be verified as follows:

$ cargo +stable clippy --version
clippy 0.0.212 (125907ad 2018-09-17)

$ cargo +beta clippy --version
clippy 0.0.212 (b1d03437 2018-10-19)

$ cargo +nightly clippy --version
clippy 0.0.212 (d8b42690 2018-11-04)

Note that while all of the build numbers are the same (v0.0.212), the commit hashes and dates are all different.

It is important to verify that the tool(s) you’re using to test or lint your project are the same version in all of your environments, otherwise you’ll end up with confusing build failures like the one we did here. In our case we are testing against beta and nightly to have an idea of future changes to the Rust compiler and any new lints that may get added in the future, so failures on anything but stable are nice-to-have information rather than complete show-stoppers. In other cases, or in different matrices, it’s even more important that the test environment is as consistent as possible and that the number of variables that are being changed are as small as possible to make tracing failures relatively simple.

Lint tools are great for catching low-hanging fruit in code review, but you can’t blindly trust them. When there is a failure, it takes a person’s knowledge of the project to determine if the failure is legitimate or if there’s a problem in the tool or lint rule and to determine if it’s a problem with the submitted code, a problem with the tool configuration, or a false positive in the tool as in this case with clippy’s new_ret_no_self lint.

Fixing the Problem

After reaching out to some friends in the #rust IRC channel, we decided to not run clippy on the beta toolchain to avoid more false positives like this in the future. We are keeping clippy enabled for the nightly release channel because we are allowing nightly to fail on Travis so while we will investigate those failures it will not block landing any pull requests if for some reason nightly or clippy on nightly finds fault with our code.

I also recently filed 0xazure/supernova#32 to provide better visibility into the versions of tools we install to match how Travis prints out tooling versions for tools that come automatically installed with the Rust build environment. This should help us track down version discrepancies and make trouble-shooting failures much quicker.

After landing the above fix (and an extra tweak so we only run Travis against the master branch), our builds went fully green for the first time since we enabled Travis on the project! Setting up automated builds can take a lot of up-front effort, but it pays big dividends as the project grows to ensure the quality of the software being written. Now we just need some tests so we can verify our code is actually correct…

11 Nov 2018

Improving CLI Ergonomics

In my day-to-day, I use a lot of command-line tools. I generally find myself to be much more productive working in a non-GUI environment, mostly down to the fact that I can type much faster than I can move and aim a mouse and the less time I spend switching between keyboard and mouse the more time I can spend typing. Unfortunately, command-line tools have one major drawback: discoverability. If I am given a GUI application, I can click around, hover over different UI elements, and generally get a feel for how the interface is laid out. In a command-line tool, I don’t have any of these visual cues to help me learn the functionality; I have to start typing and see what happens. To supplement this lack of visual cues, good command-line tools generally have autocomplete snippets for your shell so you can type the command name and start hitting TAB to see what options are available at a particular time, as well as extensive man pages that describe all the various options, commands, and any sub-commands (though these man pages are often extraordinarily verbose and rarely provide useful examples of how the tool is commonly used, so alternatives such as tldr fill this gap).

I want supernova to be a good command-line tool, so I have been thinking about ways to improve discoverability and make the tool easier to use. A really interesting post by Jeff Dickey, an engineer at Heroku, crossed my timeline recently titled 12 Factor CLI Apps In his post, Jeff provides twelve principles to guide CLI design in a similar fashion to Heroku’s original Twelve-Factor App methodology. Out of the twelve principles, Principle 2 stood out to me, particularly in the context of supernova; Jeff suggests to “prefer flags to args” when designing a CLI. He says:

Sometimes args are just fine though when the argument is obvious such as $rm file_to_remove. A good rule of thumb is 1 type of argument is fine, 2 types are very suspect, and 3 are never good.

This rule of thumb got me thinking about supernova’s current calling convention. Currently, you would use supernova like this:

$ supernova <username> [<auth-token>]

This violates Jeff’s rule of thumb because <username> and <auth-token> are not the same type of argument, and it’s arguably even worse because the <auth-token> is optional here.

I opened 0xazure/supernova#16 to track this problem with our current calling convention. In it, I suggest using the clap crate to improve our CLI and provide the facilities to do command-line argument parsing.

Introducing Clap

Clap is a great crate that makes it super easy to add all kinds of CLI goodies to your command-line tool including auto-generated help, version, and usage information which are all, as Jeff highlights in Principle 1: Great help is essential, important to good CLIs.

Adding clap to the project would check off Principles 1 & 2, so I went ahead and created a pull request to add clap and improve supernova’s calling convention. All that was necessary was to create a new clap-based App with the desired arguments and flags, provide good help messages, and set <auth-token> to be an optional argument. Having done all that, this is what clap generates, all for 13 lines of code:

$ supernova --help
supernova 0.1.0

USAGE:
    supernova [OPTIONS] <USERNAME>

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
    -t, --token <TOKEN>    Sets the authentication token for requests to GitHub

ARGS:
    <USERNAME>    The user whose stars to collect

Clap will print the name as well as the version with every --help request. I was a little wary of this at first, because while I definitely want to include this information in my CLI, I also don’t want the maintenance burden of having to remember to update the version string in main.rs every time a new version is published. However, clap exposes some very handy macros that make use of environment variables exported by cargo at build time to pull this information out of Cargo.toml, so these values will always be up to date with the crate’s metadata and don’t require setting the values in code.

To actually extract all of the parsed arguments from clap, we call App::get_matches() which produces an ArgMatches struct we can query for specific arguments by name. Instead of parsing out each argument, I decided to try my hand at implementing the From trait to convert ArgMatches into supernova’s Config type so it can be passed directly to the next function call.

Traits in Rust

Before I talk about implementing the From trait, I want to quickly talk about what traits actually are, as well as why I chose to implement From instead of one of the other conversion traits.

The Rust Book explains traits like this:

A trait tells the Rust compiler about functionality a particular type has and can share with other types. We can use traits to define shared behavior in an abstract way.

If you are familiar with the idea of interfaces in other languages such as Java or Go, traits are a very similar idea.

There is an important caveat with traits. Again, from the Rust Book, Chapter 10.2, Implementing a Trait on a Type:

One restriction to note with trait implementations is that we can implement a trait on a type only if either the trait or the type is local to our crate.

This means that I will need to implement my conversion on a type that I control, Config, to convert between ArgMatches and Config.

The restriction on traits is very interesting, because while it is not necessary in my case it raises the question of how to convert from types that I do control into types defined in the standard library or in other crates. If I wanted to convert a Config object into a String for example, it might make sense to use the conveniently named Into trait. However, the documentation for Into states:

Library authors should not directly implement this trait, but should prefer implementing the From trait, which offers greater flexibility and provides an equivalent Into implementation for free, thanks to a blanket implementation in the standard library.

So, instead of implementing Into, I should implement the conversion of Config from String, and I get the equivalent Into implementation for free? It seems a bit backwards since we are defining a trait that converts in the opposite direction of our needs, but because of this blanket implementation in the standard library we get symmetric conversions between types for free as long as we implement From on the type. That means we only have to write the conversion function once instead of needing to write it once in each direction, and I’m definitely in favour of anything that reduces the amount of code I need to write.

The From Trait

Actually implementing the From trait was straight-forward, other than the necessary addition of a lifetime annotation because of how ArgMatches is implemented on clap’s side. Implementing From also let me replace another method in Config which reduces the surface area of Config’s implementation.

I’m not totally happy with implementing From on Config because I had to pull clap into the library side of supernova instead of leaving arguments parsing completely in main, so I may go back and change this implementation to provide better separation between data and arguments parsing. Or I may decide to move Config out to its own module which would also increase separation of concerns. Either way, I was very happy to get my hands back on the keyboard writing Rust code this week, and I hope to be writing more in the near future!

04 Nov 2018

Open Source Level Up: Becoming a Maintainer

After a month of open source following my return to open source, the next step in my journey is to really immerse myself in one or two projects and start making larger contributions. The open source course I’m taking is encouraging us to complete contributions to three “external” open source projects over the next 5 weeks, as well as make three contributions to “internal” projects during the same period. In this case, “external” refers to established projects in the open source community and “internal” refers to projects we as a class are starting ourselves to get a feel for being core maintainers of a project so we can gain experience on both sides of the process. As a class we had a brainstorming session to come up with ideas for internal projects we could start, and I donated one of my existing side-projects to the list, which has lead to…

Level Up!

With absolutely zero fanfare, hoops to jump through, or documents to sign, I suddenly became the initial maintainer of one of our course’s internal projects. I was able to generate enough interest in my side-project that members of my class wanted to contribute, and the project was added to the official list of internal projects. The amazing thing about open source development is you don’t need anyone’s permission or to be told to create something; if you have an idea and enough people are interested in using and/or growing that idea, you have an open source project. The project itself grows in a very organic way as people find out about the project and drop in to see what’s going on. Some people just drop in and perhaps contribute a little bit or share their use-case for the project, while others get started and then decide to stick around for a long time; that is the nature of open source.

supernova

The project I am maintaining is 0xazure/supernova, and I would really encourage you to come check us out! We’re still in the early stages, but we would encourage any and all to take a look at what we’re building, make comments and submit issues or pull requests, and give us feedback. Even though it has been deemed an “internal” project for the course, we still welcome contributions from anyone who wants to get involved so stop by and file an issue if you have any questions.

supernova started out as a learning project for me to improve my understanding of the Rust programming language, a {,de}serializing library written in Rust called Serde, and the GitHub API. I am a project-oriented learner, so whenever I have some tools and techniques I’d like to learn or improve, I start a project that tries to use them so I can have a better understanding of their strengths and weaknesses. The initial implementation as a CLI tool was designed to pull stars data from the GitHub API and display it in a formatted list. I have a tendency to use stars like I would bookmarks so I have accumulated a lot of stars since I joined GitHub, but I have no great way to view all of my stars in one place (GitHub insists on paginating the list) or export them in various formats. supernova was only added to the internal projects list a week ago, but we already have a number of open issues to start growing supernova’s feature set.

First Week

In the first week we’ve been focusing on a lot of the low-hanging fruit for a new open source project: setting up the project infrastructure and documenting what we already have.

There are currently a few open issues for writing docs:

as well as the ever-present need to document existing code.

Within the first week we’ve also set up and enabled TravisCI for testing and linting our codebase thanks to Sean Prashad and made supernova more accessible to new contributors by converting existing documentation to Markdown thanks to Mordax. I really enjoyed landing these contributions; it’s really exciting to see something I started as a side-project to get better at writing Rust grow and take on a life of its own.

Next Week

In the next week or so we hope to have a lot of the initial set up completed so that we can move on to adding functionality. My personal goal for next week is to submit a pull request for improving the usability of supernova by following Guideline #2 of 12 Factor CLI Apps to “prefer flags to args” to make input to the CLI more explicit and clear, as well as many other usability benefits.

I also need to start thinking about what external project(s) I want to contribute to over the next few weeks; I have some ideas from my contributions during Hackoberfest, but I’ll need to make a decision soon so I can get set up to contribute.

31 Oct 2018

Hacktoberfest - Week Five

With this fifth week, Hacktoberfest 2018 comes to a close. I’ll be doing a retrospective on the whole month of contributions soon, but that is not this post. This week, I was able to make one of the contributions I had started back in week three, a documentation contribution to the Rust and WebAssembly Book.

As I mentioned in that post, I had filed rustwasm/book#127 to suggest some improvements to the explanation of src/utils.rs, a module that the rust-wasm project template provides to make working with Rust compiled to WebAssembly easier including some debugging helpers in the form of a panic hook to print well-formatted and helpful wasm error messages to console.error. Nick Fitzgerald, a member of the Rust dev tools team, responded to my issue and said that he would be happy to merge a pull request that did the rephrasing I had suggested. Since I had already done the work up-front when I filed my issue, I was able to get a pull request with the change up very quickly and it got merged the same day.

My change is live on the Rust and WebAssembly site right now, and it feels really great to see something that I wrote included in a project in which I’m excited to get even more involved. Big thanks to Nick for getting back to me about my proposed change and for the quick merge on rustwasm/book#129 to get it incorporated.

With that pull request opened (and merged), I was able to achieve my five pull requests for Hacktoberfest and secure my limited edition T-shirt. The T-shirt is nice, of course, but the real benefit of Hacktoberfest is to all the open source projects that took pull requests this past month, as well as for all the participants who were able to level up their programming skills while contributing to the open source community. More on that in the next post where I’ll summarize my experience participating in Hacktoberfest 2018 and talk about what I learned, what went well, what I would do differently next time, and a general reflection on Hacktoberfest itself.

22 Oct 2018

Hacktoberfest - Week Three

As of last week we are just about half way through Hacktoberfest. Time is just flying by, and I can’t believe how quickly it’s passing. In my previous Hacktoberfest post I wrote about my slightly self-interested contribution to dplesca/purehugo, the theme I use for this blog, and how I used some of Hugo’s built-in functions to normalize URLs that used the .Site.BaseURL variable.

My contribution this week for Hacktoberfest is minor, but it started me on a path to encounter a number of issues with a common theme that should leave me with plenty of work to do over the next few weeks and possibly past the end of the month.

Theme Week: Documentation

There has been a trend over the last number of years for contributors to open source to diversify how they contribute to open source. Code is not the only kind of contribution maintainers hope to receive, and they are happy to receive contributions in the form of documentation, code examples, sample projects, and feedback from first-time users, among others.

For Hacktoberfest week three, I focused my efforts on project documentation, or a lack thereof, specifically from the perspective of a new user. So far during Hacktoberfest I haven’t made more than one contribution to the same project; instead I’ve jumped around quite a bit to different projects that use many different languages and project structures. This project hopping has given me some perspective on the onboarding and getting started instructions of a number of projects, which are critical for encouraging adoption and helping new users get up to speed on how to use or contribute to a project, application, or service. Documentation can also be tricky to get right the first, second, or even fifth time because it is hard to write a document that is useful for every audience, especially if the topic is technical in nature and the author(s) can’t make many assumptions about the audience’s technical background or ability. Many projects deal with this by dividing the documentation in two: one set of high-level documentation on getting the project up and running aimed at users, and another set of documentation that augments and extends this high-level documentation aimed at contributors and developers that provides much more technical detail.

This past week also reiterates the importance of contributing to projects in ways other than implementing new features, writing test cases, or fixing bugs in code. There are so many ways to contribute to open source projects, and one of the best ways new contributors can help a project is to verify the project’s onboarding flow by following basic steps like the project setup information and getting their development environment up and running. These are tasks that established contributors and maintainers do not have to do very often and documentation can get out of date; new contributors are great at spotting problem areas in guides, tutorials, and example code that can improve the onboarding experience and identify areas where further explanation may be necessary that established contributors may not be able to see.

Project One: rust-wasm

I have been excited about WebAssembly since I first heard about the work being done to support it in multiple browser engines. WebAssembly is a binary instruction format predominantly designed for the web (though its success on the web will surely spur adoption in other areas just as NodeJS forced us to rethink possible applications for JavaScript) that can be targeted by many other languages like C++, Rust, Go, and Java, among others. rust-wasm is the Rust initiative to provide compilation for the wasm target as well as support interoperation with existing JavaScript code both shipped in browsers and provided as libraries through tools like npm.

I wanted to get up and running with WebAssembly as a compilation target for Rust, so I decided to read some of the core documentation provided by the rust-wasm team: the Rust and WebAssembly Book. After taking a brief detour to set up a personal Homebrew tap for the wasm-pack tool (which I hope, once the prerequisite Rust version lands on the stable channel, I will be able to contribute back to homebrew/hombrew-core and rust-wasm), I was able to set up all of the necessary project dependencies to start working through the tutorial.

Hello, World!

The Rust and WebAssembly Book bases the tutorial off a project template to get new users up and running more quickly. It spends a portion of the introductory section talking about each file in the template and its purpose in the larger project. When I got to the section describing wasm-game-of-life/src/utils.rs, I was given this explanation:

The src/utils.rs module provides a couple included batteries that we will use later in the tutorial. We can ignore it for now.

— Rust and WebAssembly, 5.2 - Hello, World!

After the reasonably detailed explanations of other files in the project, this section stood out to me as one that could be improved. This section also makes reference to a philosophy of “batteries included”, which I had only ever been exposed to before in Python jargon and that may not be familiar to everyone who has not had a similar exposure, or for whom English is not their primary language and the idiom is lost in translation.

Being a fresh set of eyes on the Rust and WebAssembly Book, I filed rustwasm/book#127 to discuss some suggestions for how this section could be improved without the need for jargon-heavy idioms and to provide more details about the particular module in question. At the time of publishing I have not heard back from anyone on the rust-wasm team, but I am hopeful that I will be able to submit a pull request before the end of the month to improve this part of the tutorial.

Typos and Transcription Errors

Reading through the rest of the page, I ran across the simple error of a duplicated word in the explanation of wasm-game-of-life/www/index.js. I submitted my correction as rustwasm/book#128 and it was accepted the same day. I even received a congratulatory message from one of the maintainers on my first contribution to the project which, as my first interaction with any of the project maintainers, is a nice personal touch that definitely makes me want to to continue to contribute to rust-wasm.

Project Two: Diesel

I have been playing around with reimplementing a currently active Postgresql-backed project using Rust, and discovered Diesel as one of the ORM tools at the forefront of Rust development stacks. As with rust-wasm, I have not used Diesel in any of my projects, so I turned to the getting started guide Diesel provides. It wasn’t long after I got the diesel-cli tool installed that I ran into a discrepancy between the guide and the behaviour of the CLI tool, so I filed diesel-rs/diesel#1891 to detail my expectations along with the actual output the tool was producing, as well as some suggestions for improving the clarity of documentation in a few disparate but related areas of the codebase and guide.

I am of course very grateful for the guide because it is much more approachable - with many code samples and examples - than reading highly technical API documentation or man pages. However, it can be intimidating for a new user if they are following along with the steps in the guide and something unexpected happens; often the new user doesn’t know if they did something in the wrong order or missed an important step, or if the guide is simply out of date and they are not sure how to get around the outdated part to continue their progress. It is also common for code and documentation to get out of sync, resulting in confused and annoyed users. Unfortunately there isn’t a great technical solution to this issue and authors need to always take care to not only update code but also documentation, though Rust’s documentation tests are one example of a tool that can help by ensuring that example code included in comments is up to date and working.

Project: Next

As I am still a new user of rust-wasm and Diesel, I am sure there are still many things I can contribute to these projects, both in terms of improving the onboarding experience for new users as well as other non-code contributions. I also have my eye on a few areas of the guide for Rocket, a web framework written in Rust that was the reason I started using Diesel. I’m hoping to hear back soon about the two issues I filed to improve documentation, and I’d like to take the time this month to shepherd them through to getting pull requests opened and merged. As for this week, I’ll be picking out my next few contributions for Hacktoberfest and might also start thinking about picking a project to stick with for a while now that I’ve worked on a number of projects so far this term.

04 Oct 2018

Hacktoberfest - Week One

Hacktoberfest, start! In my previous post I mentioned that I was hoping to find some interesting projects to contribute to this month, and so far I’ve come up with a pretty good shortlist of contributions I want to make.

To start the month off, I lent a hand to lk-geimfari/awesomo to reorganize their list of Rust projects. They had already reorganized the one for Python and suggested contributors use that as a model for the rest of the reorganizations. There are still a bunch of languages on the list in the tracking issue, so head on over and give them a hand!

The work itself was straight forward; contributors were asked to alphabetize the projects in the list, and provide a table of contents and headings based on the example. I think this was a great first issue for Hacktoberfest because my goal for this month is to contribute to at least two projects that use Rust, and what better way to find out about cool projects than working with a list chock full of them?

Alacritty

Alacritty was one of the first projects to catch my eye, not just because it was originally at the top of the list, but also because I saw an announcement post that Alacritty now supports terminal scrollback which introduced me to the existence of the project in the first place. Alacritty claims to be “the fastest terminal emulator in existence”, and uses the GPU for rendering to enable optimizations that aren’t possible using other terminal emulators. Alacritty is still very much in its infancy, with version 0.2.1 as the most recent release at the time of writing. Windows support is planned before a 1.0 release, so if this intrigues you and you’re a Windows developer have a look at the tracking issue and see if you can help out.

Diesel

Another project I was aware of before working with this awesome list of Rust projects is Diesel, a safe, extensible object-relational mapper (ORM) for Rust. When deciding on the technology stack for my capstone project this year, I investigated using the Rocket web framework written in Rust, and Diesel is one of the ORMs recommended in the getting started guide. Diesel provides a comfortable CLI experience for developers familiar with tools like ActiveRecord from Ruby on Rails and Sequelize in NodeJS, but drastically improves on both by leveraging Rust’s type safety and by “eliminat[ing] the possibility of incorrect database interactions at compile time.”

exa

exa is a modern replacement for the built-in unix ls command which aims to have better defaults and more features. I have been using exa myself for a few weeks and I am really enjoying the experience so far. With ls I set up a number of shell aliases and functions to do things like enable human-readable sizes and always show colours; in exa these features are turned on automatically without any setup. I am all for customizability in those cases where it’s absolutely necessary but sane defaults is almost always a better solution, especially with a tool that most of us use hundreds of times a day.

To be Continued

I definitely have some ideas about more projects that I’d like to contribute to after seeing the lk-geimfari/awesomo list, either this month for Hacktoberfest or on an ongoing basis. If Rust isn’t your language of choice, there are similar lists of projects in pretty much every popular language as well as some more obscure ones, so take a look, be inspired, and happy hacking!