Subject:

unwise, for when unsafe isn't quite right


Date: Message-Id: https://www.5snb.club/posts/2021/unwise/

The rust language has a clearly defined notion of unsafe code. In short, any function for which it is ever possible to cause undefined behaviour by misusing it must be marked as unsafe. And any function for which it is not possible to ever cause undefined behaviour using it, should be marked as safe. Therefore, a codebase consisting of entirely safe code can never cause undefined behaviour.

But what if the kind of unsafety you’re dealing with is not undefined behaviour, but instead something else, like leaking crypto keys, or SQL injection?

A common complaint about rust is that it does nothing to prevent this. Which is true, but perhaps we could re-use the notion of unsafe for these things?

A point on notation: I’ve tried to use unwise consistently as a general counterpart to unsafe.

Unsafe Functions

There are two ways a function can involve unsafety in rust

Being unsafe and calling unsafe code

An unsafe function is allowed to call other unsafe functions without an unsafe block. There are plans to change this, see Rust RFC 2585.

/// Gets a character
///
/// # Safety
/// * `index` must be less than 4
unsafe fn get_char(idx: usize) -> char {
    let array = ['R', 'u', 's', 't'];
    array.as_ptr().add(idx).read()
}

This function is unsafe because it indexes into an array without a bounds check and reads off a character from that array. Because it’s an unsafe function, it’s allowed to call add as well as read on pointers, all without an unsafe block itself, as the whole function is assumed to be an unsafe block.

Calling unsafe, but being safe

Rust wouldn’t be a very useful language if everything you wanted to do was unsafe. So you can have safe functions that make use of unsafe functions. The safe function is meant to perform all checks to ensure that the unsafe code is only called when it is valid to do so.

For example, let’s write a safe wrapper around our unsafe get_char from before.

/// Gets a character
///
/// # Safety
/// * `index` must be less than 4
fn get_char_safe(index: usize) -> char {
    assert!(index < 4, "Index out of bounds");

    unsafe { get_char(index) }
}

Unwise Functions

Now let’s replicate these two notions of unsafe, but without using the actual unsafe blocks.

To do this, we’ll introduce a zero sized marker type for a particular notion of unsafety. If a function takes this marker type as a parameter, it is considered unsafe, and you can create a marker out of thin air if you explain why your usage of it is safe. If you have it as a parameter yourself, you’re free to pass it to other functions without needing to make it again.

/// This allows for uncontrolled SQL statements 
/// to be executed that don't exist in the source code.
struct SQLInjection(());

impl SQLInjection {
    fn trustme() -> Self {
        SQLInjection(())
    }
}

This creates a new form of unsafety, one that’s not related to memory safety but instead has a new property.

And now we can introduce SQL examples of the two functions listed above.

Being unwise without calling any unwise functions.

This doesn’t happen for isolated functions of unsafe code, but here we have a boundary where below which it’s not marked as unwise.

/// Executes SQL from a string
/// 
/// # Unwise (SQL Injection)
/// This allows for arbitrary SQL to be executed,
/// and does no filtering on the input sql.
fn execute_sql(sql: &str, parameters: &[&dyn AsSql], _unwise: SQLInjection) {
    low_level_db::sql::execute(sql, parameters);
}

Calling unwise, but being wise

fn update_username_safe(user_id: u32, user_name: &str) {
    execute_sql(
        "UPDATE Users SET UserName=? WHERE Id = ?",
        &[user_name, user_id],
        // This is safe because the SQL being executed is known at compile time,
        // and user input is entered via parameters.
        SQLInjection::trustme(),
    );
}

Calling unwise and being unwise itself

Here we can see that you do need to explicitly pass the unwise marker on to any unwise calls, but it’s less noisy than creating it.

/// Executes SQL from a string and log the SQL
/// 
/// # Unwise (SQL Injection)
/// This allows for arbitrary SQL to be executed,
/// and does no filtering on the input sql.
fn execute_sql(sql: &str, parameters: &[&dyn AsSql], unwise: SQLInjection) {
    log::debug!("{}, {}", sql, parameters);

    execute_sql(sql, parameters, unwise);
}

Drawbacks

One important note is that unwise only works if you don’t have memory unsafety bugs. For example, you can simply create a SQLInjection using unsafe code. So if you suspect a bug caused by an unwise type, you have to check wherever it’s constructed in addition to all unsafe blocks, in theory.

Additionally, there’s the potential for unmarked unwise functions, since this wasn’t baked into the language from early on, like unsafe was. Therefore, this works best for the unwise type being defined in the same crate that defines the ability to misuse it.

So for the instance of a SQL injection type, that would be defined in a SQL library, and the library would take care to never allow SQL injection through its codebase, but can’t guarantee anything about additional libraries. Any libraries that build on top of one that uses unwise should re-use the same definition consistently if needed.