~/ ~/documents ~/software ~/pictures ~/harmful.txt github (opens in new tab)

Safety-Critical Software

NASA lays out a set of rules in the paper “The Power of 10 Rules” that, when followed, help increase confidence in the correctness of software. This aids the development of safety-critical software systems by reducing ambiguity and complexity as much as possible. The rules are designed to make the software easier to understand, analyse, and verify safety-critical applications where failures can have severe consequences.

Rules

  1. All code must be restricted to a simple control flow that never includes goto statements, setjmp or longjmp constructs, or direct or indirect recursion.
  2. All loops must have a fixed upper bound on the number of iterations, regardless of the input data, making it trivially possible to prove statically that the loop cannot exceed a preset upper bound.
  3. Dynamic memory allocation is never allowed after initialization.
  4. No function should be longer than can be printed on a single sheet of paper, using one line per statement and declaration.
  5. Code assertion density should average at least two assertions per function. Assertions must check for anomalous conditions that should never occur during normal execution. All assertions must be side-effect-free and defined as Boolean tests. When an assertion fails, an explicit recovery action must be taken, such as returning an error condition to the caller. Any assertion that a static checking tool can prove can never fail or can never hold violates this rule.
  6. Declare all data objects at the smallest possible level of scope.
  7. A calling function must always check the return value of non-void functions, and all called functions must check the validity of all parameters provided by the caller.
  8. The use of the preprocessor must be limited to the inclusion of header files and simple macro definitions. Token pasting, variable argument lists (ellipses), and recursive macro calls are never allowed. All macros must expand into complete syntactic units. The use of conditional compilation directives must be kept to a minimum.
  9. The use of pointers must be restricted. Specifically, no more than one level of dereference should be used. Pointer dereference operations may not be hidden in macro definitions or inside typedef declarations. Function pointers are never permitted.
  10. All code must be compiled, from the first day of development, with all compiler warnings enabled in the most pedantic mode available. All code must compile without warnings. A daily check of all code with a static source code analyser must be performed, and all analyses must pass with zero warnings.

Real World Consequences

When safty-critical software fails, real world consequences are realised. Some notable failures include:

Standards and Certification Processes

Standards and certification processes for safety-critical software are crucial to ensure correctness and reliable operation. Domain-specific standards such as DO-178C for aerospace, IEC 62304 for medical devices, and ISO 26262 for automotive software provide comprehensive guidelines, rules, and design processes across a system’s lifecycle, including requirements traceability, software design, coding practices, testing, and verification. Adherence to these standards is mandatory before a system can be certified for use in safety-critical contexts. Rigorous documentation, testing, and independent verification are required to ensure that the software meets the necessary safety requirements.

Implementation Example

An abstracted example of a software system that averages a sensor’s readings and reports the calculated value at a fixed interval can illustrate the importance of following an appropriate standard. When the system’s output is used for safety-critical decision making, it is vital to be able to verify the correctness of the software to increase confidence in the system’s ability to perform its intended function without failure. The following code snippets illustrate both a poor implementation in which little confidence can be placed, and an alternative implementation that increases confidence by adhering to the rules outlined above.

Never implement systems in such ways as:
#include <stdio.h>
#include <stdlib.h>

int *samples = NULL;
int idx = 0;

void process_sensor(void) {
    while (1) {
        int v;
        if (scanf("%d", &v) != 1) break;

        if (!samples) samples = malloc(sizeof(int) * 1000);
        samples[idx++] = v;

        if (idx > 16) {
            goto report;
        }
    }

    report:
    long sum = 0;
    for (int i = 0; i < idx; ++i) sum += samples[i];
    double avg = (double)sum / idx;
    printf("Avg: %f\n", avg);
}

int main() {
    process_sensor();
    free(samples);
    return 0;
}

Here a goto statement is used to jump to the report section, which makes the control flow harder to reason about. The unbounded loop can lead to resource exhaustion if the input is large. Dynamic memory allocation is used without a robust strategy for deallocation, which can lead to memory leaks. The function is long and includes multiple responsibilities, making it harder to understand and verify. Variables are not declared at the smallest scope, and the index is hidden and unchecked, which can lead to buffer overflows. There are no assertions or checks for anomalous conditions; when runtime errors occur the code lacks a clear recovery strategy. The return value of scanf is not handled robustly, which can lead to incorrect processing of input data.

Implementations such as the following increase confidence in the system’s correctness:
#include <stdio.h>
#include <stdbool.h>
#include <stdint.h>

#define SAMPLE_COUNT 16

static int sensor_read(void) {
    int v;
    if (scanf("%d", &v) != 1) return INT32_MIN;
    return v;
}

#define CHECK(cond, action) do { if (!(cond)) { action; } } while (0)

int compute_average(const int buf[], size_t n, double *out) {
    CHECK(buf != NULL, return -1);
    CHECK(out != NULL, return -1);
    CHECK(n > 0, return -2);

    long sum = 0;
    for (size_t i = 0; i < n; ++i) {
        sum += buf[i];
    }
    *out = (double)sum / (double)n;
    return 0;
}

int main(void) {
    int samples[SAMPLE_COUNT];
    size_t count = 0;

    for (size_t i = 0; i < SAMPLE_COUNT; ++i) {
        int v = sensor_read();
        if (v == INT32_MIN) {
            break;
        }
        samples[count++] = v;
    }

    double avg;
    int rc = compute_average(samples, count == 0 ? 1 : count, &avg);
    if (rc != 0) {
        fprintf(stderr, "compute_average failed (rc=%d)\n", rc);
        return 1;
    }

    printf("Average over %zu samples: %.3f\n", count, avg);
    return 0;
}

No goto statements are used, creating a clearer control flow. A fixed upper bound on the number of iterations is established by SAMPLE_COUNT, ensuring that the loop cannot exceed a preset upper bound resulting in resource exhaustion. Dynamic memory allocation is avoided after initialization by using a fixed-size array samples. Functions are short and focused, with each having a single responsibility. Assertions are implemented using the CHECK macro, which checks for anomalous conditions and takes explicit recovery actions by returning error codes. Variables are declared at the smallest possible scope, and all return values are checked to ensure that errors are handled appropriately. The use of the preprocessor is limited to a simple macro definition for assertions, and pointer use is simple and straightforward. The code is written to be compiled with warnings enabled, supporting static analysis and ensuring that it adheres to good coding practices.

When compiling this code, the most pedantic warnings should be enabled with:

gcc -Wall -Wextra -pedantic -o safe_avg safe_avg.c