1 Best Practices in the Hypervisor

1.1 Handling unexpected conditions

1.1.1 Guidelines

Passing errors up the stack should be used when the caller is already expecting to handle errors, and the state when the error was discovered isn’t broken, or isn’t too hard to fix.

domain_crash() should be used when passing errors up the stack is too difficult, and/or when fixing up state of a guest is impractical, but where fixing up the state of Xen will allow Xen to continue running. This is particularly appropriate when the guest is exhibiting behavior well-behaved guests shouldn’t.

BUG_ON() should be used when you can’t pass errors up the stack, and either continuing or crashing the guest would likely cause an information leak or privilege escalation vulnerability.

ASSERT() IS NOT AN ERROR HANDLING MECHANISM. ASSERT is a way to move detection of a bug earlier in the programming cycle; it is a more-noticeable printk. It should only be added after one of the other three error-handling mechanisms has been evaluated for reliability and security.

1.1.2 Rationale

It’s frequently the case that code is written with the assumption that certain conditions can never happen. There are several possible actions programmers can take in these situations:

In selecting which response to use, we want to achieve several goals:

The guidelines above attempt to balance these:

Note however that domain_crash() has its own traps: callers far up the call stack may not realize that the domain is now dying as a result of an innocuous-looking operation, particularly if somewhere on the callstack between the initial function call and the failure, no error is returned. Using domain_crash() requires careful inspection and documentation of the code to make sure all callers at the stack handle a newly-dead domain gracefully.