In one of the mini demos (D-P13), it was shown that it’s possible to catch the SIGSEGV signal, which is sent after a segmentation fault, and continue a program’s execution. My question is, are there any situations where it would be useful to create your own handler for this signal, such as one that continues the program rather than exiting? What would the handler function have to do to ensure the program could continue successfully?
I'll provide 3 examples.
(1) One example is distributed shared memory (DSM). For instance, we want to simulate a shared global address space between different processes on different machines. If a process accesses an address within this address space, a SIGSEGV signal handler could map some memory at that address before resuming the process, possibly after getting the data the process expected there over the network.
(2) There's also a line of research called "failure-oblivious computing", see for instance "Automatic Runtime Error Repair and Containment via Recovery Shepherding" by Long et al (PLDI'14).
We present a new system, RCV, which enables applications to recover from divide-by-zero and null-dereference errors, continue on with their normal execution path, and productively serve the needs of their users despite the presence of such errors. RCV replaces the standard divide-by-zero (SIGFPE) and segmentation violation (SIGSEGV) signal handlers with its own handlers.
(3) A third example is (one possible) implementation of NullPointerException
in Java. A Java JVM, which is required to turn NULL pointer dereferences into exceptions, has 2 choices: (a) check before each dereference if the pointer is NULL and branch to the exception branch if so. (b) just do the access, and if the pointer was NULL, end up in the SIGSEGV signal handler, and from there dispatch to the exception path. Since the check would be on the hot path, it makes sense to just let the program trap.
You can test this on rlogin:
import java.util.*;
public class Null
{
static int loopdeloop(Random rnd) {
int sum = 0;
for (int i = 0; i < 10000; i++) {
Object a = new Object();
int r = rnd.nextInt();
if (33 <= r && r <= 50)
a = null;
sum += a.hashCode();
}
return sum;
}
public static void main(String []av) {
Random rnd = new Random(42);
try {
while (true)
loopdeloop(rnd);
} catch (NullPointerException e) {
System.out.println("Caught NPE");
}
}
}
Run with
$ strace -ff -o /tmp/X java Null
Caught NPE
and you'll see this in the strace:
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x8} ---
rt_sigreturn({mask=[QUIT]}) = 0
write(1, "Caught NPE", 10) = 10
write(1, "\n", 1) = 1
which indicates a segfault at address 0x8 (a null pointer with a small offset where the vtable pointer would be expected).