XPC: Architectural Support for Secure and Efficient Cross Process Call

3 minute read

Published:

This post is the reading notes and building process of XPC: Architectural Support for Secure and Efficient Cross Process Call

Motivation

  • Tradition IPC remains critical for microkernel yet inefficient
    • SQLite’s 18% - 39% time is spent on IPC
  • Most of the cycles of an IPC are spent on domain switching and message copying
    • domain switching includes: context saving/restoring, capability checking and other IPC logics
    • sending messages: time of check to time of use attack, TLB shootdown and other costly stuff

IPC-percentage

IPC occupies much of the CPU time (SQLite3 on seL4 on SiFive U500)

IPC in Zircon

Most system components run in userland. The components interact with each other with IPC.

Fuchsia's IPC usage

Namespace in Fuchsia

  • Each process has its own virtual filesystem, called a namespace
  • The namespace contains objects, which can be files, services, devices
  • These objects are ordered in a hierarchy, and accessible using file paths with the usual POSIX functions such as open()
  • The kernel has no understanding of the namespace, of its hierarchy, and of its objects. The only thing the kernel is aware of, is handles.

Fuchsia’s Security Policies

  • Fuchsia uses ASLR (mandatory for userland), DEP, SafeStack, ShadowCallStack, AutoVarInit. The Zircon kernel is compiled with all of that by default.
  • Fuchsia has overloaded some basic programming model to prevent errors
    • e.g. [] in c++ will face mandatory range checks

Why IPC is so slow

  • Trap & Restore
    • Save the caller’s context and switch to callee’s context
  • IPC Logic
    • Priority and capability check
  • Process Switch
    • Memory access (user context, capabilities and scheduling queue)
  • Message Transfer
    • zero-copy does not cover all circumstances

Clock Cycles consumed by different phase of IPC

What is XPC

  • An hardware-assisted OS primitive
  • It has 4 goals:   - direct switching without trapping to kernel   - secure zero-copying for message passing
    • easy integration with existing kernels   - minimal hardware modifications

How XPC Works

  • Hardware abstraction, x-entry and xcall-cap
    • Each x-entry has its own ID
    • xcall-cap for access control
  • New instructions that allow user-level code to directly switch accross processes
  • New AS mapping mechanism
    • Support ownership transfer
    • One owner of the message at any time
  • Keep semantic of synchronous IPC
    • Zircon use asynchronous IPC to simulate synchronous IPC
    • Improve the throughput of IPC with relay-seg mechanism
    • Easier for multi-threaded applications because of migrating thread model

Design 1: XPC Engine

XPC-engine

XPC-engine

  • XPC Engine is the hardware support for two new primitives: x-entry and xcall-cap
  • An x-entry is bounded with a procedure that can be invoked by other processes
    • A process can create mutliple x-entries
    • All the x-entries are stored in an x-entry-table
    • A caller needs an xcall-cap to invoke an x-entry, this record the IPC capabilities of each x-entry
    • x-call #reg where #reg records an x-entry index

Design 2: Relay Segment

Design 3: Programming Model

Support For Microkernels

  • Capability Calculation
  • Split Thread State

Evaluation

  • Benchmark: 5x - 141x for existing microkernels
  • Real Application: 1.6x - 12x (SQLite, HTTP Web Server)
  • Hardware Cost: 1.99% LUT overhead

Reference

  1. Playing Around With The Fuchsia Operating System